dbt

Crunchy Bridge and Crunchy Data Warehouse integrate with Data Build Tool (dbt) to update, insert, or transform data. Installing both the dbt-core (version 1.9.2+) and the database adapter dbt-postgres(version 1.9.0+) are required.

Connection

dbt runs externally to Crunchy Bridge over a Postgres connection. Superuser access might be needed depending on the number of changes expected.

my_dbt_project:
  outputs:
    dev:
      type: postgres
      host: p.vbjrfujv5beutaoelw725gvi3i.db.postgresbridge.com
      user: postgres
      password: qNvJYuEDV2f7Oqtn7wtazQKv
      port: 5432
      dbname: postgres
      schema: public
      threads: 4
  target: dev

S3 file location

While configuring dbt, you will need to create an environment variable to point dbt at S3, like this:

export ICEBERG_LOCATION_PREFIX=<S3 location>

If you’re using the built-in Iceberg appliance for Crunchy Data Warehouse, your S3 bucket location can be accessed from psql with a show command, like this:

psql <connection-string> -c 'show crunchy_iceberg.default_location_prefix'

Models configuration

The model configuration controls how the transformation process behaves.

  • materialized='incremental': This tells dbt to perform incremental updates instead of fully rebuilding the table each time.
  • unique_key='created_at': This specifies the unique identifier for each record, used to detect new records.
  • pre_hook and post_hook: These hooks are executed before and after the model runs. In this case, the pre_hook sets the default access method to iceberg and configures the location prefix for storing Iceberg tables in S3. The post_hook resets these settings after the model has completed.
{{ config(
    materialized='incremental',
    unique_key='created_at',
    pre_hook="SET default_table_access_method TO 'iceberg'; SET crunchy_iceberg.default_location_prefix = '{{ env_var('ICEBERG_LOCATION_PREFIX', '') }}';",
    post_hook="RESET default_table_access_method; RESET crunchy_iceberg.default_location_prefix;"
) }}

DBT for Postgres and Crunchy Bridge

With dbt you can run the full range of features in the dbt-postgres adaptor for loading data and performing SQL operations in Postgres.

  • Create and manage Postgres tables inside Crunchy Bridge & Crunchy Data Warehouse
  • Load data from outside sources to Postgres
  • Utilize incremental processing for new or updated data instead of full table refreshes. MERGE actions supported

DBT for Iceberg and Crunchy Data Warehouse

dbt can be especially helpful for adding data to Iceberg for fast analytics with Crunchy Data Warehouse. With any data source, dbt can be configured to create and populate Iceberg table data. Included in the support for dbt with Iceberg is:

  • Creating and managing Iceberg tables inside Crunchy Data Warehouse
  • Data loading from outside sources to Iceberg
  • Incrementally processing and transforming data in Iceberg. This is done with a delete+insert strategy. Merge is not currently supported on Iceberg tables, but it can be used directly with the Postgres instance.