Storage

Managed storage

Crunchy Data Warehouse comes with object storage built into the appliance. Files are written to managed storage with the using iceberg syntax (see Iceberg tables for details). Warehouse object storage is billed based on usage ($0.046/gb/month) and is infinite.

Files in your storage can be accessed from the cluster overview screen:

Note that the name of your storage location will be 'database name'/'schema name'/'table name'/'table OID'. If you're using the public schema for Postgres, that is how your Iceberg table will be named. Regardless of the name displayed, storage repositories reside in a private network and are not publicly accessible.

Connecting to external storage

Crunchy Data Warehouse also supports connection to outside Amazon S3 buckets either to store managed Iceberg tables, or to import, export and query data lake files. Google Cloud Storage and Azure Blob Storage are in development but not currently available in Crunchy Data Warehouse.

Credentials to your object store are in the Team settings —> Data lake. This is where you'll enter the connection details. Each new warehouse cluster can be connected to new or existing credentials.

Connecting via any URL

Crunchy Data Warehouse can also read any publicly available Parquet file via an HTTPS URL path with the CREATE FOREIGN TABLE command:

CREATE FOREIGN TABLE taxi_another_trips()
SERVER crunchy_lake_analytics
OPTIONS (path `https://d37ci6vzurychx.cloudfront.net/trip-data/fhvhv_tripdata_2023-01.parquet`);

Granting URL access to database users

By default, regular database users cannot perform operations that read or write to an arbitrary URL, regardless of whether that URL points to public or private data.

Crunchy Data Warehouse comes with three different roles that you can GRANT to existing users/roles.

  • crunchy_lake_read - permission to read from an arbitrary URL via COPY ... FROM or creating a crunchy_lake_analytics table
  • crunchy_lake_write - permission to write to an arbitrary URL via COPY ... TO or creating an Iceberg table
  • crunchy_lake_read_write - permission to both read and write

For example, you can give a user permission to import from URLs with: GRANT crunchy_lake_read TO importer;

Note that granting one of these roles implicitly gives access to all data in managed and external storage. You can use table-level grants to give users only read access on specific tables.