Getting started
Create a team for your Crunchy Data warehouse
Before you create your warehouse cluster, you'll need to decide which Team it belongs in. You can create a new Team for your warehouse clusters or provision them in existing Teams.
Creating a new warehouse
Provision a new warehouse cluster inside your team by clicking the down arrow next to the Create Cluster button. The option to create an Warehouse cluster will appear in the dropdown:
Select the region and instance size for the cluster:
Block storage
When provisioning a new warehouse cluster, you'll be asked to choose the amount of block storage to be created for use on the local Postgres instance. This will be used for storing the PostgreSQL database directory and heap tables.
Managed object storage
In addition to local storage, Crunchy Data Warehouse comes with object storage in Amazon S3 built into the appliance. This is the default location for storing the managed Iceberg tables. You also have the option to provide your own S3 bucket for storage, provided that it is in the same AWS region as your warehouse cluster.
See Iceberg tables for specific commands you can use to work with Iceberg tables. You can also set up external projects to connect to Iceberg.
Warehouse object storage is billed based on usage ($0.046/GB/month) and is infinite.
Creating and querying Iceberg tables
Crunchy Data Warehouse comes with specific syntax for creating Iceberg tables.
Iceberg tables can be created from data inside your database or external
sources. Here's example syntax of the using iceberg
for loading data from an
external URL:
-- Convert a file directly into an Iceberg table
create table taxi_yellow ()
using iceberg
with (load_from = 'https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2024-01.parquet');
To query the Iceberg table, nothing else is needed. Query the Iceberg table with regular SQL syntax using the table name from the create statement:
select * from taxi_yellow limit 10;
Regions
The region of the managed object storage for the Iceberg tables will automatically be set to the same region as your cluster. If you want to specify a different S3 bucket location for Iceberg tables, it will have to be in the same region as well. You can import, export and query data lake files in different S3 regions, although that may result in network charges. In such cases, the warehouse will automatically detect regions and configure S3 access accordingly.
If you want to pass in a special region parameter when specifying the URL of a
bucket, you can add the optional ?s3_region=[bucket_region}
parameter, but in
the vast majority of cases you should not need to do this.