Getting started

Create a team for analytics

Before you create your analytics cluster, you'll need to decide which Team it belongs in. You can create a new Team for your analytics clusters or provision them in existing Teams.

Creating a new analytics cluster

Provision a new analytics cluster inside your team by clicking the down arrow next to the Create Cluster button. The option to create an Analytics cluster will appear in the dropdown:

Select the region and instance size for the cluster:

Granting access to the existing users

Crunchy Bridge for Analytics comes with 3 different roles that you should GRANT to your existing users/roles.

  • crunchy_lake_read
  • crunchy_lake_write
  • crunchy_lake_read_write
GRANT crunchy_lake_read_write TO application;

Connecting to cloud object store

Crunchy Bridge for Analytics currently supports connection to Amazon S3 (Google Cloud Storage and Azure Blob Storage are in development). Credentials to your object store are in the Team Settings —> Analytics Credentials. Here you’ll enter the connection details. Each analytics provision can be connected to new or existing credentials.

Connecting via any URL

Crunchy Bridge for analytics can also read any publicly available Parquet file via https url path with the CREATE FOREIGN TABLE command.

CREATE FOREIGN TABLE taxi_another_trips()
SERVER crunchy_lake_analytics OPTIONS
(path 'https://d37ci6vzurychx.cloudfront.net/trip-data/fhvhv_tripdata_2023-01.parquet');

Regions

We recommend that your Analytics cluster and the S3 buckets you will access frequently are in the same region to maximize query performance and avoid potential network charges. Whether they are in the same region or not, the experience will be seamless. Analytics will automatically detect the regions and configure S3 access accordingly. If you want to pass on a special region parameter when specifying the URL of a bucket, you can add the optional ?s3_region=[bucket_region] parameter, but in vast majority of cases you should not need to.

Connecting to public S3

For testing purposes or experimentation, you may want to connect a Bridge Analytics cluster to public data in S3.

There are two kinds of S3 public buckets:

  • Give access to everyone (public access). For public access buckets, you can skip the credential setup during cluster initialization. You will continue to see a warning on the cluster overview page to enter credentials for full functionality, but your Analytics cluster will still be operational without them.
  • Authenticated users group (anyone with an AWS account). For these, you will need to be connected to s3 so that the account is recognized.