Crunchy Bridge for Analytics allows you to query files stored in object storage directly. Caching is a critical feature that enhances query performance by storing recently accessed files locally. When files are cached on the database server, you can access them locally instead of downloading them from storage again.

When you first query a file, our caching layer will start moving data to your instance to optimize performance for subsequent queries on that file. Once the files you’re working with are cached, your query performance will improve significantly.

Automatic cache management

After you access a foreign table for the first time, Crunchy Bridge for Analytics automatically begins to cache the files. The system uses a simple method to manage the cache:

  • Fetch and store recently accessed files.
  • Removes older files from the cache to free up space when necessary.

This process is automatic and works in the background, allowing you to enjoy faster query execution without manual intervention.

Manual cache management

If you want more control over which files are cached, Crunchy Bridge for Analytics provides functions that let you manage the cache manually. You can use these functions to specify which files to cache or remove as needed.

crunchy_file_cache.add (PATH text, REFRESH boolean [default false])Adds a single file to cache.PATH: path (url) to file to be added, REFRESH (optional): forces to redownload file into the cache even if it exists
crunchy_file_cache.remove(PATH text)Removes a single file from cachePATH: path (url) to file to be removed
crunchy_file_cache.list()Lists all files in cache

Example caching calls:

--Manually start downloading a file into the cache, rewrite the file if it already exists
SELECT crunchy_file_cache.add('s3://your_bucket_name/file_to_be_cached.xx', true);

--Manually remove file from cache
SELECT crunchy_file_cache.remove('s3://your_bucket_name/file_to_be_removed_from_cached.xx');

--List all files in cache
SELECT crunchy_file_cache.list();