Citus terminology

Below is a quick collection of useful Citus terminology.

  • Cluster Group - A group of Postgres clusters defined within Crunchy Bridge that run within the same network and region to allow Citus to connect between nodes.
  • Coordinator node - This is the primary node in Citus that you will connect to in lieu of a standard Postgres instance. The coordinator does query routing among worker nodes, and at times final aggregation, and returns results back to your application. Typically your coordinator node is more CPU-focused than memory-focused. The coordinator doesn't keep the bulk of data in memory as most data will be stored on worker nodes.
  • Worker node - These are the nodes that will do the heavy lifting for your Citus cluster. When you distribute or shard a table, the data will then reside in shards on worker nodes. Worker nodes can be scaled independently of each other, are sized relative to your workload. Worker nodes are typically more memory-focused than CPU-focused. Worker nodes can be added to an existing Citus cluster, and once added you can use rebalancing functionality to have data moved onto new workers.
  • Distributed table - A distributed table is one that contains a subset of a full dataset. This is also sometimes called a sharded table.
  • Colocation - Tables that are located on the same node, due to being sharded on the same key, are referred to as co-located tables.
  • Rebalancing - The act of moving shards between nodes, co-located shards will be rebalanced together.