Metrics and monitoring
You can find system metrics on the Crunchy Bridge Dashboard under the Metrics tab.
Metrics are automatically available on any cluster provisioned after September 28, 2022.
If your Metrics graphs are blank, Metrics are likely not enabled for the cluster. You can enable them by initiating a Cluster Refresh maintenance to replace the instance.
This graph displays processing load broken out into system load, user load, iowait, and percent CPU steal. System CPU time reflects operating system (i.e. kernel) functions while user time reflects processing in the actual running instance of Postgres.
Note: Hobby-tier plans have burstable vCPUs. This means that you can temporarily use 20x more CPU than your instance is allotted, but the CPU will be throttled back to baseline when all burst credits are depleted. This is likely to manifest as a huge drop in performance on a hobby-tier database.
If you're having performance challenges on a hobby-tier cluster, look for spikes in percent steal in the CPU graph that typically follow after a spike in another measure of CPU load. High percent CPU steal would indicate CPU burst credit exhaustion:
Burst credits will accumulate again over time, but you may need to upgrade your cluster to achieve more consistent performance. Review plans and pricing to determine which tier is right for your use case.
Note: If you don't see % CPU steal in your cluster Metrics, you may need to refresh your cluster to receive the latest Crunchy Bridge features.
IOPS (input/output per second) is available in I/O RTPS (read transactions per second) and I/O WTPS (write transactions per second). IOPS capacity varies by plan. The Plans and Pricing page shows the specifications of each plan.
To determine which queries are contributing to IOPS usage, look for ones that use a lot of disk. Crunchy Bridge runs
pg_stat_statements by default on all instances, so statistics are available to review.
You can query
pg_stat_statements to look for a low hit rate on shared blocks, which would indicate that more data is read proportionally from disk than is being provided by cache: low
shared_blks_hit / (
Here's an example query you can use to find queries with a low hit rate:
pd.datname AS DB_Name
,pss.rows AS Total_Row_Count
,(pss.total_exec_time / 1000 / 60) AS Total_Exec_Mins
,((pss.total_exec_time / 1000 / 60) / calls) as Total_Avg_Exec_Time
,shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0)::float AS Hit_Rate
FROM pg_stat_statements AS pss
INNER JOIN pg_database AS pd
WHERE calls > 1000
ORDER BY 6
To dig into a query shown in the output, you can run the following statement with a given
select query from pg_stat_statements where queryid = <queryid>;
Query and index tuning can be a big help in increasing the hit rate on the cache and thereby reducing IOPS usage. For a deeper dive, check out Query Optimization in Postgres with
pg_stat_statements on the blog.
Load average shows average CPU load over the indicated time period. A load average equal to your vCPU count indicates full utilization of all CPUs. A load average in excess of your CPU count means that processes had to wait for CPU time, with higher values meaning more time spent waiting.
Number of vCPUs varies by plan. Check the Plans and Pricing page for details about specific plans. If you are consistently seeing high load average you should look at tuning expensive queries or consider upgrading to a larger plan.
This shows the amount of process memory and the amount of swap you are using based on the plan you have provisioned. Check the Plans and Pricing page for details about specific plans.
Note that swap usage is not necessarily a bad thing. However, if you’re often needing swap and your baseline memory usage is high, you likely need additional memory.
Postgres uses memory at a few different levels. If you're interested in the details, check out our blog post on data storage and flow.
Additionally, Postgres memory usage can be tricky to interpret. With regard to Postgres memory utilization, there are three main things at play.
Process Memory - This is memory being taken up by each backend process for its own use, including:
- the main Postmaster process
- utility processes (checkpointer, archiver, autovacuum launcher, etc)
- any client processes, i.e. those executing query statements
These processes allocate (by default) 4 MB each for process memory, but they also reserve additional memory based on parameters like
Shared Memory - This is memory used by all processes for data and transaction log caching. That is the sum of
CLOG_buffers, etc. By default we allocate 25% of system memory to
Kernel Memory - Memory not being used by Postgres processes is generally used by the kernel for disk cache. The kernel is (generally) smarter about what to keep and what to push to disk.
The memory graph on your Crunchy Bridge dashboard currently shows a
memory_used metric which includes all memory allocated by processes. The PostgreSQL server process allocates various buffers shared by all processes, so this value includes the sum of all the Process Memory and Shared Memory described above.
On Standard and Memory instances this will usually account for 25%-30% of memory usage, although it may be larger if you have a high connection count or query activity which consumes a lot of memory. However on Hobby instances this process memory will represent a larger fraction of overall memory usage, and it's not uncommon to see this value consistently reporting 80-85% of memory in use.
The important thing to note is that Linux makes intelligent use of available memory, using it to reduce load on disks. If processes need the memory, the OS will give up some of its disk cache.
Shows the number of connected Postgres clients over time. The Y axis uses your existing setting for
max_connections. This can be altered by updating the