Autovacuum tips

PostgreSQL has an AUTOVACUUM process that runs in the background and automatically performs vacuum operations on tables and indexes when necessary. The autovacuum process helps to manage disk space and performance on your cluster by marking dead tuples as reusable, thus helping to reduce bloat. Read more about what bloat is in our Insights and Metrics section.

Autovacuum defaults

Autovacuum is enabled and configured by default for all Crunchy Bridge clusters. The default settings for autovacuum on Crunchy Bridge clusters are:

// Minimum # of updated or deleted tuples needed to trigger a VACUUM
autovacuum_vacuum_threshold = 50

// Fraction of table size; added to autovacuum_vacuum_threshold parameter when deciding whether to trigger a VACUUM
autovacuum_vacuum_scale_factor = 0.2

Tuning autovacuum

The default threshold for triggering autovacuum is set to 20% of the rowcount. This works well for small tables, but maintaining performant large, busy tables can require much more frequent vacuuming. Fortunately, you can configure autovacuum to run more frequently on a per-table basis.

A good rule of thumb for large tables with lots of turnover is to set the autovacuum_vacuum_scale_factor so that autovacuum is triggered when there are about 2 million tuples that have been updated or deleted. You can use the current row count for a given table to find an appropriate scale factor: autovacuum_vacuum_scale_factor = 2M / current row count.

Then you can use ALTER TABLE to set it to your computed scale factor. We'll use 0.015 here, which means autovacuum will be triggered when the number of updated or deleted tuples in this table equals autovacuum_vacuum_threshold + 1.5% of the total size of our very_large_table:

ALTER TABLE very_large_table SET (autovacuum_vacuum_scale_factor = 0.015);

Checking vacuum settings

You can quickly review a cluster's Vacuum Statistics in the Insights tab in the dashboard. This table will show you the last time your tables have been vacuumed, along with related information.

You can also review information about the most recent vacuum along with currently set thresholds (including any tuning/overrides) using this query:

SELECT psut.relname,
       to_char(psut.last_vacuum, 'YYYY-MM-DD HH24:MI') as last_vacuum,
       to_char(psut.last_autovacuum, 'YYYY-MM-DD HH24:MI') as last_autovacuum,
       to_char(pg_class.reltuples, '9G999G999G999') AS n_tup,
       to_char(psut.n_dead_tup, '999G999G999G999') AS dead_tup,
       COALESCE((SELECT split_part(x, '=', 2) FROM unnest(reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ), current_setting('autovacuum_vacuum_scale_factor'))::float8 AS av_scale_factor,
       to_char(CAST(current_setting('autovacuum_vacuum_threshold') AS bigint) + (COALESCE((SELECT split_part(x, '=', 2) FROM unnest(reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ), current_setting('autovacuum_vacuum_scale_factor'))::float8 * pg_class.reltuples), '9G999G999G999') AS av_threshold,
       CASE
          WHEN CAST(current_setting('autovacuum_vacuum_threshold') AS bigint) + (COALESCE((SELECT split_part(x, '=', 2) FROM unnest(reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ), current_setting('autovacuum_vacuum_scale_factor'))::float8 * pg_class.reltuples) < psut.n_dead_tup
          THEN '*'
          ELSE ''
       END AS expect_av
 FROM pg_stat_user_tables psut
      JOIN pg_class
        ON psut.relid = pg_class.oid
ORDER BY
      CAST(current_setting('autovacuum_vacuum_threshold') AS bigint) + (COALESCE((SELECT split_part(x, '=', 2) FROM unnest(reloptions) q (x) WHERE x ~ '^autovacuum_vacuum_scale_factor=' ), current_setting('autovacuum_vacuum_scale_factor'))::float8 * pg_class.reltuples) DESC;