From Amazon RDS

Datasets under 100 GB: Migration with dump/restore

If your dataset is under 100 GB, the simplest migration approach is to use Postgres' pg_dump and pg_restore commands. We strongly recommend validating that this approach will work for you by testing it in a staging environment before moving forward with your production clusters.

Disable `pgaudit`

All Crunchy Bridge instances are created with database auditing enabled for the postgres role. Therefore, to reduce excessive logging, temporarily disable pgaudit for the postgres role. On your Crunchy Bridge instance as the postgres role, while connected to your Crunchy Bridge instance:

ALTER ROLE postgres RESET pgaudit.log;

Pipe `pg_dump` on RDS to `pg_restore` on Crunchy Bridge

From a session that has access to both your RDS instance and your Crunchy Bridge instance, you can pipe the logical backup from pg_dump directly into pg_restore to your empty Crunchy Bridge instance. The pg_dump will be current as of the moment it is executed. You should ensure your application is in maintenance mode if you are running this as part of a production cutover so that there are no changes to the data during the migration. It is also preferable to run it as geographically close to both databases as possible. For example, you could use a VM in the same region as your RDS and Crunchy Bridge instances, rather than executing it from your local workstation.

$ pg_dump -Fc $SOURCE_DB_URI | pg_restore --no-acl --no-owner -d TARGET_DB_URI

Update ownership and permissions

Since the restore will be executed using the postgres role, all the created tables will also be owned by the postgres role. This means that the Crunchy Bridge application role will have no default permissions. To give the application role appropriate database permissions, you will need to reset objects' owners to the application role. To achieve this you need to run the SQL below to generate the correct commands, and then you must run the output as well.

Info

This is a two step process. The statements below will generate the commands to be used to grant permissions. You will run the statements below in your Crunchy cluster as the postgres role, and then you'll need to take the output of each one, which will be ALTER statements specific to your particular schema, and run those in order to update ownership and permissions in your cluster.

-- Tables
SELECT 'ALTER TABLE '|| schemaname || '."' || tablename ||'" OWNER TO application;'
FROM pg_tables WHERE NOT schemaname IN ('pg_catalog', 'information_schema')
ORDER BY schemaname, tablename;

-- Sequences
SELECT 'ALTER SEQUENCE '|| sequence_schema || '."' || sequence_name ||'" OWNER TO application;'
FROM information_schema.sequences
WHERE NOT sequence_schema IN ('pg_catalog', 'information_schema')
ORDER BY sequence_schema, sequence_name;

-- Views
SELECT 'ALTER VIEW '|| table_schema || '."' || table_name ||'" OWNER TO application;'
FROM information_schema.views
WHERE NOT table_schema IN ('pg_catalog', 'information_schema')
AND NOT table_name IN ('pg_stat_statements', 'pg_stat_statements_info')
ORDER BY table_schema, table_name;

-- Materialized Views
SELECT 'ALTER TABLE '|| oid::regclass::text ||' OWNER TO application;'
FROM pg_class WHERE relkind = 'm'
ORDER BY oid;

Here is an example of running the first statement block above, and then running the resulting output, an ALTER TABLE statement, in order to produce the change in permissions:

postgres=# SELECT 'ALTER TABLE '|| schemaname || '."' || tablename ||'" OWNER TO application;'
FROM pg_tables WHERE NOT schemaname IN ('pg_catalog', 'information_schema')
ORDER BY schemaname, tablename;
                       ?column?
-------------------------------------------------------
 ALTER TABLE public."sampledata" OWNER TO application;
(1 row)
Time: 65.915 ms

postgres=# ALTER TABLE public."sampledata" OWNER TO application;
ALTER TABLE
Time: 292.640 ms

postgres=#

Set up any additional roles

Be sure to create any other needed roles and set appropriate permissions depending on how they are used (for example, read-only roles used by your application). Note also that the application role is intended for use by your application, while the postgres role / superuser account is intended for administration only (refer to User management concepts for details).

Re-enable `pgaudit`

Once you've completed all the earlier steps, you should re-enable pgaudit on your Crunchy Bridge instance by running the following as the postgres role:

ALTER USER postgres SET pgaudit.log='all';

Datasets over 100 GB: Migration with logical replication

The architecture of logical replication is straightforward. The primary data source (your Amazon RDS instance) will be the publisher, and the receiving database (your new, empty Crunchy Bridge database) will be the subscriber.

In the initial load, all data is copied from the publisher to the subscriber. Following the initial data copy, any transactions made on the publisher side are sent to the subscriber.

Migrate schema from Amazon RDS to Crunchy Bridge

Logical replication only replicates data changes (DML: INSERT, UPDATE, DELETE), so you must ensure that the target database has the correct schema beforehand.

To get a schema-only dump of our source and apply to your Crunchy Bridge database, run:

pg_dump -Fc -s $SOURCE_DB_URI | pg_restore --no-acl --no-owner -d $TARGET_DB_URI

If your migration process is proceeding while application development continues, you must make sure to update the receiving database's schema if you make any schema changes on your source database.

Publisher / Amazon RDS settings

To set up logical replication, it must be enabled in RDS (a custom parameter). This may require a reboot of your instance.

You can check replication settings using this query:

SELECT name,setting FROM pg_settings WHERE name IN ('wal_level','rds.logical_replication');

          name           | setting
-------------------------+---------
 rds.logical_replication | on
 wal_level               | logical
(2 rows)

Slot configuration

Review the replication slots settings to make sure there are sufficient resources. For very large replication projects, the defaults may need to be changed. Reach out to support if you have questions.

max_replication_slots
max_wal_senders
max_logical_replication_workers
max_worker_processes
max_sync_workers_per_subscription based on your usage.

For details on how these parameters should be set, see the PostgreSQL manual's chapter on Logical Replication Configuration Settings.

Firewall

Make sure your Crunchy Bridge database is allowed to access your network and firewall. Your CIDR can be found on the Cluster Networking page.

Replication user for the Crunchy Bridge subscriber

Ideally you will be using the rds_replication user or the rds_superuser to set up replication. You can also create a specific user for this purpose that has the REPLICATION role attribute and GRANT select on all tables to it.

Tables without primary keys or UNIQUE indexes

For logical replication, Postgres needs a way to uniquely identify rows to be updated/deleted. For tables with primary keys, that key is used, so first identify any tables that lack primary keys:

select tab.table_schema,
       tab.table_name
from information_schema.tables tab
left join information_schema.table_constraints tco 
          on tab.table_schema = tco.table_schema
          and tab.table_name = tco.table_name 
          and tco.constraint_type = 'PRIMARY KEY'
where tab.table_type = 'BASE TABLE'
      and tab.table_schema not in ('pg_catalog', 'information_schema')
      and tco.constraint_name is null
order by table_schema,
         table_name;

For tables without primary keys, any UNIQUE index can be used:

ALTER TABLE tablename REPLICA IDENTITY USING INDEX idx_some_unique_index;

If there are no existing UNIQUE indexes, one can be created, or the table can be set with REPLICA IDENTITY FULL - in which case it treats each row as its own "key":

ALTER TABLE tablename REPLICA IDENTITY FULL;

Create the publication of all tables

Next create a publication, which is a grouping of tables you intend to replicate. In most cases you will create a publication FOR ALL TABLES:

CREATE PUBLICATION bridge_migration FOR ALL TABLES;

Check that your tables are ready for publication:

SELECT * FROM pg_publication_tables;

In some cases, for example if your database has one very large table, you may want to group your tables into multiple separate publications. If you have any questions about this, please get in touch with us to discuss.

Subscriber / Crunchy Bridge database settings

On the Crunchy Bridge side, create a subscription to each publication to begin receiving the published data. Crunchy Bridge can connect using the rds_replication user or another REPLICATION level user if you've made a different one.

Create the subscription

CREATE SUBSCRIPTION bridge_migration CONNECTION 'host={host} port=5432 dbname={datatbase} user={login} password={password}' PUBLICATION bridge_migration;

Creating the subscription in this way will create a replication slot on the publisher and begin copying data from tables specified in the publication. A separate temporary slot will be created for each table for the duration of its initial data synchronization copy. You can limit how many tables are synchronized at once with the max_sync_workers_per_subscription setting on your Crunchy Bridge subscriber cluster.

Monitoring replication process

You may want to monitor this process. The pg_stat_subscription table will show data on the subscriber end of the transaction:

postgres=# select * from pg_stat_subscription;

-[ RECORD 1 ]---------+------------------------------
subid                 | 27183
subname               | bridge_migration
worker_type           | table synchronization
pid                   | 1197139
leader_pid            |
relid                 | 26721
received_lsn          |
last_msg_send_time    | 2025-02-26 15:54:45.095215+00
last_msg_receipt_time | 2025-02-26 15:54:45.095215+00
latest_end_lsn        |
latest_end_time       | 2025-02-26 15:54:45.095215+00
-[ RECORD 2 ]---------+------------------------------
subid                 | 27183
subname               | bridge_migration
worker_type           | apply
pid                   | 47075
leader_pid            |
relid                 |
received_lsn          | 4E32/7092F6F8
last_msg_send_time    | 2025-02-26 15:55:11.020012+00
last_msg_receipt_time | 2025-02-26 15:55:11.021989+00
latest_end_lsn        | 4E32/7092F3E0
latest_end_time       | 2025-02-26 15:55:10.843251+00

You can also look at the pg_subscription_rel view to see the synchronization state of each table:

select * from pg_subscription_rel;

Here, the state_code can tell you about each object:

d - data is being copied
f - finished table copy
s - synchronized
r - ready (normal replication)

Because of table bloat and other factors, you won't be able to compare the table sizes. Though you can do select count(*) to compare row sizes.

Fix sequences

While logical replication will copy over all the data from the source, it doesn't update any of the sequences. For this reason, we recommend that you update your sequences post-cutover, before you begin production operations. The best approach to fixing your sequences is to simply create setval commands for all sequences in your source database(s), which you can do with this query:

SELECT
    'SELECT setval(' || quote_literal(quote_ident(n.nspname) || '.' || quote_ident(c.relname)) || ', ' || s.last_value || ');'
FROM
    pg_class c
    JOIN pg_namespace n ON n.oid = c.relnamespace
    JOIN pg_sequences s ON s.schemaname = n.nspname
        AND s.sequencename = c.relname
WHERE
    c.relkind = 'S';

The resulting output file can be be executed on the promoted replica to synchronize all sequences.

Testing and cutover

Now you can begin testing your application against the Crunchy Bridge database. Once you have confirmed that all the data is present, you can do a migration cutover. This will require stopping transactions on the RDS side, fixing your sequences, and pointing your application to the Crunchy Bridge database.

Still need help?

Welcome to Crunchy Bridge, we're happy you're here! 🎉 If you find that you need help before, during, or after the end of your migration, don't hesitate to get in touch with us.