Logical replication and logical decoding in Azure Database for PostgreSQL - Flexible Server

Azure Database for PostgreSQL - Flexible Server supports the following logical data extraction and replication methodologies:

  1. Logical replication
    1. Using PostgreSQL native logical replication to replicate data objects. Logical replication allows fine-grained control over the data replication, including table-level data replication.
    2. Using pglogical extension that provides logical streaming replication and additional capabilities such as copying initial schema of the database, support for TRUNCATE, ability to replicate DDL etc.
  2. Logical decoding which is implemented by decoding the content of write-ahead log (WAL).

Comparing logical replication and logical decoding

Logical replication and logical decoding have several similarities. They both

The two technologies have their differences: Logical replication

  • allows you to specify a table or set of tables to be replicated
  • replicates data between PostgreSQL instances

Logical decoding

  • extracts changes across all tables in a database
  • cannot directly send data between PostgreSQL instances.

Note

As at this time, Flexible server does not support cross-region read replicas. Depending on the type of workload, you may choose to use logical replication feature for cross-region disaster recovery (DR) purpose.

Pre-requisites for logical replication and logical decoding

  1. Go to server parameters page on the portal.
  2. Set the server parameter wal_level to logical.
  3. If you want to use pglogical extension, search for the shared_preload_libaries parameter, and select pglogical from the drop-down box.
  4. Update max_worker_processes parameter value to at least 16. Otherwise, you may run into issues like WARNING: out of background worker slots.
  5. Save the changes and restart the server to apply the wal_level change.
  6. Confirm that your PostgreSQL instance allows network traffic from your connecting resource.
  7. Grant the admin user replication permissions.
    ALTER ROLE <adminname> WITH REPLICATION;
    
  8. You may want to make sure the role you are using has privileges on the schema that you are replicating. Otherwise, you may run into errors such as Permission denied for schema.

Using logical replication and logical decoding

Native logical replication

Logical replication uses the terms 'publisher' and 'subscriber'.

  • The publisher is the PostgreSQL database you are sending data from.
  • The subscriber is the PostgreSQL database you are sending data to.

Here's some sample code you can use to try out logical replication.

  1. Connect to the publisher database. Create a table and add some data.

    CREATE TABLE basic(id SERIAL, name varchar(40));
    INSERT INTO basic(name) VALUES ('apple');
    INSERT INTO basic(name) VALUES ('banana');
    
  2. Create a publication for the table.

    CREATE PUBLICATION pub FOR TABLE basic;
    
  3. Connect to the subscriber database. Create a table with the same schema as on the publisher.

    CREATE TABLE basic(id SERIAL, name varchar(40));
    
  4. Create a subscription that will connect to the publication you created earlier.

    CREATE SUBSCRIPTION sub CONNECTION 'host=<server>.postgres.database.azure.com user=<admin> dbname=<dbname> password=<password>' PUBLICATION pub;
    
  5. You can now query the table on the subscriber. You will see that it has received data from the publisher.

    SELECT * FROM basic;
    

    You can add more rows to the publisher's table and view the changes on the subscriber.

    If you are not able to see the data, enable the login privilege for azure_pg_admin and check the table content.

    ALTER ROLE azure_pg_admin login;
    

Visit the PostgreSQL documentation to understand more about logical replication.

pglogical extension

Here is an example of configuring pglogical at the provider database server and the subscriber. Please refer to pglogical extension documentation for more details. Also make sure you have performed pre-requisite tasks listed above.

  1. Install pglogical extension in the database in both the provider and the subscriber database servers.

    \C myDB
    CREATE EXTENSION pglogical;
    
  2. At the provider (source/publisher) database server, create the provider node.

    select pglogical.create_node( node_name := 'provider1', 
    dsn := ' host=myProviderServer.postgres.database.azure.com port=5432 dbname=myDB user=myUser password=myPassword');
    
  3. Create a replication set.

    select pglogical.create_replication_set('myreplicationset');
    
  4. Add all tables in the database to the replication set.

    SELECT pglogical.replication_set_add_all_tables('myreplicationset', '{public}'::text[]);
    

    As an alternate method, ou can also add tables from a specific schema (for example, testUser) to a default replication set.

    SELECT pglogical.replication_set_add_all_tables('default', ARRAY['testUser']);
    
  5. At the subscriber database server, create a subscriber node.

    select pglogical.create_node( node_name := 'subscriber1', 
    dsn := ' host=mySubscriberServer.postgres.database.azure.com port=5432 dbname=myDB user=myUser password=myPasword' );
    
  6. Create a subscription to start the synchronization and the replication process.

    select pglogical.create_subscription (
    subscription_name := 'subscription1',
    replication_sets := array['myreplicationset'],
    provider_dsn := 'host=myProviderServer.postgres.database.azure.com port=5432 dbname=myDB user=myUser password=myPassword');
    
  7. You can then verify the subscription status.

    SELECT subscription_name, status FROM pglogical.show_subscription_status();
    

Logical decoding

Logical decoding can be consumed via the streaming protocol or SQL interface.

Streaming protocol

Consuming changes using the streaming protocol is often preferable. You can create your own consumer / connector, or use a third-party service like Debezium.

Visit the wal2json documentation for an example using the streaming protocol with pg_recvlogical.

SQL interface

In the example below, we use the SQL interface with the wal2json plugin.

  1. Create a slot.

    SELECT * FROM pg_create_logical_replication_slot('test_slot', 'wal2json');
    
  2. Issue SQL commands. For example:

    CREATE TABLE a_table (
       id varchar(40) NOT NULL,
       item varchar(40),
       PRIMARY KEY (id)
    );
    
    INSERT INTO a_table (id, item) VALUES ('id1', 'item1');
    DELETE FROM a_table WHERE id='id1';
    
  3. Consume the changes.

    SELECT data FROM pg_logical_slot_get_changes('test_slot', NULL, NULL, 'pretty-print', '1');
    

    The output will look like:

    {
          "change": [
          ]
    }
    {
          "change": [
                   {
                            "kind": "insert",
                            "schema": "public",
                            "table": "a_table",
                            "columnnames": ["id", "item"],
                            "columntypes": ["character varying(40)", "character varying(40)"],
                            "columnvalues": ["id1", "item1"]
                   }
          ]
    }
    {
          "change": [
                   {
                            "kind": "delete",
                            "schema": "public",
                            "table": "a_table",
                            "oldkeys": {
                                  "keynames": ["id"],
                                  "keytypes": ["character varying(40)"],
                                  "keyvalues": ["id1"]
                            }
                   }
          ]
    }
    
  4. Drop the slot once you are done using it.

    SELECT pg_drop_replication_slot('test_slot'); 
    

Visit the PostgreSQL documentation to understand more about logical decoding.

Monitoring

You must monitor logical decoding. Any unused replication slot must be dropped. Slots hold on to Postgres WAL logs and relevant system catalogs until changes have been read. If your subscriber or consumer fails or has not been properly configured, the unconsumed logs will pile up and fill your storage. Also, unconsumed logs increase the risk of transaction ID wraparound. Both situations can cause the server to become unavailable. Therefore, it is critical that logical replication slots are consumed continuously. If a logical replication slot is no longer used, drop it immediately.

The 'active' column in the pg_replication_slots view will indicate whether there is a consumer connected to a slot.

SELECT * FROM pg_replication_slots;

Set alerts on the Maximum Used Transaction IDs and Storage Used flexible server metrics to notify you when the values increase past normal thresholds.

Limitations

  • Logical replication limitations apply as documented here.
  • Read replicas - Azure Database for PostgreSQL read replicas are not currently supported with flexible servers.
  • Slots and HA failover - Logical replication slots on the primary server are not available on the standby server in your secondary AZ. This applies to you if your server uses the zone-redundant high availability option. In the event of a failover to the standby server, logical replication slots will not be available on the standby.

Next steps