Postgres Setup Guide
Prerequisites
(Optional) Create a dedicated read-only user
Creating a dedicated read-only user is recommended for better permission control and auditing. If you choose to skip this step, you can use an existing Postgres user in your database.
To create a dedicated user, run the following command:
CREATE USER <user_name> PASSWORD 'your_password_here';
Grant the user access to the relevant schema:
GRANT USAGE ON SCHEMA <schema_name> TO <user_name>
Grant the user read-only access to the relevant tables:
GRANT SELECT ON ALL TABLES IN SCHEMA <schema_name> TO <user_name>;
Grant the user read access to tables created in the future:
ALTER DEFAULT PRIVILEGES IN SCHEMA <schema_name> GRANT SELECT ON TABLES TO <user_name>;
(Optional) Grant the user read access to the pg_read_all_stats system role, this is required for us to monitor concurrent index creation and prevent locking the process.
GRANT pg_read_all_stats TO <user_name>;
Change Data Capture (CDC)
-
Ensure that your database has enough free space for logical replication. The amount of additional disk space that the logs consume is proportional to the number of changes committed on the server.
-
In your postgresql.conf file, add or verify the following settings:
- Disable the
statement_timeoutandwal_sender_timeout parametersettings by setting both to0. - Set the
wal_levelparameter to'logical'. - Ensure that your
max_replication_slotsvalue is equal to or higher than the number of PostgreSQL connectors that use WAL plus the number of other replication slots your database uses. - Ensure that the
max_wal_sendersparameter, which specifies the maximum number of concurrent connections to the WAL, is at least twice the total number of logical replication slots. - Restart the server for changes to take effect.
- Disable the
-
Log into your PostgreSQL console as a superuser or, for AWS Aurora, one that has the
rds_superuserrole. -
Create a publication. You can create a publication for only certain tables so that you add or remove tables from the publication later on.
-
To create publication for all tables execute the following query:
CREATE PUBLICATION matia_pub FOR ALL TABLES; -
To create publication for only certain tables execute the following query:
CREATE PUBLICATION matia_pub FOR TABLE table1, table2, table3;Note that we will be able to get changes only for tables included to publication.
-
-
Create a dedicated replication slot for the database you want to sync. Matia will use this slot to track changes in your database and flush it during each successful sync.
SELECT pg_create_logical_replication_slot('name_of_your_slot', 'pgoutput') -
Grant replication privileges to your PostgreSQL user.
ALTER ROLE <user_name> WITH REPLICATION;For AWS Aurora/RDS, execute the following query instead:
GRANT rds_replication TO <user_name>; -
Verify that your PostgreSQL user can read the replication slot by running the following command. Replace
matia_slotwith your replication slot name.SELECT count(*) FROM pg_logical_slot_peek_binary_changes('matia_slot', NULL, NULL, 'proto_version', '1', 'publication_names', 'matia_pub');
Setup Guide
- Fill in the Username, Password, Hostname, Port and Database for your Postgres database.
- If the SSL configuration of your database is either
allow,requireorprefer, set the SSL toggle to ON. - Enter a Name for the connector.
- (Optional) Enter a Description for the connector.
- Select the Owner of the connector.
- (Optional) Verify that your Postgres database is successfully connected by clicking on Test Connection.
- Click Connect.
Supported Sync Modes
- Full Refresh
- Incremental
- Incremental XMIN
Incremental XMIN
The XMIN Incremental Sync is based on the hidden XMIN system column that is present in all PostgreSQL tables. This will be used as the default cursor if a user-defined cursor is not selected. To enable XMIN, ensure that the statement_timeout setting on your server is either 0 (the default value to disable the timeout) or greater than 5 minutes.
Note: An XMIN Incremental Sync will not sync deleted records.
Supported Cursors
- TIMESTAMP
- TIMESTAMP_WITH_TIMEZONE
- TIME
- TIME_WITH_TIMEZONE
- DATE
- BIT
- BOOLEAN
- TINYINT/SMALLINT
- INTEGER
- BIGINT
- FLOAT/DOUBLE
- REAL
- NUMERIC/DECIMAL
- CHAR/NCHAR/NVARCHAR/VARCHAR/LONGVARCHAR
- BINARY/BLOB