Skip to content

Commit b490581

Browse files
author
Cary Huang
committed
updates
1 parent 1c2b500 commit b490581

File tree

5 files changed

+557
-5
lines changed

5 files changed

+557
-5
lines changed
Lines changed: 136 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,136 @@
1-
# Oracle CDC to PostgreSQL
1+
# Oracle CDC to PostgreSQL
2+
3+
## Prepare Oracle Databasea for SynchDB
4+
5+
Before SynchDB can be used to replicate from Oracle, Oracle needs to be configured according to the procedure outlined [here](https://docs.synchdb.com/getting-started/remote_database_setups/)
6+
7+
Please ensure that supplemental log data is enabled for all columns for each desired table to be replicated by SynchDB. This is needed for SynchDB to correctly handle UPDATE and DELETE oeprations.
8+
9+
For example, the following enables supplemental log data for all columns for `customer` and `products` table. Please add more tables as needed.
10+
11+
```sql
12+
ALTER TABLE customer ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
13+
ALTER TABLE products ADD SUPPLEMENTAL LOG DATA (ALL) COLUMNS;
14+
... etc
15+
```
16+
17+
## Create a Oracle Connector
18+
19+
Create a connector that targets all the tables under `FREE` database in Oracle.
20+
```sql
21+
SELECT
22+
synchdb_add_conninfo(
23+
'oracleconn', '127.0.0.1', 1521,
24+
'c##dbzuser', 'dbz', 'FREE', 'postgres',
25+
'null', 'null', 'oracle');
26+
```
27+
28+
## Initial Snapshot + CDC
29+
30+
Start the connector using `initial` mode will perform the initial snapshot of all designated tables (all in this case). After this is completed, the change data capture (CDC) process will begin to stream for new changes.
31+
32+
```sql
33+
SELECT synchdb_start_engine_bgw('oracleconn', 'initial');
34+
35+
or
36+
37+
SELECT synchdb_start_engine_bgw('oracleconn');
38+
```
39+
40+
The stage of this connector should be in `initial snapshot` the first time it runs:
41+
```sql
42+
postgres=# select * from synchdb_state_view where name='oracleconn';
43+
name | connector_type | pid | stage | state | err | last_dbz_offset
44+
------------+----------------+--------+------------------+---------+----------+-----------------------------
45+
oracleconn | oracle | 528146 | initial snapshot | polling | no error | offset file not flushed yet
46+
(1 row)
47+
48+
```
49+
50+
A new schema called `inventory` will be created and all tables streamed by the connector will be replicated under that schema.
51+
```sql
52+
postgres=# set search_path=public,free;
53+
SET
54+
postgres=# \d
55+
List of relations
56+
Schema | Name | Type | Owner
57+
--------+--------------------+-------+--------
58+
free | orders | table | ubuntu
59+
public | synchdb_att_view | view | ubuntu
60+
public | synchdb_attribute | table | ubuntu
61+
public | synchdb_conninfo | table | ubuntu
62+
public | synchdb_objmap | table | ubuntu
63+
public | synchdb_state_view | view | ubuntu
64+
public | synchdb_stats_view | view | ubuntu
65+
(7 rows)
66+
67+
```
68+
69+
After the initial snapshot is completed, and at least one subsequent changes is received and processed, the connector stage shall change from `initial snapshot` to `Change Data Capture`.
70+
```sql
71+
postgres=# select * from synchdb_state_view where name='oracleconn';
72+
name | connector_type | pid | stage | state | err |
73+
last_dbz_offset
74+
------------+----------------+--------+---------------------+---------+----------+-------------------------------
75+
-------------------------------------------------------
76+
oracleconn | oracle | 528414 | change data capture | polling | no error | {"commit_scn":"3118146:1:02001
77+
f00c0020000","snapshot_scn":"3081987","scn":"3118125"}
78+
(1 row)
79+
80+
81+
```
82+
83+
This means that the connector is now streaming for new changes of the designated tables. Restarting the connector in `initial` mode will proceed replication since the last successful point and initial snapshot will not be re-run.
84+
85+
## Initial Snapshot Only and no CDC
86+
87+
Start the connector using `initial_only` mode will perform the initial snapshot of all designated tables (all in this case) only and will not perform CDC after.
88+
89+
```sql
90+
SELECT synchdb_start_engine_bgw('oracleconn', 'initial_only');
91+
92+
```
93+
94+
The connector would still appear to be `polling` from the connector but no change will be captured because Debzium internally has stopped the CDC. You have the option to shut it down. Restarting the connector in `initial_only` mode will not rebuild the tables as they have already been built.
95+
96+
## Capture Table Schema Only + CDC
97+
98+
Start the connector using `no_data` mode will perform the schema capture only, build the corresponding tables in PostgreSQL and it does not replicate existing table data (skip initial snapshot). After the schema capture is completed, the connector goes into CDC mode and will start capture subsequent changes to the tables.
99+
100+
```sql
101+
SELECT synchdb_start_engine_bgw('oracleconn', 'no_data');
102+
103+
```
104+
105+
Restarting the connector in `no_data` mode will not rebuild the schema again, and it will resume CDC since the last successful point.
106+
107+
## CDC only
108+
109+
Start the connector using `never` will skip schema capture and initial snapshot entirely and will go to CDC mode to capture subsequent changes. Please note that the connector expects all the capture tables have been created in PostgreSQL prior to starting in `never` mode. If the tables do not exist, the connector will encounter an error when it tries to apply a CDC change to a non-existent table.
110+
111+
```sql
112+
SELECT synchdb_start_engine_bgw('oracleconn', 'never');
113+
114+
```
115+
116+
Restarting the connector in `never` mode will resume CDC since the last successful point.
117+
118+
## Always do Initial Snapahot + CDC
119+
120+
Start the connector using `always` mode will always capture the schemas of capture tables, always redo the initial snapshot and then go to CDC. This is similar to a reset button because everything will be rebuilt using this mode. Use it with caution especially when you have large number of tables being captured, which could take a long time to finish. After the rebuild, CDC resumes as normal.
121+
122+
```sql
123+
SELECT synchdb_start_engine_bgw('oracleconn', 'always');
124+
125+
```
126+
127+
However, it is possible to select partial tables to redo the initial snapshot by using the `snapshottable` option of the connector. Tables matching the criteria in `snapshottable` will redo the inital snapshot, if not, their initial snapshot will be skipped. If `snapshottable` is null or empty, by default, all the tables specified in `table` option of the connector will redo the initial snapshot under `always` mode.
128+
129+
This example makes the connector only redo the initial snapshot of `inventory.customers` table. All other tables will have their snapshot skipped.
130+
```sql
131+
UPDATE synchdb_conninfo
132+
SET data = jsonb_set(data, '{snapshottable}', '"free.customers"')
133+
WHERE name = 'oracleconn';
134+
```
135+
136+
After the initial snapshot, CDC will begin. Restarting a connector in `always` mode will repeat the same process described above.
Lines changed: 144 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,144 @@
1-
# SQL Server CDC to PostgreSQL
1+
# SQL Server CDC to PostgreSQL
2+
3+
## Prepare SQL Server Databasea for SynchDB
4+
5+
Before SynchDB can be used to replicate from SQL Server, SQL Server needs to be configured according to the procedure outlined [here](https://docs.synchdb.com/getting-started/remote_database_setups/)
6+
7+
Please ensure the desired tables have already been enabled as CDC table in SQL Server. The following commands can be run on SQL Server client to enable CDC for `dbo.customer`, `dbo.district`, and `dbo.history`. You will continue to add new tables as needed.
8+
9+
```sql
10+
USE MyDB
11+
GO
12+
EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'customer', @role_name = NULL, @supports_net_changes = 0;
13+
EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'district', @role_name = NULL, @supports_net_changes = 0;
14+
EXEC sys.sp_cdc_enable_table @source_schema = 'dbo', @source_name = 'history', @role_name = NULL, @supports_net_changes = 0;
15+
GO
16+
```
17+
18+
## Create a SQL Server Connector
19+
20+
Create a connector that targets all the tables under `testDB` database in SQL Server.
21+
```sql
22+
SELECT
23+
synchdb_add_conninfo(
24+
'sqlserverconn', '127.0.0.1', 1433,
25+
'sa', 'Password!', 'testDB', 'postgres',
26+
'null', 'null', 'sqlserver');
27+
```
28+
29+
## Initial Snapshot + CDC
30+
31+
Start the connector using `initial` mode will perform the initial snapshot of all designated tables (all in this case). After this is completed, the change data capture (CDC) process will begin to stream for new changes.
32+
33+
```sql
34+
SELECT synchdb_start_engine_bgw('sqlserverconn', 'initial');
35+
36+
or
37+
38+
SELECT synchdb_start_engine_bgw('sqlserverconn');
39+
```
40+
41+
The stage of this connector should be in `initial snapshot` the first time it runs:
42+
```sql
43+
postgres=# select * from synchdb_state_view where name='sqlserverconn';
44+
name | connector_type | pid | stage | state | err | last_dbz_offset
45+
---------------+----------------+--------+------------------+---------+----------+-----------------------------
46+
sqlserverconn | sqlserver | 526003 | initial snapshot | polling | no error | offset file not flushed yet
47+
(1 row)
48+
49+
50+
```
51+
52+
A new schema called `testdb` will be created and all tables streamed by the connector will be replicated under that schema.
53+
```sql
54+
postgres=# set search_path=public,testdb;
55+
SET
56+
postgres=# \d
57+
List of relations
58+
Schema | Name | Type | Owner
59+
--------+-------------------------+----------+--------
60+
public | synchdb_att_view | view | ubuntu
61+
public | synchdb_attribute | table | ubuntu
62+
public | synchdb_conninfo | table | ubuntu
63+
public | synchdb_objmap | table | ubuntu
64+
public | synchdb_state_view | view | ubuntu
65+
public | synchdb_stats_view | view | ubuntu
66+
testdb | customers | table | ubuntu
67+
testdb | customers_id_seq | sequence | ubuntu
68+
testdb | orders | table | ubuntu
69+
testdb | orders_order_number_seq | sequence | ubuntu
70+
testdb | products | table | ubuntu
71+
testdb | products_id_seq | sequence | ubuntu
72+
testdb | products_on_hand | table | ubuntu
73+
(13 rows)
74+
75+
```
76+
77+
After the initial snapshot is completed, and at least one subsequent changes is received and processed, the connector stage shall change from `initial snapshot` to `Change Data Capture`.
78+
```sql
79+
postgres=# select * from synchdb_state_view where name='sqlserverconn';
80+
name | connector_type | pid | stage | state | err |
81+
last_dbz_offset
82+
---------------+----------------+--------+---------------------+---------+----------+-----------------------------
83+
----------------------------------------------------------------------
84+
sqlserverconn | sqlserver | 526290 | change data capture | polling | no error | {"event_serial_no":1,"commit
85+
_lsn":"0000002b:000004d8:0004","change_lsn":"0000002b:000004d8:0003"}
86+
(1 row
87+
88+
```
89+
90+
This means that the connector is now streaming for new changes of the designated tables. Restarting the connector in `initial` mode will proceed replication since the last successful point and initial snapshot will not be re-run.
91+
92+
## Initial Snapshot Only and no CDC
93+
94+
Start the connector using `initial_only` mode will perform the initial snapshot of all designated tables (all in this case) only and will not perform CDC after.
95+
96+
```sql
97+
SELECT synchdb_start_engine_bgw('sqlserverconn', 'initial_only');
98+
99+
```
100+
101+
The connector would still appear to be `polling` from the connector but no change will be captured because Debzium internally has stopped the CDC. You have the option to shut it down. Restarting the connector in `initial_only` mode will not rebuild the tables as they have already been built.
102+
103+
104+
## Capture Table Schema Only + CDC
105+
106+
Start the connector using `no_data` mode will perform the schema capture only, build the corresponding tables in PostgreSQL and it does not replicate existing table data (skip initial snapshot). After the schema capture is completed, the connector goes into CDC mode and will start capture subsequent changes to the tables.
107+
108+
```sql
109+
SELECT synchdb_start_engine_bgw('sqlserverconn', 'no_data');
110+
111+
```
112+
113+
Restarting the connector in `no_data` mode will not rebuild the schema again, and it will resume CDC since the last successful point.
114+
115+
## CDC only
116+
117+
Start the connector using `never` will skip schema capture and initial snapshot entirely and will go to CDC mode to capture subsequent changes. Please note that the connector expects all the capture tables have been created in PostgreSQL prior to starting in `never` mode. If the tables do not exist, the connector will encounter an error when it tries to apply a CDC change to a non-existent table.
118+
119+
```sql
120+
SELECT synchdb_start_engine_bgw('sqlserverconn', 'never');
121+
122+
```
123+
124+
Restarting the connector in `never` mode will resume CDC since the last successful point.
125+
126+
## Always do Initial Snapahot + CDC
127+
128+
Start the connector using `always` mode will always capture the schemas of capture tables, always redo the initial snapshot and then go to CDC. This is similar to a reset button because everything will be rebuilt using this mode. Use it with caution especially when you have large number of tables being captured, which could take a long time to finish. After the rebuild, CDC resumes as normal.
129+
130+
```sql
131+
SELECT synchdb_start_engine_bgw('sqlserverconn', 'always');
132+
133+
```
134+
135+
However, it is possible to select partial tables to redo the initial snapshot by using the `snapshottable` option of the connector. Tables matching the criteria in `snapshottable` will redo the inital snapshot, if not, their initial snapshot will be skipped. If `snapshottable` is null or empty, by default, all the tables specified in `table` option of the connector will redo the initial snapshot under `always` mode.
136+
137+
This example makes the connector only redo the initial snapshot of `inventory.customers` table. All other tables will have their snapshot skipped.
138+
```sql
139+
UPDATE synchdb_conninfo
140+
SET data = jsonb_set(data, '{snapshottable}', '"testDB.dbo.customers"')
141+
WHERE name = 'sqlserverconn';
142+
```
143+
144+
After the initial snapshot, CDC will begin. Restarting a connector in `always` mode will repeat the same process described above.

docs/zh/tutorial/mysql_cdc_to_postgresql.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## 为 SynchDB 准备 MySQL 数据库
44

5-
在使用 SynchDB 从 MySQL 复制之前,需要按照[此处](https://docs.synchdb.com/getting-started/remote_database_setups/) 概述的步骤配置 MySQL
5+
在使用 SynchDB 从 MySQL 复制之前,需要按照[此处](https://docs.synchdb.com/zh/getting-started/remote_database_setups/) 概述的步骤配置 MySQL
66

77
## 创建 MySQL 连接器
88

0 commit comments

Comments
 (0)