Skip to content

Commit ddffe91

Browse files
author
Cary Huang
committed
updates
1 parent 17a8f9d commit ddffe91

File tree

3 files changed

+95
-3
lines changed

3 files changed

+95
-3
lines changed
Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,47 @@
1-
# Table Snapshot and Re-snapshot
1+
# Table Snapshot and Re-snapshot
2+
3+
## Initial Snapshot
4+
"Initial snapshot" (or table snapshot) in SynchDB means to copy table schema plus initial data for all designated tables. This is similar to the term "table sync" in PostgreSQL logical replication. When a connector is started using the default `initial` mode, it will automatically perform the initial snapshot before going to Change Data Capture (CDC) stage. This can be omitted entirely with mode `never` or partially omitted with mode `no_data`. See [here](https://docs.synchdb.com/user-guide/start_stop_connector/) for all snapshot options.
5+
6+
Once the initial snapshot is completed, the connector will not do it again upon subsequent restarts and will just resume with CDC since the last incomplete offset. This behavior is controled by the metadata files managed by Debezium engine. See [here](https://docs.synchdb.com/architecture/metadata_files/) for more about metadata files.
7+
8+
9+
## Re-snapshot
10+
If for any reason, user needs to perform the initial snapshot again to re-build `all the designated tables` and all of the initial data, we need to use the `always` snapshot mode, which causes the connector to obtain schema and initial data again at the moment the connector starts. You may need to drop all the desginated tables or clears ll the data in them before SynchDB will attempt to create the tables and populate initial data again, which may exist already.
11+
12+
Please be cautious as this may be an aggressive action to perform in your setups. A better alternative would be a selective snapshot where only the selected tables will be re-snapshotted in `always` snapshot mode. See below.
13+
14+
15+
## Selective Snapshot
16+
Selective snapshot can be configured to a connector during creation of changed in run time. This done by specifying a list of tables to perform snapshot in `snapshot table` paramter. For example:
17+
18+
19+
**During creation:**
20+
This example creates a conenctor that perform CDC on `inventory.orders`,`inventory.customers` and `invnetory.produts` tables but will only do initial snapshot again for `inventory.products` if the connector starts in `always` snapshot mode.
21+
22+
```sql
23+
SELECT synchdb_add_conninfo(
24+
'mysqlconn',
25+
'127.0.0.1',
26+
3306,
27+
'mysqluser',
28+
'mysqlpwd',
29+
'inventory',
30+
'postgres',
31+
'inventory.orders,inventory.customers,invnetory.produts',
32+
'inventory.products',
33+
'mysql');
34+
35+
SELECT synchdb_start_engine_bgw('mysqlconn', 'always');
36+
```
37+
38+
**Alter existing connector:**
39+
This example sets `inventory.products` to snapshot table field. When started in `always` mode, only `inventory.products` table will be re-snapshotted.
40+
41+
```sql
42+
UPDATE synchdb_conninfo
43+
SET data = jsonb_set(data, '{snapshottable}', 'inventory.products', true)
44+
WHERE name = 'mysqlconn';
45+
46+
SELECT synchdb_start_engine_bgw('mysqlconn', 'always');
47+
```
Lines changed: 45 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1,45 @@
1-
# Table Snapshot and Re-snapshot
1+
# 表快照和重新快照
2+
3+
## 初始快照
4+
SynchDB 中的“初始快照”(或表快照)是指复制所有指定表的表结构和初始数据。这类似于 PostgreSQL 逻辑复制中的“表同步”。当使用默认“initial”模式启动连接器时,它将在进入变更数据捕获 (CDC) 阶段之前自动执行初始快照。可以使用“never”模式完全省略此操作,或使用“no_data”模式部分省略此操作。有关所有快照选项,请参阅[此处](https://docs.synchdb.com/zh/user-guide/start_stop_connector/)
5+
6+
初始快照完成后,连接器在后续重启时将不再执行此操作,而是从上次未完成的偏移量开始继续执行 CDC。此行为由 Debezium 引擎管理的元数据文件控制。有关元数据文件的更多信息,请参阅[此处](https://docs.synchdb.com/zh/architecture/metadata_files/)
7+
8+
## 重新快照
9+
如果出于任何原因,用户需要再次执行初始快照以重建“所有指定的表”和所有初始数据,我们需要使用“始终”快照模式,这会导致连接器在启动时再次获取架构和初始数据。您可能需要删除所有指定的表或清除其中的所有数据,然后 SynchDB 才会尝试重新创建表并填充初始数据(这些数据可能已经存在)。
10+
11+
请谨慎操作,因为在您的设置中执行此操作可能过于激进。更好的替代方案是使用选择性快照,在“始终”快照模式下,仅对选定的表进行重新快照。请参见下文。
12+
13+
## 选择性快照
14+
可以在运行时创建或更改连接器时配置选择性快照。这可以通过在“快照表”参数中指定要执行快照的表列表来实现。例如:
15+
16+
**创建期间:**
17+
此示例创建一个连接器,该连接器对 `inventory.orders``inventory.customers``invnetory.produts` 表执行 CDC,但如果连接器以 `always` 快照模式启动,则只会对 `inventory.products` 再次执行初始快照。
18+
19+
```sql
20+
SELECT synchdb_add_conninfo(
21+
'mysqlconn',
22+
'127.0.0.1',
23+
3306,
24+
'mysqluser',
25+
'mysqlpwd',
26+
'inventory',
27+
'postgres',
28+
'inventory.orders,inventory.customers,invnetory.produts',
29+
'inventory.products',
30+
'mysql');
31+
32+
SELECT synchdb_start_engine_bgw('mysqlconn', 'always');
33+
```
34+
35+
**修改现有连接器:**
36+
此示例将 `inventory.products` 设置为快照表字段。以 `always` 模式启动时,只有 `inventory.products` 表会重新创建快照。
37+
38+
```sql
39+
UPDATE synchdb_conninfo
40+
SET data = jsonb_set(data, '{snapshottable}', 'inventory.products', true)
41+
WHERE name = 'mysqlconn';
42+
43+
SELECT synchdb_start_engine_bgw('mysqlconn', 'always');
44+
45+
```

mkdocs.yml

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,7 @@ plugins:
7979
MySQL CDC to PostgreSQL: CDC de MySQL a PostgreSQL
8080
SQL Server CDC to PostgreSQL: CDC de SQL Server a PostgreSQL
8181
Oracle CDC to PostgreSQL: CDC de Oracle a PostgreSQL
82+
Table Snapshot and Re-snapshot: Instantánea de tabla y nueva instantánea
8283
- locale: zh
8384
name: 中文
8485
site_name: SynchDB 文档
@@ -117,6 +118,7 @@ plugins:
117118
MySQL CDC to PostgreSQL: MySQL CDC 到 PostgreSQL
118119
SQL Server CDC to PostgreSQL: SQL Server CDC 到 PostgreSQL
119120
Oracle CDC to PostgreSQL: Oracle CDC 到 PostgreSQL
121+
Table Snapshot and Re-snapshot: 表快照和重新快照
120122
- mike:
121123
version_selector: true
122124
css_dir: css
@@ -154,7 +156,7 @@ nav:
154156
- Oracle CDC to PostgreSQL: tutorial/oracle_cdc_to_postgresql.md
155157
- Selective Table Replication: tutorial/selective_table_sync.md
156158
# - Data Transformation: tutorial/data_transformation.md
157-
# - Table Snapshot and Re-snapshot: tutorial/table_snapshot_resnapshot.md
159+
- Table Snapshot and Re-snapshot: tutorial/table_snapshot_resnapshot.md
158160
# - Schema Only Synchronization: tutorial/schema_only_sync.md
159161
# - Multi Source Replication: tutorial/multi_source_replication.md
160162
- Object Mapping Workflow: tutorial/object_mapping_workflow.md

0 commit comments

Comments
 (0)