Skip to content

Commit 1312a1d

Browse files
hailin0TaoZex
andauthored
[Feature][CDC] Support custom table primary key (apache#6106)
* [Feature][CDC] Support custom table primary key * Update seatunnel-connectors-v2/connector-cdc/connector-cdc-base/src/main/java/org/apache/seatunnel/connectors/cdc/base/option/JdbcSourceOptions.java Co-authored-by: TaoZex <[email protected]> * Update seatunnel-connectors-v2/connector-cdc/connector-cdc-base/src/main/java/org/apache/seatunnel/connectors/cdc/base/utils/CatalogTableUtils.java Co-authored-by: TaoZex <[email protected]> --------- Co-authored-by: TaoZex <[email protected]>
1 parent edcaace commit 1312a1d

File tree

19 files changed

+873
-15
lines changed

19 files changed

+873
-15
lines changed

docs/en/connector-v2/source/MySQL-CDC.md

+32-3
Original file line numberDiff line numberDiff line change
@@ -153,6 +153,7 @@ When an initial consistent snapshot is made for large databases, your establishe
153153
| password | String | Yes | - | Password to use when connecting to the database server. |
154154
| database-names | List | No | - | Database name of the database to monitor. |
155155
| table-names | List | Yes | - | Table name of the database to monitor. The table name needs to include the database name, for example: `database_name.table_name` |
156+
| table-names-config | List | No | - | Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":["key1"]}] |
156157
| startup.mode | Enum | No | INITIAL | Optional startup mode for MySQL CDC consumer, valid enumerations are `initial`, `earliest`, `latest` and `specific`. <br/> `initial`: Synchronize historical data at startup, and then synchronize incremental data.<br/> `earliest`: Startup from the earliest offset possible.<br/> `latest`: Startup from the latest offset.<br/> `specific`: Startup from user-supplied specific offsets. |
157158
| startup.specific-offset.file | String | No | - | Start from the specified binlog file name. **Note, This option is required when the `startup.mode` option used `specific`.** |
158159
| startup.specific-offset.pos | Long | No | - | Start from the specified binlog file position. **Note, This option is required when the `startup.mode` option used `specific`.** |
@@ -190,9 +191,6 @@ env {
190191
191192
source {
192193
MySQL-CDC {
193-
catalog = {
194-
factory = MySQL
195-
}
196194
base-url = "jdbc:mysql://localhost:3306/testdb"
197195
username = "root"
198196
password = "root@123"
@@ -212,6 +210,37 @@ sink {
212210

213211
> Must be used with kafka connector sink, see [compatible debezium format](../formats/cdc-compatible-debezium-json.md) for details
214212
213+
### Support custom primary key for table
214+
215+
```
216+
env {
217+
parallelism = 1
218+
job.mode = "STREAMING"
219+
checkpoint.interval = 10000
220+
}
221+
222+
source {
223+
MySQL-CDC {
224+
base-url = "jdbc:mysql://localhost:3306/testdb"
225+
username = "root"
226+
password = "root@123"
227+
228+
table-names = ["testdb.table1", "testdb.table2"]
229+
table-names-config = [
230+
{
231+
table = "testdb.table2"
232+
primaryKeys = ["id"]
233+
}
234+
]
235+
}
236+
}
237+
238+
sink {
239+
Console {
240+
}
241+
}
242+
```
243+
215244
## Changelog
216245

217246
- Add MySQL CDC Source Connector

docs/en/connector-v2/source/SqlServer-CDC.md

+32
Original file line numberDiff line numberDiff line change
@@ -60,6 +60,7 @@ Please download and put SqlServer driver in `${SEATUNNEL_HOME}/lib/` dir. For ex
6060
| password | String | Yes | - | Password to use when connecting to the database server. |
6161
| database-names | List | Yes | - | Database name of the database to monitor. |
6262
| table-names | List | Yes | - | Table name is a combination of schema name and table name (databaseName.schemaName.tableName). |
63+
| table-names-config | List | No | - | Table config list. for example: [{"table": "db1.schema1.table1","primaryKeys":["key1"]}] |
6364
| base-url | String | Yes | - | URL has to be with database, like "jdbc:sqlserver://localhost:1433;databaseName=test". |
6465
| startup.mode | Enum | No | INITIAL | Optional startup mode for SqlServer CDC consumer, valid enumerations are "initial", "earliest", "latest" and "specific". |
6566
| startup.timestamp | Long | No | - | Start from the specified epoch timestamp (in milliseconds).<br/> **Note, This option is required when** the **"startup.mode" option used `'timestamp'`.** |
@@ -186,3 +187,34 @@ sink {
186187
}
187188
```
188189

190+
### Support custom primary key for table
191+
192+
```
193+
env {
194+
parallelism = 1
195+
job.mode = "STREAMING"
196+
checkpoint.interval = 5000
197+
}
198+
199+
source {
200+
SqlServer-CDC {
201+
base-url = "jdbc:sqlserver://localhost:1433;databaseName=column_type_test"
202+
username = "sa"
203+
password = "Y.sa123456"
204+
database-names = ["column_type_test"]
205+
206+
table-names = ["column_type_test.dbo.simple_types", "column_type_test.dbo.full_types"]
207+
table-names-config = [
208+
{
209+
table = "column_type_test.dbo.full_types"
210+
primaryKeys = ["id"]
211+
}
212+
]
213+
}
214+
}
215+
216+
sink {
217+
console {
218+
}
219+
```
220+
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
/*
2+
* Licensed to the Apache Software Foundation (ASF) under one or more
3+
* contributor license agreements. See the NOTICE file distributed with
4+
* this work for additional information regarding copyright ownership.
5+
* The ASF licenses this file to You under the Apache License, Version 2.0
6+
* (the "License"); you may not use this file except in compliance with
7+
* the License. You may obtain a copy of the License at
8+
*
9+
* http://www.apache.org/licenses/LICENSE-2.0
10+
*
11+
* Unless required by applicable law or agreed to in writing, software
12+
* distributed under the License is distributed on an "AS IS" BASIS,
13+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14+
* See the License for the specific language governing permissions and
15+
* limitations under the License.
16+
*/
17+
18+
package org.apache.seatunnel.connectors.cdc.base.config;
19+
20+
import lombok.Data;
21+
22+
import java.io.Serializable;
23+
import java.util.List;
24+
25+
@Data
26+
public class JdbcSourceTableConfig implements Serializable {
27+
private String table;
28+
private List<String> primaryKeys;
29+
}

seatunnel-connectors-v2/connector-cdc/connector-cdc-base/src/main/java/org/apache/seatunnel/connectors/cdc/base/option/JdbcSourceOptions.java

+14
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@
1919

2020
import org.apache.seatunnel.api.configuration.Option;
2121
import org.apache.seatunnel.api.configuration.Options;
22+
import org.apache.seatunnel.connectors.cdc.base.config.JdbcSourceTableConfig;
2223
import org.apache.seatunnel.connectors.cdc.base.source.IncrementalSource;
2324

2425
import java.time.ZoneId;
@@ -141,4 +142,17 @@ public class JdbcSourceOptions extends SourceOptions {
141142
+ "The value represents the denominator of the sampling rate fraction. "
142143
+ "For example, a value of 1000 means a sampling rate of 1/1000. "
143144
+ "This parameter is used when the sample sharding strategy is triggered.");
145+
146+
public static final Option<List<JdbcSourceTableConfig>> TABLE_NAMES_CONFIG =
147+
Options.key("table-names-config")
148+
.listType(JdbcSourceTableConfig.class)
149+
.noDefaultValue()
150+
.withDescription(
151+
"Config table configs. Example: "
152+
+ "["
153+
+ " {"
154+
+ " \"table\": \"db1.schema1.table1\","
155+
+ " \"primaryKeys\": [\"key1\",\"key2\"]"
156+
+ " }"
157+
+ "]");
144158
}

0 commit comments

Comments
 (0)