Skip to content

Commit

Permalink
Support databendwritter (wgzhao#717)
Browse files Browse the repository at this point in the history
  • Loading branch information
TCeason authored Dec 23, 2022
1 parent 4b48105 commit 930a1fc
Show file tree
Hide file tree
Showing 22 changed files with 1,380 additions and 0 deletions.
55 changes: 55 additions & 0 deletions docs/assets/jobs/databendwriter.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
{
"job": {
"setting": {
"speed": {
"channel": 2
}
},
"content": {
"writer": {
"name": "databendwriter",
"parameter": {
"preSql": [
"truncate table @table"
],
"postSql": [
],
"username": "u1",
"password": "123",
"database": "example_db",
"table": "table1",
"jdbcUrl": "jdbc:mysql://127.0.0.1:3307/example_db",
"loadUrl": ["127.0.0.1:8000","127.0.0.1:8000"],
"fieldDelimiter": "\\x01",
"lineDelimiter": "\\x02",
"column": ["*"],
"format": "csv"
}
},
"reader": {
"name": "streamreader",
"parameter": {
"column": [
{
"random": "1,500",
"type": "long"
},
{
"random": "1,127",
"type": "long"
},
{
"value": "this is a text",
"type": "string"
},
{
"random": "5,200",
"type": "long"
}
],
"sliceRecordCount": 100
}
}
}
}
}
Binary file modified docs/images/supported_databases.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
65 changes: 65 additions & 0 deletions docs/writer/databendwriter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
# DatabendWriter

Databend 插件用于向 [Databend](https://databend.rs/zh-CN/doc/) 数据库以流式方式写入数据。 其实现上是通过访问 Databend http 连接(8000)
,然后通过 [stream load](https://databend.rs/zh-CN/doc/integrations/api/streaming-load)
加载数据到数据中,相比 `insert into` 方式效率要高不少,也是官方推荐的生产环境下的数据加载方式。

Databend 是一个兼容 MySQL 协议的数据库后端,因此 Databend 读取可以使用 [MySQLReader](../../reader/mysqlreader) 进行访问。

## 示例

假定要写入的表的建表语句如下:

```sql
CREATE
DATABASE example_db;
CREATE TABLE `example_db`.`table1` (
`siteid` INT DEFAULT CAST(10 AS INT),
`citycode` INT,
`username` VARCHAR,
`pv` BIGINT
);
```

下面配置一个从内存读取数据,然后写入到 databend 表的配置文件

```json
--8<-- "jobs/databendwriter.json"
```

将上述配置文件保存为 `job/stream2databend.json`

执行下面的命令

```shell
bin/addax.sh job/stream2Databend.json
```

## 参数说明

| 配置项 | 是否必须 | 类型 | 默认值 | 描述 |
|:---------------|:----:|--------|-----------|-------------------------------------------------------------------------------------------------------------------------------------------------------|
| jdbcUrl || string || 目的数据库的 JDBC 连接信息,用于执行`preSql``postSql` |
| loadUrl || string || Databend query 节点的地址用于StreamLoad,可以为多个 query 地址,`query_ip:query_http_port`,从多个地址轮循写入 | |
| username || string || HTTP 签名验证帐号 |
| password || string || HTTP 签名验证密码 |
| database || string || Databend表的数据库名称 |
| table || string || Databend表的表名称 |
| column || list || 所配置的表中需要同步的列名集合,详细描述见 [rdbmswriter](../rdbmswriter) |
| maxBatchRows || int | 500000 | 定义了插件和数据库服务器端每次批量数据获取条数,调高该值可能导致 Addax 出现OOM或者目标数据库事务提交失败导致挂起 |
| maxBatchSize || int | 104857600 | 单次StreamLoad导入的最大字节数 |
| flushInterval || int | 300000 | 上一次StreamLoad结束至下一次开始的时间间隔(单位:ms) |
| endpoint || string || Databend 的HTTP连接方式,只需要写到主机和端口即可,具体路径插件会自动拼装 | |
| username || string || HTTP 签名验证帐号 |
| password || string || HTTP 签名验证密码 |
| table || string || 所选取的需要同步的表名 |
| column || list || 所配置的表中需要同步的列名集合,详细描述见 [rdbmswriter](../rdbmswriter) |
| batchSize || int | 1024 | |
| lineDelimiter || string | `\n` | 每行的分隔符,支持高位字节, 例如 `\\x02` |
| filedDelimiter || string | `\t` | 每列的分隔符,支持高位字节, 例如 `\\x01` |
| format || string | `csv` | 被导入数据会被转换成 format 指定格式。 |


## 类型转换

默认传入的数据均会被转为字符串,并以`\t`作为列分隔符,`\n`作为行分隔符,组成`csv`文件进行StreamLoad导入操作。
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ nav:
- writer/clickhousewriter.md
- writer/dbfwriter.md
- writer/doriswriter.md
- writer/databendwriter.md
- writer/elasticsearchwriter.md
- writer/excelwriter.md
- writer/ftpwriter.md
Expand Down
8 changes: 8 additions & 0 deletions package.xml
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,14 @@
<fileMode>0644</fileMode>
<outputDirectory>addax-${project.version}</outputDirectory>
</fileSet>
<fileSet>
<directory>plugin/writer/databendwriter/target/databendwriter-${project.version}/</directory>
<includes>
<include>**/*.*</include>
</includes>
<fileMode>0644</fileMode>
<outputDirectory>addax-${project.version}</outputDirectory>
</fileSet>
<fileSet>
<directory>plugin/writer/doriswriter/target/doriswriter-${project.version}/</directory>
<includes>
Expand Down
37 changes: 37 additions & 0 deletions plugin/writer/databendwriter/package.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
<assembly
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2"
xsi:schemaLocation="http://maven.apache.org/plugins/maven-assembly-plugin/assembly/1.1.2 http://maven.apache.org/xsd/assembly-component-1.1.2.xsd">
<id>release</id>
<formats>
<format>dir</format>
</formats>
<includeBaseDirectory>false</includeBaseDirectory>
<fileSets>
<fileSet>
<directory>src/main/resources</directory>
<includes>
<include>*.json</include>
</includes>
<outputDirectory>plugin/writer/${project.artifactId}</outputDirectory>
</fileSet>
<fileSet>
<directory>target/</directory>
<includes>
<include>${project.artifactId}-${project.version}.jar</include>
</includes>
<outputDirectory>plugin/writer/${project.artifactId}</outputDirectory>
</fileSet>
</fileSets>

<dependencySets>
<dependencySet>
<useProjectArtifact>false</useProjectArtifact>
<outputDirectory>plugin/writer/${project.artifactId}/libs</outputDirectory>
<scope>runtime</scope>
<excludes>
<exclude>com.wgzhao.addax:*</exclude>
</excludes>
</dependencySet>
</dependencySets>
</assembly>
100 changes: 100 additions & 0 deletions plugin/writer/databendwriter/pom.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://maven.apache.org/POM/4.0.0" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
<modelVersion>4.0.0</modelVersion>
<parent>
<groupId>com.wgzhao.addax</groupId>
<artifactId>addax-all</artifactId>
<version>4.0.11-SNAPSHOT</version>
<relativePath>../../../pom.xml</relativePath>
</parent>

<artifactId>databendwriter</artifactId>
<name>databend-writer</name>
<packaging>jar</packaging>

<dependencies>
<dependency>
<groupId>com.wgzhao.addax</groupId>
<artifactId>addax-common</artifactId>
<version>${project.version}</version>
<exclusions>
<exclusion>
<artifactId>slf4j-log4j12</artifactId>
<groupId>org.slf4j</groupId>
</exclusion>
</exclusions>
</dependency>
<dependency>
<groupId>org.slf4j</groupId>
<artifactId>slf4j-api</artifactId>
</dependency>

<dependency>
<groupId>com.wgzhao.addax</groupId>
<artifactId>addax-rdbms</artifactId>
<version>${project.version}</version>
</dependency>
<dependency>
<groupId>commons-codec</groupId>
<artifactId>commons-codec</artifactId>
<version>${commons.codec.version}</version>
</dependency>
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-lang3</artifactId>
<version>${commons.lang3.version}</version>
</dependency>
<dependency>
<groupId>commons-logging</groupId>
<artifactId>commons-logging</artifactId>
<version>${commons.logging.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpcore</artifactId>
<version>${httpcore.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpclient</artifactId>
<version>${httpclient.version}</version>
</dependency>
<dependency>
<groupId>com.alibaba</groupId>
<artifactId>fastjson</artifactId>
<version>${fastjson2.version}</version>
</dependency>
<dependency>
<groupId>mysql</groupId>
<artifactId>mysql-connector-java</artifactId>
<version>${mysql.jdbc.version}</version>
</dependency>
<dependency>
<groupId>org.apache.httpcomponents</groupId>
<artifactId>httpmime</artifactId>
<version>4.5.13</version>
</dependency>
</dependencies>
<build>
<plugins>
<plugin>
<artifactId>maven-assembly-plugin</artifactId>
<configuration>
<descriptors>
<descriptor>package.xml</descriptor>
</descriptors>
<finalName>${project.artifactId}-${project.version}</finalName>
</configuration>
<executions>
<execution>
<id>release</id>
<phase>package</phase>
<goals>
<goal>single</goal>
</goals>
</execution>
</executions>
</plugin>
</plugins>
</build>

</project>
Loading

0 comments on commit 930a1fc

Please sign in to comment.