Skip to content

Commit ce39948

Browse files
Carl-Zhou-CNzhouyao
and
zhouyao
authored
[Docs][Connector-V2][Hudi] Reconstruct the Hudi connector document (apache#4905)
* [Docs][Connector-V2][Hudi] Reconstruct the Hudi connector document --------- Co-authored-by: zhouyao <[email protected]>
1 parent 0e4190a commit ce39948

File tree

1 file changed

+44
-38
lines changed

1 file changed

+44
-38
lines changed

docs/en/connector-v2/source/Hudi.md

+44-38
Original file line numberDiff line numberDiff line change
@@ -2,69 +2,67 @@
22

33
> Hudi source connector
44
5-
## Description
5+
## Support Those Engines
66

7-
Used to read data from Hudi. Currently, only supports hudi cow table and Snapshot Query with Batch Mode.
7+
> Spark<br/>
8+
> Flink<br/>
9+
> SeaTunnel Zeta<br/>
810
9-
In order to use this connector, You must ensure your spark/flink cluster already integrated hive. The tested hive version is 2.3.9.
10-
11-
## Key features
11+
## Key Features
1212

1313
- [x] [batch](../../concept/connector-v2-features.md)
14-
15-
Currently, only supports hudi cow table and Snapshot Query with Batch Mode
16-
1714
- [ ] [stream](../../concept/connector-v2-features.md)
1815
- [x] [exactly-once](../../concept/connector-v2-features.md)
1916
- [ ] [column projection](../../concept/connector-v2-features.md)
2017
- [x] [parallelism](../../concept/connector-v2-features.md)
2118
- [ ] [support user-defined split](../../concept/connector-v2-features.md)
2219

23-
## Options
24-
25-
| name | type | required | default value |
26-
|-------------------------|---------|------------------------------|---------------|
27-
| table.path | string | yes | - |
28-
| table.type | string | yes | - |
29-
| conf.files | string | yes | - |
30-
| use.kerberos | boolean | no | false |
31-
| kerberos.principal | string | yes when use.kerberos = true | - |
32-
| kerberos.principal.file | string | yes when use.kerberos = true | - |
33-
| common-options | config | no | - |
34-
35-
### table.path [string]
36-
37-
`table.path` The hdfs root path of hudi table,such as 'hdfs://nameserivce/data/hudi/hudi_table/'.
20+
## Description
3821

39-
### table.type [string]
22+
Used to read data from Hudi. Currently, only supports hudi cow table and Snapshot Query with Batch Mode.
4023

41-
`table.type` The type of hudi table. Now we only support 'cow', 'mor' is not support yet.
24+
In order to use this connector, You must ensure your spark/flink cluster already integrated hive. The tested hive version is 2.3.9.
4225

43-
### conf.files [string]
26+
## Supported DataSource Info
4427

45-
`conf.files` The environment conf file path list(local path), which used to init hdfs client to read hudi table file. The example is '/home/test/hdfs-site.xml;/home/test/core-site.xml;/home/test/yarn-site.xml'.
28+
:::tip
4629

47-
### use.kerberos [boolean]
30+
* Currently, only supports Hudi cow table and Snapshot Query with Batch Mode
4831

49-
`use.kerberos` Whether to enable Kerberos, default is false.
32+
:::
5033

51-
### kerberos.principal [string]
34+
## Data Type Mapping
5235

53-
`kerberos.principal` When use kerberos, we should set kerberos princal such as 'test_user@xxx'.
36+
| Hudi Data type | Seatunnel Data type |
37+
|----------------|---------------------|
38+
| ALL TYPE | STRING |
5439

55-
### kerberos.principal.file [string]
40+
## Source Options
5641

57-
`kerberos.principal.file` When use kerberos, we should set kerberos princal file such as '/home/test/test_user.keytab'.
42+
| Name | Type | Required | Default | Description |
43+
|-------------------------|--------|------------------------------|---------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
44+
| table.path | String | Yes | - | The hdfs root path of hudi table,such as 'hdfs://nameserivce/data/hudi/hudi_table/'. |
45+
| table.type | String | Yes | - | The type of hudi table. Now we only support 'cow', 'mor' is not support yet. |
46+
| conf.files | String | Yes | - | The environment conf file path list(local path), which used to init hdfs client to read hudi table file. The example is '/home/test/hdfs-site.xml;/home/test/core-site.xml;/home/test/yarn-site.xml'. |
47+
| use.kerberos | bool | No | false | Whether to enable Kerberos, default is false. |
48+
| kerberos.principal | String | yes when use.kerberos = true | - | When use kerberos, we should set kerberos principal such as 'test_user@xxx'. |
49+
| kerberos.principal.file | string | yes when use.kerberos = true | - | When use kerberos, we should set kerberos principal file such as '/home/test/test_user.keytab'. |
50+
| common-options | config | No | - | Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details. |
5851

59-
### common options
52+
## Task Example
6053

61-
Source plugin common parameters, please refer to [Source Common Options](common-options.md) for details.
54+
### Simple:
6255

63-
## Examples
56+
> This example reads from a Hudi COW table and configures Kerberos for the environment, printing to the console.
6457
6558
```hocon
66-
source {
67-
59+
# Defining the runtime environment
60+
env {
61+
# You can set flink configuration here
62+
execution.parallelism = 2
63+
job.mode = "BATCH"
64+
}
65+
source{
6866
Hudi {
6967
table.path = "hdfs://nameserivce/data/hudi/hudi_table/"
7068
table.type = "cow"
@@ -73,7 +71,15 @@ source {
7371
kerberos.principal = "test_user@xxx"
7472
kerberos.principal.file = "/home/test/test_user.keytab"
7573
}
74+
}
75+
76+
transform {
77+
# If you would like to get more information about how to configure seatunnel and see full list of transform plugins,
78+
# please go to https://seatunnel.apache.org/docs/transform-v2/sql/
79+
}
7680
81+
sink {
82+
Console {}
7783
}
7884
```
7985

0 commit comments

Comments
 (0)