diff --git a/docs/content.zh/docs/connectors/table/hive/hive_dialect.md b/docs/content.zh/docs/connectors/table/hive/hive_dialect.md index c70554429a4f2..9840494a4c365 100644 --- a/docs/content.zh/docs/connectors/table/hive/hive_dialect.md +++ b/docs/content.zh/docs/connectors/table/hive/hive_dialect.md @@ -335,26 +335,85 @@ CREATE FUNCTION function_name AS class_name; DROP FUNCTION [IF EXISTS] function_name; ``` -## DML +## DML & DQL _`Beta`_ -### INSERT +Hive 方言支持常用的 Hive [DML](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML) +和 [DQL](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select) 。 下表列出了一些 Hive 方言支持的语法。 -```sql -INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT ...; -``` +- [SORT/CLUSTER/DISTRIBUTE BY](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy) +- [Group By](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy) +- [Join](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins) +- [Union](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union) +- [LATERAL VIEW](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView) +- [Window Functions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics) +- [SubQueries](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries) +- [CTE](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression) +- [INSERT INTO dest schema](https://issues.apache.org/jira/browse/HIVE-9481) +- [Implicit type conversions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions) + +为了实现更好的语法和语义的兼容,强烈建议使用 [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule) +并将其放在 Module 列表的首位,以便在函数解析时优先使用 Hive 内置函数。 -如果指定了 `partition_spec`,可以是完整或者部分分区列。如果是部分指定,则可以省略动态分区的列名。 +Hive 方言不再支持 [Flink SQL 语法]({{< ref "docs/dev/table/sql/queries" >}}) 。 若需使用 Flink 语法,请切换到 `default` 方言。 + +以下是一个使用 Hive 方言的示例。 + +```bash +Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/opt/hive-conf'); +[INFO] Execute statement succeed. -## DQL +Flink SQL> use catalog myhive; +[INFO] Execute statement succeed. -目前,对于DQL语句 Hive 方言和 Flink SQL 支持的语法相同。有关更多详细信息,请参考[Flink SQL 查询]({{< ref "docs/dev/table/sql/queries" >}})。并且建议切换到 `default` 方言来执行 DQL 语句。 +Flink SQL> load module hive; +[INFO] Execute statement succeed. + +Flink SQL> use modules hive,core; +[INFO] Execute statement succeed. + +Flink SQL> set table.sql-dialect=hive; +[INFO] Session property has been set. + +Flink SQL> select explode(array(1,2,3)); -- call hive udtf ++-----+ +| col | ++-----+ +| 1 | +| 2 | +| 3 | ++-----+ +3 rows in set + +Flink SQL> create table tbl (key int,value string); +[INFO] Execute statement succeed. + +Flink SQL> insert overwrite table tbl values (5,'e'),(1,'a'),(1,'a'),(3,'c'),(2,'b'),(3,'c'),(3,'c'),(4,'d'); +[INFO] Submitting SQL update statement to the cluster... +[INFO] SQL update statement has been successfully submitted to the cluster: + +Flink SQL> select * from tbl cluster by key; -- run cluster by +2021-04-22 16:13:57,005 INFO org.apache.hadoop.mapred.FileInputFormat [] - Total input paths to process : 1 ++-----+-------+ +| key | value | ++-----+-------+ +| 1 | a | +| 1 | a | +| 5 | e | +| 2 | b | +| 3 | c | +| 3 | c | +| 3 | c | +| 4 | d | ++-----+-------+ +8 rows in set +``` ## 注意 以下是使用 Hive 方言的一些注意事项。 -- Hive 方言只能用于操作 Hive 表,不能用于一般表。Hive 方言应与[HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}})一起使用。 +- Hive 方言只能用于操作 Hive 对象,并要求当前 Catalog 是一个 [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}) 。 +- Hive 方言只支持 `db.table` 这种两级的标识符,不支持带有 Catalog 名字的标识符。 - 虽然所有 Hive 版本支持相同的语法,但是一些特定的功能是否可用仍取决于你使用的[Hive 版本]({{< ref "docs/connectors/table/hive/overview" >}}#支持的hive版本)。例如,更新数据库位置 只在 Hive-2.4.0 或更高版本支持。 -- Hive 和 Calcite 有不同的保留关键字集合。例如,`default` 是 Calcite 的保留关键字,却不是 Hive 的保留关键字。即使使用 Hive 方言, 也必须使用反引号 ( ` ) 引用此类关键字才能将其用作标识符。 -- 由于扩展的查询语句的不兼容性,在 Flink 中创建的视图是不能在 Hive 中查询的。 +- 执行 DML 和 DQL 时应该使用 [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule) 。 diff --git a/docs/content.zh/docs/connectors/table/hive/overview.md b/docs/content.zh/docs/connectors/table/hive/overview.md index 1f43865a29704..a32a3ff4d1809 100644 --- a/docs/content.zh/docs/connectors/table/hive/overview.md +++ b/docs/content.zh/docs/connectors/table/hive/overview.md @@ -127,6 +127,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` // Hive dependencies hive-exec-2.3.4.jar + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 1.0.0" >}} @@ -146,6 +149,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` orc-core-1.4.3-nohive.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 1.1.0" >}} @@ -165,6 +171,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` orc-core-1.4.3-nohive.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 1.2.1" >}} @@ -184,6 +193,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` orc-core-1.4.3-nohive.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 2.0.0" >}} @@ -197,6 +209,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` // Hive dependencies hive-exec-2.0.0.jar + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 2.1.0" >}} @@ -210,6 +225,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` // Hive dependencies hive-exec-2.1.0.jar + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 2.2.0" >}} @@ -227,6 +245,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` orc-core-1.4.3.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 3.1.0" >}} @@ -241,6 +262,9 @@ export HADOOP_CLASSPATH=`hadoop classpath` hive-exec-3.1.0.jar libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some versions, need to add it separately + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< /tabs >}} diff --git a/docs/content/docs/connectors/table/hive/hive_dialect.md b/docs/content/docs/connectors/table/hive/hive_dialect.md index a1c71fdeff0d6..47384b5337ae6 100644 --- a/docs/content/docs/connectors/table/hive/hive_dialect.md +++ b/docs/content/docs/connectors/table/hive/hive_dialect.md @@ -300,8 +300,6 @@ CREATE VIEW [IF NOT EXISTS] view_name [(column_name, ...) ] #### Alter -**NOTE**: Altering view only works in Table API, but not supported via SQL client. - ##### Rename ```sql @@ -346,33 +344,90 @@ CREATE FUNCTION function_name AS class_name; DROP FUNCTION [IF EXISTS] function_name; ``` -## DML +## DML & DQL _`Beta`_ -### INSERT +Hive dialect supports a commonly-used subset of Hive's [DML](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DML) +and [DQL](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Select). The following lists some examples of +HiveQL supported by the Hive dialect. -```sql -INSERT (INTO|OVERWRITE) [TABLE] table_name [PARTITION partition_spec] SELECT ...; -``` +- [SORT/CLUSTER/DISTRIBUTE BY](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SortBy) +- [Group By](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+GroupBy) +- [Join](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Joins) +- [Union](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Union) +- [LATERAL VIEW](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+LateralView) +- [Window Functions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+WindowingAndAnalytics) +- [SubQueries](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+SubQueries) +- [CTE](https://cwiki.apache.org/confluence/display/Hive/Common+Table+Expression) +- [INSERT INTO dest schema](https://issues.apache.org/jira/browse/HIVE-9481) +- [Implicit type conversions](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+Types#LanguageManualTypes-AllowedImplicitConversions) + +In order to have better syntax and semantic compatibility, it's highly recommended to use [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule) +and place it first in the module list, so that Hive built-in functions can be picked up during function resolution. + +Hive dialect no longer supports [Flink SQL queries]({{< ref "docs/dev/table/sql/queries" >}}). Please switch to `default` +dialect if you'd like to write in Flink syntax. + +Following is an example of using hive dialect to run some queries. -The `partition_spec`, if present, can be either a full spec or partial spec. If the `partition_spec` is a partial -spec, the dynamic partition column names can be omitted. +```bash +Flink SQL> create catalog myhive with ('type' = 'hive', 'hive-conf-dir' = '/opt/hive-conf'); +[INFO] Execute statement succeed. + +Flink SQL> use catalog myhive; +[INFO] Execute statement succeed. + +Flink SQL> load module hive; +[INFO] Execute statement succeed. -## DQL +Flink SQL> use modules hive,core; +[INFO] Execute statement succeed. -At the moment, Hive dialect supports the same syntax as Flink SQL for DQLs. Refer to -[Flink SQL queries]({{< ref "docs/dev/table/sql/queries" >}}) for more details. And it's recommended to switch to -`default` dialect to execute DQLs. +Flink SQL> set table.sql-dialect=hive; +[INFO] Session property has been set. + +Flink SQL> select explode(array(1,2,3)); -- call hive udtf ++-----+ +| col | ++-----+ +| 1 | +| 2 | +| 3 | ++-----+ +3 rows in set + +Flink SQL> create table tbl (key int,value string); +[INFO] Execute statement succeed. + +Flink SQL> insert overwrite table tbl values (5,'e'),(1,'a'),(1,'a'),(3,'c'),(2,'b'),(3,'c'),(3,'c'),(4,'d'); +[INFO] Submitting SQL update statement to the cluster... +[INFO] SQL update statement has been successfully submitted to the cluster: + +Flink SQL> select * from tbl cluster by key; -- run cluster by +2021-04-22 16:13:57,005 INFO org.apache.hadoop.mapred.FileInputFormat [] - Total input paths to process : 1 ++-----+-------+ +| key | value | ++-----+-------+ +| 1 | a | +| 1 | a | +| 5 | e | +| 2 | b | +| 3 | c | +| 3 | c | +| 3 | c | +| 4 | d | ++-----+-------+ +8 rows in set +``` ## Notice The following are some precautions for using the Hive dialect. -- Hive dialect should only be used to manipulate Hive tables, not generic tables. And Hive dialect should be used together -with a [HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}). +- Hive dialect should only be used to process Hive meta objects, and requires the current catalog to be a +[HiveCatalog]({{< ref "docs/connectors/table/hive/hive_catalog" >}}). +- Hive dialect only supports 2-part identifiers, so you can't specify catalog for an identifier. - While all Hive versions support the same syntax, whether a specific feature is available still depends on the [Hive version]({{< ref "docs/connectors/table/hive/overview" >}}#supported-hive-versions) you use. For example, updating database location is only supported in Hive-2.4.0 or later. -- Hive and Calcite have different sets of reserved keywords. For example, `default` is a reserved keyword in Calcite and -a non-reserved keyword in Hive. Even with Hive dialect, you have to quote such keywords with backtick ( ` ) in order to -use them as identifiers. -- Due to expanded query incompatibility, views created in Flink cannot be queried in Hive. +- Use [HiveModule]({{< ref "docs/connectors/table/hive/hive_functions" >}}#use-hive-built-in-functions-via-hivemodule) +to run DML and DQL. diff --git a/docs/content/docs/connectors/table/hive/overview.md b/docs/content/docs/connectors/table/hive/overview.md index dd3e21a187c5f..e8956ecabb182 100644 --- a/docs/content/docs/connectors/table/hive/overview.md +++ b/docs/content/docs/connectors/table/hive/overview.md @@ -131,6 +131,9 @@ Please find the required dependencies for different Hive major versions below. // Hive dependencies hive-exec-2.3.4.jar + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 1.0.0" >}} @@ -150,6 +153,9 @@ Please find the required dependencies for different Hive major versions below. orc-core-1.4.3-nohive.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 1.1.0" >}} @@ -169,6 +175,9 @@ Please find the required dependencies for different Hive major versions below. orc-core-1.4.3-nohive.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 1.2.1" >}} @@ -188,6 +197,9 @@ Please find the required dependencies for different Hive major versions below. orc-core-1.4.3-nohive.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 2.0.0" >}} @@ -201,6 +213,9 @@ Please find the required dependencies for different Hive major versions below. // Hive dependencies hive-exec-2.0.0.jar + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 2.1.0" >}} @@ -214,6 +229,9 @@ Please find the required dependencies for different Hive major versions below. // Hive dependencies hive-exec-2.1.0.jar + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 2.2.0" >}} @@ -231,6 +249,9 @@ Please find the required dependencies for different Hive major versions below. orc-core-1.4.3.jar aircompressor-0.8.jar // transitive dependency of orc-core + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< tab "Hive 3.1.0" >}} @@ -245,6 +266,9 @@ Please find the required dependencies for different Hive major versions below. hive-exec-3.1.0.jar libfb303-0.9.3.jar // libfb303 is not packed into hive-exec in some versions, need to add it separately + // add antlr-runtime if you need to use hive dialect + antlr-runtime-3.5.2.jar + ``` {{< /tab >}} {{< /tabs >}}