Skip to content

Commit

Permalink
[FLINK-18224][docs] Add document about sql client's tableau result mode
Browse files Browse the repository at this point in the history
This closes apache#12569
  • Loading branch information
KurtYoung authored Jun 11, 2020
1 parent 88cc44a commit 6fed1a1
Show file tree
Hide file tree
Showing 3 changed files with 83 additions and 9 deletions.
45 changes: 41 additions & 4 deletions docs/dev/table/sqlClient.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ SELECT 'Hello World';

This query requires no table source and produces a single row result. The CLI will retrieve results from the cluster and visualize them. You can close the result view by pressing the `Q` key.

The CLI supports **two modes** for maintaining and visualizing results.
The CLI supports **three modes** for maintaining and visualizing results.

The **table mode** materializes results in memory and visualizes them in a regular, paginated table representation. It can be enabled by executing the following command in the CLI:

Expand All @@ -80,7 +80,18 @@ The **changelog mode** does not materialize results and visualizes the result st
SET execution.result-mode=changelog;
{% endhighlight %}

You can use the following query to see both result modes in action:
The **tableau mode** is more like a traditional way which will display the results in the screen directly with a tableau format.
The displaying content will be influenced by the query execution type(`execution.type`).

{% highlight text %}
SET execution.result-mode=tableau;
{% endhighlight %}

Note that when you use this mode with streaming query, the result will be continuously printed on the console. If the input data of
this query is bounded, the job will terminate after Flink processed all input data, and the printing will also be stopped automatically.
Otherwise, if you want to terminate a running query, just type `CTRL-C` in this case, the job and the printing will be stopped.

You can use the following query to see all the result modes in action:

{% highlight sql %}
SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name;
Expand All @@ -106,9 +117,35 @@ Alice, 1
Greg, 1
{% endhighlight %}

Both result modes can be useful during the prototyping of SQL queries. In both modes, results are stored in the Java heap memory of the SQL Client. In order to keep the CLI interface responsive, the changelog mode only shows the latest 1000 changes. The table mode allows for navigating through bigger results that are only limited by the available main memory and the configured [maximum number of rows](sqlClient.html#configuration) (`max-table-result-rows`).
In *tableau mode*, if you ran the query in streaming mode, the displayed result would be:
{% highlight text %}
+-----+----------------------+----------------------+
| +/- | name | cnt |
+-----+----------------------+----------------------+
| + | Bob | 1 |
| + | Alice | 1 |
| + | Greg | 1 |
| - | Bob | 1 |
| + | Bob | 2 |
+-----+----------------------+----------------------+
Received a total of 5 rows
{% endhighlight %}

And if you ran the query in batch mode, the displayed result would be:
{% highlight text %}
+-------+-----+
| name | cnt |
+-------+-----+
| Alice | 1 |
| Bob | 2 |
| Greg | 1 |
+-------+-----+
3 rows in set
{% endhighlight %}

All these result modes can be useful during the prototyping of SQL queries. In all these modes, results are stored in the Java heap memory of the SQL Client. In order to keep the CLI interface responsive, the changelog mode only shows the latest 1000 changes. The table mode allows for navigating through bigger results that are only limited by the available main memory and the configured [maximum number of rows](sqlClient.html#configuration) (`max-table-result-rows`).

<span class="label label-danger">Attention</span> Queries that are executed in a batch environment, can only be retrieved using the `table` result mode.
<span class="label label-danger">Attention</span> Queries that are executed in a batch environment, can only be retrieved using the `table` or `tableau` result mode.

After a query is defined, it can be submitted to the cluster as a long-running, detached Flink job. For this, a target system that stores the results needs to be specified using the [INSERT INTO statement](sqlClient.html#detached-sql-queries). The [configuration section](sqlClient.html#configuration) explains how to declare table sources for reading data, how to declare table sinks for writing data, and how to configure other table program properties.

Expand Down
45 changes: 41 additions & 4 deletions docs/dev/table/sqlClient.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ SELECT 'Hello World';

该查询不需要 table source,并且只产生一行结果。CLI 将从集群中检索结果并将其可视化。按 `Q` 键退出结果视图。

CLI 为维护和可视化结果提供**两种模式**
CLI 为维护和可视化结果提供**三种模式**

**表格模式**(table mode)在内存中实体化结果,并将结果用规则的分页表格可视化展示出来。执行如下命令启用:

Expand All @@ -79,7 +79,18 @@ SET execution.result-mode=table;
SET execution.result-mode=changelog;
{% endhighlight %}

你可以用如下查询来查看两种结果模式的运行情况:
**Tableau模式**(tableau mode)更接近传统的数据库,会将执行的结果以制表的形式直接打在屏幕之上。具体显示的内容会取决于作业
执行模式的不同(`execution.type`):

{% highlight text %}
SET execution.result-mode=tableau;
{% endhighlight %}

注意当你使用这个模式运行一个流式查询的时候,Flink 会将结果持续的打印在当前的屏幕之上。如果这个流式查询的输入是有限的数据集,
那么Flink在处理完所有的数据之后,会自动的停止作业,同时屏幕上的打印也会相应的停止。如果你想提前结束这个查询,那么可以直接使用
`CTRL-C` 按键,这个会停掉作业同时停止屏幕上的打印。

你可以用如下查询来查看三种结果模式的运行情况:

{% highlight sql %}
SELECT name, COUNT(*) AS cnt FROM (VALUES ('Bob'), ('Alice'), ('Greg'), ('Bob')) AS NameTable(name) GROUP BY name;
Expand All @@ -105,9 +116,35 @@ Alice, 1
Greg, 1
{% endhighlight %}

这两种结果模式在 SQL 查询的原型设计过程中都非常有用。这两种模式结果都存储在 SQL 客户端 的 Java 堆内存中。为了保持 CLI 界面及时响应,变更日志模式仅显示最近的 1000 个更改。表格模式支持浏览更大的结果,这些结果仅受可用主内存和配置的[最大行数](sqlClient.html#configuration)`max-table-result-rows`)的限制。
*Tableau模式* 下,如果这个查询以流的方式执行,那么将显示以下内容:
{% highlight text %}
+-----+----------------------+----------------------+
| +/- | name | cnt |
+-----+----------------------+----------------------+
| + | Bob | 1 |
| + | Alice | 1 |
| + | Greg | 1 |
| - | Bob | 1 |
| + | Bob | 2 |
+-----+----------------------+----------------------+
Received a total of 5 rows
{% endhighlight %}

如果这个查询以批的方式执行,显示的内容如下:
{% highlight text %}
+-------+-----+
| name | cnt |
+-------+-----+
| Alice | 1 |
| Bob | 2 |
| Greg | 1 |
+-------+-----+
3 rows in set
{% endhighlight %}

这几种结果模式在 SQL 查询的原型设计过程中都非常有用。这些模式的结果都存储在 SQL 客户端 的 Java 堆内存中。为了保持 CLI 界面及时响应,变更日志模式仅显示最近的 1000 个更改。表格模式支持浏览更大的结果,这些结果仅受可用主内存和配置的[最大行数](sqlClient.html#configuration)`max-table-result-rows`)的限制。

<span class="label label-danger">注意</span> 在批处理环境下执行的查询只能用表格模式进行检索
<span class="label label-danger">注意</span> 在批处理环境下执行的查询只能用表格模式或者Tableau模式进行检索

定义查询语句后,可以将其作为长时间运行的独立 Flink 作业提交给集群。为此,其目标系统需要使用 [INSERT INTO 语句](sqlClient.html#detached-sql-queries)指定存储结果。[配置部分](sqlClient.html#configuration)解释如何声明读取数据的 table source,写入数据的 sink 以及配置其他表程序属性的方法。

Expand Down
2 changes: 1 addition & 1 deletion flink-table/flink-sql-client/conf/sql-client-defaults.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -104,7 +104,7 @@ execution:
time-characteristic: event-time
# interval in ms for emitting periodic watermarks
periodic-watermarks-interval: 200
# 'changelog' or 'table' presentation of results
# 'changelog', 'table' or 'tableau' presentation of results
result-mode: table
# maximum number of maintained rows in 'table' presentation of results
max-table-result-rows: 1000000
Expand Down

0 comments on commit 6fed1a1

Please sign in to comment.