Skip to content

Commit

Permalink
Modify according to new module (apache#19651)
Browse files Browse the repository at this point in the history
* Refactor document

Remove concepts document

* refactor

* update feature sharding

* update according new module

* update according to new module

* update according to new module

* update according to new module

* modify according to new module

* update according new module

* update

* update

* update according new module

* update according to new module

* update

* update

* update according to new module
  • Loading branch information
Mike0601 authored Jul 28, 2022
1 parent 46f16e4 commit 24bbf9b
Show file tree
Hide file tree
Showing 9 changed files with 241 additions and 183 deletions.
5 changes: 2 additions & 3 deletions docs/document/content/reference/sharding/_index.cn.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,8 @@ weight = 3
chapter = true
+++

ShardingSphere 的 3 个产品的数据分片主要流程是完全一致的,按照是否进行查询优化,可以分为 Standard 内核流程和 Federation 执行引擎流程。
Standard 内核流程由 `SQL 解析 => SQL 路由 => SQL 改写 => SQL 执行 => 结果归并` 组成,主要用于处理标准分片场景下的 SQL 执行。
Federation 执行引擎流程由 `SQL 解析 => 逻辑优化 => 物理优化 => 优化执行 => Standard 内核流程` 组成,Federation 执行引擎内部进行逻辑优化和物理优化,在优化执行阶段依赖 Standard 内核流程,对优化后的逻辑 SQL 进行路由、改写、执行和归并。
ShardingSphere 数据分片的原理如下图所示,按照是否需要进行查询优化,可以分为 Simple Push Down 下推流程和 SQL Federation 执行引擎流程。 Simple Push Down 下推流程由 `SQL 解析 => SQL 绑定 => SQL 路由 => SQL 改写 => SQL 执行 => 结果归并` 组成,主要用于处理标准分片场景下的 SQL 执行。
SQL Federation 执行引擎流程由 `SQL 解析 => SQL 绑定 => 逻辑优化 => 物理优化 => 数据拉取 => 算子执行` 组成,SQL Federation 执行引擎内部进行逻辑优化和物理优化,在优化执行阶段依赖 Standard 内核流程,对优化后的逻辑 SQL 进行路由、改写、执行和归并。

![分片架构图](https://shardingsphere.apache.org/document/current/img/sharding/sharding_architecture_cn_v2.png)

Expand Down
23 changes: 13 additions & 10 deletions docs/document/content/reference/sharding/_index.en.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,32 +5,35 @@ weight = 3
chapter = true
+++

The major sharding processes of all the three ShardingSphere products are identical. According to whether query optimization is performed, they can be divided into standard kernel process and federation executor engine process.
The standard kernel process consists of `SQL Parse => SQL Route => SQL Rewrite => SQL Execute => Result Merge`, which is used to process SQL execution in standard sharding scenarios.
The federation executor engine process consists of `SQL Parse => Logical Plan Optimize => Physical Plan Optimize => Plan Execute => Standard Kernel Process`. The federation executor engine perform logical plan optimization and physical plan optimization. In the optimization execution phase, it relies on the standard kernel process to route, rewrite, execute, and merge the optimized logical SQL.
The figure below shows how sharding works. According to whether query and optimization are needed, it can be divided into the Simple Push Down process and SQL Federation execution engine process.
Simple Push Down process consists of `SQL parser => SQL binder => SQL router => SQL rewriter => SQL executor => result merger`, mainly used to deal with SQL execution in standard sharding scenarios.
SQL Federation execution engine consists of `SQL parser => SQL binder => logical optimization => physical optimization => data fetcher => operator calculation`.
This process performs logical optimization and physical optimization internally, during which the standard kernel procedure is adopted to route, rewrite, execute and merge the optimized logical SQL.

![Sharding Architecture Diagram](https://shardingsphere.apache.org/document/current/img/sharding/sharding_architecture_en_v2.png)

## SQL Parsing
## SQL Parser

It is divided into lexical parsing and syntactic parsing. The lexical parser will split SQL into inseparable words, and then the syntactic parser will analyze SQL and extract the parsing context, which can include tables, options, ordering items, grouping items, aggregation functions, pagination information, query conditions and placeholders that may be revised.
It is divided into the lexical parser and syntactic parser. SQL is first split into indivisible words through a lexical parser.

The syntactic parser is then used to analyze SQL and ultimately extract the parsing context, which can include tables, options, ordering items, grouping items, aggregation functions, pagination information, query conditions, and placeholders that may be modified.

## SQL Route

It is the sharding strategy that matches users’ configurations according to the parsing context and the route path can be generated. It supports sharding route and broadcast route currently.
The sharding strategy configured by the user is matched according to the parsing context and the routing path is generated. Currently, sharding router and broadcast router are supported.

## SQL Rewrite

It rewrites SQL as statement that can be rightly executed in the real database, and can be divided into correctness rewrite and optimization rewrite.
Rewrite SQL into statements that can be executed correctly in a real database. SQL rewriting is divided into rewriting for correctness and rewriting for optimization.

## SQL Execution

Through multi-thread executor, it executes asynchronously.
It executes asynchronously through a multithreaded executor.

## Result Merger

It merges multiple execution result sets to output through unified JDBC interface. Result merger includes methods as stream merger, memory merger and addition merger using decorator merger.
It merges multiple execution result sets to achieve output through the unified JDBC interface. The result merger includes the stream merger, memory merger and appended merger using decorator mode.

## Query Optimization

Supported by federation executor engine(under development), optimization is performed on complex query such as join query and subquery. It also supports distributed query across multiple database instances. It uses relational algebra internally to optimize query plan, and then get query result through the best query plan.
Supported by the experimental Federation Execution Engine, it optimizes complex queries such as associated queries and sub-queries and supports distributed queries across multiple database instances. It internally optimizes query plans using relational algebra to query results through optimal plans.
Loading

0 comments on commit 24bbf9b

Please sign in to comment.