diff --git a/.github/workflows/deploy-github.yml b/.github/workflows/deploy-github.yml
index 7d72ad9b99a..c135dc420f3 100644
--- a/.github/workflows/deploy-github.yml
+++ b/.github/workflows/deploy-github.yml
@@ -39,6 +39,7 @@ jobs:
uses: JamesIves/github-pages-deploy-action@v4.3.0
with:
repository-name: ClickHouse/clickhouse-docs-content
+ token: ${{ secrets.GITHUB_TOKEN }}
branch: gh-pages
folder: .
env:
diff --git a/.gitignore b/.gitignore
index fb9a46a1b5a..c11afe4cd36 100644
--- a/.gitignore
+++ b/.gitignore
@@ -1,6 +1,5 @@
node_modules
.docusaurus
build
-docs/en/reference
**/.DS_Store
diff --git a/docs/en/reference/development/_category_.yml b/docs/en/reference/development/_category_.yml
new file mode 100644
index 00000000000..9e622150f74
--- /dev/null
+++ b/docs/en/reference/development/_category_.yml
@@ -0,0 +1,8 @@
+position: 100
+label: 'Building ClickHouse'
+collapsible: true
+collapsed: true
+link:
+ type: generated-index
+ title: Building ClickHouse
+ slug: /en/development
\ No newline at end of file
diff --git a/docs/en/reference/development/adding_test_queries.md b/docs/en/reference/development/adding_test_queries.md
new file mode 100644
index 00000000000..9b993a96ed5
--- /dev/null
+++ b/docs/en/reference/development/adding_test_queries.md
@@ -0,0 +1,157 @@
+---
+sidebar_label: Adding Test Queries
+sidebar_position: 63
+description: Instructions on how to add a test case to ClickHouse continuous integration
+---
+
+# How to add test queries to ClickHouse CI
+
+ClickHouse has hundreds (or even thousands) of features. Every commit gets checked by a complex set of tests containing many thousands of test cases.
+
+The core functionality is very well tested, but some corner-cases and different combinations of features can be uncovered with ClickHouse CI.
+
+Most of the bugs/regressions we see happen in that 'grey area' where test coverage is poor.
+
+And we are very interested in covering most of the possible scenarios and feature combinations used in real life by tests.
+
+## Why adding tests
+
+Why/when you should add a test case into ClickHouse code:
+1) you use some complicated scenarios / feature combinations / you have some corner case which is probably not widely used
+2) you see that certain behavior gets changed between version w/o notifications in the changelog
+3) you just want to help to improve ClickHouse quality and ensure the features you use will not be broken in the future releases
+4) once the test is added/accepted, you can be sure the corner case you check will never be accidentally broken.
+5) you will be a part of great open-source community
+6) your name will be visible in the `system.contributors` table!
+7) you will make a world bit better :)
+
+### Steps to do
+
+#### Prerequisite
+
+I assume you run some Linux machine (you can use docker / virtual machines on other OS) and any modern browser / internet connection, and you have some basic Linux & SQL skills.
+
+Any highly specialized knowledge is not needed (so you don't need to know C++ or know something about how ClickHouse CI works).
+
+
+#### Preparation
+
+1) [create GitHub account](https://github.com/join) (if you haven't one yet)
+2) [setup git](https://docs.github.com/en/free-pro-team@latest/github/getting-started-with-github/set-up-git)
+```bash
+# for Ubuntu
+sudo apt-get update
+sudo apt-get install git
+
+git config --global user.name "John Doe" # fill with your name
+git config --global user.email "email@example.com" # fill with your email
+
+```
+3) [fork ClickHouse project](https://docs.github.com/en/free-pro-team@latest/github/getting-started-with-github/fork-a-repo) - just open [https://github.com/ClickHouse/ClickHouse](https://github.com/ClickHouse/ClickHouse) and press fork button in the top right corner:
+![fork repo](https://github-images.s3.amazonaws.com/help/bootcamp/Bootcamp-Fork.png)
+
+4) clone your fork to some folder on your PC, for example, `~/workspace/ClickHouse`
+```
+mkdir ~/workspace && cd ~/workspace
+git clone https://github.com/< your GitHub username>/ClickHouse
+cd ClickHouse
+git remote add upstream https://github.com/ClickHouse/ClickHouse
+```
+
+#### New branch for the test
+
+1) create a new branch from the latest clickhouse master
+```
+cd ~/workspace/ClickHouse
+git fetch upstream
+git checkout -b name_for_a_branch_with_my_test upstream/master
+```
+
+#### Install & run clickhouse
+
+1) install `clickhouse-server` (follow [official docs](https://clickhouse.com/docs/en/getting-started/install/))
+2) install test configurations (it will use Zookeeper mock implementation and adjust some settings)
+```
+cd ~/workspace/ClickHouse/tests/config
+sudo ./install.sh
+```
+3) run clickhouse-server
+```
+sudo systemctl restart clickhouse-server
+```
+
+#### Creating the test file
+
+
+1) find the number for your test - find the file with the biggest number in `tests/queries/0_stateless/`
+
+```sh
+$ cd ~/workspace/ClickHouse
+$ ls tests/queries/0_stateless/[0-9]*.reference | tail -n 1
+tests/queries/0_stateless/01520_client_print_query_id.reference
+```
+Currently, the last number for the test is `01520`, so my test will have the number `01521`
+
+2) create an SQL file with the next number and name of the feature you test
+
+```sh
+touch tests/queries/0_stateless/01521_dummy_test.sql
+```
+
+3) edit SQL file with your favorite editor (see hint of creating tests below)
+```sh
+vim tests/queries/0_stateless/01521_dummy_test.sql
+```
+
+
+4) run the test, and put the result of that into the reference file:
+```
+clickhouse-client -nmT < tests/queries/0_stateless/01521_dummy_test.sql | tee tests/queries/0_stateless/01521_dummy_test.reference
+```
+
+5) ensure everything is correct, if the test output is incorrect (due to some bug for example), adjust the reference file using text editor.
+
+#### How to create a good test
+
+- A test should be
+ - minimal - create only tables related to tested functionality, remove unrelated columns and parts of query
+ - fast - should not take longer than a few seconds (better subseconds)
+ - correct - fails then feature is not working
+ - deterministic
+ - isolated / stateless
+ - don't rely on some environment things
+ - don't rely on timing when possible
+- try to cover corner cases (zeros / Nulls / empty sets / throwing exceptions)
+- to test that query return errors, you can put special comment after the query: `-- { serverError 60 }` or `-- { clientError 20 }`
+- don't switch databases (unless necessary)
+- you can create several table replicas on the same node if needed
+- you can use one of the test cluster definitions when needed (see system.clusters)
+- use `number` / `numbers_mt` / `zeros` / `zeros_mt` and similar for queries / to initialize data when applicable
+- clean up the created objects after test and before the test (DROP IF EXISTS) - in case of some dirty state
+- prefer sync mode of operations (mutations, merges, etc.)
+- use other SQL files in the `0_stateless` folder as an example
+- ensure the feature / feature combination you want to test is not yet covered with existing tests
+
+#### Test naming rules
+
+It's important to name tests correctly, so one could turn some tests subset off in clickhouse-test invocation.
+
+| Tester flag| What should be in test name | When flag should be added |
+|---|---|---|---|
+| `--[no-]zookeeper`| "zookeeper" or "replica" | Test uses tables from ReplicatedMergeTree family |
+| `--[no-]shard` | "shard" or "distributed" or "global"| Test using connections to 127.0.0.2 or similar |
+| `--[no-]long` | "long" or "deadlock" or "race" | Test runs longer than 60 seconds |
+
+#### Commit / push / create PR.
+
+1) commit & push your changes
+```sh
+cd ~/workspace/ClickHouse
+git add tests/queries/0_stateless/01521_dummy_test.sql
+git add tests/queries/0_stateless/01521_dummy_test.reference
+git commit # use some nice commit message when possible
+git push origin HEAD
+```
+2) use a link which was shown during the push, to create a PR into the main repo
+3) adjust the PR title and contents, in `Changelog category (leave one)` keep
+`Build/Testing/Packaging Improvement`, fill the rest of the fields if you want.
diff --git a/docs/en/reference/development/architecture.md b/docs/en/reference/development/architecture.md
new file mode 100644
index 00000000000..b5cb6c321ac
--- /dev/null
+++ b/docs/en/reference/development/architecture.md
@@ -0,0 +1,202 @@
+---
+sidebar_label: Architecture Overview
+sidebar_position: 62
+---
+
+# Overview of ClickHouse Architecture
+
+ClickHouse is a true column-oriented DBMS. Data is stored by columns, and during the execution of arrays (vectors or chunks of columns).
+Whenever possible, operations are dispatched on arrays, rather than on individual values. It is called “vectorized query execution” and it helps lower the cost of actual data processing.
+
+> This idea is nothing new. It dates back to the `APL` (A programming language, 1957) and its descendants: `A +` (APL dialect), `J` (1990), `K` (1993), and `Q` (programming language from Kx Systems, 2003). Array programming is used in scientific data processing. Neither is this idea something new in relational databases: for example, it is used in the `VectorWise` system (also known as Actian Vector Analytic Database by Actian Corporation).
+
+There are two different approaches for speeding up query processing: vectorized query execution and runtime code generation. The latter removes all indirection and dynamic dispatch. Neither of these approaches is strictly better than the other. Runtime code generation can be better when it fuses many operations, thus fully utilizing CPU execution units and the pipeline. Vectorized query execution can be less practical because it involves temporary vectors that must be written to the cache and read back. If the temporary data does not fit in the L2 cache, this becomes an issue. But vectorized query execution more easily utilizes the SIMD capabilities of the CPU. A [research paper](http://15721.courses.cs.cmu.edu/spring2016/papers/p5-sompolski.pdf) written by our friends shows that it is better to combine both approaches. ClickHouse uses vectorized query execution and has limited initial support for runtime code generation.
+
+## Columns {#columns}
+
+`IColumn` interface is used to represent columns in memory (actually, chunks of columns). This interface provides helper methods for the implementation of various relational operators. Almost all operations are immutable: they do not modify the original column, but create a new modified one. For example, the `IColumn :: filter` method accepts a filter byte mask. It is used for the `WHERE` and `HAVING` relational operators. Additional examples: the `IColumn :: permute` method to support `ORDER BY`, the `IColumn :: cut` method to support `LIMIT`.
+
+Various `IColumn` implementations (`ColumnUInt8`, `ColumnString`, and so on) are responsible for the memory layout of columns. The memory layout is usually a contiguous array. For the integer type of columns, it is just one contiguous array, like `std :: vector`. For `String` and `Array` columns, it is two vectors: one for all array elements, placed contiguously, and a second one for offsets to the beginning of each array. There is also `ColumnConst` that stores just one value in memory, but looks like a column.
+
+## Field {#field}
+
+Nevertheless, it is possible to work with individual values as well. To represent an individual value, the `Field` is used. `Field` is just a discriminated union of `UInt64`, `Int64`, `Float64`, `String` and `Array`. `IColumn` has the `operator []` method to get the n-th value as a `Field`, and the `insert` method to append a `Field` to the end of a column. These methods are not very efficient, because they require dealing with temporary `Field` objects representing an individual value. There are more efficient methods, such as `insertFrom`, `insertRangeFrom`, and so on.
+
+`Field` does not have enough information about a specific data type for a table. For example, `UInt8`, `UInt16`, `UInt32`, and `UInt64` are all represented as `UInt64` in a `Field`.
+
+## Leaky Abstractions {#leaky-abstractions}
+
+`IColumn` has methods for common relational transformations of data, but they do not meet all needs. For example, `ColumnUInt64` does not have a method to calculate the sum of two columns, and `ColumnString` does not have a method to run a substring search. These countless routines are implemented outside of `IColumn`.
+
+Various functions on columns can be implemented in a generic, non-efficient way using `IColumn` methods to extract `Field` values, or in a specialized way using knowledge of inner memory layout of data in a specific `IColumn` implementation. It is implemented by casting functions to a specific `IColumn` type and deal with internal representation directly. For example, `ColumnUInt64` has the `getData` method that returns a reference to an internal array, then a separate routine reads or fills that array directly. We have “leaky abstractions” to allow efficient specializations of various routines.
+
+## Data Types {#data_types}
+
+`IDataType` is responsible for serialization and deserialization: for reading and writing chunks of columns or individual values in binary or text form. `IDataType` directly corresponds to data types in tables. For example, there are `DataTypeUInt32`, `DataTypeDateTime`, `DataTypeString` and so on.
+
+`IDataType` and `IColumn` are only loosely related to each other. Different data types can be represented in memory by the same `IColumn` implementations. For example, `DataTypeUInt32` and `DataTypeDateTime` are both represented by `ColumnUInt32` or `ColumnConstUInt32`. In addition, the same data type can be represented by different `IColumn` implementations. For example, `DataTypeUInt8` can be represented by `ColumnUInt8` or `ColumnConstUInt8`.
+
+`IDataType` only stores metadata. For instance, `DataTypeUInt8` does not store anything at all (except virtual pointer `vptr`) and `DataTypeFixedString` stores just `N` (the size of fixed-size strings).
+
+`IDataType` has helper methods for various data formats. Examples are methods to serialize a value with possible quoting, to serialize a value for JSON, and to serialize a value as part of the XML format. There is no direct correspondence to data formats. For example, the different data formats `Pretty` and `TabSeparated` can use the same `serializeTextEscaped` helper method from the `IDataType` interface.
+
+## Block {#block}
+
+A `Block` is a container that represents a subset (chunk) of a table in memory. It is just a set of triples: `(IColumn, IDataType, column name)`. During query execution, data is processed by `Block`s. If we have a `Block`, we have data (in the `IColumn` object), we have information about its type (in `IDataType`) that tells us how to deal with that column, and we have the column name. It could be either the original column name from the table or some artificial name assigned for getting temporary results of calculations.
+
+When we calculate some function over columns in a block, we add another column with its result to the block, and we do not touch columns for arguments of the function because operations are immutable. Later, unneeded columns can be removed from the block, but not modified. It is convenient for the elimination of common subexpressions.
+
+Blocks are created for every processed chunk of data. Note that for the same type of calculation, the column names and types remain the same for different blocks, and only column data changes. It is better to split block data from the block header because small block sizes have a high overhead of temporary strings for copying shared_ptrs and column names.
+
+## Block Streams {#block-streams}
+
+Block streams are for processing data. We use streams of blocks to read data from somewhere, perform data transformations, or write data to somewhere. `IBlockInputStream` has the `read` method to fetch the next block while available. `IBlockOutputStream` has the `write` method to push the block somewhere.
+
+Streams are responsible for:
+
+1. Reading or writing to a table. The table just returns a stream for reading or writing blocks.
+2. Implementing data formats. For example, if you want to output data to a terminal in `Pretty` format, you create a block output stream where you push blocks, and it formats them.
+3. Performing data transformations. Let’s say you have `IBlockInputStream` and want to create a filtered stream. You create `FilterBlockInputStream` and initialize it with your stream. Then when you pull a block from `FilterBlockInputStream`, it pulls a block from your stream, filters it, and returns the filtered block to you. Query execution pipelines are represented this way.
+
+There are more sophisticated transformations. For example, when you pull from `AggregatingBlockInputStream`, it reads all data from its source, aggregates it, and then returns a stream of aggregated data for you. Another example: `UnionBlockInputStream` accepts many input sources in the constructor and also a number of threads. It launches multiple threads and reads from multiple sources in parallel.
+
+> Block streams use the “pull” approach to control flow: when you pull a block from the first stream, it consequently pulls the required blocks from nested streams, and the entire execution pipeline will work. Neither “pull” nor “push” is the best solution, because control flow is implicit, and that limits the implementation of various features like simultaneous execution of multiple queries (merging many pipelines together). This limitation could be overcome with coroutines or just running extra threads that wait for each other. We may have more possibilities if we make control flow explicit: if we locate the logic for passing data from one calculation unit to another outside of those calculation units. Read this [article](http://journal.stuffwithstuff.com/2013/01/13/iteration-inside-and-out/) for more thoughts.
+
+We should note that the query execution pipeline creates temporary data at each step. We try to keep block size small enough so that temporary data fits in the CPU cache. With that assumption, writing and reading temporary data is almost free in comparison with other calculations. We could consider an alternative, which is to fuse many operations in the pipeline together. It could make the pipeline as short as possible and remove much of the temporary data, which could be an advantage, but it also has drawbacks. For example, a split pipeline makes it easy to implement caching intermediate data, stealing intermediate data from similar queries running at the same time, and merging pipelines for similar queries.
+
+## Formats {#formats}
+
+Data formats are implemented with block streams. There are “presentational” formats only suitable for the output of data to the client, such as `Pretty` format, which provides only `IBlockOutputStream`. And there are input/output formats, such as `TabSeparated` or `JSONEachRow`.
+
+There are also row streams: `IRowInputStream` and `IRowOutputStream`. They allow you to pull/push data by individual rows, not by blocks. And they are only needed to simplify the implementation of row-oriented formats. The wrappers `BlockInputStreamFromRowInputStream` and `BlockOutputStreamFromRowOutputStream` allow you to convert row-oriented streams to regular block-oriented streams.
+
+## I/O {#io}
+
+For byte-oriented input/output, there are `ReadBuffer` and `WriteBuffer` abstract classes. They are used instead of C++ `iostream`s. Don’t worry: every mature C++ project is using something other than `iostream`s for good reasons.
+
+`ReadBuffer` and `WriteBuffer` are just a contiguous buffer and a cursor pointing to the position in that buffer. Implementations may own or not own the memory for the buffer. There is a virtual method to fill the buffer with the following data (for `ReadBuffer`) or to flush the buffer somewhere (for `WriteBuffer`). The virtual methods are rarely called.
+
+Implementations of `ReadBuffer`/`WriteBuffer` are used for working with files and file descriptors and network sockets, for implementing compression (`CompressedWriteBuffer` is initialized with another WriteBuffer and performs compression before writing data to it), and for other purposes – the names `ConcatReadBuffer`, `LimitReadBuffer`, and `HashingWriteBuffer` speak for themselves.
+
+Read/WriteBuffers only deal with bytes. There are functions from `ReadHelpers` and `WriteHelpers` header files to help with formatting input/output. For example, there are helpers to write a number in decimal format.
+
+Let’s look at what happens when you want to write a result set in `JSON` format to stdout. You have a result set ready to be fetched from `IBlockInputStream`. You create `WriteBufferFromFileDescriptor(STDOUT_FILENO)` to write bytes to stdout. You create `JSONRowOutputStream`, initialized with that `WriteBuffer`, to write rows in `JSON` to stdout. You create `BlockOutputStreamFromRowOutputStream` on top of it, to represent it as `IBlockOutputStream`. Then you call `copyData` to transfer data from `IBlockInputStream` to `IBlockOutputStream`, and everything works. Internally, `JSONRowOutputStream` will write various JSON delimiters and call the `IDataType::serializeTextJSON` method with a reference to `IColumn` and the row number as arguments. Consequently, `IDataType::serializeTextJSON` will call a method from `WriteHelpers.h`: for example, `writeText` for numeric types and `writeJSONString` for `DataTypeString`.
+
+## Tables {#tables}
+
+The `IStorage` interface represents tables. Different implementations of that interface are different table engines. Examples are `StorageMergeTree`, `StorageMemory`, and so on. Instances of these classes are just tables.
+
+The key `IStorage` methods are `read` and `write`. There are also `alter`, `rename`, `drop`, and so on. The `read` method accepts the following arguments: the set of columns to read from a table, the `AST` query to consider, and the desired number of streams to return. It returns one or multiple `IBlockInputStream` objects and information about the stage of data processing that was completed inside a table engine during query execution.
+
+In most cases, the read method is only responsible for reading the specified columns from a table, not for any further data processing. All further data processing is done by the query interpreter and is outside the responsibility of `IStorage`.
+
+But there are notable exceptions:
+
+- The AST query is passed to the `read` method, and the table engine can use it to derive index usage and to read fewer data from a table.
+- Sometimes the table engine can process data itself to a specific stage. For example, `StorageDistributed` can send a query to remote servers, ask them to process data to a stage where data from different remote servers can be merged, and return that preprocessed data. The query interpreter then finishes processing the data.
+
+The table’s `read` method can return multiple `IBlockInputStream` objects to allow parallel data processing. These multiple block input streams can read from a table in parallel. Then you can wrap these streams with various transformations (such as expression evaluation or filtering) that can be calculated independently and create a `UnionBlockInputStream` on top of them, to read from multiple streams in parallel.
+
+There are also `TableFunction`s. These are functions that return a temporary `IStorage` object to use in the `FROM` clause of a query.
+
+To get a quick idea of how to implement your table engine, look at something simple, like `StorageMemory` or `StorageTinyLog`.
+
+> As the result of the `read` method, `IStorage` returns `QueryProcessingStage` – information about what parts of the query were already calculated inside storage.
+
+## Parsers {#parsers}
+
+A hand-written recursive descent parser parses a query. For example, `ParserSelectQuery` just recursively calls the underlying parsers for various parts of the query. Parsers create an `AST`. The `AST` is represented by nodes, which are instances of `IAST`.
+
+> Parser generators are not used for historical reasons.
+
+## Interpreters {#interpreters}
+
+Interpreters are responsible for creating the query execution pipeline from an `AST`. There are simple interpreters, such as `InterpreterExistsQuery` and `InterpreterDropQuery`, or the more sophisticated `InterpreterSelectQuery`. The query execution pipeline is a combination of block input or output streams. For example, the result of interpreting the `SELECT` query is the `IBlockInputStream` to read the result set from; the result of the INSERT query is the `IBlockOutputStream` to write data for insertion to, and the result of interpreting the `INSERT SELECT` query is the `IBlockInputStream` that returns an empty result set on the first read, but that copies data from `SELECT` to `INSERT` at the same time.
+
+`InterpreterSelectQuery` uses `ExpressionAnalyzer` and `ExpressionActions` machinery for query analysis and transformations. This is where most rule-based query optimizations are done. `ExpressionAnalyzer` is quite messy and should be rewritten: various query transformations and optimizations should be extracted to separate classes to allow modular transformations of query.
+
+## Functions {#functions}
+
+There are ordinary functions and aggregate functions. For aggregate functions, see the next section.
+
+Ordinary functions do not change the number of rows – they work as if they are processing each row independently. In fact, functions are not called for individual rows, but for `Block`’s of data to implement vectorized query execution.
+
+There are some miscellaneous functions, like [blockSize](../sql-reference/functions/other-functions.md#function-blocksize), [rowNumberInBlock](../sql-reference/functions/other-functions.md#function-rownumberinblock), and [runningAccumulate](../sql-reference/functions/other-functions.md#runningaccumulate), that exploit block processing and violate the independence of rows.
+
+ClickHouse has strong typing, so there’s no implicit type conversion. If a function does not support a specific combination of types, it throws an exception. But functions can work (be overloaded) for many different combinations of types. For example, the `plus` function (to implement the `+` operator) works for any combination of numeric types: `UInt8` + `Float32`, `UInt16` + `Int8`, and so on. Also, some variadic functions can accept any number of arguments, such as the `concat` function.
+
+Implementing a function may be slightly inconvenient because a function explicitly dispatches supported data types and supported `IColumns`. For example, the `plus` function has code generated by instantiation of a C++ template for each combination of numeric types, and constant or non-constant left and right arguments.
+
+It is an excellent place to implement runtime code generation to avoid template code bloat. Also, it makes it possible to add fused functions like fused multiply-add or to make multiple comparisons in one loop iteration.
+
+Due to vectorized query execution, functions are not short-circuited. For example, if you write `WHERE f(x) AND g(y)`, both sides are calculated, even for rows, when `f(x)` is zero (except when `f(x)` is a zero constant expression). But if the selectivity of the `f(x)` condition is high, and calculation of `f(x)` is much cheaper than `g(y)`, it’s better to implement multi-pass calculation. It would first calculate `f(x)`, then filter columns by the result, and then calculate `g(y)` only for smaller, filtered chunks of data.
+
+## Aggregate Functions {#aggregate-functions}
+
+Aggregate functions are stateful functions. They accumulate passed values into some state and allow you to get results from that state. They are managed with the `IAggregateFunction` interface. States can be rather simple (the state for `AggregateFunctionCount` is just a single `UInt64` value) or quite complex (the state of `AggregateFunctionUniqCombined` is a combination of a linear array, a hash table, and a `HyperLogLog` probabilistic data structure).
+
+States are allocated in `Arena` (a memory pool) to deal with multiple states while executing a high-cardinality `GROUP BY` query. States can have a non-trivial constructor and destructor: for example, complicated aggregation states can allocate additional memory themselves. It requires some attention to creating and destroying states and properly passing their ownership and destruction order.
+
+Aggregation states can be serialized and deserialized to pass over the network during distributed query execution or to write them on the disk where there is not enough RAM. They can even be stored in a table with the `DataTypeAggregateFunction` to allow incremental aggregation of data.
+
+> The serialized data format for aggregate function states is not versioned right now. It is ok if aggregate states are only stored temporarily. But we have the `AggregatingMergeTree` table engine for incremental aggregation, and people are already using it in production. It is the reason why backward compatibility is required when changing the serialized format for any aggregate function in the future.
+
+## Server {#server}
+
+The server implements several different interfaces:
+
+- An HTTP interface for any foreign clients.
+- A TCP interface for the native ClickHouse client and for cross-server communication during distributed query execution.
+- An interface for transferring data for replication.
+
+Internally, it is just a primitive multithreaded server without coroutines or fibers. Since the server is not designed to process a high rate of simple queries but to process a relatively low rate of complex queries, each of them can process a vast amount of data for analytics.
+
+The server initializes the `Context` class with the necessary environment for query execution: the list of available databases, users and access rights, settings, clusters, the process list, the query log, and so on. Interpreters use this environment.
+
+We maintain full backward and forward compatibility for the server TCP protocol: old clients can talk to new servers, and new clients can talk to old servers. But we do not want to maintain it eternally, and we are removing support for old versions after about one year.
+
+:::note
+For most external applications, we recommend using the HTTP interface because it is simple and easy to use. The TCP protocol is more tightly linked to internal data structures: it uses an internal format for passing blocks of data, and it uses custom framing for compressed data. We haven’t released a C library for that protocol because it requires linking most of the ClickHouse codebase, which is not practical.
+:::
+
+## Distributed Query Execution {#distributed-query-execution}
+
+Servers in a cluster setup are mostly independent. You can create a `Distributed` table on one or all servers in a cluster. The `Distributed` table does not store data itself – it only provides a “view” to all local tables on multiple nodes of a cluster. When you SELECT from a `Distributed` table, it rewrites that query, chooses remote nodes according to load balancing settings, and sends the query to them. The `Distributed` table requests remote servers to process a query just up to a stage where intermediate results from different servers can be merged. Then it receives the intermediate results and merges them. The distributed table tries to distribute as much work as possible to remote servers and does not send much intermediate data over the network.
+
+Things become more complicated when you have subqueries in IN or JOIN clauses, and each of them uses a `Distributed` table. We have different strategies for the execution of these queries.
+
+There is no global query plan for distributed query execution. Each node has its local query plan for its part of the job. We only have simple one-pass distributed query execution: we send queries for remote nodes and then merge the results. But this is not feasible for complicated queries with high cardinality GROUP BYs or with a large amount of temporary data for JOIN. In such cases, we need to “reshuffle” data between servers, which requires additional coordination. ClickHouse does not support that kind of query execution, and we need to work on it.
+
+## Merge Tree {#merge-tree}
+
+`MergeTree` is a family of storage engines that supports indexing by primary key. The primary key can be an arbitrary tuple of columns or expressions. Data in a `MergeTree` table is stored in “parts”. Each part stores data in the primary key order, so data is ordered lexicographically by the primary key tuple. All the table columns are stored in separate `column.bin` files in these parts. The files consist of compressed blocks. Each block is usually from 64 KB to 1 MB of uncompressed data, depending on the average value size. The blocks consist of column values placed contiguously one after the other. Column values are in the same order for each column (the primary key defines the order), so when you iterate by many columns, you get values for the corresponding rows.
+
+The primary key itself is “sparse”. It does not address every single row, but only some ranges of data. A separate `primary.idx` file has the value of the primary key for each N-th row, where N is called `index_granularity` (usually, N = 8192). Also, for each column, we have `column.mrk` files with “marks”, which are offsets to each N-th row in the data file. Each mark is a pair: the offset in the file to the beginning of the compressed block, and the offset in the decompressed block to the beginning of data. Usually, compressed blocks are aligned by marks, and the offset in the decompressed block is zero. Data for `primary.idx` always resides in memory, and data for `column.mrk` files is cached.
+
+When we are going to read something from a part in `MergeTree`, we look at `primary.idx` data and locate ranges that could contain requested data, then look at `column.mrk` data and calculate offsets for where to start reading those ranges. Because of sparseness, excess data may be read. ClickHouse is not suitable for a high load of simple point queries, because the entire range with `index_granularity` rows must be read for each key, and the entire compressed block must be decompressed for each column. We made the index sparse because we must be able to maintain trillions of rows per single server without noticeable memory consumption for the index. Also, because the primary key is sparse, it is not unique: it cannot check the existence of the key in the table at INSERT time. You could have many rows with the same key in a table.
+
+When you `INSERT` a bunch of data into `MergeTree`, that bunch is sorted by primary key order and forms a new part. There are background threads that periodically select some parts and merge them into a single sorted part to keep the number of parts relatively low. That’s why it is called `MergeTree`. Of course, merging leads to “write amplification”. All parts are immutable: they are only created and deleted, but not modified. When SELECT is executed, it holds a snapshot of the table (a set of parts). After merging, we also keep old parts for some time to make a recovery after failure easier, so if we see that some merged part is probably broken, we can replace it with its source parts.
+
+`MergeTree` is not an LSM tree because it does not contain MEMTABLE and LOG: inserted data is written directly to the filesystem. This behavior makes MergeTree much more suitable to insert data in batches. Therefore frequently inserting small amounts of rows is not ideal for MergeTree. For example, a couple of rows per second is OK, but doing it a thousand times a second is not optimal for MergeTree. However, there is an async insert mode for small inserts to overcome this limitation. We did it this way for simplicity’s sake, and because we are already inserting data in batches in our applications
+
+There are MergeTree engines that are doing additional work during background merges. Examples are `CollapsingMergeTree` and `AggregatingMergeTree`. This could be treated as special support for updates. Keep in mind that these are not real updates because users usually have no control over the time when background merges are executed, and data in a `MergeTree` table is almost always stored in more than one part, not in completely merged form.
+
+## Replication {#replication}
+
+Replication in ClickHouse can be configured on a per-table basis. You could have some replicated and some non-replicated tables on the same server. You could also have tables replicated in different ways, such as one table with two-factor replication and another with three-factor.
+
+Replication is implemented in the `ReplicatedMergeTree` storage engine. The path in `ZooKeeper` is specified as a parameter for the storage engine. All tables with the same path in `ZooKeeper` become replicas of each other: they synchronize their data and maintain consistency. Replicas can be added and removed dynamically simply by creating or dropping a table.
+
+Replication uses an asynchronous multi-master scheme. You can insert data into any replica that has a session with `ZooKeeper`, and data is replicated to all other replicas asynchronously. Because ClickHouse does not support UPDATEs, replication is conflict-free. As there is no quorum acknowledgment of inserts, just-inserted data might be lost if one node fails.
+
+Metadata for replication is stored in ZooKeeper. There is a replication log that lists what actions to do. Actions are: get part; merge parts; drop a partition, and so on. Each replica copies the replication log to its queue and then executes the actions from the queue. For example, on insertion, the “get the part” action is created in the log, and every replica downloads that part. Merges are coordinated between replicas to get byte-identical results. All parts are merged in the same way on all replicas. One of the leaders initiates a new merge first and writes “merge parts” actions to the log. Multiple replicas (or all) can be leaders at the same time. A replica can be prevented from becoming a leader using the `merge_tree` setting `replicated_can_become_leader`. The leaders are responsible for scheduling background merges.
+
+Replication is physical: only compressed parts are transferred between nodes, not queries. Merges are processed on each replica independently in most cases to lower the network costs by avoiding network amplification. Large merged parts are sent over the network only in cases of significant replication lag.
+
+Besides, each replica stores its state in ZooKeeper as the set of parts and its checksums. When the state on the local filesystem diverges from the reference state in ZooKeeper, the replica restores its consistency by downloading missing and broken parts from other replicas. When there is some unexpected or broken data in the local filesystem, ClickHouse does not remove it, but moves it to a separate directory and forgets it.
+
+:::note
+The ClickHouse cluster consists of independent shards, and each shard consists of replicas. The cluster is **not elastic**, so after adding a new shard, data is not rebalanced between shards automatically. Instead, the cluster load is supposed to be adjusted to be uneven. This implementation gives you more control, and it is ok for relatively small clusters, such as tens of nodes. But for clusters with hundreds of nodes that we are using in production, this approach becomes a significant drawback. We should implement a table engine that spans across the cluster with dynamically replicated regions that could be split and balanced between clusters automatically.
+:::
+
+[Original article](https://clickhouse.com/docs/en/development/architecture/)
diff --git a/docs/en/reference/development/browse-code.md b/docs/en/reference/development/browse-code.md
new file mode 100644
index 00000000000..da924c359ff
--- /dev/null
+++ b/docs/en/reference/development/browse-code.md
@@ -0,0 +1,13 @@
+---
+sidebar_label: Source Code Browser
+sidebar_position: 72
+description: Various ways to browse and edit the source code
+---
+
+# Browse ClickHouse Source Code
+
+You can use the **Woboq** online code browser available [here](https://clickhouse.com/codebrowser/ClickHouse/src/index.html). It provides code navigation and semantic highlighting, search and indexing. The code snapshot is updated daily.
+
+Also, you can browse sources on [GitHub](https://github.com/ClickHouse/ClickHouse) as usual.
+
+If you’re interested what IDE to use, we recommend CLion, QT Creator, VS Code and KDevelop (with caveats). You can use any favorite IDE. Vim and Emacs also count.
diff --git a/docs/en/reference/development/build-cross-arm.md b/docs/en/reference/development/build-cross-arm.md
new file mode 100644
index 00000000000..305c09ae217
--- /dev/null
+++ b/docs/en/reference/development/build-cross-arm.md
@@ -0,0 +1,38 @@
+---
+sidebar_position: 67
+sidebar_label: Build on Linux for AARCH64 (ARM64)
+---
+
+# How to Build ClickHouse on Linux for AARCH64 (ARM64) Architecture
+
+This is for the case when you have Linux machine and want to use it to build `clickhouse` binary that will run on another Linux machine with AARCH64 CPU architecture.
+This is intended for continuous integration checks that run on Linux servers.
+
+The cross-build for AARCH64 is based on the [Build instructions](../development/build.md), follow them first.
+
+## Install Clang-13
+
+Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup or do
+```
+sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
+```
+
+## Install Cross-Compilation Toolset {#install-cross-compilation-toolset}
+
+``` bash
+cd ClickHouse
+mkdir -p build-aarch64/cmake/toolchain/linux-aarch64
+wget 'https://developer.arm.com/-/media/Files/downloads/gnu-a/8.3-2019.03/binrel/gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz?revision=2e88a73f-d233-4f96-b1f4-d8b36e9bb0b9&la=en' -O gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz
+tar xJf gcc-arm-8.3-2019.03-x86_64-aarch64-linux-gnu.tar.xz -C build-aarch64/cmake/toolchain/linux-aarch64 --strip-components=1
+```
+
+## Build ClickHouse {#build-clickhouse}
+
+``` bash
+cd ClickHouse
+mkdir build-arm64
+CC=clang-13 CXX=clang++-13 cmake . -Bbuild-arm64 -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-aarch64.cmake
+ninja -C build-arm64
+```
+
+The resulting binary will run only on Linux with the AARCH64 CPU architecture.
diff --git a/docs/en/reference/development/build-cross-osx.md b/docs/en/reference/development/build-cross-osx.md
new file mode 100644
index 00000000000..1dbd0ec6430
--- /dev/null
+++ b/docs/en/reference/development/build-cross-osx.md
@@ -0,0 +1,62 @@
+---
+sidebar_position: 66
+sidebar_label: Build on Linux for Mac OS X
+---
+
+# How to Build ClickHouse on Linux for Mac OS X
+
+This is for the case when you have a Linux machine and want to use it to build `clickhouse` binary that will run on OS X.
+This is intended for continuous integration checks that run on Linux servers. If you want to build ClickHouse directly on Mac OS X, then proceed with [another instruction](../development/build-osx.md).
+
+The cross-build for Mac OS X is based on the [Build instructions](../development/build.md), follow them first.
+
+## Install Clang-13
+
+Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup.
+For example the commands for Bionic are like:
+
+``` bash
+sudo echo "deb [trusted=yes] http://apt.llvm.org/bionic/ llvm-toolchain-bionic-13 main" >> /etc/apt/sources.list
+sudo apt-get install clang-13
+```
+
+## Install Cross-Compilation Toolset {#install-cross-compilation-toolset}
+
+Let’s remember the path where we install `cctools` as ${CCTOOLS}
+
+``` bash
+mkdir ${CCTOOLS}
+cd ${CCTOOLS}
+
+git clone https://github.com/tpoechtrager/apple-libtapi.git
+cd apple-libtapi
+INSTALLPREFIX=${CCTOOLS} ./build.sh
+./install.sh
+cd ..
+
+git clone https://github.com/tpoechtrager/cctools-port.git
+cd cctools-port/cctools
+./configure --prefix=$(readlink -f ${CCTOOLS}) --with-libtapi=$(readlink -f ${CCTOOLS}) --target=x86_64-apple-darwin
+make install
+```
+
+Also, we need to download macOS X SDK into the working tree.
+
+``` bash
+cd ClickHouse
+wget 'https://github.com/phracker/MacOSX-SDKs/releases/download/10.15/MacOSX10.15.sdk.tar.xz'
+mkdir -p build-darwin/cmake/toolchain/darwin-x86_64
+tar xJf MacOSX10.15.sdk.tar.xz -C build-darwin/cmake/toolchain/darwin-x86_64 --strip-components=1
+```
+
+## Build ClickHouse {#build-clickhouse}
+
+``` bash
+cd ClickHouse
+mkdir build-darwin
+cd build-darwin
+CC=clang-13 CXX=clang++-13 cmake -DCMAKE_AR:FILEPATH=${CCTOOLS}/bin/aarch64-apple-darwin-ar -DCMAKE_INSTALL_NAME_TOOL=${CCTOOLS}/bin/aarch64-apple-darwin-install_name_tool -DCMAKE_RANLIB:FILEPATH=${CCTOOLS}/bin/aarch64-apple-darwin-ranlib -DLINKER_NAME=${CCTOOLS}/bin/aarch64-apple-darwin-ld -DCMAKE_TOOLCHAIN_FILE=cmake/darwin/toolchain-x86_64.cmake ..
+ninja
+```
+
+The resulting binary will have a Mach-O executable format and can’t be run on Linux.
diff --git a/docs/en/reference/development/build-cross-riscv.md b/docs/en/reference/development/build-cross-riscv.md
new file mode 100644
index 00000000000..94c0f47a05d
--- /dev/null
+++ b/docs/en/reference/development/build-cross-riscv.md
@@ -0,0 +1,30 @@
+---
+sidebar_position: 68
+sidebar_label: Build on Linux for RISC-V 64
+---
+
+# How to Build ClickHouse on Linux for RISC-V 64 Architecture
+
+As of writing (11.11.2021) building for risc-v considered to be highly experimental. Not all features can be enabled.
+
+This is for the case when you have Linux machine and want to use it to build `clickhouse` binary that will run on another Linux machine with RISC-V 64 CPU architecture. This is intended for continuous integration checks that run on Linux servers.
+
+The cross-build for RISC-V 64 is based on the [Build instructions](../development/build.md), follow them first.
+
+## Install Clang-13
+
+Follow the instructions from https://apt.llvm.org/ for your Ubuntu or Debian setup or do
+```
+sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
+```
+
+## Build ClickHouse {#build-clickhouse}
+
+``` bash
+cd ClickHouse
+mkdir build-riscv64
+CC=clang-13 CXX=clang++-13 cmake . -Bbuild-riscv64 -G Ninja -DCMAKE_TOOLCHAIN_FILE=cmake/linux/toolchain-riscv64.cmake -DGLIBC_COMPATIBILITY=OFF -DENABLE_LDAP=OFF -DOPENSSL_NO_ASM=ON -DENABLE_JEMALLOC=ON -DENABLE_PARQUET=OFF -DENABLE_ORC=OFF -DUSE_UNWIND=OFF -DENABLE_GRPC=OFF -DENABLE_HDFS=OFF -DENABLE_MYSQL=OFF
+ninja -C build-riscv64
+```
+
+The resulting binary will run only on Linux with the RISC-V 64 CPU architecture.
diff --git a/docs/en/reference/development/build-osx.md b/docs/en/reference/development/build-osx.md
new file mode 100644
index 00000000000..05ef10ad020
--- /dev/null
+++ b/docs/en/reference/development/build-osx.md
@@ -0,0 +1,154 @@
+---
+sidebar_position: 65
+sidebar_label: Build on Mac OS X
+description: How to build ClickHouse on Mac OS X
+---
+
+# How to Build ClickHouse on Mac OS X
+
+:::info You don't have to build ClickHouse yourself!
+You can install pre-built ClickHouse as described in [Quick Start](https://clickhouse.com/#quick-start). Follow **macOS (Intel)** or **macOS (Apple silicon)** installation instructions.
+:::
+
+Build should work on x86_64 (Intel) and arm64 (Apple silicon) based macOS 10.15 (Catalina) and higher with Homebrew's vanilla Clang.
+It is always recommended to use vanilla `clang` compiler.
+
+:::note
+It is possible to use XCode's `apple-clang` or `gcc`, but it's strongly discouraged.
+:::
+
+## Install Homebrew {#install-homebrew}
+
+``` bash
+/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
+# ...and follow the printed instructions on any additional steps required to complete the installation.
+```
+
+## Install Xcode and Command Line Tools {#install-xcode-and-command-line-tools}
+
+Install the latest [Xcode](https://apps.apple.com/am/app/xcode/id497799835?mt=12) from App Store.
+
+Open it at least once to accept the end-user license agreement and automatically install the required components.
+
+Then, make sure that the latest Command Line Tools are installed and selected in the system:
+
+``` bash
+sudo rm -rf /Library/Developer/CommandLineTools
+sudo xcode-select --install
+```
+
+## Install Required Compilers, Tools, and Libraries {#install-required-compilers-tools-and-libraries}
+
+``` bash
+brew update
+brew install cmake ninja libtool gettext llvm gcc binutils
+```
+
+## Checkout ClickHouse Sources {#checkout-clickhouse-sources}
+
+``` bash
+git clone --recursive git@github.com:ClickHouse/ClickHouse.git
+# ...alternatively, you can use https://github.com/ClickHouse/ClickHouse.git as the repo URL.
+```
+
+## Build ClickHouse {#build-clickhouse}
+
+To build using Homebrew's vanilla Clang compiler (the only **recommended** way):
+
+``` bash
+cd ClickHouse
+rm -rf build
+mkdir build
+cd build
+cmake -DCMAKE_C_COMPILER=$(brew --prefix llvm)/bin/clang -DCMAKE_CXX_COMPILER=$(brew --prefix llvm)/bin/clang++ -DCMAKE_AR=$(brew --prefix llvm)/bin/llvm-ar -DCMAKE_RANLIB=$(brew --prefix llvm)/bin/llvm-ranlib -DOBJCOPY_PATH=$(brew --prefix llvm)/bin/llvm-objcopy -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
+cmake --build . --config RelWithDebInfo
+# The resulting binary will be created at: ./programs/clickhouse
+```
+
+To build using Xcode's native AppleClang compiler in Xcode IDE (this option is only for development builds and workflows, and is **not recommended** unless you know what you are doing):
+
+``` bash
+cd ClickHouse
+rm -rf build
+mkdir build
+cd build
+XCODE_IDE=1 ALLOW_APPLECLANG=1 cmake -G Xcode -DCMAKE_BUILD_TYPE=Debug -DENABLE_JEMALLOC=OFF ..
+cmake --open .
+# ...then, in Xcode IDE select ALL_BUILD scheme and start the building process.
+# The resulting binary will be created at: ./programs/Debug/clickhouse
+```
+
+To build using Homebrew's vanilla GCC compiler (this option is only for development experiments, and is **absolutely not recommended** unless you really know what you are doing):
+
+``` bash
+cd ClickHouse
+rm -rf build
+mkdir build
+cd build
+cmake -DCMAKE_C_COMPILER=$(brew --prefix gcc)/bin/gcc-11 -DCMAKE_CXX_COMPILER=$(brew --prefix gcc)/bin/g++-11 -DCMAKE_AR=$(brew --prefix gcc)/bin/gcc-ar-11 -DCMAKE_RANLIB=$(brew --prefix gcc)/bin/gcc-ranlib-11 -DOBJCOPY_PATH=$(brew --prefix binutils)/bin/objcopy -DCMAKE_BUILD_TYPE=RelWithDebInfo ..
+cmake --build . --config RelWithDebInfo
+# The resulting binary will be created at: ./programs/clickhouse
+```
+
+## Caveats {#caveats}
+
+If you intend to run `clickhouse-server`, make sure to increase the system’s maxfiles variable.
+
+:::note
+You’ll need to use sudo.
+:::
+
+To do so, create the `/Library/LaunchDaemons/limit.maxfiles.plist` file with the following content:
+
+``` xml
+
+
+
+
+ Label
+ limit.maxfiles
+ ProgramArguments
+
+ launchctl
+ limit
+ maxfiles
+ 524288
+ 524288
+
+ RunAtLoad
+
+ ServiceIPC
+
+
+
+```
+
+Give the file correct permissions:
+
+``` bash
+sudo chown root:wheel /Library/LaunchDaemons/limit.maxfiles.plist
+```
+
+Validate that the file is correct:
+
+``` bash
+plutil /Library/LaunchDaemons/limit.maxfiles.plist
+```
+
+Load the file (or reboot):
+
+``` bash
+sudo launchctl load -w /Library/LaunchDaemons/limit.maxfiles.plist
+```
+
+To check if it’s working, use the `ulimit -n` or `launchctl limit maxfiles` commands.
+
+## Running ClickHouse server
+
+``` bash
+cd ClickHouse
+./build/programs/clickhouse-server --config-file ./programs/server/config.xml
+```
+
+[Original article](https://clickhouse.com/docs/en/development/build_osx/)
diff --git a/docs/en/reference/development/build.md b/docs/en/reference/development/build.md
new file mode 100644
index 00000000000..b128412a55e
--- /dev/null
+++ b/docs/en/reference/development/build.md
@@ -0,0 +1,181 @@
+---
+sidebar_position: 64
+sidebar_label: Build on Linux
+description: How to build ClickHouse on Linux
+---
+
+# How to Build ClickHouse on Linux
+
+Supported platforms:
+
+- x86_64
+- AArch64
+- Power9 (experimental)
+
+## Normal Build for Development on Ubuntu
+
+The following tutorial is based on the Ubuntu Linux system. With appropriate changes, it should also work on any other Linux distribution.
+
+### Install Git, CMake, Python and Ninja {#install-git-cmake-python-and-ninja}
+
+``` bash
+$ sudo apt-get install git cmake python ninja-build
+```
+
+Or cmake3 instead of cmake on older systems.
+
+### Install the latest clang (recommended)
+
+On Ubuntu/Debian you can use the automatic installation script (check [official webpage](https://apt.llvm.org/))
+
+```bash
+sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
+```
+
+For other Linux distribution - check the availability of the [prebuild packages](https://releases.llvm.org/download.html) or build clang [from sources](https://clang.llvm.org/get_started.html).
+
+#### Use the latest clang for Builds
+
+``` bash
+$ export CC=clang-14
+$ export CXX=clang++-14
+```
+
+In this example we use version 14 that is the latest as of Feb 2022.
+
+Gcc can also be used though it is discouraged.
+
+### Checkout ClickHouse Sources {#checkout-clickhouse-sources}
+
+``` bash
+$ git clone --recursive git@github.com:ClickHouse/ClickHouse.git
+```
+
+or
+
+``` bash
+$ git clone --recursive https://github.com/ClickHouse/ClickHouse.git
+```
+
+### Build ClickHouse {#build-clickhouse}
+
+``` bash
+$ cd ClickHouse
+$ mkdir build
+$ cd build
+$ cmake ..
+$ ninja
+```
+
+To create an executable, run `ninja clickhouse`.
+This will create the `programs/clickhouse` executable, which can be used with `client` or `server` arguments.
+
+## How to Build ClickHouse on Any Linux {#how-to-build-clickhouse-on-any-linux}
+
+The build requires the following components:
+
+- Git (is used only to checkout the sources, it’s not needed for the build)
+- CMake 3.10 or newer
+- Ninja
+- C++ compiler: clang-13 or newer
+- Linker: lld
+
+If all the components are installed, you may build in the same way as the steps above.
+
+Example for Ubuntu Eoan:
+``` bash
+sudo apt update
+sudo apt install git cmake ninja-build clang++ python
+git clone --recursive https://github.com/ClickHouse/ClickHouse.git
+mkdir build && cd build
+cmake ../ClickHouse
+ninja
+```
+
+Example for OpenSUSE Tumbleweed:
+``` bash
+sudo zypper install git cmake ninja clang-c++ python lld
+git clone --recursive https://github.com/ClickHouse/ClickHouse.git
+mkdir build && cd build
+cmake ../ClickHouse
+ninja
+```
+
+Example for Fedora Rawhide:
+``` bash
+sudo yum update
+yum --nogpg install git cmake make clang-c++ python3
+git clone --recursive https://github.com/ClickHouse/ClickHouse.git
+mkdir build && cd build
+cmake ../ClickHouse
+make -j $(nproc)
+```
+
+Here is an example of how to build `clang` and all the llvm infrastructure from sources:
+
+```
+ git clone git@github.com:llvm/llvm-project.git
+ mkdir llvm-build && cd llvm-build
+ cmake -DCMAKE_BUILD_TYPE:STRING=Release -DLLVM_ENABLE_PROJECTS=all ../llvm-project/llvm/
+ make -j16
+ sudo make install
+ hash clang
+ clang --version
+```
+
+You can install the older clang like clang-11 from packages and then use it to build the new clang from sources.
+
+Here is an example of how to install the new `cmake` from the official website:
+
+```
+wget https://github.com/Kitware/CMake/releases/download/v3.22.2/cmake-3.22.2-linux-x86_64.sh
+chmod +x cmake-3.22.2-linux-x86_64.sh
+./cmake-3.22.2-linux-x86_64.sh
+export PATH=/home/milovidov/work/cmake-3.22.2-linux-x86_64/bin/:${PATH}
+hash cmake
+```
+
+## How to Build ClickHouse Debian Package {#how-to-build-clickhouse-debian-package}
+
+### Install Git {#install-git}
+
+``` bash
+$ sudo apt-get update
+$ sudo apt-get install git python debhelper lsb-release fakeroot sudo debian-archive-keyring debian-keyring
+```
+
+### Checkout ClickHouse Sources {#checkout-clickhouse-sources-1}
+
+``` bash
+$ git clone --recursive --branch master https://github.com/ClickHouse/ClickHouse.git
+$ cd ClickHouse
+```
+
+### Run Release Script {#run-release-script}
+
+``` bash
+$ ./release
+```
+
+## You Don’t Have to Build ClickHouse {#you-dont-have-to-build-clickhouse}
+
+ClickHouse is available in pre-built binaries and packages. Binaries are portable and can be run on any Linux flavour.
+
+They are built for stable, prestable and testing releases as long as for every commit to master and for every pull request.
+
+To find the freshest build from `master`, go to [commits page](https://github.com/ClickHouse/ClickHouse/commits/master), click on the first green checkmark or red cross near commit, and click to the “Details” link right after “ClickHouse Build Check”.
+
+## Faster builds for development: Split build configuration {#split-build}
+
+Normally, ClickHouse is statically linked into a single static `clickhouse` binary with minimal dependencies. This is convenient for distribution, but it means that on every change the entire binary needs to be linked, which is slow and may be inconvenient for development. There is an alternative configuration which instead creates dynamically loaded shared libraries and separate binaries `clickhouse-server`, `clickhouse-client` etc., allowing for faster incremental builds. To use it, add the following flags to your `cmake` invocation:
+```
+-DUSE_STATIC_LIBRARIES=0 -DSPLIT_SHARED_LIBRARIES=1 -DCLICKHOUSE_SPLIT_BINARY=1
+```
+
+Note that the split build has several drawbacks:
+* There is no single `clickhouse` binary, and you have to run `clickhouse-server`, `clickhouse-client`, etc.
+* Risk of segfault if you run any of the programs while rebuilding the project.
+* You cannot run the integration tests since they only work a single complete binary.
+* You can't easily copy the binaries elsewhere. Instead of moving a single binary you'll need to copy all binaries and libraries.
+
+[Original article](https://clickhouse.com/docs/en/development/build/)
diff --git a/docs/en/reference/development/continuous-integration.md b/docs/en/reference/development/continuous-integration.md
new file mode 100644
index 00000000000..b3172d103f0
--- /dev/null
+++ b/docs/en/reference/development/continuous-integration.md
@@ -0,0 +1,193 @@
+---
+sidebar_position: 62
+sidebar_label: Continuous Integration Checks
+description: When you submit a pull request, some automated checks are ran for your code by the ClickHouse continuous integration (CI) system
+---
+
+# Continuous Integration Checks
+
+When you submit a pull request, some automated checks are ran for your code by
+the ClickHouse [continuous integration (CI) system](tests.md#test-automation).
+This happens after a repository maintainer (someone from ClickHouse team) has
+screened your code and added the `can be tested` label to your pull request.
+The results of the checks are listed on the GitHub pull request page as
+described in the [GitHub checks
+documentation](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/about-status-checks).
+If a check is failing, you might be required to fix it. This page gives an
+overview of checks you may encounter, and what you can do to fix them.
+
+If it looks like the check failure is not related to your changes, it may be
+some transient failure or an infrastructure problem. Push an empty commit to
+the pull request to restart the CI checks:
+```
+git reset
+git commit --allow-empty
+git push
+```
+
+If you are not sure what to do, ask a maintainer for help.
+
+
+## Merge With Master
+
+Verifies that the PR can be merged to master. If not, it will fail with the
+message 'Cannot fetch mergecommit'. To fix this check, resolve the conflict as
+described in the [GitHub
+documentation](https://docs.github.com/en/github/collaborating-with-issues-and-pull-requests/resolving-a-merge-conflict-on-github),
+or merge the `master` branch to your pull request branch using git.
+
+
+## Docs check
+
+Tries to build the ClickHouse documentation website. It can fail if you changed
+something in the documentation. Most probable reason is that some cross-link in
+the documentation is wrong. Go to the check report and look for `ERROR` and `WARNING` messages.
+
+### Report Details
+
+- [Status page example](https://clickhouse-test-reports.s3.yandex.net/12550/eabcc293eb02214caa6826b7c15f101643f67a6b/docs_check.html)
+- `docs_output.txt` contains the building log. [Successful result example](https://clickhouse-test-reports.s3.yandex.net/12550/eabcc293eb02214caa6826b7c15f101643f67a6b/docs_check/docs_output.txt)
+
+
+## Description Check
+
+Check that the description of your pull request conforms to the template
+[PULL_REQUEST_TEMPLATE.md](https://github.com/ClickHouse/ClickHouse/blob/master/.github/PULL_REQUEST_TEMPLATE.md).
+You have to specify a changelog category for your change (e.g., Bug Fix), and
+write a user-readable message describing the change for [CHANGELOG.md](../whats-new/changelog/index.md)
+
+
+## Push To Dockerhub
+
+Builds docker images used for build and tests, then pushes them to DockerHub.
+
+
+## Marker Check
+
+This check means that the CI system started to process the pull request. When it has 'pending' status, it means that not all checks have been started yet. After all checks have been started, it changes status to 'success'.
+
+
+## Style Check
+
+Performs some simple regex-based checks of code style, using the [`utils/check-style/check-style`](https://github.com/ClickHouse/ClickHouse/blob/master/utils/check-style/check-style) binary (note that it can be run locally).
+If it fails, fix the style errors following the [code style guide](style.md).
+
+### Report Details
+- [Status page example](https://clickhouse-test-reports.s3.yandex.net/12550/659c78c7abb56141723af6a81bfae39335aa8cb2/style_check.html)
+- `output.txt` contains the check resulting errors (invalid tabulation etc), blank page means no errors. [Successful result example](https://clickhouse-test-reports.s3.yandex.net/12550/659c78c7abb56141723af6a81bfae39335aa8cb2/style_check/output.txt).
+
+
+## Fast Test
+Normally this is the first check that is ran for a PR. It builds ClickHouse and
+runs most of [stateless functional tests](tests.md#functional-tests), omitting
+some. If it fails, further checks are not started until it is fixed. Look at
+the report to see which tests fail, then reproduce the failure locally as
+described [here](tests.md#functional-test-locally).
+
+### Report Details
+[Status page example](https://clickhouse-test-reports.s3.yandex.net/12550/67d716b5cc3987801996c31a67b31bf141bc3486/fast_test.html)
+
+#### Status Page Files
+- `runlog.out.log` is the general log that includes all other logs.
+- `test_log.txt`
+- `submodule_log.txt` contains the messages about cloning and checkouting needed submodules.
+- `stderr.log`
+- `stdout.log`
+- `clickhouse-server.log`
+- `clone_log.txt`
+- `install_log.txt`
+- `clickhouse-server.err.log`
+- `build_log.txt`
+- `cmake_log.txt` contains messages about the C/C++ and Linux flags check.
+
+#### Status Page Columns
+
+- *Test name* contains the name of the test (without the path e.g. all types of tests will be stripped to the name).
+- *Test status* -- one of _Skipped_, _Success_, or _Fail_.
+- *Test time, sec.* -- empty on this test.
+
+
+## Build Check {#build-check}
+
+Builds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The `cmake` options can be found in the build log, grepping for `cmake`. Use these options and follow the [general build process](../development/build.md).
+
+### Report Details
+
+[Status page example](https://clickhouse-builds.s3.yandex.net/12550/67d716b5cc3987801996c31a67b31bf141bc3486/clickhouse_build_check/report.html).
+
+- **Compiler**: `gcc-9` or `clang-10` (or `clang-10-xx` for other architectures e.g. `clang-10-freebsd`).
+- **Build type**: `Debug` or `RelWithDebInfo` (cmake).
+- **Sanitizer**: `none` (without sanitizers), `address` (ASan), `memory` (MSan), `undefined` (UBSan), or `thread` (TSan).
+- **Splitted** `splitted` is a [split build](../development/build.md#split-build)
+- **Status**: `success` or `fail`
+- **Build log**: link to the building and files copying log, useful when build failed.
+- **Build time**.
+- **Artifacts**: build result files (with `XXX` being the server version e.g. `20.8.1.4344`).
+ - `clickhouse-client_XXX_all.deb`
+ - `clickhouse-common-static-dbg_XXX[+asan, +msan, +ubsan, +tsan]_amd64.deb`
+ - `clickhouse-common-staticXXX_amd64.deb`
+ - `clickhouse-server_XXX_all.deb`
+ - `clickhouse_XXX_amd64.buildinfo`
+ - `clickhouse_XXX_amd64.changes`
+ - `clickhouse`: Main built binary.
+ - `clickhouse-odbc-bridge`
+ - `unit_tests_dbms`: GoogleTest binary with ClickHouse unit tests.
+ - `shared_build.tgz`: build with shared libraries.
+ - `performance.tgz`: Special package for performance tests.
+
+
+## Special Build Check
+Performs static analysis and code style checks using `clang-tidy`. The report is similar to the [build check](#build-check). Fix the errors found in the build log.
+
+
+## Functional Stateless Tests
+Runs [stateless functional tests](tests.md#functional-tests) for ClickHouse
+binaries built in various configurations -- release, debug, with sanitizers,
+etc. Look at the report to see which tests fail, then reproduce the failure
+locally as described [here](tests.md#functional-test-locally). Note that you
+have to use the correct build configuration to reproduce -- a test might fail
+under AddressSanitizer but pass in Debug. Download the binary from [CI build
+checks page](../development/build.md#you-dont-have-to-build-clickhouse), or build it locally.
+
+
+## Functional Stateful Tests
+Runs [stateful functional tests](tests.md#functional-tests). Treat them in the same way as the functional stateless tests. The difference is that they require `hits` and `visits` tables from the [clickstream dataset](../example-datasets/metrica.md) to run.
+
+
+## Integration Tests
+Runs [integration tests](tests.md#integration-tests).
+
+
+## Testflows Check
+Runs some tests using Testflows test system. See [here](https://github.com/ClickHouse/ClickHouse/tree/master/tests/testflows#running-tests-locally) how to run them locally.
+
+
+## Stress Test
+Runs stateless functional tests concurrently from several clients to detect
+concurrency-related errors. If it fails:
+
+ * Fix all other test failures first;
+ * Look at the report to find the server logs and check them for possible causes
+ of error.
+
+
+## Split Build Smoke Test
+
+Checks that the server build in [split build](../development/build.md#split-build)
+configuration can start and run simple queries. If it fails:
+
+ * Fix other test errors first;
+ * Build the server in [split build](../development/build.md#split-build) configuration
+ locally and check whether it can start and run `select 1`.
+
+
+## Compatibility Check
+Checks that `clickhouse` binary runs on distributions with old libc versions. If it fails, ask a maintainer for help.
+
+
+## AST Fuzzer
+Runs randomly generated queries to catch program errors. If it fails, ask a maintainer for help.
+
+
+## Performance Tests
+Measure changes in query performance. This is the longest check that takes just below 6 hours to run. The performance test report is described in detail [here](https://github.com/ClickHouse/ClickHouse/tree/master/docker/test/performance-comparison#how-to-read-the-report).
diff --git a/docs/en/reference/development/contrib.md b/docs/en/reference/development/contrib.md
new file mode 100644
index 00000000000..7cbe32fdd8b
--- /dev/null
+++ b/docs/en/reference/development/contrib.md
@@ -0,0 +1,107 @@
+---
+sidebar_position: 71
+sidebar_label: Third-Party Libraries
+description: A list of third-party libraries used
+---
+
+# Third-Party Libraries Used
+
+The list of third-party libraries:
+
+| Library name | License type |
+|:-|:-|
+| abseil-cpp | [Apache](https://github.com/ClickHouse-Extras/abseil-cpp/blob/4f3b686f86c3ebaba7e4e926e62a79cb1c659a54/LICENSE) |
+| AMQP-CPP | [Apache](https://github.com/ClickHouse-Extras/AMQP-CPP/blob/1a6c51f4ac51ac56610fa95081bd2f349911375a/LICENSE) |
+| arrow | [Apache](https://github.com/ClickHouse-Extras/arrow/blob/078e21bad344747b7656ef2d7a4f7410a0a303eb/LICENSE.txt) |
+| avro | [Apache](https://github.com/ClickHouse-Extras/avro/blob/e43c46e87fd32eafdc09471e95344555454c5ef8/LICENSE.txt) |
+| aws | [Apache](https://github.com/ClickHouse-Extras/aws-sdk-cpp/blob/7d48b2c8193679cc4516e5bd68ae4a64b94dae7d/LICENSE.txt) |
+| aws-c-common | [Apache](https://github.com/ClickHouse-Extras/aws-c-common/blob/736a82d1697c108b04a277e66438a7f4e19b6857/LICENSE) |
+| aws-c-event-stream | [Apache](https://github.com/ClickHouse-Extras/aws-c-event-stream/blob/3bc33662f9ccff4f4cbcf9509cc78c26e022fde0/LICENSE) |
+| aws-checksums | [Apache](https://github.com/ClickHouse-Extras/aws-checksums/blob/519d6d9093819b6cf89ffff589a27ef8f83d0f65/LICENSE) |
+| base64 | [BSD 2-clause](https://github.com/ClickHouse-Extras/Turbo-Base64/blob/af9b331f2b4f30b41c70f3a571ff904a8251c1d3/LICENSE) |
+| boost | [Boost](https://github.com/ClickHouse-Extras/boost/blob/9cf09dbfd55a5c6202dedbdf40781a51b02c2675/LICENSE_1_0.txt) |
+| boringssl | [BSD](https://github.com/ClickHouse-Extras/boringssl/blob/a6a2e2ab3e44d97ce98e51c558e989f211de7eb3/LICENSE) |
+| brotli | [MIT](https://github.com/google/brotli/blob/63be8a99401992075c23e99f7c84de1c653e39e2/LICENSE) |
+| capnproto | [MIT](https://github.com/capnproto/capnproto/blob/a00ccd91b3746ef2ab51d40fe3265829949d1ace/LICENSE) |
+| cassandra | [Apache](https://github.com/ClickHouse-Extras/cpp-driver/blob/eb9b68dadbb4417a2c132ad4a1c2fa76e65e6fc1/LICENSE.txt) |
+| cctz | [Apache](https://github.com/ClickHouse-Extras/cctz/blob/c0f1bcb97fd2782f7c3f972fadd5aad5affac4b8/LICENSE.txt) |
+| cityhash102 | [MIT](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/cityhash102/COPYING) |
+| cppkafka | [BSD 2-clause](https://github.com/mfontanini/cppkafka/blob/5a119f689f8a4d90d10a9635e7ee2bee5c127de1/LICENSE) |
+| croaring | [Apache](https://github.com/RoaringBitmap/CRoaring/blob/2c867e9f9c9e2a3a7032791f94c4c7ae3013f6e0/LICENSE) |
+| curl | [Apache](https://github.com/curl/curl/blob/3b8bbbbd1609c638a3d3d0acb148a33dedb67be3/docs/LICENSE-MIXING.md) |
+| cyrus-sasl | [BSD 2-clause](https://github.com/ClickHouse-Extras/cyrus-sasl/blob/e6466edfd638cc5073debe941c53345b18a09512/COPYING) |
+| double-conversion | [BSD 3-clause](https://github.com/google/double-conversion/blob/cf2f0f3d547dc73b4612028a155b80536902ba02/LICENSE) |
+| dragonbox | [Apache](https://github.com/ClickHouse-Extras/dragonbox/blob/923705af6fd953aa948fc175f6020b15f7359838/LICENSE-Apache2-LLVM) |
+| fast_float | [Apache](https://github.com/fastfloat/fast_float/blob/7eae925b51fd0f570ccd5c880c12e3e27a23b86f/LICENSE) |
+| fastops | [MIT](https://github.com/ClickHouse-Extras/fastops/blob/88752a5e03cf34639a4a37a4b41d8b463fffd2b5/LICENSE) |
+| flatbuffers | [Apache](https://github.com/ClickHouse-Extras/flatbuffers/blob/eb3f827948241ce0e701516f16cd67324802bce9/LICENSE.txt) |
+| fmtlib | [Unknown](https://github.com/fmtlib/fmt/blob/c108ee1d590089ccf642fc85652b845924067af2/LICENSE.rst) |
+| gcem | [Apache](https://github.com/kthohr/gcem/blob/8d4f1b5d76ea8f6ff12f3f4f34cda45424556b00/LICENSE) |
+| googletest | [BSD 3-clause](https://github.com/google/googletest/blob/e7e591764baba0a0c3c9ad0014430e7a27331d16/LICENSE) |
+| grpc | [Apache](https://github.com/ClickHouse-Extras/grpc/blob/60c986e15cae70aade721d26badabab1f822fdd6/LICENSE) |
+| h3 | [Apache](https://github.com/ClickHouse-Extras/h3/blob/c7f46cfd71fb60e2fefc90e28abe81657deff735/LICENSE) |
+| hyperscan | [Boost](https://github.com/ClickHouse-Extras/hyperscan/blob/e9f08df0213fc637aac0a5bbde9beeaeba2fe9fa/LICENSE) |
+| icu | [Public Domain](https://github.com/unicode-org/icu/blob/a56dde820dc35665a66f2e9ee8ba58e75049b668/icu4c/LICENSE) |
+| icudata | [Public Domain](https://github.com/ClickHouse-Extras/icudata/blob/72d9a4a7febc904e2b0a534ccb25ae40fac5f1e5/LICENSE) |
+| jemalloc | [BSD 2-clause](https://github.com/ClickHouse-Extras/jemalloc/blob/e6891d9746143bf2cf617493d880ba5a0b9a3efd/COPYING) |
+| krb5 | [MIT](https://github.com/ClickHouse-Extras/krb5/blob/5149dea4e2be0f67707383d2682b897c14631374/src/lib/gssapi/LICENSE) |
+| libc-headers | [LGPL](https://github.com/ClickHouse-Extras/libc-headers/blob/a720b7105a610acbd7427eea475a5b6810c151eb/LICENSE) |
+| libcpuid | [BSD 2-clause](https://github.com/ClickHouse-Extras/libcpuid/blob/8db3b8d2d32d22437f063ce692a1b9bb15e42d18/COPYING) |
+| libcxx | [Apache](https://github.com/ClickHouse-Extras/libcxx/blob/2fa892f69acbaa40f8a18c6484854a6183a34482/LICENSE.TXT) |
+| libcxxabi | [Apache](https://github.com/ClickHouse-Extras/libcxxabi/blob/df8f1e727dbc9e2bedf2282096fa189dc3fe0076/LICENSE.TXT) |
+| libdivide | [zLib](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libdivide/LICENSE.txt) |
+| libfarmhash | [MIT](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libfarmhash/COPYING) |
+| libgsasl | [LGPL](https://github.com/ClickHouse-Extras/libgsasl/blob/383ee28e82f69fa16ed43b48bd9c8ee5b313ab84/LICENSE) |
+| libhdfs3 | [Apache](https://github.com/ClickHouse-Extras/libhdfs3/blob/095b9d48b400abb72d967cb0539af13b1e3d90cf/LICENSE.txt) |
+| libmetrohash | [Apache](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/libmetrohash/LICENSE) |
+| libpq | [Unknown](https://github.com/ClickHouse-Extras/libpq/blob/e071ea570f8985aa00e34f5b9d50a3cfe666327e/COPYRIGHT) |
+| libpqxx | [BSD 3-clause](https://github.com/ClickHouse-Extras/libpqxx/blob/357608d11b7a1961c3fb7db2ef9a5dbb2e87da77/COPYING) |
+| librdkafka | [MIT](https://github.com/ClickHouse-Extras/librdkafka/blob/b8554f1682062c85ba519eb54ef2f90e02b812cb/LICENSE.murmur2) |
+| libunwind | [Apache](https://github.com/ClickHouse-Extras/libunwind/blob/6b816d2fba3991f8fd6aaec17d92f68947eab667/LICENSE.TXT) |
+| libuv | [BSD](https://github.com/ClickHouse-Extras/libuv/blob/e2e9b7e9f978ce8a1367b5fe781d97d1ce9f94ab/LICENSE) |
+| llvm | [Apache](https://github.com/ClickHouse-Extras/llvm/blob/e5751459412bce1391fb7a2e9bbc01e131bf72f1/llvm/LICENSE.TXT) |
+| lz4 | [BSD](https://github.com/lz4/lz4/blob/f39b79fb02962a1cd880bbdecb6dffba4f754a11/LICENSE) |
+| mariadb-connector-c | [LGPL](https://github.com/ClickHouse-Extras/mariadb-connector-c/blob/5f4034a3a6376416504f17186c55fe401c6d8e5e/COPYING.LIB) |
+| miniselect | [Boost](https://github.com/danlark1/miniselect/blob/be0af6bd0b6eb044d1acc4f754b229972d99903a/LICENSE_1_0.txt) |
+| msgpack-c | [Boost](https://github.com/msgpack/msgpack-c/blob/46684265d50b5d1b062d4c5c428ba08462844b1d/LICENSE_1_0.txt) |
+| murmurhash | [Public Domain](https://github.com/ClickHouse/ClickHouse/blob/master/contrib/murmurhash/LICENSE) |
+| NuRaft | [Apache](https://github.com/ClickHouse-Extras/NuRaft/blob/7ecb16844af6a9c283ad432d85ecc2e7d1544676/LICENSE) |
+| openldap | [Unknown](https://github.com/ClickHouse-Extras/openldap/blob/0208811b6043ca06fda8631a5e473df1ec515ccb/LICENSE) |
+| orc | [Apache](https://github.com/ClickHouse-Extras/orc/blob/0a936f6bbdb9303308973073f8623b5a8d82eae1/LICENSE) |
+| poco | [Boost](https://github.com/ClickHouse-Extras/poco/blob/7351c4691b5d401f59e3959adfc5b4fa263b32da/LICENSE) |
+| protobuf | [BSD 3-clause](https://github.com/ClickHouse-Extras/protobuf/blob/75601841d172c73ae6bf4ce8121f42b875cdbabd/LICENSE) |
+| rapidjson | [MIT](https://github.com/ClickHouse-Extras/rapidjson/blob/c4ef90ccdbc21d5d5a628d08316bfd301e32d6fa/bin/jsonschema/LICENSE) |
+| re2 | [BSD 3-clause](https://github.com/google/re2/blob/13ebb377c6ad763ca61d12dd6f88b1126bd0b911/LICENSE) |
+| replxx | [BSD 3-clause](https://github.com/ClickHouse-Extras/replxx/blob/c81be6c68b146f15f2096b7ef80e3f21fe27004c/LICENSE.md) |
+| rocksdb | [BSD 3-clause](https://github.com/ClickHouse-Extras/rocksdb/blob/b6480c69bf3ab6e298e0d019a07fd4f69029b26a/LICENSE.leveldb) |
+| s2geometry | [Apache](https://github.com/ClickHouse-Extras/s2geometry/blob/20ea540d81f4575a3fc0aea585aac611bcd03ede/LICENSE) |
+| sentry-native | [MIT](https://github.com/ClickHouse-Extras/sentry-native/blob/94644e92f0a3ff14bd35ed902a8622a2d15f7be4/LICENSE) |
+| simdjson | [Apache](https://github.com/simdjson/simdjson/blob/8df32cea3359cb30120795da6020b3b73da01d38/LICENSE) |
+| snappy | [Public Domain](https://github.com/google/snappy/blob/3f194acb57e0487531c96b97af61dcbd025a78a3/COPYING) |
+| sparsehash-c11 | [BSD 3-clause](https://github.com/sparsehash/sparsehash-c11/blob/cf0bffaa456f23bc4174462a789b90f8b6f5f42f/LICENSE) |
+| stats | [Apache](https://github.com/kthohr/stats/blob/b6dd459c10a88c7ea04693c007e9e35820c5d9ad/LICENSE) |
+| thrift | [Apache](https://github.com/apache/thrift/blob/010ccf0a0c7023fea0f6bf4e4078ebdff7e61982/LICENSE) |
+| unixodbc | [LGPL](https://github.com/ClickHouse-Extras/UnixODBC/blob/b0ad30f7f6289c12b76f04bfb9d466374bb32168/COPYING) |
+| xz | [Public Domain](https://github.com/xz-mirror/xz/blob/869b9d1b4edd6df07f819d360d306251f8147353/COPYING) |
+| zlib-ng | [zLib](https://github.com/ClickHouse-Extras/zlib-ng/blob/6a5e93b9007782115f7f7e5235dedc81c4f1facb/LICENSE.md) |
+| zstd | [BSD](https://github.com/facebook/zstd/blob/a488ba114ec17ea1054b9057c26a046fc122b3b6/LICENSE) |
+
+The list of third-party libraries can be obtained by the following query:
+
+``` sql
+SELECT library_name, license_type, license_path FROM system.licenses ORDER BY library_name COLLATE 'en';
+```
+
+[Example](https://gh-api.clickhouse.com/play?user=play#U0VMRUNUIGxpYnJhcnlfbmFtZSwgbGljZW5zZV90eXBlLCBsaWNlbnNlX3BhdGggRlJPTSBzeXN0ZW0ubGljZW5zZXMgT1JERVIgQlkgbGlicmFyeV9uYW1lIENPTExBVEUgJ2VuJw==)
+
+## Guidelines for adding new third-party libraries and maintaining custom changes in them {#adding-third-party-libraries}
+
+1. All external third-party code should reside in the dedicated directories under `contrib` directory of ClickHouse repo. Prefer Git submodules, when available.
+2. Fork/mirror the official repo in [Clickhouse-extras](https://github.com/ClickHouse-Extras). Prefer official GitHub repos, when available.
+3. Branch from the branch you want to integrate, e.g., `master` -> `clickhouse/master`, or `release/vX.Y.Z` -> `clickhouse/release/vX.Y.Z`.
+4. All forks in [Clickhouse-extras](https://github.com/ClickHouse-Extras) can be automatically synchronized with upstreams. `clickhouse/...` branches will remain unaffected, since virtually nobody is going to use that naming pattern in their upstream repos.
+5. Add submodules under `contrib` of ClickHouse repo that refer the above forks/mirrors. Set the submodules to track the corresponding `clickhouse/...` branches.
+6. Every time the custom changes have to be made in the library code, a dedicated branch should be created, like `clickhouse/my-fix`. Then this branch should be merged into the branch, that is tracked by the submodule, e.g., `clickhouse/master` or `clickhouse/release/vX.Y.Z`.
+7. No code should be pushed in any branch of the forks in [Clickhouse-extras](https://github.com/ClickHouse-Extras), whose names do not follow `clickhouse/...` pattern.
+8. Always write the custom changes with the official repo in mind. Once the PR is merged from (a feature/fix branch in) your personal fork into the fork in [Clickhouse-extras](https://github.com/ClickHouse-Extras), and the submodule is bumped in ClickHouse repo, consider opening another PR from (a feature/fix branch in) the fork in [Clickhouse-extras](https://github.com/ClickHouse-Extras) to the official repo of the library. This will make sure, that 1) the contribution has more than a single use case and importance, 2) others will also benefit from it, 3) the change will not remain a maintenance burden solely on ClickHouse developers.
+9. When a submodule needs to start using a newer code from the original branch (e.g., `master`), and since the custom changes might be merged in the branch it is tracking (e.g., `clickhouse/master`) and so it may diverge from its original counterpart (i.e., `master`), a careful merge should be carried out first, i.e., `master` -> `clickhouse/master`, and only then the submodule can be bumped in ClickHouse.
diff --git a/docs/en/reference/development/developer-instruction.md b/docs/en/reference/development/developer-instruction.md
new file mode 100644
index 00000000000..291e57fef66
--- /dev/null
+++ b/docs/en/reference/development/developer-instruction.md
@@ -0,0 +1,278 @@
+---
+sidebar_position: 61
+sidebar_label: Getting Started
+description: Prerequisites and an overview of how to build ClickHouse
+---
+
+# Getting Started Guide for Building ClickHouse
+
+The building of ClickHouse is supported on Linux, FreeBSD and Mac OS X.
+
+If you use Windows, you need to create a virtual machine with Ubuntu. To start working with a virtual machine please install VirtualBox. You can download Ubuntu from the website: https://www.ubuntu.com/#download. Please create a virtual machine from the downloaded image (you should reserve at least 4GB of RAM for it). To run a command-line terminal in Ubuntu, please locate a program containing the word “terminal” in its name (gnome-terminal, konsole etc.) or just press Ctrl+Alt+T.
+
+ClickHouse cannot work or build on a 32-bit system. You should acquire access to a 64-bit system and you can continue reading.
+
+## Creating a Repository on GitHub {#creating-a-repository-on-github}
+
+To start working with ClickHouse repository you will need a GitHub account.
+
+You probably already have one, but if you do not, please register at https://github.com. In case you do not have SSH keys, you should generate them and then upload them on GitHub. It is required for sending over your patches. It is also possible to use the same SSH keys that you use with any other SSH servers - probably you already have those.
+
+Create a fork of ClickHouse repository. To do that please click on the “fork” button in the upper right corner at https://github.com/ClickHouse/ClickHouse. It will fork your own copy of ClickHouse/ClickHouse to your account.
+
+The development process consists of first committing the intended changes into your fork of ClickHouse and then creating a “pull request” for these changes to be accepted into the main repository (ClickHouse/ClickHouse).
+
+To work with git repositories, please install `git`.
+
+To do that in Ubuntu you would run in the command line terminal:
+
+ sudo apt update
+ sudo apt install git
+
+A brief manual on using Git can be found here: https://education.github.com/git-cheat-sheet-education.pdf.
+For a detailed manual on Git see https://git-scm.com/book/en/v2.
+
+## Cloning a Repository to Your Development Machine {#cloning-a-repository-to-your-development-machine}
+
+Next, you need to download the source files onto your working machine. This is called “to clone a repository” because it creates a local copy of the repository on your working machine.
+
+In the command line terminal run:
+
+ git clone --recursive git@github.com:your_github_username/ClickHouse.git
+ cd ClickHouse
+
+Note: please, substitute *your_github_username* with what is appropriate!
+
+This command will create a directory `ClickHouse` containing the working copy of the project.
+
+It is important that the path to the working directory contains no whitespaces as it may lead to problems with running the build system.
+
+Please note that ClickHouse repository uses `submodules`. That is what the references to additional repositories are called (i.e. external libraries on which the project depends). It means that when cloning the repository you need to specify the `--recursive` flag as in the example above. If the repository has been cloned without submodules, to download them you need to run the following:
+
+ git submodule init
+ git submodule update
+
+You can check the status with the command: `git submodule status`.
+
+If you get the following error message:
+
+ Permission denied (publickey).
+ fatal: Could not read from remote repository.
+
+ Please make sure you have the correct access rights
+ and the repository exists.
+
+It generally means that the SSH keys for connecting to GitHub are missing. These keys are normally located in `~/.ssh`. For SSH keys to be accepted you need to upload them in the settings section of GitHub UI.
+
+You can also clone the repository via https protocol:
+
+ git clone --recursive https://github.com/ClickHouse/ClickHouse.git
+
+This, however, will not let you send your changes to the server. You can still use it temporarily and add the SSH keys later replacing the remote address of the repository with `git remote` command.
+
+You can also add original ClickHouse repo’s address to your local repository to pull updates from there:
+
+ git remote add upstream git@github.com:ClickHouse/ClickHouse.git
+
+After successfully running this command you will be able to pull updates from the main ClickHouse repo by running `git pull upstream master`.
+
+### Working with Submodules {#working-with-submodules}
+
+Working with submodules in git could be painful. Next commands will help to manage it:
+
+ # ! each command accepts
+ # Update remote URLs for submodules. Barely rare case
+ git submodule sync
+ # Add new submodules
+ git submodule init
+ # Update existing submodules to the current state
+ git submodule update
+ # Two last commands could be merged together
+ git submodule update --init
+
+The next commands would help you to reset all submodules to the initial state (!WARNING! - any changes inside will be deleted):
+
+ # Synchronizes submodules' remote URL with .gitmodules
+ git submodule sync
+ # Update the registered submodules with initialize not yet initialized
+ git submodule update --init
+ # Reset all changes done after HEAD
+ git submodule foreach git reset --hard
+ # Clean files from .gitignore
+ git submodule foreach git clean -xfd
+ # Repeat last 4 commands for all submodule
+ git submodule foreach git submodule sync
+ git submodule foreach git submodule update --init
+ git submodule foreach git submodule foreach git reset --hard
+ git submodule foreach git submodule foreach git clean -xfd
+
+## Build System {#build-system}
+
+ClickHouse uses CMake and Ninja for building.
+
+CMake - a meta-build system that can generate Ninja files (build tasks).
+Ninja - a smaller build system with a focus on the speed used to execute those cmake generated tasks.
+
+To install on Ubuntu, Debian or Mint run `sudo apt install cmake ninja-build`.
+
+On CentOS, RedHat run `sudo yum install cmake ninja-build`.
+
+If you use Arch or Gentoo, you probably know it yourself how to install CMake.
+
+For installing CMake and Ninja on Mac OS X first install Homebrew and then install everything else via brew:
+
+ /usr/bin/ruby -e "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
+ brew install cmake ninja
+
+Next, check the version of CMake: `cmake --version`. If it is below 3.12, you should install a newer version from the website: https://cmake.org/download/.
+
+## C++ Compiler {#c-compiler}
+
+Compilers Clang starting from version 11 is supported for building ClickHouse.
+
+Clang should be used instead of gcc. Though, our continuous integration (CI) platform runs checks for about a dozen of build combinations.
+
+On Ubuntu/Debian you can use the automatic installation script (check [official webpage](https://apt.llvm.org/))
+
+```bash
+sudo bash -c "$(wget -O - https://apt.llvm.org/llvm.sh)"
+```
+
+Mac OS X build is also supported. Just run `brew install llvm`
+
+
+## The Building Process {#the-building-process}
+
+Now that you are ready to build ClickHouse we recommend you to create a separate directory `build` inside `ClickHouse` that will contain all of the build artefacts:
+
+ mkdir build
+ cd build
+
+You can have several different directories (build_release, build_debug, etc.) for different types of build.
+
+While inside the `build` directory, configure your build by running CMake. Before the first run, you need to define environment variables that specify compiler.
+
+ export CC=clang CXX=clang++
+ cmake ..
+
+If you installed clang using the automatic installation script above, also specify the version of clang installed in the first command, e.g. `export CC=clang-13 CXX=clang++-13`. The clang version will be in the script output.
+
+The `CC` variable specifies the compiler for C (short for C Compiler), and `CXX` variable instructs which C++ compiler is to be used for building.
+
+For a faster build, you can resort to the `debug` build type - a build with no optimizations. For that supply the following parameter `-D CMAKE_BUILD_TYPE=Debug`:
+
+ cmake -D CMAKE_BUILD_TYPE=Debug ..
+
+You can change the type of build by running this command in the `build` directory.
+
+Run ninja to build:
+
+ ninja clickhouse-server clickhouse-client
+
+Only the required binaries are going to be built in this example.
+
+If you require to build all the binaries (utilities and tests), you should run ninja with no parameters:
+
+ ninja
+
+Full build requires about 30GB of free disk space or 15GB to build the main binaries.
+
+When a large amount of RAM is available on build machine you should limit the number of build tasks run in parallel with `-j` param:
+
+ ninja -j 1 clickhouse-server clickhouse-client
+
+On machines with 4GB of RAM, it is recommended to specify 1, for 8GB of RAM `-j 2` is recommended.
+
+If you get the message: `ninja: error: loading 'build.ninja': No such file or directory`, it means that generating a build configuration has failed and you need to inspect the message above.
+
+Upon the successful start of the building process, you’ll see the build progress - the number of processed tasks and the total number of tasks.
+
+While building messages about protobuf files in libhdfs2 library like `libprotobuf WARNING` may show up. They affect nothing and are safe to be ignored.
+
+Upon successful build you get an executable file `ClickHouse//programs/clickhouse`:
+
+ ls -l programs/clickhouse
+
+## Running the Built Executable of ClickHouse {#running-the-built-executable-of-clickhouse}
+
+To run the server under the current user you need to navigate to `ClickHouse/programs/server/` (located outside of `build`) and run:
+
+ ../../build/programs/clickhouse server
+
+In this case, ClickHouse will use config files located in the current directory. You can run `clickhouse server` from any directory specifying the path to a config file as a command-line parameter `--config-file`.
+
+To connect to ClickHouse with clickhouse-client in another terminal navigate to `ClickHouse/build/programs/` and run `./clickhouse client`.
+
+If you get `Connection refused` message on Mac OS X or FreeBSD, try specifying host address 127.0.0.1:
+
+ clickhouse client --host 127.0.0.1
+
+You can replace the production version of ClickHouse binary installed in your system with your custom-built ClickHouse binary. To do that install ClickHouse on your machine following the instructions from the official website. Next, run the following:
+
+ sudo service clickhouse-server stop
+ sudo cp ClickHouse/build/programs/clickhouse /usr/bin/
+ sudo service clickhouse-server start
+
+Note that `clickhouse-client`, `clickhouse-server` and others are symlinks to the commonly shared `clickhouse` binary.
+
+You can also run your custom-built ClickHouse binary with the config file from the ClickHouse package installed on your system:
+
+ sudo service clickhouse-server stop
+ sudo -u clickhouse ClickHouse/build/programs/clickhouse server --config-file /etc/clickhouse-server/config.xml
+
+## IDE (Integrated Development Environment) {#ide-integrated-development-environment}
+
+If you do not know which IDE to use, we recommend that you use CLion. CLion is commercial software, but it offers 30 days free trial period. It is also free of charge for students. CLion can be used both on Linux and on Mac OS X.
+
+KDevelop and QTCreator are other great alternatives of an IDE for developing ClickHouse. KDevelop comes in as a very handy IDE although unstable. If KDevelop crashes after a while upon opening project, you should click “Stop All” button as soon as it has opened the list of project’s files. After doing so KDevelop should be fine to work with.
+
+As simple code editors, you can use Sublime Text or Visual Studio Code, or Kate (all of which are available on Linux).
+
+Just in case, it is worth mentioning that CLion creates `build` path on its own, it also on its own selects `debug` for build type, for configuration it uses a version of CMake that is defined in CLion and not the one installed by you, and finally, CLion will use `make` to run build tasks instead of `ninja`. This is normal behaviour, just keep that in mind to avoid confusion.
+
+## Writing Code {#writing-code}
+
+The description of ClickHouse architecture can be found here: https://clickhouse.com/docs/en/development/architecture/
+
+The Code Style Guide: https://clickhouse.com/docs/en/development/style/
+
+Adding third-party libraries: https://clickhouse.com/docs/en/development/contrib/#adding-third-party-libraries
+
+Writing tests: https://clickhouse.com/docs/en/development/tests/
+
+List of tasks: https://github.com/ClickHouse/ClickHouse/issues?q=is%3Aopen+is%3Aissue+label%3Ahacktoberfest
+
+## Test Data {#test-data}
+
+Developing ClickHouse often requires loading realistic datasets. It is particularly important for performance testing. We have a specially prepared set of anonymized data of web analytics. It requires additionally some 3GB of free disk space. Note that this data is not required to accomplish most of the development tasks.
+
+ sudo apt install wget xz-utils
+
+ wget https://datasets.clickhouse.com/hits/tsv/hits_v1.tsv.xz
+ wget https://datasets.clickhouse.com/visits/tsv/visits_v1.tsv.xz
+
+ xz -v -d hits_v1.tsv.xz
+ xz -v -d visits_v1.tsv.xz
+
+ clickhouse-client
+
+ CREATE DATABASE IF NOT EXISTS test
+
+ CREATE TABLE test.hits ( WatchID UInt64, JavaEnable UInt8, Title String, GoodEvent Int16, EventTime DateTime, EventDate Date, CounterID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RegionID UInt32, UserID UInt64, CounterClass Int8, OS UInt8, UserAgent UInt8, URL String, Referer String, URLDomain String, RefererDomain String, Refresh UInt8, IsRobot UInt8, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), ResolutionWidth UInt16, ResolutionHeight UInt16, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, FlashMinor2 String, NetMajor UInt8, NetMinor UInt8, UserAgentMajor UInt16, UserAgentMinor FixedString(2), CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, MobilePhone UInt8, MobilePhoneModel String, Params String, IPNetworkID UInt32, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, IsArtifical UInt8, WindowClientWidth UInt16, WindowClientHeight UInt16, ClientTimeZone Int16, ClientEventTime DateTime, SilverlightVersion1 UInt8, SilverlightVersion2 UInt8, SilverlightVersion3 UInt32, SilverlightVersion4 UInt16, PageCharset String, CodeVersion UInt32, IsLink UInt8, IsDownload UInt8, IsNotBounce UInt8, FUniqID UInt64, HID UInt32, IsOldCounter UInt8, IsEvent UInt8, IsParameter UInt8, DontCountHits UInt8, WithHash UInt8, HitColor FixedString(1), UTCEventTime DateTime, Age UInt8, Sex UInt8, Income UInt8, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), RemoteIP UInt32, RemoteIP6 FixedString(16), WindowName Int32, OpenerName Int32, HistoryLength Int16, BrowserLanguage FixedString(2), BrowserCountry FixedString(2), SocialNetwork String, SocialAction String, HTTPError UInt16, SendTiming Int32, DNSTiming Int32, ConnectTiming Int32, ResponseStartTiming Int32, ResponseEndTiming Int32, FetchTiming Int32, RedirectTiming Int32, DOMInteractiveTiming Int32, DOMContentLoadedTiming Int32, DOMCompleteTiming Int32, LoadEventStartTiming Int32, LoadEventEndTiming Int32, NSToDOMContentLoadedTiming Int32, FirstPaintTiming Int32, RedirectCount Int8, SocialSourceNetworkID UInt8, SocialSourcePage String, ParamPrice Int64, ParamOrderID String, ParamCurrency FixedString(3), ParamCurrencyID UInt16, GoalsReached Array(UInt32), OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, RefererHash UInt64, URLHash UInt64, CLID UInt32, YCLID UInt64, ShareService String, ShareURL String, ShareTitle String, `ParsedParams.Key1` Array(String), `ParsedParams.Key2` Array(String), `ParsedParams.Key3` Array(String), `ParsedParams.Key4` Array(String), `ParsedParams.Key5` Array(String), `ParsedParams.ValueDouble` Array(Float64), IslandID FixedString(16), RequestNum UInt32, RequestTry UInt8) ENGINE = MergeTree PARTITION BY toYYYYMM(EventDate) SAMPLE BY intHash32(UserID) ORDER BY (CounterID, EventDate, intHash32(UserID), EventTime);
+
+ CREATE TABLE test.visits ( CounterID UInt32, StartDate Date, Sign Int8, IsNew UInt8, VisitID UInt64, UserID UInt64, StartTime DateTime, Duration UInt32, UTCStartTime DateTime, PageViews Int32, Hits Int32, IsBounce UInt8, Referer String, StartURL String, RefererDomain String, StartURLDomain String, EndURL String, LinkURL String, IsDownload UInt8, TraficSourceID Int8, SearchEngineID UInt16, SearchPhrase String, AdvEngineID UInt8, PlaceID Int32, RefererCategories Array(UInt16), URLCategories Array(UInt16), URLRegions Array(UInt32), RefererRegions Array(UInt32), IsYandex UInt8, GoalReachesDepth Int32, GoalReachesURL Int32, GoalReachesAny Int32, SocialSourceNetworkID UInt8, SocialSourcePage String, MobilePhoneModel String, ClientEventTime DateTime, RegionID UInt32, ClientIP UInt32, ClientIP6 FixedString(16), RemoteIP UInt32, RemoteIP6 FixedString(16), IPNetworkID UInt32, SilverlightVersion3 UInt32, CodeVersion UInt32, ResolutionWidth UInt16, ResolutionHeight UInt16, UserAgentMajor UInt16, UserAgentMinor UInt16, WindowClientWidth UInt16, WindowClientHeight UInt16, SilverlightVersion2 UInt8, SilverlightVersion4 UInt16, FlashVersion3 UInt16, FlashVersion4 UInt16, ClientTimeZone Int16, OS UInt8, UserAgent UInt8, ResolutionDepth UInt8, FlashMajor UInt8, FlashMinor UInt8, NetMajor UInt8, NetMinor UInt8, MobilePhone UInt8, SilverlightVersion1 UInt8, Age UInt8, Sex UInt8, Income UInt8, JavaEnable UInt8, CookieEnable UInt8, JavascriptEnable UInt8, IsMobile UInt8, BrowserLanguage UInt16, BrowserCountry UInt16, Interests UInt16, Robotness UInt8, GeneralInterests Array(UInt16), Params Array(String), `Goals.ID` Array(UInt32), `Goals.Serial` Array(UInt32), `Goals.EventTime` Array(DateTime), `Goals.Price` Array(Int64), `Goals.OrderID` Array(String), `Goals.CurrencyID` Array(UInt32), WatchIDs Array(UInt64), ParamSumPrice Int64, ParamCurrency FixedString(3), ParamCurrencyID UInt16, ClickLogID UInt64, ClickEventID Int32, ClickGoodEvent Int32, ClickEventTime DateTime, ClickPriorityID Int32, ClickPhraseID Int32, ClickPageID Int32, ClickPlaceID Int32, ClickTypeID Int32, ClickResourceID Int32, ClickCost UInt32, ClickClientIP UInt32, ClickDomainID UInt32, ClickURL String, ClickAttempt UInt8, ClickOrderID UInt32, ClickBannerID UInt32, ClickMarketCategoryID UInt32, ClickMarketPP UInt32, ClickMarketCategoryName String, ClickMarketPPName String, ClickAWAPSCampaignName String, ClickPageName String, ClickTargetType UInt16, ClickTargetPhraseID UInt64, ClickContextType UInt8, ClickSelectType Int8, ClickOptions String, ClickGroupBannerID Int32, OpenstatServiceName String, OpenstatCampaignID String, OpenstatAdID String, OpenstatSourceID String, UTMSource String, UTMMedium String, UTMCampaign String, UTMContent String, UTMTerm String, FromTag String, HasGCLID UInt8, FirstVisit DateTime, PredLastVisit Date, LastVisit Date, TotalVisits UInt32, `TraficSource.ID` Array(Int8), `TraficSource.SearchEngineID` Array(UInt16), `TraficSource.AdvEngineID` Array(UInt8), `TraficSource.PlaceID` Array(UInt16), `TraficSource.SocialSourceNetworkID` Array(UInt8), `TraficSource.Domain` Array(String), `TraficSource.SearchPhrase` Array(String), `TraficSource.SocialSourcePage` Array(String), Attendance FixedString(16), CLID UInt32, YCLID UInt64, NormalizedRefererHash UInt64, SearchPhraseHash UInt64, RefererDomainHash UInt64, NormalizedStartURLHash UInt64, StartURLDomainHash UInt64, NormalizedEndURLHash UInt64, TopLevelDomain UInt64, URLScheme UInt64, OpenstatServiceNameHash UInt64, OpenstatCampaignIDHash UInt64, OpenstatAdIDHash UInt64, OpenstatSourceIDHash UInt64, UTMSourceHash UInt64, UTMMediumHash UInt64, UTMCampaignHash UInt64, UTMContentHash UInt64, UTMTermHash UInt64, FromHash UInt64, WebVisorEnabled UInt8, WebVisorActivity UInt32, `ParsedParams.Key1` Array(String), `ParsedParams.Key2` Array(String), `ParsedParams.Key3` Array(String), `ParsedParams.Key4` Array(String), `ParsedParams.Key5` Array(String), `ParsedParams.ValueDouble` Array(Float64), `Market.Type` Array(UInt8), `Market.GoalID` Array(UInt32), `Market.OrderID` Array(String), `Market.OrderPrice` Array(Int64), `Market.PP` Array(UInt32), `Market.DirectPlaceID` Array(UInt32), `Market.DirectOrderID` Array(UInt32), `Market.DirectBannerID` Array(UInt32), `Market.GoodID` Array(String), `Market.GoodName` Array(String), `Market.GoodQuantity` Array(Int32), `Market.GoodPrice` Array(Int64), IslandID FixedString(16)) ENGINE = CollapsingMergeTree(Sign) PARTITION BY toYYYYMM(StartDate) SAMPLE BY intHash32(UserID) ORDER BY (CounterID, StartDate, intHash32(UserID), VisitID);
+
+ clickhouse-client --max_insert_block_size 100000 --query "INSERT INTO test.hits FORMAT TSV" < hits_v1.tsv
+ clickhouse-client --max_insert_block_size 100000 --query "INSERT INTO test.visits FORMAT TSV" < visits_v1.tsv
+
+## Creating Pull Request {#creating-pull-request}
+
+Navigate to your fork repository in GitHub’s UI. If you have been developing in a branch, you need to select that branch. There will be a “Pull request” button located on the screen. In essence, this means “create a request for accepting my changes into the main repository”.
+
+A pull request can be created even if the work is not completed yet. In this case please put the word “WIP” (work in progress) at the beginning of the title, it can be changed later. This is useful for cooperative reviewing and discussion of changes as well as for running all of the available tests. It is important that you provide a brief description of your changes, it will later be used for generating release changelogs.
+
+Testing will commence as soon as ClickHouse employees label your PR with a tag “can be tested”. The results of some first checks (e.g. code style) will come in within several minutes. Build check results will arrive within half an hour. And the main set of tests will report itself within an hour.
+
+The system will prepare ClickHouse binary builds for your pull request individually. To retrieve these builds click the “Details” link next to “ClickHouse build check” entry in the list of checks. There you will find direct links to the built .deb packages of ClickHouse which you can deploy even on your production servers (if you have no fear).
+
+Most probably some of the builds will fail at first times. This is due to the fact that we check builds both with gcc as well as with clang, with almost all of existing warnings (always with the `-Werror` flag) enabled for clang. On that same page, you can find all of the build logs so that you do not have to build ClickHouse in all of the possible ways.
diff --git a/docs/en/reference/development/style.md b/docs/en/reference/development/style.md
new file mode 100644
index 00000000000..82cd9273680
--- /dev/null
+++ b/docs/en/reference/development/style.md
@@ -0,0 +1,832 @@
+---
+sidebar_position: 69
+sidebar_label: C++ Guide
+description: A list of recommendations regarding coding style, naming convention, formatting and more
+---
+
+# How to Write C++ Code
+
+## General Recommendations {#general-recommendations}
+
+**1.** The following are recommendations, not requirements.
+
+**2.** If you are editing code, it makes sense to follow the formatting of the existing code.
+
+**3.** Code style is needed for consistency. Consistency makes it easier to read the code, and it also makes it easier to search the code.
+
+**4.** Many of the rules do not have logical reasons; they are dictated by established practices.
+
+## Formatting {#formatting}
+
+**1.** Most of the formatting will be done automatically by `clang-format`.
+
+**2.** Indents are 4 spaces. Configure your development environment so that a tab adds four spaces.
+
+**3.** Opening and closing curly brackets must be on a separate line.
+
+``` cpp
+inline void readBoolText(bool & x, ReadBuffer & buf)
+{
+ char tmp = '0';
+ readChar(tmp, buf);
+ x = tmp != '0';
+}
+```
+
+**4.** If the entire function body is a single `statement`, it can be placed on a single line. Place spaces around curly braces (besides the space at the end of the line).
+
+``` cpp
+inline size_t mask() const { return buf_size() - 1; }
+inline size_t place(HashValue x) const { return x & mask(); }
+```
+
+**5.** For functions. Don’t put spaces around brackets.
+
+``` cpp
+void reinsert(const Value & x)
+```
+
+``` cpp
+memcpy(&buf[place_value], &x, sizeof(x));
+```
+
+**6.** In `if`, `for`, `while` and other expressions, a space is inserted in front of the opening bracket (as opposed to function calls).
+
+``` cpp
+for (size_t i = 0; i < rows; i += storage.index_granularity)
+```
+
+**7.** Add spaces around binary operators (`+`, `-`, `*`, `/`, `%`, …) and the ternary operator `?:`.
+
+``` cpp
+UInt16 year = (s[0] - '0') * 1000 + (s[1] - '0') * 100 + (s[2] - '0') * 10 + (s[3] - '0');
+UInt8 month = (s[5] - '0') * 10 + (s[6] - '0');
+UInt8 day = (s[8] - '0') * 10 + (s[9] - '0');
+```
+
+**8.** If a line feed is entered, put the operator on a new line and increase the indent before it.
+
+``` cpp
+if (elapsed_ns)
+ message << " ("
+ << rows_read_on_server * 1000000000 / elapsed_ns << " rows/s., "
+ << bytes_read_on_server * 1000.0 / elapsed_ns << " MB/s.) ";
+```
+
+**9.** You can use spaces for alignment within a line, if desired.
+
+``` cpp
+dst.ClickLogID = click.LogID;
+dst.ClickEventID = click.EventID;
+dst.ClickGoodEvent = click.GoodEvent;
+```
+
+**10.** Don’t use spaces around the operators `.`, `->`.
+
+If necessary, the operator can be wrapped to the next line. In this case, the offset in front of it is increased.
+
+**11.** Do not use a space to separate unary operators (`--`, `++`, `*`, `&`, …) from the argument.
+
+**12.** Put a space after a comma, but not before it. The same rule goes for a semicolon inside a `for` expression.
+
+**13.** Do not use spaces to separate the `[]` operator.
+
+**14.** In a `template <...>` expression, use a space between `template` and `<`; no spaces after `<` or before `>`.
+
+``` cpp
+template
+struct AggregatedStatElement
+{}
+```
+
+**15.** In classes and structures, write `public`, `private`, and `protected` on the same level as `class/struct`, and indent the rest of the code.
+
+``` cpp
+template
+class MultiVersion
+{
+public:
+ /// Version of object for usage. shared_ptr manage lifetime of version.
+ using Version = std::shared_ptr;
+ ...
+}
+```
+
+**16.** If the same `namespace` is used for the entire file, and there isn’t anything else significant, an offset is not necessary inside `namespace`.
+
+**17.** If the block for an `if`, `for`, `while`, or other expression consists of a single `statement`, the curly brackets are optional. Place the `statement` on a separate line, instead. This rule is also valid for nested `if`, `for`, `while`, …
+
+But if the inner `statement` contains curly brackets or `else`, the external block should be written in curly brackets.
+
+``` cpp
+/// Finish write.
+for (auto & stream : streams)
+ stream.second->finalize();
+```
+
+**18.** There shouldn’t be any spaces at the ends of lines.
+
+**19.** Source files are UTF-8 encoded.
+
+**20.** Non-ASCII characters can be used in string literals.
+
+``` cpp
+<< ", " << (timer.elapsed() / chunks_stats.hits) << " μsec/hit.";
+```
+
+**21.** Do not write multiple expressions in a single line.
+
+**22.** Group sections of code inside functions and separate them with no more than one empty line.
+
+**23.** Separate functions, classes, and so on with one or two empty lines.
+
+**24.** `A const` (related to a value) must be written before the type name.
+
+``` cpp
+//correct
+const char * pos
+const std::string & s
+//incorrect
+char const * pos
+```
+
+**25.** When declaring a pointer or reference, the `*` and `&` symbols should be separated by spaces on both sides.
+
+``` cpp
+//correct
+const char * pos
+//incorrect
+const char* pos
+const char *pos
+```
+
+**26.** When using template types, alias them with the `using` keyword (except in the simplest cases).
+
+In other words, the template parameters are specified only in `using` and aren’t repeated in the code.
+
+`using` can be declared locally, such as inside a function.
+
+``` cpp
+//correct
+using FileStreams = std::map>;
+FileStreams streams;
+//incorrect
+std::map> streams;
+```
+
+**27.** Do not declare several variables of different types in one statement.
+
+``` cpp
+//incorrect
+int x, *y;
+```
+
+**28.** Do not use C-style casts.
+
+``` cpp
+//incorrect
+std::cerr << (int)c <<; std::endl;
+//correct
+std::cerr << static_cast(c) << std::endl;
+```
+
+**29.** In classes and structs, group members and functions separately inside each visibility scope.
+
+**30.** For small classes and structs, it is not necessary to separate the method declaration from the implementation.
+
+The same is true for small methods in any classes or structs.
+
+For templated classes and structs, do not separate the method declarations from the implementation (because otherwise they must be defined in the same translation unit).
+
+**31.** You can wrap lines at 140 characters, instead of 80.
+
+**32.** Always use the prefix increment/decrement operators if postfix is not required.
+
+``` cpp
+for (Names::const_iterator it = column_names.begin(); it != column_names.end(); ++it)
+```
+
+## Comments {#comments}
+
+**1.** Be sure to add comments for all non-trivial parts of code.
+
+This is very important. Writing the comment might help you realize that the code isn’t necessary, or that it is designed wrong.
+
+``` cpp
+/** Part of piece of memory, that can be used.
+ * For example, if internal_buffer is 1MB, and there was only 10 bytes loaded to buffer from file for reading,
+ * then working_buffer will have size of only 10 bytes
+ * (working_buffer.end() will point to position right after those 10 bytes available for read).
+ */
+```
+
+**2.** Comments can be as detailed as necessary.
+
+**3.** Place comments before the code they describe. In rare cases, comments can come after the code, on the same line.
+
+``` cpp
+/** Parses and executes the query.
+*/
+void executeQuery(
+ ReadBuffer & istr, /// Where to read the query from (and data for INSERT, if applicable)
+ WriteBuffer & ostr, /// Where to write the result
+ Context & context, /// DB, tables, data types, engines, functions, aggregate functions...
+ BlockInputStreamPtr & query_plan, /// Here could be written the description on how query was executed
+ QueryProcessingStage::Enum stage = QueryProcessingStage::Complete /// Up to which stage process the SELECT query
+ )
+```
+
+**4.** Comments should be written in English only.
+
+**5.** If you are writing a library, include detailed comments explaining it in the main header file.
+
+**6.** Do not add comments that do not provide additional information. In particular, do not leave empty comments like this:
+
+``` cpp
+/*
+* Procedure Name:
+* Original procedure name:
+* Author:
+* Date of creation:
+* Dates of modification:
+* Modification authors:
+* Original file name:
+* Purpose:
+* Intent:
+* Designation:
+* Classes used:
+* Constants:
+* Local variables:
+* Parameters:
+* Date of creation:
+* Purpose:
+*/
+```
+
+The example is borrowed from the resource http://home.tamk.fi/~jaalto/course/coding-style/doc/unmaintainable-code/.
+
+**7.** Do not write garbage comments (author, creation date ..) at the beginning of each file.
+
+**8.** Single-line comments begin with three slashes: `///` and multi-line comments begin with `/**`. These comments are considered “documentation”.
+
+Note: You can use Doxygen to generate documentation from these comments. But Doxygen is not generally used because it is more convenient to navigate the code in the IDE.
+
+**9.** Multi-line comments must not have empty lines at the beginning and end (except the line that closes a multi-line comment).
+
+**10.** For commenting out code, use basic comments, not “documenting” comments.
+
+**11.** Delete the commented out parts of the code before committing.
+
+**12.** Do not use profanity in comments or code.
+
+**13.** Do not use uppercase letters. Do not use excessive punctuation.
+
+``` cpp
+/// WHAT THE FAIL???
+```
+
+**14.** Do not use comments to make delimeters.
+
+``` cpp
+///******************************************************
+```
+
+**15.** Do not start discussions in comments.
+
+``` cpp
+/// Why did you do this stuff?
+```
+
+**16.** There’s no need to write a comment at the end of a block describing what it was about.
+
+``` cpp
+/// for
+```
+
+## Names {#names}
+
+**1.** Use lowercase letters with underscores in the names of variables and class members.
+
+``` cpp
+size_t max_block_size;
+```
+
+**2.** For the names of functions (methods), use camelCase beginning with a lowercase letter.
+
+``` cpp
+std::string getName() const override { return "Memory"; }
+```
+
+**3.** For the names of classes (structs), use CamelCase beginning with an uppercase letter. Prefixes other than I are not used for interfaces.
+
+``` cpp
+class StorageMemory : public IStorage
+```
+
+**4.** `using` are named the same way as classes.
+
+**5.** Names of template type arguments: in simple cases, use `T`; `T`, `U`; `T1`, `T2`.
+
+For more complex cases, either follow the rules for class names, or add the prefix `T`.
+
+``` cpp
+template
+struct AggregatedStatElement
+```
+
+**6.** Names of template constant arguments: either follow the rules for variable names, or use `N` in simple cases.
+
+``` cpp
+template
+struct ExtractDomain
+```
+
+**7.** For abstract classes (interfaces) you can add the `I` prefix.
+
+``` cpp
+class IBlockInputStream
+```
+
+**8.** If you use a variable locally, you can use the short name.
+
+In all other cases, use a name that describes the meaning.
+
+``` cpp
+bool info_successfully_loaded = false;
+```
+
+**9.** Names of `define`s and global constants use ALL_CAPS with underscores.
+
+``` cpp
+#define MAX_SRC_TABLE_NAMES_TO_STORE 1000
+```
+
+**10.** File names should use the same style as their contents.
+
+If a file contains a single class, name the file the same way as the class (CamelCase).
+
+If the file contains a single function, name the file the same way as the function (camelCase).
+
+**11.** If the name contains an abbreviation, then:
+
+- For variable names, the abbreviation should use lowercase letters `mysql_connection` (not `mySQL_connection`).
+- For names of classes and functions, keep the uppercase letters in the abbreviation`MySQLConnection` (not `MySqlConnection`).
+
+**12.** Constructor arguments that are used just to initialize the class members should be named the same way as the class members, but with an underscore at the end.
+
+``` cpp
+FileQueueProcessor(
+ const std::string & path_,
+ const std::string & prefix_,
+ std::shared_ptr handler_)
+ : path(path_),
+ prefix(prefix_),
+ handler(handler_),
+ log(&Logger::get("FileQueueProcessor"))
+{
+}
+```
+
+The underscore suffix can be omitted if the argument is not used in the constructor body.
+
+**13.** There is no difference in the names of local variables and class members (no prefixes required).
+
+``` cpp
+timer (not m_timer)
+```
+
+**14.** For the constants in an `enum`, use CamelCase with a capital letter. ALL_CAPS is also acceptable. If the `enum` is non-local, use an `enum class`.
+
+``` cpp
+enum class CompressionMethod
+{
+ QuickLZ = 0,
+ LZ4 = 1,
+};
+```
+
+**15.** All names must be in English. Transliteration of Hebrew words is not allowed.
+
+ not T_PAAMAYIM_NEKUDOTAYIM
+
+**16.** Abbreviations are acceptable if they are well known (when you can easily find the meaning of the abbreviation in Wikipedia or in a search engine).
+
+ `AST`, `SQL`.
+
+ Not `NVDH` (some random letters)
+
+Incomplete words are acceptable if the shortened version is common use.
+
+You can also use an abbreviation if the full name is included next to it in the comments.
+
+**17.** File names with C++ source code must have the `.cpp` extension. Header files must have the `.h` extension.
+
+## How to Write Code {#how-to-write-code}
+
+**1.** Memory management.
+
+Manual memory deallocation (`delete`) can only be used in library code.
+
+In library code, the `delete` operator can only be used in destructors.
+
+In application code, memory must be freed by the object that owns it.
+
+Examples:
+
+- The easiest way is to place an object on the stack, or make it a member of another class.
+- For a large number of small objects, use containers.
+- For automatic deallocation of a small number of objects that reside in the heap, use `shared_ptr/unique_ptr`.
+
+**2.** Resource management.
+
+Use `RAII` and see above.
+
+**3.** Error handling.
+
+Use exceptions. In most cases, you only need to throw an exception, and do not need to catch it (because of `RAII`).
+
+In offline data processing applications, it’s often acceptable to not catch exceptions.
+
+In servers that handle user requests, it’s usually enough to catch exceptions at the top level of the connection handler.
+
+In thread functions, you should catch and keep all exceptions to rethrow them in the main thread after `join`.
+
+``` cpp
+/// If there weren't any calculations yet, calculate the first block synchronously
+if (!started)
+{
+ calculate();
+ started = true;
+}
+else /// If calculations are already in progress, wait for the result
+ pool.wait();
+
+if (exception)
+ exception->rethrow();
+```
+
+Never hide exceptions without handling. Never just blindly put all exceptions to log.
+
+``` cpp
+//Not correct
+catch (...) {}
+```
+
+If you need to ignore some exceptions, do so only for specific ones and rethrow the rest.
+
+``` cpp
+catch (const DB::Exception & e)
+{
+ if (e.code() == ErrorCodes::UNKNOWN_AGGREGATE_FUNCTION)
+ return nullptr;
+ else
+ throw;
+}
+```
+
+When using functions with response codes or `errno`, always check the result and throw an exception in case of error.
+
+``` cpp
+if (0 != close(fd))
+ throwFromErrno("Cannot close file " + file_name, ErrorCodes::CANNOT_CLOSE_FILE);
+```
+
+You can use assert to check invariants in code.
+
+**4.** Exception types.
+
+There is no need to use complex exception hierarchy in application code. The exception text should be understandable to a system administrator.
+
+**5.** Throwing exceptions from destructors.
+
+This is not recommended, but it is allowed.
+
+Use the following options:
+
+- Create a function (`done()` or `finalize()`) that will do all the work in advance that might lead to an exception. If that function was called, there should be no exceptions in the destructor later.
+- Tasks that are too complex (such as sending messages over the network) can be put in separate method that the class user will have to call before destruction.
+- If there is an exception in the destructor, it’s better to log it than to hide it (if the logger is available).
+- In simple applications, it is acceptable to rely on `std::terminate` (for cases of `noexcept` by default in C++11) to handle exceptions.
+
+**6.** Anonymous code blocks.
+
+You can create a separate code block inside a single function in order to make certain variables local, so that the destructors are called when exiting the block.
+
+``` cpp
+Block block = data.in->read();
+
+{
+ std::lock_guard lock(mutex);
+ data.ready = true;
+ data.block = block;
+}
+
+ready_any.set();
+```
+
+**7.** Multithreading.
+
+In offline data processing programs:
+
+- Try to get the best possible performance on a single CPU core. You can then parallelize your code if necessary.
+
+In server applications:
+
+- Use the thread pool to process requests. At this point, we haven’t had any tasks that required userspace context switching.
+
+Fork is not used for parallelization.
+
+**8.** Syncing threads.
+
+Often it is possible to make different threads use different memory cells (even better: different cache lines,) and to not use any thread synchronization (except `joinAll`).
+
+If synchronization is required, in most cases, it is sufficient to use mutex under `lock_guard`.
+
+In other cases use system synchronization primitives. Do not use busy wait.
+
+Atomic operations should be used only in the simplest cases.
+
+Do not try to implement lock-free data structures unless it is your primary area of expertise.
+
+**9.** Pointers vs references.
+
+In most cases, prefer references.
+
+**10.** const.
+
+Use constant references, pointers to constants, `const_iterator`, and const methods.
+
+Consider `const` to be default and use non-`const` only when necessary.
+
+When passing variables by value, using `const` usually does not make sense.
+
+**11.** unsigned.
+
+Use `unsigned` if necessary.
+
+**12.** Numeric types.
+
+Use the types `UInt8`, `UInt16`, `UInt32`, `UInt64`, `Int8`, `Int16`, `Int32`, and `Int64`, as well as `size_t`, `ssize_t`, and `ptrdiff_t`.
+
+Don’t use these types for numbers: `signed/unsigned long`, `long long`, `short`, `signed/unsigned char`, `char`.
+
+**13.** Passing arguments.
+
+Pass complex values by value if they are going to be moved and use std::move; pass by reference if you want to update value in a loop.
+
+If a function captures ownership of an object created in the heap, make the argument type `shared_ptr` or `unique_ptr`.
+
+**14.** Return values.
+
+In most cases, just use `return`. Do not write `return std::move(res)`.
+
+If the function allocates an object on heap and returns it, use `shared_ptr` or `unique_ptr`.
+
+In rare cases (updating a value in a loop) you might need to return the value via an argument. In this case, the argument should be a reference.
+
+``` cpp
+using AggregateFunctionPtr = std::shared_ptr;
+
+/** Allows creating an aggregate function by its name.
+ */
+class AggregateFunctionFactory
+{
+public:
+ AggregateFunctionFactory();
+ AggregateFunctionPtr get(const String & name, const DataTypes & argument_types) const;
+```
+
+**15.** namespace.
+
+There is no need to use a separate `namespace` for application code.
+
+Small libraries do not need this, either.
+
+For medium to large libraries, put everything in a `namespace`.
+
+In the library’s `.h` file, you can use `namespace detail` to hide implementation details not needed for the application code.
+
+In a `.cpp` file, you can use a `static` or anonymous namespace to hide symbols.
+
+Also, a `namespace` can be used for an `enum` to prevent the corresponding names from falling into an external `namespace` (but it’s better to use an `enum class`).
+
+**16.** Deferred initialization.
+
+If arguments are required for initialization, then you normally shouldn’t write a default constructor.
+
+If later you’ll need to delay initialization, you can add a default constructor that will create an invalid object. Or, for a small number of objects, you can use `shared_ptr/unique_ptr`.
+
+``` cpp
+Loader(DB::Connection * connection_, const std::string & query, size_t max_block_size_);
+
+/// For deferred initialization
+Loader() {}
+```
+
+**17.** Virtual functions.
+
+If the class is not intended for polymorphic use, you do not need to make functions virtual. This also applies to the destructor.
+
+**18.** Encodings.
+
+Use UTF-8 everywhere. Use `std::string` and `char *`. Do not use `std::wstring` and `wchar_t`.
+
+**19.** Logging.
+
+See the examples everywhere in the code.
+
+Before committing, delete all meaningless and debug logging, and any other types of debug output.
+
+Logging in cycles should be avoided, even on the Trace level.
+
+Logs must be readable at any logging level.
+
+Logging should only be used in application code, for the most part.
+
+Log messages must be written in English.
+
+The log should preferably be understandable for the system administrator.
+
+Do not use profanity in the log.
+
+Use UTF-8 encoding in the log. In rare cases you can use non-ASCII characters in the log.
+
+**20.** Input-output.
+
+Don’t use `iostreams` in internal cycles that are critical for application performance (and never use `stringstream`).
+
+Use the `DB/IO` library instead.
+
+**21.** Date and time.
+
+See the `DateLUT` library.
+
+**22.** include.
+
+Always use `#pragma once` instead of include guards.
+
+**23.** using.
+
+`using namespace` is not used. You can use `using` with something specific. But make it local inside a class or function.
+
+**24.** Do not use `trailing return type` for functions unless necessary.
+
+``` cpp
+auto f() -> void
+```
+
+**25.** Declaration and initialization of variables.
+
+``` cpp
+//right way
+std::string s = "Hello";
+std::string s{"Hello"};
+
+//wrong way
+auto s = std::string{"Hello"};
+```
+
+**26.** For virtual functions, write `virtual` in the base class, but write `override` instead of `virtual` in descendent classes.
+
+## Unused Features of C++ {#unused-features-of-c}
+
+**1.** Virtual inheritance is not used.
+
+**2.** Exception specifiers from C++03 are not used.
+
+## Platform {#platform}
+
+**1.** We write code for a specific platform.
+
+But other things being equal, cross-platform or portable code is preferred.
+
+**2.** Language: C++20 (see the list of available [C++20 features](https://en.cppreference.com/w/cpp/compiler_support#C.2B.2B20_features)).
+
+**3.** Compiler: `clang`. At this time (April 2021), the code is compiled using clang version 11. (It can also be compiled using `gcc` version 10, but it's untested and not suitable for production usage).
+
+The standard library is used (`libc++`).
+
+**4.**OS: Linux Ubuntu, not older than Precise.
+
+**5.**Code is written for x86_64 CPU architecture.
+
+The CPU instruction set is the minimum supported set among our servers. Currently, it is SSE 4.2.
+
+**6.** Use `-Wall -Wextra -Werror` compilation flags. Also `-Weverything` is used with few exceptions.
+
+**7.** Use static linking with all libraries except those that are difficult to connect to statically (see the output of the `ldd` command).
+
+**8.** Code is developed and debugged with release settings.
+
+## Tools {#tools}
+
+**1.** KDevelop is a good IDE.
+
+**2.** For debugging, use `gdb`, `valgrind` (`memcheck`), `strace`, `-fsanitize=...`, or `tcmalloc_minimal_debug`.
+
+**3.** For profiling, use `Linux Perf`, `valgrind` (`callgrind`), or `strace -cf`.
+
+**4.** Sources are in Git.
+
+**5.** Assembly uses `CMake`.
+
+**6.** Programs are released using `deb` packages.
+
+**7.** Commits to master must not break the build.
+
+Though only selected revisions are considered workable.
+
+**8.** Make commits as often as possible, even if the code is only partially ready.
+
+Use branches for this purpose.
+
+If your code in the `master` branch is not buildable yet, exclude it from the build before the `push`. You’ll need to finish it or remove it within a few days.
+
+**9.** For non-trivial changes, use branches and publish them on the server.
+
+**10.** Unused code is removed from the repository.
+
+## Libraries {#libraries}
+
+**1.** The C++20 standard library is used (experimental extensions are allowed), as well as `boost` and `Poco` frameworks.
+
+**2.** It is not allowed to use libraries from OS packages. It is also not allowed to use pre-installed libraries. All libraries should be placed in form of source code in `contrib` directory and built with ClickHouse. See [Guidelines for adding new third-party libraries](contrib.md#adding-third-party-libraries) for details.
+
+**3.** Preference is always given to libraries that are already in use.
+
+## General Recommendations {#general-recommendations-1}
+
+**1.** Write as little code as possible.
+
+**2.** Try the simplest solution.
+
+**3.** Don’t write code until you know how it’s going to work and how the inner loop will function.
+
+**4.** In the simplest cases, use `using` instead of classes or structs.
+
+**5.** If possible, do not write copy constructors, assignment operators, destructors (other than a virtual one, if the class contains at least one virtual function), move constructors or move assignment operators. In other words, the compiler-generated functions must work correctly. You can use `default`.
+
+**6.** Code simplification is encouraged. Reduce the size of your code where possible.
+
+## Additional Recommendations {#additional-recommendations}
+
+**1.** Explicitly specifying `std::` for types from `stddef.h`
+
+is not recommended. In other words, we recommend writing `size_t` instead `std::size_t`, because it’s shorter.
+
+It is acceptable to add `std::`.
+
+**2.** Explicitly specifying `std::` for functions from the standard C library
+
+is not recommended. In other words, write `memcpy` instead of `std::memcpy`.
+
+The reason is that there are similar non-standard functions, such as `memmem`. We do use these functions on occasion. These functions do not exist in `namespace std`.
+
+If you write `std::memcpy` instead of `memcpy` everywhere, then `memmem` without `std::` will look strange.
+
+Nevertheless, you can still use `std::` if you prefer it.
+
+**3.** Using functions from C when the same ones are available in the standard C++ library.
+
+This is acceptable if it is more efficient.
+
+For example, use `memcpy` instead of `std::copy` for copying large chunks of memory.
+
+**4.** Multiline function arguments.
+
+Any of the following wrapping styles are allowed:
+
+``` cpp
+function(
+ T1 x1,
+ T2 x2)
+```
+
+``` cpp
+function(
+ size_t left, size_t right,
+ const & RangesInDataParts ranges,
+ size_t limit)
+```
+
+``` cpp
+function(size_t left, size_t right,
+ const & RangesInDataParts ranges,
+ size_t limit)
+```
+
+``` cpp
+function(size_t left, size_t right,
+ const & RangesInDataParts ranges,
+ size_t limit)
+```
+
+``` cpp
+function(
+ size_t left,
+ size_t right,
+ const & RangesInDataParts ranges,
+ size_t limit)
+```
+
+[Original article](https://clickhouse.com/docs/en/development/style/)
diff --git a/docs/en/reference/development/tests.md b/docs/en/reference/development/tests.md
new file mode 100644
index 00000000000..29b69f0b697
--- /dev/null
+++ b/docs/en/reference/development/tests.md
@@ -0,0 +1,297 @@
+---
+sidebar_position: 70
+sidebar_label: Testing
+description: Most of ClickHouse features can be tested with functional tests and they are mandatory to use for every change in ClickHouse code that can be tested that way.
+---
+
+# ClickHouse Testing
+
+## Functional Tests
+
+Functional tests are the most simple and convenient to use. Most of ClickHouse features can be tested with functional tests and they are mandatory to use for every change in ClickHouse code that can be tested that way.
+
+Each functional test sends one or multiple queries to the running ClickHouse server and compares the result with reference.
+
+Tests are located in `queries` directory. There are two subdirectories: `stateless` and `stateful`. Stateless tests run queries without any preloaded test data - they often create small synthetic datasets on the fly, within the test itself. Stateful tests require preloaded test data from CLickHouse and it is available to general public.
+
+Each test can be one of two types: `.sql` and `.sh`. `.sql` test is the simple SQL script that is piped to `clickhouse-client --multiquery --testmode`. `.sh` test is a script that is run by itself. SQL tests are generally preferable to `.sh` tests. You should use `.sh` tests only when you have to test some feature that cannot be exercised from pure SQL, such as piping some input data into `clickhouse-client` or testing `clickhouse-local`.
+
+### Running a Test Locally {#functional-test-locally}
+
+Start the ClickHouse server locally, listening on the default port (9000). To
+run, for example, the test `01428_hash_set_nan_key`, change to the repository
+folder and run the following command:
+
+```
+PATH=$PATH: tests/clickhouse-test 01428_hash_set_nan_key
+```
+
+For more options, see `tests/clickhouse-test --help`. You can simply run all tests or run subset of tests filtered by substring in test name: `./clickhouse-test substring`. There are also options to run tests in parallel or in randomized order.
+
+### Adding a New Test
+
+To add new test, create a `.sql` or `.sh` file in `queries/0_stateless` directory, check it manually and then generate `.reference` file in the following way: `clickhouse-client -n --testmode < 00000_test.sql > 00000_test.reference` or `./00000_test.sh > ./00000_test.reference`.
+
+Tests should use (create, drop, etc) only tables in `test` database that is assumed to be created beforehand; also tests can use temporary tables.
+
+### Choosing the Test Name
+
+The name of the test starts with a five-digit prefix followed by a descriptive name, such as `00422_hash_function_constexpr.sql`. To choose the prefix, find the largest prefix already present in the directory, and increment it by one. In the meantime, some other tests might be added with the same numeric prefix, but this is OK and does not lead to any problems, you don't have to change it later.
+
+Some tests are marked with `zookeeper`, `shard` or `long` in their names. `zookeeper` is for tests that are using ZooKeeper. `shard` is for tests that requires server to listen `127.0.0.*`; `distributed` or `global` have the same meaning. `long` is for tests that run slightly longer that one second. You can disable these groups of tests using `--no-zookeeper`, `--no-shard` and `--no-long` options, respectively. Make sure to add a proper prefix to your test name if it needs ZooKeeper or distributed queries.
+
+### Checking for an Error that Must Occur
+
+Sometimes you want to test that a server error occurs for an incorrect query. We support special annotations for this in SQL tests, in the following form:
+```
+select x; -- { serverError 49 }
+```
+This test ensures that the server returns an error with code 49 about unknown column `x`. If there is no error, or the error is different, the test will fail. If you want to ensure that an error occurs on the client side, use `clientError` annotation instead.
+
+Do not check for a particular wording of error message, it may change in the future, and the test will needlessly break. Check only the error code. If the existing error code is not precise enough for your needs, consider adding a new one.
+
+### Testing a Distributed Query
+
+If you want to use distributed queries in functional tests, you can leverage `remote` table function with `127.0.0.{1..2}` addresses for the server to query itself; or you can use predefined test clusters in server configuration file like `test_shard_localhost`. Remember to add the words `shard` or `distributed` to the test name, so that it is run in CI in correct configurations, where the server is configured to support distributed queries.
+
+
+## Known Bugs {#known-bugs}
+
+If we know some bugs that can be easily reproduced by functional tests, we place prepared functional tests in `tests/queries/bugs` directory. These tests will be moved to `tests/queries/0_stateless` when bugs are fixed.
+
+## Integration Tests {#integration-tests}
+
+Integration tests allow testing ClickHouse in clustered configuration and ClickHouse interaction with other servers like MySQL, Postgres, MongoDB. They are useful to emulate network splits, packet drops, etc. These tests are run under Docker and create multiple containers with various software.
+
+See `tests/integration/README.md` on how to run these tests.
+
+Note that integration of ClickHouse with third-party drivers is not tested. Also, we currently do not have integration tests with our JDBC and ODBC drivers.
+
+## Unit Tests {#unit-tests}
+
+Unit tests are useful when you want to test not the ClickHouse as a whole, but a single isolated library or class. You can enable or disable build of tests with `ENABLE_TESTS` CMake option. Unit tests (and other test programs) are located in `tests` subdirectories across the code. To run unit tests, type `ninja test`. Some tests use `gtest`, but some are just programs that return non-zero exit code on test failure.
+
+It’s not necessary to have unit tests if the code is already covered by functional tests (and functional tests are usually much more simple to use).
+
+You can run individual gtest checks by calling the executable directly, for example:
+
+```bash
+$ ./src/unit_tests_dbms --gtest_filter=LocalAddress*
+```
+
+## Performance Tests {#performance-tests}
+
+Performance tests allow to measure and compare performance of some isolated part of ClickHouse on synthetic queries. Tests are located at `tests/performance`. Each test is represented by `.xml` file with description of test case. Tests are run with `docker/tests/performance-comparison` tool . See the readme file for invocation.
+
+Each test run one or multiple queries (possibly with combinations of parameters) in a loop. Some tests can contain preconditions on preloaded test dataset.
+
+If you want to improve performance of ClickHouse in some scenario, and if improvements can be observed on simple queries, it is highly recommended to write a performance test. It always makes sense to use `perf top` or other perf tools during your tests.
+
+## Test Tools and Scripts {#test-tools-and-scripts}
+
+Some programs in `tests` directory are not prepared tests, but are test tools. For example, for `Lexer` there is a tool `src/Parsers/tests/lexer` that just do tokenization of stdin and writes colorized result to stdout. You can use these kind of tools as a code examples and for exploration and manual testing.
+
+## Miscellaneous Tests {#miscellaneous-tests}
+
+There are tests for machine learned models in `tests/external_models`. These tests are not updated and must be transferred to integration tests.
+
+There is separate test for quorum inserts. This test run ClickHouse cluster on separate servers and emulate various failure cases: network split, packet drop (between ClickHouse nodes, between ClickHouse and ZooKeeper, between ClickHouse server and client, etc.), `kill -9`, `kill -STOP` and `kill -CONT` , like [Jepsen](https://aphyr.com/tags/Jepsen). Then the test checks that all acknowledged inserts was written and all rejected inserts was not.
+
+Quorum test was written by separate team before ClickHouse was open-sourced. This team no longer work with ClickHouse. Test was accidentally written in Java. For these reasons, quorum test must be rewritten and moved to integration tests.
+
+## Manual Testing {#manual-testing}
+
+When you develop a new feature, it is reasonable to also test it manually. You can do it with the following steps:
+
+Build ClickHouse. Run ClickHouse from the terminal: change directory to `programs/clickhouse-server` and run it with `./clickhouse-server`. It will use configuration (`config.xml`, `users.xml` and files within `config.d` and `users.d` directories) from the current directory by default. To connect to ClickHouse server, run `programs/clickhouse-client/clickhouse-client`.
+
+Note that all clickhouse tools (server, client, etc) are just symlinks to a single binary named `clickhouse`. You can find this binary at `programs/clickhouse`. All tools can also be invoked as `clickhouse tool` instead of `clickhouse-tool`.
+
+Alternatively you can install ClickHouse package: either stable release from ClickHouse repository or you can build package for yourself with `./release` in ClickHouse sources root. Then start the server with `sudo clickhouse start` (or stop to stop the server). Look for logs at `/etc/clickhouse-server/clickhouse-server.log`.
+
+When ClickHouse is already installed on your system, you can build a new `clickhouse` binary and replace the existing binary:
+
+``` bash
+$ sudo clickhouse stop
+$ sudo cp ./clickhouse /usr/bin/
+$ sudo clickhouse start
+```
+
+Also you can stop system clickhouse-server and run your own with the same configuration but with logging to terminal:
+
+``` bash
+$ sudo clickhouse stop
+$ sudo -u clickhouse /usr/bin/clickhouse server --config-file /etc/clickhouse-server/config.xml
+```
+
+Example with gdb:
+
+``` bash
+$ sudo -u clickhouse gdb --args /usr/bin/clickhouse server --config-file /etc/clickhouse-server/config.xml
+```
+
+If the system clickhouse-server is already running and you do not want to stop it, you can change port numbers in your `config.xml` (or override them in a file in `config.d` directory), provide appropriate data path, and run it.
+
+`clickhouse` binary has almost no dependencies and works across wide range of Linux distributions. To quick and dirty test your changes on a server, you can simply `scp` your fresh built `clickhouse` binary to your server and then run it as in examples above.
+
+## Build Tests {#build-tests}
+
+Build tests allow to check that build is not broken on various alternative configurations and on some foreign systems. These tests are automated as well.
+
+Examples:
+- cross-compile for Darwin x86_64 (Mac OS X)
+- cross-compile for FreeBSD x86_64
+- cross-compile for Linux AArch64
+- build on Ubuntu with libraries from system packages (discouraged)
+- build with shared linking of libraries (discouraged)
+
+For example, build with system packages is bad practice, because we cannot guarantee what exact version of packages a system will have. But this is really needed by Debian maintainers. For this reason we at least have to support this variant of build. Another example: shared linking is a common source of trouble, but it is needed for some enthusiasts.
+
+Though we cannot run all tests on all variant of builds, we want to check at least that various build variants are not broken. For this purpose we use build tests.
+
+We also test that there are no translation units that are too long to compile or require too much RAM.
+
+We also test that there are no too large stack frames.
+
+## Testing for Protocol Compatibility {#testing-for-protocol-compatibility}
+
+When we extend ClickHouse network protocol, we test manually that old clickhouse-client works with new clickhouse-server and new clickhouse-client works with old clickhouse-server (simply by running binaries from corresponding packages).
+
+We also test some cases automatically with integrational tests:
+- if data written by old version of ClickHouse can be successfully read by the new version;
+- do distributed queries work in a cluster with different ClickHouse versions.
+
+## Help from the Compiler {#help-from-the-compiler}
+
+Main ClickHouse code (that is located in `dbms` directory) is built with `-Wall -Wextra -Werror` and with some additional enabled warnings. Although these options are not enabled for third-party libraries.
+
+Clang has even more useful warnings - you can look for them with `-Weverything` and pick something to default build.
+
+For production builds, clang is used, but we also test make gcc builds. For development, clang is usually more convenient to use. You can build on your own machine with debug mode (to save battery of your laptop), but please note that compiler is able to generate more warnings with `-O3` due to better control flow and inter-procedure analysis. When building with clang in debug mode, debug version of `libc++` is used that allows to catch more errors at runtime.
+
+## Sanitizers {#sanitizers}
+
+### Address sanitizer
+We run functional, integration, stress and unit tests under ASan on per-commit basis.
+
+### Thread sanitizer
+We run functional, integration, stress and unit tests under TSan on per-commit basis.
+
+### Memory sanitizer
+We run functional, integration, stress and unit tests under MSan on per-commit basis.
+
+### Undefined behaviour sanitizer
+We run functional, integration, stress and unit tests under UBSan on per-commit basis. The code of some third-party libraries is not sanitized for UB.
+
+### Valgrind (Memcheck)
+We used to run functional tests under Valgrind overnight, but don't do it anymore. It takes multiple hours. Currently there is one known false positive in `re2` library, see [this article](https://research.swtch.com/sparse).
+
+## Fuzzing {#fuzzing}
+
+ClickHouse fuzzing is implemented both using [libFuzzer](https://llvm.org/docs/LibFuzzer.html) and random SQL queries.
+All the fuzz testing should be performed with sanitizers (Address and Undefined).
+
+LibFuzzer is used for isolated fuzz testing of library code. Fuzzers are implemented as part of test code and have “_fuzzer” name postfixes.
+Fuzzer example can be found at `src/Parsers/tests/lexer_fuzzer.cpp`. LibFuzzer-specific configs, dictionaries and corpus are stored at `tests/fuzz`.
+We encourage you to write fuzz tests for every functionality that handles user input.
+
+Fuzzers are not built by default. To build fuzzers both `-DENABLE_FUZZING=1` and `-DENABLE_TESTS=1` options should be set.
+We recommend to disable Jemalloc while building fuzzers. Configuration used to integrate ClickHouse fuzzing to
+Google OSS-Fuzz can be found at `docker/fuzz`.
+
+We also use simple fuzz test to generate random SQL queries and to check that the server does not die executing them.
+You can find it in `00746_sql_fuzzy.pl`. This test should be run continuously (overnight and longer).
+
+We also use sophisticated AST-based query fuzzer that is able to find huge amount of corner cases. It does random permutations and substitutions in queries AST. It remembers AST nodes from previous tests to use them for fuzzing of subsequent tests while processing them in random order. You can learn more about this fuzzer in [this blog article](https://clickhouse.com/blog/en/2021/fuzzing-clickhouse/).
+
+## Stress test
+
+Stress tests are another case of fuzzing. It runs all functional tests in parallel in random order with a single server. Results of the tests are not checked.
+
+It is checked that:
+- server does not crash, no debug or sanitizer traps are triggered;
+- there are no deadlocks;
+- the database structure is consistent;
+- server can successfully stop after the test and start again without exceptions.
+
+There are five variants (Debug, ASan, TSan, MSan, UBSan).
+
+## Thread Fuzzer
+
+Thread Fuzzer (please don't mix up with Thread Sanitizer) is another kind of fuzzing that allows to randomize thread order of execution. It helps to find even more special cases.
+
+## Security Audit
+
+Our Security Team did some basic overview of ClickHouse capabilities from the security standpoint.
+
+## Static Analyzers {#static-analyzers}
+
+We run `clang-tidy` on per-commit basis. `clang-static-analyzer` checks are also enabled. `clang-tidy` is also used for some style checks.
+
+We have evaluated `clang-tidy`, `Coverity`, `cppcheck`, `PVS-Studio`, `tscancode`, `CodeQL`. You will find instructions for usage in `tests/instructions/` directory.
+
+If you use `CLion` as an IDE, you can leverage some `clang-tidy` checks out of the box.
+
+We also use `shellcheck` for static analysis of shell scripts.
+
+## Hardening {#hardening}
+
+In debug build we are using custom allocator that does ASLR of user-level allocations.
+
+We also manually protect memory regions that are expected to be readonly after allocation.
+
+In debug build we also involve a customization of libc that ensures that no "harmful" (obsolete, insecure, not thread-safe) functions are called.
+
+Debug assertions are used extensively.
+
+In debug build, if exception with "logical error" code (implies a bug) is being thrown, the program is terminated prematurally. It allows to use exceptions in release build but make it an assertion in debug build.
+
+Debug version of jemalloc is used for debug builds.
+Debug version of libc++ is used for debug builds.
+
+## Runtime Integrity Checks
+
+Data stored on disk is checksummed. Data in MergeTree tables is checksummed in three ways simultaneously* (compressed data blocks, uncompressed data blocks, the total checksum across blocks). Data transferred over network between client and server or between servers is also checksummed. Replication ensures bit-identical data on replicas.
+
+It is required to protect from faulty hardware (bit rot on storage media, bit flips in RAM on server, bit flips in RAM of network controller, bit flips in RAM of network switch, bit flips in RAM of client, bit flips on the wire). Note that bit flips are common and likely to occur even for ECC RAM and in presense of TCP checksums (if you manage to run thousands of servers processing petabytes of data each day). [See the video (russian)](https://www.youtube.com/watch?v=ooBAQIe0KlQ).
+
+ClickHouse provides diagnostics that will help ops engineers to find faulty hardware.
+
+\* and it is not slow.
+
+## Code Style {#code-style}
+
+Code style rules are described [here](style.md).
+
+To check for some common style violations, you can use `utils/check-style` script.
+
+To force proper style of your code, you can use `clang-format`. File `.clang-format` is located at the sources root. It mostly corresponding with our actual code style. But it’s not recommended to apply `clang-format` to existing files because it makes formatting worse. You can use `clang-format-diff` tool that you can find in clang source repository.
+
+Alternatively you can try `uncrustify` tool to reformat your code. Configuration is in `uncrustify.cfg` in the sources root. It is less tested than `clang-format`.
+
+`CLion` has its own code formatter that has to be tuned for our code style.
+
+We also use `codespell` to find typos in code. It is automated as well.
+
+## Test Coverage {#test-coverage}
+
+We also track test coverage but only for functional tests and only for clickhouse-server. It is performed on daily basis.
+
+## Tests for Tests
+
+There is automated check for flaky tests. It runs all new tests 100 times (for functional tests) or 10 times (for integration tests). If at least single time the test failed, it is considered flaky.
+
+## Testflows
+
+[Testflows](https://testflows.com/) is an enterprise-grade open-source testing framework, which is used to test a subset of ClickHouse.
+
+## Test Automation {#test-automation}
+
+We run tests with [GitHub Actions](https://github.com/features/actions).
+
+Build jobs and tests are run in Sandbox on per commit basis. Resulting packages and test results are published in GitHub and can be downloaded by direct links. Artifacts are stored for several months. When you send a pull request on GitHub, we tag it as “can be tested” and our CI system will build ClickHouse packages (release, debug, with address sanitizer, etc) for you.
+
+We do not use Travis CI due to the limit on time and computational power.
+We do not use Jenkins. It was used before and now we are happy we are not using Jenkins.
+
+[Original article](https://clickhouse.com/docs/en/development/tests/)
diff --git a/docs/en/reference/engines/_category_.yml b/docs/en/reference/engines/_category_.yml
new file mode 100644
index 00000000000..a82c53bc65e
--- /dev/null
+++ b/docs/en/reference/engines/_category_.yml
@@ -0,0 +1,8 @@
+position: 30
+label: 'Database & Table Engines'
+collapsible: true
+collapsed: true
+link:
+ type: generated-index
+ title: Database & Table Engines
+ slug: /en/table-engines
\ No newline at end of file
diff --git a/docs/en/reference/engines/database-engines/atomic.md b/docs/en/reference/engines/database-engines/atomic.md
new file mode 100644
index 00000000000..878307121aa
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/atomic.md
@@ -0,0 +1,61 @@
+---
+sidebar_label: Atomic
+sidebar_position: 10
+---
+
+# Atomic
+
+It supports non-blocking [DROP TABLE](#drop-detach-table) and [RENAME TABLE](#rename-table) queries and atomic [EXCHANGE TABLES](#exchange-tables) queries. `Atomic` database engine is used by default.
+
+## Creating a Database {#creating-a-database}
+
+``` sql
+CREATE DATABASE test [ENGINE = Atomic];
+```
+
+## Specifics and recommendations {#specifics-and-recommendations}
+
+### Table UUID {#table-uuid}
+
+All tables in database `Atomic` have persistent [UUID](../../sql-reference/data-types/uuid.md) and store data in directory `/clickhouse_path/store/xxx/xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy/`, where `xxxyyyyy-yyyy-yyyy-yyyy-yyyyyyyyyyyy` is UUID of the table.
+Usually, the UUID is generated automatically, but the user can also explicitly specify the UUID in the same way when creating the table (this is not recommended).
+
+For example:
+
+```sql
+CREATE TABLE name UUID '28f1c61c-2970-457a-bffe-454156ddcfef' (n UInt64) ENGINE = ...;
+```
+
+:::note
+You can use the [show_table_uuid_in_table_create_query_if_not_nil](../../operations/settings/settings.md#show_table_uuid_in_table_create_query_if_not_nil) setting to display the UUID with the `SHOW CREATE` query.
+:::
+
+### RENAME TABLE {#rename-table}
+
+[RENAME](../../sql-reference/statements/rename.md) queries are performed without changing the UUID or moving table data. These queries do not wait for the completion of queries using the table and are executed instantly.
+
+### DROP/DETACH TABLE {#drop-detach-table}
+
+On `DROP TABLE` no data is removed, database `Atomic` just marks table as dropped by moving metadata to `/clickhouse_path/metadata_dropped/` and notifies background thread. Delay before final table data deletion is specified by the [database_atomic_delay_before_drop_table_sec](../../operations/server-configuration-parameters/settings.md#database_atomic_delay_before_drop_table_sec) setting.
+You can specify synchronous mode using `SYNC` modifier. Use the [database_atomic_wait_for_drop_and_detach_synchronously](../../operations/settings/settings.md#database_atomic_wait_for_drop_and_detach_synchronously) setting to do this. In this case `DROP` waits for running `SELECT`, `INSERT` and other queries which are using the table to finish. Table will be actually removed when it's not in use.
+
+### EXCHANGE TABLES/DICTIONARIES {#exchange-tables}
+
+[EXCHANGE](../../sql-reference/statements/exchange.md) query swaps tables or dictionaries atomically. For instance, instead of this non-atomic operation:
+
+```sql
+RENAME TABLE new_table TO tmp, old_table TO new_table, tmp TO old_table;
+```
+you can use one atomic query:
+
+``` sql
+EXCHANGE TABLES new_table AND old_table;
+```
+
+### ReplicatedMergeTree in Atomic Database {#replicatedmergetree-in-atomic-database}
+
+For [ReplicatedMergeTree](../table-engines/mergetree-family/replication.md#table_engines-replication) tables, it is recommended not to specify engine parameters - path in ZooKeeper and replica name. In this case, configuration parameters [default_replica_path](../../operations/server-configuration-parameters/settings.md#default_replica_path) and [default_replica_name](../../operations/server-configuration-parameters/settings.md#default_replica_name) will be used. If you want to specify engine parameters explicitly, it is recommended to use `{uuid}` macros. This is useful so that unique paths are automatically generated for each table in ZooKeeper.
+
+## See Also
+
+- [system.databases](../../operations/system-tables/databases.md) system table
diff --git a/docs/en/reference/engines/database-engines/index.md b/docs/en/reference/engines/database-engines/index.md
new file mode 100644
index 00000000000..0cee580abcd
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/index.md
@@ -0,0 +1,25 @@
+---
+toc_folder_title: Database Engines
+toc_priority: 27
+toc_title: Introduction
+---
+
+# Database Engines {#database-engines}
+
+Database engines allow you to work with tables. By default, ClickHouse uses the [Atomic](../../engines/database-engines/atomic.md) database engine, which provides configurable [table engines](../../engines/table-engines/index.md) and an [SQL dialect](../../sql-reference/syntax.md).
+
+Here is a complete list of available database engines. Follow the links for more details:
+
+- [Atomic](../../engines/database-engines/atomic.md)
+
+- [MySQL](../../engines/database-engines/mysql.md)
+
+- [MaterializedMySQL](../../engines/database-engines/materialized-mysql.md)
+
+- [Lazy](../../engines/database-engines/lazy.md)
+
+- [PostgreSQL](../../engines/database-engines/postgresql.md)
+
+- [Replicated](../../engines/database-engines/replicated.md)
+
+- [SQLite](../../engines/database-engines/sqlite.md)
diff --git a/docs/en/reference/engines/database-engines/lazy.md b/docs/en/reference/engines/database-engines/lazy.md
new file mode 100644
index 00000000000..b95ade19df4
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/lazy.md
@@ -0,0 +1,16 @@
+---
+sidebar_label: Lazy
+sidebar_position: 20
+---
+
+# Lazy {#lazy}
+
+Keeps tables in RAM only `expiration_time_in_seconds` seconds after last access. Can be used only with \*Log tables.
+
+It’s optimized for storing many small \*Log tables, for which there is a long time interval between accesses.
+
+## Creating a Database {#creating-a-database}
+
+ CREATE DATABASE testlazy ENGINE = Lazy(expiration_time_in_seconds);
+
+[Original article](https://clickhouse.com/docs/en/database_engines/lazy/)
diff --git a/docs/en/reference/engines/database-engines/materialized-mysql.md b/docs/en/reference/engines/database-engines/materialized-mysql.md
new file mode 100644
index 00000000000..df072682097
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/materialized-mysql.md
@@ -0,0 +1,290 @@
+---
+sidebar_label: MaterializedMySQL
+sidebar_position: 70
+---
+
+# [experimental] MaterializedMySQL
+
+:::warning
+This is an experimental feature that should not be used in production.
+:::
+
+Creates a ClickHouse database with all the tables existing in MySQL, and all the data in those tables. The ClickHouse server works as MySQL replica. It reads `binlog` and performs DDL and DML queries.
+
+## Creating a Database {#creating-a-database}
+
+``` sql
+CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
+ENGINE = MaterializedMySQL('host:port', ['database' | database], 'user', 'password') [SETTINGS ...]
+[TABLE OVERRIDE table1 (...), TABLE OVERRIDE table2 (...)]
+```
+
+**Engine Parameters**
+
+- `host:port` — MySQL server endpoint.
+- `database` — MySQL database name.
+- `user` — MySQL user.
+- `password` — User password.
+
+**Engine Settings**
+
+- `max_rows_in_buffer` — Maximum number of rows that data is allowed to cache in memory (for single table and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `65 505`.
+- `max_bytes_in_buffer` — Maximum number of bytes that data is allowed to cache in memory (for single table and the cache data unable to query). When this number is exceeded, the data will be materialized. Default: `1 048 576`.
+- `max_flush_data_time` — Maximum number of milliseconds that data is allowed to cache in memory (for database and the cache data unable to query). When this time is exceeded, the data will be materialized. Default: `1000`.
+- `max_wait_time_when_mysql_unavailable` — Retry interval when MySQL is not available (milliseconds). Negative value disables retry. Default: `1000`.
+- `allows_query_when_mysql_lost` — Allows to query a materialized table when MySQL is lost. Default: `0` (`false`).
+- `materialized_mysql_tables_list` — a comma-separated list of mysql database tables, which will be replicated by MaterializedMySQL database engine. Default value: empty list — means whole tables will be replicated.
+
+```sql
+CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user', '***')
+ SETTINGS
+ allows_query_when_mysql_lost=true,
+ max_wait_time_when_mysql_unavailable=10000;
+```
+
+**Settings on MySQL-server Side**
+
+For the correct work of `MaterializedMySQL`, there are few mandatory `MySQL`-side configuration settings that must be set:
+
+- `default_authentication_plugin = mysql_native_password` since `MaterializedMySQL` can only authorize with this method.
+- `gtid_mode = on` since GTID based logging is a mandatory for providing correct `MaterializedMySQL` replication.
+
+:::note
+While turning on `gtid_mode` you should also specify `enforce_gtid_consistency = on`.
+:::
+
+## Virtual Columns {#virtual-columns}
+
+When working with the `MaterializedMySQL` database engine, [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md) tables are used with virtual `_sign` and `_version` columns.
+
+- `_version` — Transaction counter. Type [UInt64](../../sql-reference/data-types/int-uint.md).
+- `_sign` — Deletion mark. Type [Int8](../../sql-reference/data-types/int-uint.md). Possible values:
+ - `1` — Row is not deleted,
+ - `-1` — Row is deleted.
+
+## Data Types Support {#data_types-support}
+
+| MySQL | ClickHouse |
+|-------------------------|--------------------------------------------------------------|
+| TINY | [Int8](../../sql-reference/data-types/int-uint.md) |
+| SHORT | [Int16](../../sql-reference/data-types/int-uint.md) |
+| INT24 | [Int32](../../sql-reference/data-types/int-uint.md) |
+| LONG | [UInt32](../../sql-reference/data-types/int-uint.md) |
+| LONGLONG | [UInt64](../../sql-reference/data-types/int-uint.md) |
+| FLOAT | [Float32](../../sql-reference/data-types/float.md) |
+| DOUBLE | [Float64](../../sql-reference/data-types/float.md) |
+| DECIMAL, NEWDECIMAL | [Decimal](../../sql-reference/data-types/decimal.md) |
+| DATE, NEWDATE | [Date](../../sql-reference/data-types/date.md) |
+| DATETIME, TIMESTAMP | [DateTime](../../sql-reference/data-types/datetime.md) |
+| DATETIME2, TIMESTAMP2 | [DateTime64](../../sql-reference/data-types/datetime64.md) |
+| YEAR | [UInt16](../../sql-reference/data-types/int-uint.md) |
+| TIME | [Int64](../../sql-reference/data-types/int-uint.md) |
+| ENUM | [Enum](../../sql-reference/data-types/enum.md) |
+| STRING | [String](../../sql-reference/data-types/string.md) |
+| VARCHAR, VAR_STRING | [String](../../sql-reference/data-types/string.md) |
+| BLOB | [String](../../sql-reference/data-types/string.md) |
+| GEOMETRY | [String](../../sql-reference/data-types/string.md) |
+| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) |
+| BIT | [UInt64](../../sql-reference/data-types/int-uint.md) |
+| SET | [UInt64](../../sql-reference/data-types/int-uint.md) |
+
+[Nullable](../../sql-reference/data-types/nullable.md) is supported.
+
+The data of TIME type in MySQL is converted to microseconds in ClickHouse.
+
+Other types are not supported. If MySQL table contains a column of such type, ClickHouse throws exception "Unhandled data type" and stops replication.
+
+## Specifics and Recommendations {#specifics-and-recommendations}
+
+### Compatibility Restrictions {#compatibility-restrictions}
+
+Apart of the data types limitations there are few restrictions comparing to `MySQL` databases, that should be resolved before replication will be possible:
+
+- Each table in `MySQL` should contain `PRIMARY KEY`.
+
+- Replication for tables, those are containing rows with `ENUM` field values out of range (specified in `ENUM` signature) will not work.
+
+### DDL Queries {#ddl-queries}
+
+MySQL DDL queries are converted into the corresponding ClickHouse DDL queries ([ALTER](../../sql-reference/statements/alter/index.md), [CREATE](../../sql-reference/statements/create/index.md), [DROP](../../sql-reference/statements/drop.md), [RENAME](../../sql-reference/statements/rename.md)). If ClickHouse cannot parse some DDL query, the query is ignored.
+
+### Data Replication {#data-replication}
+
+`MaterializedMySQL` does not support direct `INSERT`, `DELETE` and `UPDATE` queries. However, they are supported in terms of data replication:
+
+- MySQL `INSERT` query is converted into `INSERT` with `_sign=1`.
+
+- MySQL `DELETE` query is converted into `INSERT` with `_sign=-1`.
+
+- MySQL `UPDATE` query is converted into `INSERT` with `_sign=-1` and `INSERT` with `_sign=1` if the primary key has been changed, or
+ `INSERT` with `_sign=1` if not.
+
+### Selecting from MaterializedMySQL Tables {#select}
+
+`SELECT` query from `MaterializedMySQL` tables has some specifics:
+
+- If `_version` is not specified in the `SELECT` query, the
+ [FINAL](../../sql-reference/statements/select/from.md#select-from-final) modifier is used, so only rows with
+ `MAX(_version)` are returned for each primary key value.
+
+- If `_sign` is not specified in the `SELECT` query, `WHERE _sign=1` is used by default. So the deleted rows are not
+ included into the result set.
+
+- The result includes columns comments in case they exist in MySQL database tables.
+
+### Index Conversion {#index-conversion}
+
+MySQL `PRIMARY KEY` and `INDEX` clauses are converted into `ORDER BY` tuples in ClickHouse tables.
+
+ClickHouse has only one physical order, which is determined by `ORDER BY` clause. To create a new physical order, use
+[materialized views](../../sql-reference/statements/create/view.md#materialized).
+
+**Notes**
+
+- Rows with `_sign=-1` are not deleted physically from the tables.
+- Cascade `UPDATE/DELETE` queries are not supported by the `MaterializedMySQL` engine, as they are not visible in the
+ MySQL binlog.
+- Replication can be easily broken.
+- Manual operations on database and tables are forbidden.
+- `MaterializedMySQL` is affected by the [optimize_on_insert](../../operations/settings/settings.md#optimize-on-insert)
+ setting. Data is merged in the corresponding table in the `MaterializedMySQL` database when a table in the MySQL
+ server changes.
+
+### Table Overrides {#table-overrides}
+
+Table overrides can be used to customize the ClickHouse DDL queries, allowing you to make schema optimizations for your
+application. This is especially useful for controlling partitioning, which is important for the overall performance of
+MaterializedMySQL.
+
+These are the schema conversion manipulations you can do with table overrides for MaterializedMySQL:
+
+ * Modify column type. Must be compatible with the original type, or replication will fail. For example,
+ you can modify a UInt32 column to UInt64, but you can not modify a String column to Array(String).
+ * Modify [column TTL](../table-engines/mergetree-family/mergetree/#mergetree-column-ttl).
+ * Modify [column compression codec](../../sql-reference/statements/create/table/#codecs).
+ * Add [ALIAS columns](../../sql-reference/statements/create/table/#alias).
+ * Add [skipping indexes](../table-engines/mergetree-family/mergetree/#table_engine-mergetree-data_skipping-indexes)
+ * Add [projections](../table-engines/mergetree-family/mergetree/#projections). Note that projection optimizations are
+ disabled when using `SELECT ... FINAL` (which MaterializedMySQL does by default), so their utility is limited here.
+ `INDEX ... TYPE hypothesis` as [described in the v21.12 blog post]](https://clickhouse.com/blog/en/2021/clickhouse-v21.12-released/)
+ may be more useful in this case.
+ * Modify [PARTITION BY](../table-engines/mergetree-family/custom-partitioning-key/)
+ * Modify [ORDER BY](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses)
+ * Modify [PRIMARY KEY](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses)
+ * Add [SAMPLE BY](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses)
+ * Add [table TTL](../table-engines/mergetree-family/mergetree/#mergetree-query-clauses)
+
+```sql
+CREATE DATABASE db_name ENGINE = MaterializedMySQL(...)
+[SETTINGS ...]
+[TABLE OVERRIDE table_name (
+ [COLUMNS (
+ [col_name [datatype] [ALIAS expr] [CODEC(...)] [TTL expr], ...]
+ [INDEX index_name expr TYPE indextype[(...)] GRANULARITY val, ...]
+ [PROJECTION projection_name (SELECT [GROUP BY] [ORDER BY]), ...]
+ )]
+ [ORDER BY expr]
+ [PRIMARY KEY expr]
+ [PARTITION BY expr]
+ [SAMPLE BY expr]
+ [TTL expr]
+), ...]
+```
+
+Example:
+
+```sql
+CREATE DATABASE db_name ENGINE = MaterializedMySQL(...)
+TABLE OVERRIDE table1 (
+ COLUMNS (
+ userid UUID,
+ category LowCardinality(String),
+ timestamp DateTime CODEC(Delta, Default)
+ )
+ PARTITION BY toYear(timestamp)
+),
+TABLE OVERRIDE table2 (
+ COLUMNS (
+ client_ip String TTL created + INTERVAL 72 HOUR
+ )
+ SAMPLE BY ip_hash
+)
+```
+
+The `COLUMNS` list is sparse; existing columns are modified as specified, extra ALIAS columns are added. It is not
+possible to add ordinary or MATERIALIZED columns. Modified columns with a different type must be assignable from the
+original type. There is currently no validation of this or similar issues when the `CREATE DATABASE` query executes, so
+extra care needs to be taken.
+
+You may specify overrides for tables that do not exist yet.
+
+:::warning
+It is easy to break replication with table overrides if not used with care. For example:
+
+* If an ALIAS column is added with a table override, and a column with the same name is later added to the source
+ MySQL table, the converted ALTER TABLE query in ClickHouse will fail and replication stops.
+* It is currently possible to add overrides that reference nullable columns where not-nullable are required, such as in
+ `ORDER BY` or `PARTITION BY`. This will cause CREATE TABLE queries that will fail, also causing replication to stop.
+:::
+
+## Examples of Use {#examples-of-use}
+
+Queries in MySQL:
+
+``` sql
+mysql> CREATE DATABASE db;
+mysql> CREATE TABLE db.test (a INT PRIMARY KEY, b INT);
+mysql> INSERT INTO db.test VALUES (1, 11), (2, 22);
+mysql> DELETE FROM db.test WHERE a=1;
+mysql> ALTER TABLE db.test ADD COLUMN c VARCHAR(16);
+mysql> UPDATE db.test SET c='Wow!', b=222;
+mysql> SELECT * FROM test;
+```
+
+```text
+┌─a─┬───b─┬─c────┐
+│ 2 │ 222 │ Wow! │
+└───┴─────┴──────┘
+```
+
+Database in ClickHouse, exchanging data with the MySQL server:
+
+The database and the table created:
+
+``` sql
+CREATE DATABASE mysql ENGINE = MaterializedMySQL('localhost:3306', 'db', 'user', '***');
+SHOW TABLES FROM mysql;
+```
+
+``` text
+┌─name─┐
+│ test │
+└──────┘
+```
+
+After inserting data:
+
+``` sql
+SELECT * FROM mysql.test;
+```
+
+``` text
+┌─a─┬──b─┐
+│ 1 │ 11 │
+│ 2 │ 22 │
+└───┴────┘
+```
+
+After deleting data, adding the column and updating:
+
+``` sql
+SELECT * FROM mysql.test;
+```
+
+``` text
+┌─a─┬───b─┬─c────┐
+│ 2 │ 222 │ Wow! │
+└───┴─────┴──────┘
+```
+
+[Original article](https://clickhouse.com/docs/en/engines/database-engines/materialized-mysql/)
diff --git a/docs/en/reference/engines/database-engines/materialized-postgresql.md b/docs/en/reference/engines/database-engines/materialized-postgresql.md
new file mode 100644
index 00000000000..ff8f7b192e0
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/materialized-postgresql.md
@@ -0,0 +1,279 @@
+---
+sidebar_label: MaterializedPostgreSQL
+sidebar_position: 60
+---
+
+# [experimental] MaterializedPostgreSQL {#materialize-postgresql}
+
+Creates a ClickHouse database with tables from PostgreSQL database. Firstly, database with engine `MaterializedPostgreSQL` creates a snapshot of PostgreSQL database and loads required tables. Required tables can include any subset of tables from any subset of schemas from specified database. Along with the snapshot database engine acquires LSN and once initial dump of tables is performed - it starts pulling updates from WAL. After database is created, newly added tables to PostgreSQL database are not automatically added to replication. They have to be added manually with `ATTACH TABLE db.table` query.
+
+Replication is implemented with PostgreSQL Logical Replication Protocol, which does not allow to replicate DDL, but allows to know whether replication breaking changes happened (column type changes, adding/removing columns). Such changes are detected and according tables stop receiving updates. Such tables can be automatically reloaded in the background in case required setting is turned on (can be used starting from 22.1). Safest way for now is to use `ATTACH`/ `DETACH` queries to reload table completely. If DDL does not break replication (for example, renaming a column) table will still receive updates (insertion is done by position).
+
+## Creating a Database {#creating-a-database}
+
+``` sql
+CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
+ENGINE = MaterializedPostgreSQL('host:port', 'database', 'user', 'password') [SETTINGS ...]
+```
+
+**Engine Parameters**
+
+- `host:port` — PostgreSQL server endpoint.
+- `database` — PostgreSQL database name.
+- `user` — PostgreSQL user.
+- `password` — User password.
+
+## Example of Use {#example-of-use}
+
+``` sql
+CREATE DATABASE postgres_db
+ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgres_user', 'postgres_password');
+
+SHOW TABLES FROM postgres_db;
+
+┌─name───┐
+│ table1 │
+└────────┘
+
+SELECT * FROM postgresql_db.postgres_table;
+```
+
+## Dynamically adding new tables to replication {#dynamically-adding-table-to-replication}
+
+After `MaterializedPostgreSQL` database is created, it does not automatically detect new tables in according PostgreSQL database. Such tables can be added manually:
+
+``` sql
+ATTACH TABLE postgres_database.new_table;
+```
+
+:::warning
+Before version 22.1, adding a table to replication left an unremoved temporary replication slot (named `{db_name}_ch_replication_slot_tmp`). If attaching tables in ClickHouse version before 22.1, make sure to delete it manually (`SELECT pg_drop_replication_slot('{db_name}_ch_replication_slot_tmp')`). Otherwise disk usage will grow. This issue is fixed in 22.1.
+:::
+
+## Dynamically removing tables from replication {#dynamically-removing-table-from-replication}
+
+It is possible to remove specific tables from replication:
+
+``` sql
+DETACH TABLE postgres_database.table_to_remove;
+```
+
+## PostgreSQL schema {#schema}
+
+PostgreSQL [schema](https://www.postgresql.org/docs/9.1/ddl-schemas.html) can be configured in 3 ways (starting from version 21.12).
+
+1. One schema for one `MaterializedPostgreSQL` database engine. Requires to use setting `materialized_postgresql_schema`.
+Tables are accessed via table name only:
+
+``` sql
+CREATE DATABASE postgres_database
+ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgres_user', 'postgres_password')
+SETTINGS materialized_postgresql_schema = 'postgres_schema';
+
+SELECT * FROM postgres_database.table1;
+```
+
+2. Any number of schemas with specified set of tables for one `MaterializedPostgreSQL` database engine. Requires to use setting `materialized_postgresql_tables_list`. Each table is written along with its schema.
+Tables are accessed via schema name and table name at the same time:
+
+``` sql
+CREATE DATABASE database1
+ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgres_user', 'postgres_password')
+SETTINGS materialized_postgresql_tables_list = 'schema1.table1,schema2.table2,schema1.table3',
+ materialized_postgresql_tables_list_with_schema = 1;
+
+SELECT * FROM database1.`schema1.table1`;
+SELECT * FROM database1.`schema2.table2`;
+```
+
+But in this case all tables in `materialized_postgresql_tables_list` must be written with its schema name.
+Requires `materialized_postgresql_tables_list_with_schema = 1`.
+
+Warning: for this case dots in table name are not allowed.
+
+3. Any number of schemas with full set of tables for one `MaterializedPostgreSQL` database engine. Requires to use setting `materialized_postgresql_schema_list`.
+
+``` sql
+CREATE DATABASE database1
+ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgres_user', 'postgres_password')
+SETTINGS materialized_postgresql_schema_list = 'schema1,schema2,schema3';
+
+SELECT * FROM database1.`schema1.table1`;
+SELECT * FROM database1.`schema1.table2`;
+SELECT * FROM database1.`schema2.table2`;
+```
+
+Warning: for this case dots in table name are not allowed.
+
+
+## Requirements {#requirements}
+
+1. The [wal_level](https://www.postgresql.org/docs/current/runtime-config-wal.html) setting must have a value `logical` and `max_replication_slots` parameter must have a value at least `2` in the PostgreSQL config file.
+
+2. Each replicated table must have one of the following [replica identity](https://www.postgresql.org/docs/10/sql-altertable.html#SQL-CREATETABLE-REPLICA-IDENTITY):
+
+- primary key (by default)
+
+- index
+
+``` bash
+postgres# CREATE TABLE postgres_table (a Integer NOT NULL, b Integer, c Integer NOT NULL, d Integer, e Integer NOT NULL);
+postgres# CREATE unique INDEX postgres_table_index on postgres_table(a, c, e);
+postgres# ALTER TABLE postgres_table REPLICA IDENTITY USING INDEX postgres_table_index;
+```
+
+The primary key is always checked first. If it is absent, then the index, defined as replica identity index, is checked.
+If the index is used as a replica identity, there has to be only one such index in a table.
+You can check what type is used for a specific table with the following command:
+
+``` bash
+postgres# SELECT CASE relreplident
+ WHEN 'd' THEN 'default'
+ WHEN 'n' THEN 'nothing'
+ WHEN 'f' THEN 'full'
+ WHEN 'i' THEN 'index'
+ END AS replica_identity
+FROM pg_class
+WHERE oid = 'postgres_table'::regclass;
+```
+
+:::warning
+Replication of [**TOAST**](https://www.postgresql.org/docs/9.5/storage-toast.html) values is not supported. The default value for the data type will be used.
+:::
+
+## Settings {#settings}
+
+1. `materialized_postgresql_tables_list` {#materialized-postgresql-tables-list}
+
+ Sets a comma-separated list of PostgreSQL database tables, which will be replicated via [MaterializedPostgreSQL](../../engines/database-engines/materialized-postgresql.md) database engine.
+
+ Default value: empty list — means whole PostgreSQL database will be replicated.
+
+2. `materialized_postgresql_schema` {#materialized-postgresql-schema}
+
+ Default value: empty string. (Default schema is used)
+
+3. `materialized_postgresql_schema_list` {#materialized-postgresql-schema-list}
+
+ Default value: empty list. (Default schema is used)
+
+4. `materialized_postgresql_allow_automatic_update` {#materialized-postgresql-allow-automatic-update}
+
+ Do not use this setting before 22.1 version.
+
+ Allows reloading table in the background, when schema changes are detected. DDL queries on the PostgreSQL side are not replicated via ClickHouse [MaterializedPostgreSQL](../../engines/database-engines/materialized-postgresql.md) engine, because it is not allowed with PostgreSQL logical replication protocol, but the fact of DDL changes is detected transactionally. In this case, the default behaviour is to stop replicating those tables once DDL is detected. However, if this setting is enabled, then, instead of stopping the replication of those tables, they will be reloaded in the background via database snapshot without data losses and replication will continue for them.
+
+ Possible values:
+
+ - 0 — The table is not automatically updated in the background, when schema changes are detected.
+ - 1 — The table is automatically updated in the background, when schema changes are detected.
+
+ Default value: `0`.
+
+5. `materialized_postgresql_max_block_size` {#materialized-postgresql-max-block-size}
+
+ Sets the number of rows collected in memory before flushing data into PostgreSQL database table.
+
+ Possible values:
+
+ - Positive integer.
+
+ Default value: `65536`.
+
+6. `materialized_postgresql_replication_slot` {#materialized-postgresql-replication-slot}
+
+ A user-created replication slot. Must be used together with `materialized_postgresql_snapshot`.
+
+7. `materialized_postgresql_snapshot` {#materialized-postgresql-snapshot}
+
+ A text string identifying a snapshot, from which [initial dump of PostgreSQL tables](../../engines/database-engines/materialized-postgresql.md) will be performed. Must be used together with `materialized_postgresql_replication_slot`.
+
+ ``` sql
+ CREATE DATABASE database1
+ ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgres_user', 'postgres_password')
+ SETTINGS materialized_postgresql_tables_list = 'table1,table2,table3';
+
+ SELECT * FROM database1.table1;
+ ```
+
+ The settings can be changed, if necessary, using a DDL query. But it is impossible to change the setting `materialized_postgresql_tables_list`. To update the list of tables in this setting use the `ATTACH TABLE` query.
+
+ ``` sql
+ ALTER DATABASE postgres_database MODIFY SETTING materialized_postgresql_max_block_size = ;
+ ```
+
+## Notes {#notes}
+
+### Failover of the logical replication slot {#logical-replication-slot-failover}
+
+Logical Replication Slots which exist on the primary are not available on standby replicas.
+So if there is a failover, new primary (the old physical standby) won’t be aware of any slots which were existing with old primary. This will lead to a broken replication from PostgreSQL.
+A solution to this is to manage replication slots yourself and define a permanent replication slot (some information can be found [here](https://patroni.readthedocs.io/en/latest/SETTINGS.html)). You'll need to pass slot name via `materialized_postgresql_replication_slot` setting, and it has to be exported with `EXPORT SNAPSHOT` option. The snapshot identifier needs to be passed via `materialized_postgresql_snapshot` setting.
+
+Please note that this should be used only if it is actually needed. If there is no real need for that or full understanding why, then it is better to allow the table engine to create and manage its own replication slot.
+
+**Example (from [@bchrobot](https://github.com/bchrobot))**
+
+1. Configure replication slot in PostgreSQL.
+
+ ```yaml
+ apiVersion: "acid.zalan.do/v1"
+ kind: postgresql
+ metadata:
+ name: acid-demo-cluster
+ spec:
+ numberOfInstances: 2
+ postgresql:
+ parameters:
+ wal_level: logical
+ patroni:
+ slots:
+ clickhouse_sync:
+ type: logical
+ database: demodb
+ plugin: pgoutput
+ ```
+
+2. Wait for replication slot to be ready, then begin a transaction and export the transaction snapshot identifier:
+
+ ```sql
+ BEGIN;
+ SELECT pg_export_snapshot();
+ ```
+
+3. In ClickHouse create database:
+
+ ```sql
+ CREATE DATABASE demodb
+ ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgres_user', 'postgres_password')
+ SETTINGS
+ materialized_postgresql_replication_slot = 'clickhouse_sync',
+ materialized_postgresql_snapshot = '0000000A-0000023F-3',
+ materialized_postgresql_tables_list = 'table1,table2,table3';
+ ```
+
+4. End the PostgreSQL transaction once replication to ClickHouse DB is confirmed. Verify that replication continues after failover:
+
+ ```bash
+ kubectl exec acid-demo-cluster-0 -c postgres -- su postgres -c 'patronictl failover --candidate acid-demo-cluster-1 --force'
+ ```
+
+### Required permissions
+
+1. [CREATE PUBLICATION](https://postgrespro.ru/docs/postgresql/14/sql-createpublication) -- create query privilege.
+
+2. [CREATE_REPLICATION_SLOT](https://postgrespro.ru/docs/postgrespro/10/protocol-replication#PROTOCOL-REPLICATION-CREATE-SLOT) -- replication privelege.
+
+3. [pg_drop_replication_slot](https://postgrespro.ru/docs/postgrespro/9.5/functions-admin#functions-replication) -- replication privilege or superuser.
+
+4. [DROP PUBLICATION](https://postgrespro.ru/docs/postgresql/10/sql-droppublication) -- owner of publication (`username` in MaterializedPostgreSQL engine itself).
+
+It is possible to avoid executing `2` and `3` commands and having those permissions. Use settings `materialized_postgresql_replication_slot` and `materialized_postgresql_snapshot`. But with much care.
+
+Access to tables:
+
+1. pg_publication
+
+2. pg_replication_slots
+
+3. pg_publication_tables
diff --git a/docs/en/reference/engines/database-engines/mysql.md b/docs/en/reference/engines/database-engines/mysql.md
new file mode 100644
index 00000000000..89a0786a9ec
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/mysql.md
@@ -0,0 +1,151 @@
+---
+sidebar_position: 50
+sidebar_label: MySQL
+---
+
+# MySQL
+
+Allows to connect to databases on a remote MySQL server and perform `INSERT` and `SELECT` queries to exchange data between ClickHouse and MySQL.
+
+The `MySQL` database engine translate queries to the MySQL server so you can perform operations such as `SHOW TABLES` or `SHOW CREATE TABLE`.
+
+You cannot perform the following queries:
+
+- `RENAME`
+- `CREATE TABLE`
+- `ALTER`
+
+## Creating a Database {#creating-a-database}
+
+``` sql
+CREATE DATABASE [IF NOT EXISTS] db_name [ON CLUSTER cluster]
+ENGINE = MySQL('host:port', ['database' | database], 'user', 'password')
+```
+
+**Engine Parameters**
+
+- `host:port` — MySQL server address.
+- `database` — Remote database name.
+- `user` — MySQL user.
+- `password` — User password.
+
+## Data Types Support {#data_types-support}
+
+| MySQL | ClickHouse |
+|----------------------------------|--------------------------------------------------------------|
+| UNSIGNED TINYINT | [UInt8](../../sql-reference/data-types/int-uint.md) |
+| TINYINT | [Int8](../../sql-reference/data-types/int-uint.md) |
+| UNSIGNED SMALLINT | [UInt16](../../sql-reference/data-types/int-uint.md) |
+| SMALLINT | [Int16](../../sql-reference/data-types/int-uint.md) |
+| UNSIGNED INT, UNSIGNED MEDIUMINT | [UInt32](../../sql-reference/data-types/int-uint.md) |
+| INT, MEDIUMINT | [Int32](../../sql-reference/data-types/int-uint.md) |
+| UNSIGNED BIGINT | [UInt64](../../sql-reference/data-types/int-uint.md) |
+| BIGINT | [Int64](../../sql-reference/data-types/int-uint.md) |
+| FLOAT | [Float32](../../sql-reference/data-types/float.md) |
+| DOUBLE | [Float64](../../sql-reference/data-types/float.md) |
+| DATE | [Date](../../sql-reference/data-types/date.md) |
+| DATETIME, TIMESTAMP | [DateTime](../../sql-reference/data-types/datetime.md) |
+| BINARY | [FixedString](../../sql-reference/data-types/fixedstring.md) |
+
+All other MySQL data types are converted into [String](../../sql-reference/data-types/string.md).
+
+[Nullable](../../sql-reference/data-types/nullable.md) is supported.
+
+## Global Variables Support {#global-variables-support}
+
+For better compatibility you may address global variables in MySQL style, as `@@identifier`.
+
+These variables are supported:
+- `version`
+- `max_allowed_packet`
+
+:::warning
+By now these variables are stubs and don't correspond to anything.
+:::
+
+Example:
+
+``` sql
+SELECT @@version;
+```
+
+## Examples of Use {#examples-of-use}
+
+Table in MySQL:
+
+``` text
+mysql> USE test;
+Database changed
+
+mysql> CREATE TABLE `mysql_table` (
+ -> `int_id` INT NOT NULL AUTO_INCREMENT,
+ -> `float` FLOAT NOT NULL,
+ -> PRIMARY KEY (`int_id`));
+Query OK, 0 rows affected (0,09 sec)
+
+mysql> insert into mysql_table (`int_id`, `float`) VALUES (1,2);
+Query OK, 1 row affected (0,00 sec)
+
+mysql> select * from mysql_table;
++------+-----+
+| int_id | value |
++------+-----+
+| 1 | 2 |
++------+-----+
+1 row in set (0,00 sec)
+```
+
+Database in ClickHouse, exchanging data with the MySQL server:
+
+``` sql
+CREATE DATABASE mysql_db ENGINE = MySQL('localhost:3306', 'test', 'my_user', 'user_password')
+```
+
+``` sql
+SHOW DATABASES
+```
+
+``` text
+┌─name─────┐
+│ default │
+│ mysql_db │
+│ system │
+└──────────┘
+```
+
+``` sql
+SHOW TABLES FROM mysql_db
+```
+
+``` text
+┌─name─────────┐
+│ mysql_table │
+└──────────────┘
+```
+
+``` sql
+SELECT * FROM mysql_db.mysql_table
+```
+
+``` text
+┌─int_id─┬─value─┐
+│ 1 │ 2 │
+└────────┴───────┘
+```
+
+``` sql
+INSERT INTO mysql_db.mysql_table VALUES (3,4)
+```
+
+``` sql
+SELECT * FROM mysql_db.mysql_table
+```
+
+``` text
+┌─int_id─┬─value─┐
+│ 1 │ 2 │
+│ 3 │ 4 │
+└────────┴───────┘
+```
+
+[Original article](https://clickhouse.com/docs/en/database_engines/mysql/)
diff --git a/docs/en/reference/engines/database-engines/postgresql.md b/docs/en/reference/engines/database-engines/postgresql.md
new file mode 100644
index 00000000000..bc5e93d0923
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/postgresql.md
@@ -0,0 +1,139 @@
+---
+sidebar_position: 40
+sidebar_label: PostgreSQL
+---
+
+# PostgreSQL {#postgresql}
+
+Allows to connect to databases on a remote [PostgreSQL](https://www.postgresql.org) server. Supports read and write operations (`SELECT` and `INSERT` queries) to exchange data between ClickHouse and PostgreSQL.
+
+Gives the real-time access to table list and table structure from remote PostgreSQL with the help of `SHOW TABLES` and `DESCRIBE TABLE` queries.
+
+Supports table structure modifications (`ALTER TABLE ... ADD|DROP COLUMN`). If `use_table_cache` parameter (see the Engine Parameters below) it set to `1`, the table structure is cached and not checked for being modified, but can be updated with `DETACH` and `ATTACH` queries.
+
+## Creating a Database {#creating-a-database}
+
+``` sql
+CREATE DATABASE test_database
+ENGINE = PostgreSQL('host:port', 'database', 'user', 'password'[, `schema`, `use_table_cache`]);
+```
+
+**Engine Parameters**
+
+- `host:port` — PostgreSQL server address.
+- `database` — Remote database name.
+- `user` — PostgreSQL user.
+- `password` — User password.
+- `schema` — PostgreSQL schema.
+- `use_table_cache` — Defines if the database table structure is cached or not. Optional. Default value: `0`.
+
+## Data Types Support {#data_types-support}
+
+| PostgerSQL | ClickHouse |
+|------------------|--------------------------------------------------------------|
+| DATE | [Date](../../sql-reference/data-types/date.md) |
+| TIMESTAMP | [DateTime](../../sql-reference/data-types/datetime.md) |
+| REAL | [Float32](../../sql-reference/data-types/float.md) |
+| DOUBLE | [Float64](../../sql-reference/data-types/float.md) |
+| DECIMAL, NUMERIC | [Decimal](../../sql-reference/data-types/decimal.md) |
+| SMALLINT | [Int16](../../sql-reference/data-types/int-uint.md) |
+| INTEGER | [Int32](../../sql-reference/data-types/int-uint.md) |
+| BIGINT | [Int64](../../sql-reference/data-types/int-uint.md) |
+| SERIAL | [UInt32](../../sql-reference/data-types/int-uint.md) |
+| BIGSERIAL | [UInt64](../../sql-reference/data-types/int-uint.md) |
+| TEXT, CHAR | [String](../../sql-reference/data-types/string.md) |
+| INTEGER | Nullable([Int32](../../sql-reference/data-types/int-uint.md))|
+| ARRAY | [Array](../../sql-reference/data-types/array.md) |
+
+
+## Examples of Use {#examples-of-use}
+
+Database in ClickHouse, exchanging data with the PostgreSQL server:
+
+``` sql
+CREATE DATABASE test_database
+ENGINE = PostgreSQL('postgres1:5432', 'test_database', 'postgres', 'mysecretpassword', 1);
+```
+
+``` sql
+SHOW DATABASES;
+```
+
+``` text
+┌─name──────────┐
+│ default │
+│ test_database │
+│ system │
+└───────────────┘
+```
+
+``` sql
+SHOW TABLES FROM test_database;
+```
+
+``` text
+┌─name───────┐
+│ test_table │
+└────────────┘
+```
+
+Reading data from the PostgreSQL table:
+
+``` sql
+SELECT * FROM test_database.test_table;
+```
+
+``` text
+┌─id─┬─value─┐
+│ 1 │ 2 │
+└────┴───────┘
+```
+
+Writing data to the PostgreSQL table:
+
+``` sql
+INSERT INTO test_database.test_table VALUES (3,4);
+SELECT * FROM test_database.test_table;
+```
+
+``` text
+┌─int_id─┬─value─┐
+│ 1 │ 2 │
+│ 3 │ 4 │
+└────────┴───────┘
+```
+
+Consider the table structure was modified in PostgreSQL:
+
+``` sql
+postgre> ALTER TABLE test_table ADD COLUMN data Text
+```
+
+As the `use_table_cache` parameter was set to `1` when the database was created, the table structure in ClickHouse was cached and therefore not modified:
+
+``` sql
+DESCRIBE TABLE test_database.test_table;
+```
+``` text
+┌─name───┬─type──────────────┐
+│ id │ Nullable(Integer) │
+│ value │ Nullable(Integer) │
+└────────┴───────────────────┘
+```
+
+After detaching the table and attaching it again, the structure was updated:
+
+``` sql
+DETACH TABLE test_database.test_table;
+ATTACH TABLE test_database.test_table;
+DESCRIBE TABLE test_database.test_table;
+```
+``` text
+┌─name───┬─type──────────────┐
+│ id │ Nullable(Integer) │
+│ value │ Nullable(Integer) │
+│ data │ Nullable(String) │
+└────────┴───────────────────┘
+```
+
+[Original article](https://clickhouse.com/docs/en/database-engines/postgresql/)
diff --git a/docs/en/reference/engines/database-engines/replicated.md b/docs/en/reference/engines/database-engines/replicated.md
new file mode 100644
index 00000000000..63d955dc889
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/replicated.md
@@ -0,0 +1,123 @@
+---
+sidebar_position: 30
+sidebar_label: Replicated
+---
+
+# [experimental] Replicated {#replicated}
+
+The engine is based on the [Atomic](../../engines/database-engines/atomic.md) engine. It supports replication of metadata via DDL log being written to ZooKeeper and executed on all of the replicas for a given database.
+
+One ClickHouse server can have multiple replicated databases running and updating at the same time. But there can't be multiple replicas of the same replicated database.
+
+## Creating a Database {#creating-a-database}
+``` sql
+ CREATE DATABASE testdb ENGINE = Replicated('zoo_path', 'shard_name', 'replica_name') [SETTINGS ...]
+```
+
+**Engine Parameters**
+
+- `zoo_path` — ZooKeeper path. The same ZooKeeper path corresponds to the same database.
+- `shard_name` — Shard name. Database replicas are grouped into shards by `shard_name`.
+- `replica_name` — Replica name. Replica names must be different for all replicas of the same shard.
+
+:::warning
+For [ReplicatedMergeTree](../table-engines/mergetree-family/replication.md#table_engines-replication) tables if no arguments provided, then default arguments are used: `/clickhouse/tables/{uuid}/{shard}` and `{replica}`. These can be changed in the server settings [default_replica_path](../../operations/server-configuration-parameters/settings.md#default_replica_path) and [default_replica_name](../../operations/server-configuration-parameters/settings.md#default_replica_name). Macro `{uuid}` is unfolded to table's uuid, `{shard}` and `{replica}` are unfolded to values from server config, not from database engine arguments. But in the future, it will be possible to use `shard_name` and `replica_name` of Replicated database.
+:::
+
+## Specifics and Recommendations {#specifics-and-recommendations}
+
+DDL queries with `Replicated` database work in a similar way to [ON CLUSTER](../../sql-reference/distributed-ddl.md) queries, but with minor differences.
+
+First, the DDL request tries to execute on the initiator (the host that originally received the request from the user). If the request is not fulfilled, then the user immediately receives an error, other hosts do not try to fulfill it. If the request has been successfully completed on the initiator, then all other hosts will automatically retry until they complete it. The initiator will try to wait for the query to be completed on other hosts (no longer than [distributed_ddl_task_timeout](../../operations/settings/settings.md#distributed_ddl_task_timeout)) and will return a table with the query execution statuses on each host.
+
+The behavior in case of errors is regulated by the [distributed_ddl_output_mode](../../operations/settings/settings.md#distributed_ddl_output_mode) setting, for a `Replicated` database it is better to set it to `null_status_on_timeout` — i.e. if some hosts did not have time to execute the request for [distributed_ddl_task_timeout](../../operations/settings/settings.md#distributed_ddl_task_timeout), then do not throw an exception, but show the `NULL` status for them in the table.
+
+The [system.clusters](../../operations/system-tables/clusters.md) system table contains a cluster named like the replicated database, which consists of all replicas of the database. This cluster is updated automatically when creating/deleting replicas, and it can be used for [Distributed](../../engines/table-engines/special/distributed.md#distributed) tables.
+
+When creating a new replica of the database, this replica creates tables by itself. If the replica has been unavailable for a long time and has lagged behind the replication log — it checks its local metadata with the current metadata in ZooKeeper, moves the extra tables with data to a separate non-replicated database (so as not to accidentally delete anything superfluous), creates the missing tables, updates the table names if they have been renamed. The data is replicated at the `ReplicatedMergeTree` level, i.e. if the table is not replicated, the data will not be replicated (the database is responsible only for metadata).
+
+[`ALTER TABLE ATTACH|FETCH|DROP|DROP DETACHED|DETACH PARTITION|PART`](../../sql-reference/statements/alter/partition.md) queries are allowed but not replicated. The database engine will only add/fetch/remove the partition/part to the current replica. However, if the table itself uses a Replicated table engine, then the data will be replicated after using `ATTACH`.
+
+## Usage Example {#usage-example}
+
+Creating a cluster with three hosts:
+
+``` sql
+node1 :) CREATE DATABASE r ENGINE=Replicated('some/path/r','shard1','replica1');
+node2 :) CREATE DATABASE r ENGINE=Replicated('some/path/r','shard1','other_replica');
+node3 :) CREATE DATABASE r ENGINE=Replicated('some/path/r','other_shard','{replica}');
+```
+
+Running the DDL-query:
+
+``` sql
+CREATE TABLE r.rmt (n UInt64) ENGINE=ReplicatedMergeTree ORDER BY n;
+```
+
+``` text
+┌─────hosts────────────┬──status─┬─error─┬─num_hosts_remaining─┬─num_hosts_active─┐
+│ shard1|replica1 │ 0 │ │ 2 │ 0 │
+│ shard1|other_replica │ 0 │ │ 1 │ 0 │
+│ other_shard|r1 │ 0 │ │ 0 │ 0 │
+└──────────────────────┴─────────┴───────┴─────────────────────┴──────────────────┘
+```
+
+Showing the system table:
+
+``` sql
+SELECT cluster, shard_num, replica_num, host_name, host_address, port, is_local
+FROM system.clusters WHERE cluster='r';
+```
+
+``` text
+┌─cluster─┬─shard_num─┬─replica_num─┬─host_name─┬─host_address─┬─port─┬─is_local─┐
+│ r │ 1 │ 1 │ node3 │ 127.0.0.1 │ 9002 │ 0 │
+│ r │ 2 │ 1 │ node2 │ 127.0.0.1 │ 9001 │ 0 │
+│ r │ 2 │ 2 │ node1 │ 127.0.0.1 │ 9000 │ 1 │
+└─────────┴───────────┴─────────────┴───────────┴──────────────┴──────┴──────────┘
+```
+
+Creating a distributed table and inserting the data:
+
+``` sql
+node2 :) CREATE TABLE r.d (n UInt64) ENGINE=Distributed('r','r','rmt', n % 2);
+node3 :) INSERT INTO r SELECT * FROM numbers(10);
+node1 :) SELECT materialize(hostName()) AS host, groupArray(n) FROM r.d GROUP BY host;
+```
+
+``` text
+┌─hosts─┬─groupArray(n)─┐
+│ node1 │ [1,3,5,7,9] │
+│ node2 │ [0,2,4,6,8] │
+└───────┴───────────────┘
+```
+
+Adding replica on the one more host:
+
+``` sql
+node4 :) CREATE DATABASE r ENGINE=Replicated('some/path/r','other_shard','r2');
+```
+
+The cluster configuration will look like this:
+
+``` text
+┌─cluster─┬─shard_num─┬─replica_num─┬─host_name─┬─host_address─┬─port─┬─is_local─┐
+│ r │ 1 │ 1 │ node3 │ 127.0.0.1 │ 9002 │ 0 │
+│ r │ 1 │ 2 │ node4 │ 127.0.0.1 │ 9003 │ 0 │
+│ r │ 2 │ 1 │ node2 │ 127.0.0.1 │ 9001 │ 0 │
+│ r │ 2 │ 2 │ node1 │ 127.0.0.1 │ 9000 │ 1 │
+└─────────┴───────────┴─────────────┴───────────┴──────────────┴──────┴──────────┘
+```
+
+The distributed table also will get data from the new host:
+
+```sql
+node2 :) SELECT materialize(hostName()) AS host, groupArray(n) FROM r.d GROUP BY host;
+```
+
+```text
+┌─hosts─┬─groupArray(n)─┐
+│ node2 │ [1,3,5,7,9] │
+│ node4 │ [0,2,4,6,8] │
+└───────┴───────────────┘
+```
\ No newline at end of file
diff --git a/docs/en/reference/engines/database-engines/sqlite.md b/docs/en/reference/engines/database-engines/sqlite.md
new file mode 100644
index 00000000000..2f8b44c9a09
--- /dev/null
+++ b/docs/en/reference/engines/database-engines/sqlite.md
@@ -0,0 +1,80 @@
+---
+sidebar_position: 55
+sidebar_label: SQLite
+---
+
+# SQLite {#sqlite}
+
+Allows to connect to [SQLite](https://www.sqlite.org/index.html) database and perform `INSERT` and `SELECT` queries to exchange data between ClickHouse and SQLite.
+
+## Creating a Database {#creating-a-database}
+
+``` sql
+ CREATE DATABASE sqlite_database
+ ENGINE = SQLite('db_path')
+```
+
+**Engine Parameters**
+
+- `db_path` — Path to a file with SQLite database.
+
+## Data Types Support {#data_types-support}
+
+| SQLite | ClickHouse |
+|---------------|---------------------------------------------------------|
+| INTEGER | [Int32](../../sql-reference/data-types/int-uint.md) |
+| REAL | [Float32](../../sql-reference/data-types/float.md) |
+| TEXT | [String](../../sql-reference/data-types/string.md) |
+| BLOB | [String](../../sql-reference/data-types/string.md) |
+
+## Specifics and Recommendations {#specifics-and-recommendations}
+
+SQLite stores the entire database (definitions, tables, indices, and the data itself) as a single cross-platform file on a host machine. During writing SQLite locks the entire database file, therefore write operations are performed sequentially. Read operations can be multitasked.
+SQLite does not require service management (such as startup scripts) or access control based on `GRANT` and passwords. Access control is handled by means of file-system permissions given to the database file itself.
+
+## Usage Example {#usage-example}
+
+Database in ClickHouse, connected to the SQLite:
+
+``` sql
+CREATE DATABASE sqlite_db ENGINE = SQLite('sqlite.db');
+SHOW TABLES FROM sqlite_db;
+```
+
+``` text
+┌──name───┐
+│ table1 │
+│ table2 │
+└─────────┘
+```
+
+Shows the tables:
+
+``` sql
+SELECT * FROM sqlite_db.table1;
+```
+
+``` text
+┌─col1──┬─col2─┐
+│ line1 │ 1 │
+│ line2 │ 2 │
+│ line3 │ 3 │
+└───────┴──────┘
+```
+Inserting data into SQLite table from ClickHouse table:
+
+``` sql
+CREATE TABLE clickhouse_table(`col1` String,`col2` Int16) ENGINE = MergeTree() ORDER BY col2;
+INSERT INTO clickhouse_table VALUES ('text',10);
+INSERT INTO sqlite_db.table1 SELECT * FROM clickhouse_table;
+SELECT * FROM sqlite_db.table1;
+```
+
+``` text
+┌─col1──┬─col2─┐
+│ line1 │ 1 │
+│ line2 │ 2 │
+│ line3 │ 3 │
+│ text │ 10 │
+└───────┴──────┘
+```
diff --git a/docs/en/reference/engines/table-engines/index.md b/docs/en/reference/engines/table-engines/index.md
new file mode 100644
index 00000000000..09e0147bbf7
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/index.md
@@ -0,0 +1,89 @@
+---
+toc_folder_title: Table Engines
+toc_priority: 26
+toc_title: Introduction
+---
+
+# Table Engines {#table_engines}
+
+The table engine (type of table) determines:
+
+- How and where data is stored, where to write it to, and where to read it from.
+- Which queries are supported, and how.
+- Concurrent data access.
+- Use of indexes, if present.
+- Whether multithreaded request execution is possible.
+- Data replication parameters.
+
+## Engine Families {#engine-families}
+
+### MergeTree {#mergetree}
+
+The most universal and functional table engines for high-load tasks. The property shared by these engines is quick data insertion with subsequent background data processing. `MergeTree` family engines support data replication (with [Replicated\*](../../engines/table-engines/mergetree-family/replication.md#table_engines-replication) versions of engines), partitioning, secondary data-skipping indexes, and other features not supported in other engines.
+
+Engines in the family:
+
+- [MergeTree](../../engines/table-engines/mergetree-family/mergetree.md#mergetree)
+- [ReplacingMergeTree](../../engines/table-engines/mergetree-family/replacingmergetree.md#replacingmergetree)
+- [SummingMergeTree](../../engines/table-engines/mergetree-family/summingmergetree.md#summingmergetree)
+- [AggregatingMergeTree](../../engines/table-engines/mergetree-family/aggregatingmergetree.md#aggregatingmergetree)
+- [CollapsingMergeTree](../../engines/table-engines/mergetree-family/collapsingmergetree.md#table_engine-collapsingmergetree)
+- [VersionedCollapsingMergeTree](../../engines/table-engines/mergetree-family/versionedcollapsingmergetree.md#versionedcollapsingmergetree)
+- [GraphiteMergeTree](../../engines/table-engines/mergetree-family/graphitemergetree.md#graphitemergetree)
+
+### Log {#log}
+
+Lightweight [engines](../../engines/table-engines/log-family/index.md) with minimum functionality. They’re the most effective when you need to quickly write many small tables (up to approximately 1 million rows) and read them later as a whole.
+
+Engines in the family:
+
+- [TinyLog](../../engines/table-engines/log-family/tinylog.md#tinylog)
+- [StripeLog](../../engines/table-engines/log-family/stripelog.md#stripelog)
+- [Log](../../engines/table-engines/log-family/log.md#log)
+
+### Integration Engines {#integration-engines}
+
+Engines for communicating with other data storage and processing systems.
+
+Engines in the family:
+
+
+- [ODBC](../../engines/table-engines/integrations/odbc.md)
+- [JDBC](../../engines/table-engines/integrations/jdbc.md)
+- [MySQL](../../engines/table-engines/integrations/mysql.md)
+- [MongoDB](../../engines/table-engines/integrations/mongodb.md)
+- [HDFS](../../engines/table-engines/integrations/hdfs.md)
+- [S3](../../engines/table-engines/integrations/s3.md)
+- [Kafka](../../engines/table-engines/integrations/kafka.md)
+- [EmbeddedRocksDB](../../engines/table-engines/integrations/embedded-rocksdb.md)
+- [RabbitMQ](../../engines/table-engines/integrations/rabbitmq.md)
+- [PostgreSQL](../../engines/table-engines/integrations/postgresql.md)
+
+### Special Engines {#special-engines}
+
+Engines in the family:
+
+- [Distributed](../../engines/table-engines/special/distributed.md#distributed)
+- [MaterializedView](../../engines/table-engines/special/materializedview.md#materializedview)
+- [Dictionary](../../engines/table-engines/special/dictionary.md#dictionary)
+- [Merge](../../engines/table-engines/special/merge.md#merge)
+- [File](../../engines/table-engines/special/file.md#file)
+- [Null](../../engines/table-engines/special/null.md#null)
+- [Set](../../engines/table-engines/special/set.md#set)
+- [Join](../../engines/table-engines/special/join.md#join)
+- [URL](../../engines/table-engines/special/url.md#table_engines-url)
+- [View](../../engines/table-engines/special/view.md#table_engines-view)
+- [Memory](../../engines/table-engines/special/memory.md#memory)
+- [Buffer](../../engines/table-engines/special/buffer.md#buffer)
+
+## Virtual Columns {#table_engines-virtual_columns}
+
+Virtual column is an integral table engine attribute that is defined in the engine source code.
+
+You shouldn’t specify virtual columns in the `CREATE TABLE` query and you can’t see them in `SHOW CREATE TABLE` and `DESCRIBE TABLE` query results. Virtual columns are also read-only, so you can’t insert data into virtual columns.
+
+To select data from a virtual column, you must specify its name in the `SELECT` query. `SELECT *` does not return values from virtual columns.
+
+If you create a table with a column that has the same name as one of the table virtual columns, the virtual column becomes inaccessible. We do not recommend doing this. To help avoid conflicts, virtual column names are usually prefixed with an underscore.
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/)
diff --git a/docs/en/reference/engines/table-engines/integrations/ExternalDistributed.md b/docs/en/reference/engines/table-engines/integrations/ExternalDistributed.md
new file mode 100644
index 00000000000..c9aae1934db
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/ExternalDistributed.md
@@ -0,0 +1,56 @@
+---
+sidebar_position: 12
+sidebar_label: ExternalDistributed
+---
+
+# ExternalDistributed {#externaldistributed}
+
+The `ExternalDistributed` engine allows to perform `SELECT` queries on data that is stored on a remote servers MySQL or PostgreSQL. Accepts [MySQL](../../../engines/table-engines/integrations/mysql.md) or [PostgreSQL](../../../engines/table-engines/integrations/postgresql.md) engines as an argument so sharding is possible.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
+ ...
+) ENGINE = ExternalDistributed('engine', 'host:port', 'database', 'table', 'user', 'password');
+```
+
+See a detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
+
+The table structure can differ from the original table structure:
+
+- Column names should be the same as in the original table, but you can use just some of these columns and in any order.
+- Column types may differ from those in the original table. ClickHouse tries to [cast](../../../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) values to the ClickHouse data types.
+
+**Engine Parameters**
+
+- `engine` — The table engine `MySQL` or `PostgreSQL`.
+- `host:port` — MySQL or PostgreSQL server address.
+- `database` — Remote database name.
+- `table` — Remote table name.
+- `user` — User name.
+- `password` — User password.
+
+## Implementation Details {#implementation-details}
+
+Supports multiple replicas that must be listed by `|` and shards must be listed by `,`. For example:
+
+```sql
+CREATE TABLE test_shards (id UInt32, name String, age UInt32, money UInt32) ENGINE = ExternalDistributed('MySQL', `mysql{1|2}:3306,mysql{3|4}:3306`, 'clickhouse', 'test_replicas', 'root', 'clickhouse');
+```
+
+When specifying replicas, one of the available replicas is selected for each of the shards when reading. If the connection fails, the next replica is selected, and so on for all the replicas. If the connection attempt fails for all the replicas, the attempt is repeated the same way several times.
+
+You can specify any number of shards and any number of replicas for each shard.
+
+**See Also**
+
+- [MySQL table engine](../../../engines/table-engines/integrations/mysql.md)
+- [PostgreSQL table engine](../../../engines/table-engines/integrations/postgresql.md)
+- [Distributed table engine](../../../engines/table-engines/special/distributed.md)
+
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/ExternalDistributed/)
diff --git a/docs/en/reference/engines/table-engines/integrations/embedded-rocksdb.md b/docs/en/reference/engines/table-engines/integrations/embedded-rocksdb.md
new file mode 100644
index 00000000000..701d190f022
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/embedded-rocksdb.md
@@ -0,0 +1,84 @@
+---
+sidebar_position: 9
+sidebar_label: EmbeddedRocksDB
+---
+
+# EmbeddedRocksDB Engine {#EmbeddedRocksDB-engine}
+
+This engine allows integrating ClickHouse with [rocksdb](http://rocksdb.org/).
+
+## Creating a Table {#table_engine-EmbeddedRocksDB-creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE = EmbeddedRocksDB PRIMARY KEY(primary_key_name)
+```
+
+Required parameters:
+
+- `primary_key_name` – any column name in the column list.
+- `primary key` must be specified, it supports only one column in the primary key. The primary key will be serialized in binary as a `rocksdb key`.
+- columns other than the primary key will be serialized in binary as `rocksdb` value in corresponding order.
+- queries with key `equals` or `in` filtering will be optimized to multi keys lookup from `rocksdb`.
+
+Example:
+
+``` sql
+CREATE TABLE test
+(
+ `key` String,
+ `v1` UInt32,
+ `v2` String,
+ `v3` Float32
+)
+ENGINE = EmbeddedRocksDB
+PRIMARY KEY key
+```
+
+## Metrics
+
+There is also `system.rocksdb` table, that expose rocksdb statistics:
+
+```sql
+SELECT
+ name,
+ value
+FROM system.rocksdb
+
+┌─name──────────────────────┬─value─┐
+│ no.file.opens │ 1 │
+│ number.block.decompressed │ 1 │
+└───────────────────────────┴───────┘
+```
+
+## Configuration
+
+You can also change any [rocksdb options](https://github.com/facebook/rocksdb/wiki/Option-String-and-Option-Map) using config:
+
+```xml
+
+
+ 8
+
+
+ 2
+
+
+
+ TABLE
+
+ 8
+
+
+ 2
+
+
+
+
+```
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/embedded-rocksdb/)
diff --git a/docs/en/reference/engines/table-engines/integrations/hdfs.md b/docs/en/reference/engines/table-engines/integrations/hdfs.md
new file mode 100644
index 00000000000..503bd779abf
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/hdfs.md
@@ -0,0 +1,230 @@
+---
+sidebar_position: 6
+sidebar_label: HDFS
+---
+
+# HDFS {#table_engines-hdfs}
+
+This engine provides integration with the [Apache Hadoop](https://en.wikipedia.org/wiki/Apache_Hadoop) ecosystem by allowing to manage data on [HDFS](https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html) via ClickHouse. This engine is similar to the [File](../../../engines/table-engines/special/file.md#table_engines-file) and [URL](../../../engines/table-engines/special/url.md#table_engines-url) engines, but provides Hadoop-specific features.
+
+## Usage {#usage}
+
+``` sql
+ENGINE = HDFS(URI, format)
+```
+
+**Engine Parameters**
+
+- `URI` - whole file URI in HDFS. The path part of `URI` may contain globs. In this case the table would be readonly.
+- `format` - specifies one of the available file formats. To perform
+`SELECT` queries, the format must be supported for input, and to perform
+`INSERT` queries – for output. The available formats are listed in the
+[Formats](../../../interfaces/formats.md#formats) section.
+
+**Example:**
+
+**1.** Set up the `hdfs_engine_table` table:
+
+``` sql
+CREATE TABLE hdfs_engine_table (name String, value UInt32) ENGINE=HDFS('hdfs://hdfs1:9000/other_storage', 'TSV')
+```
+
+**2.** Fill file:
+
+``` sql
+INSERT INTO hdfs_engine_table VALUES ('one', 1), ('two', 2), ('three', 3)
+```
+
+**3.** Query the data:
+
+``` sql
+SELECT * FROM hdfs_engine_table LIMIT 2
+```
+
+``` text
+┌─name─┬─value─┐
+│ one │ 1 │
+│ two │ 2 │
+└──────┴───────┘
+```
+
+## Implementation Details {#implementation-details}
+
+- Reads and writes can be parallel.
+- [Zero-copy](../../../operations/storing-data.md#zero-copy) replication is supported.
+- Not supported:
+ - `ALTER` and `SELECT...SAMPLE` operations.
+ - Indexes.
+
+**Globs in path**
+
+Multiple path components can have globs. For being processed file should exists and matches to the whole path pattern. Listing of files determines during `SELECT` (not at `CREATE` moment).
+
+- `*` — Substitutes any number of any characters except `/` including empty string.
+- `?` — Substitutes any single character.
+- `{some_string,another_string,yet_another_one}` — Substitutes any of strings `'some_string', 'another_string', 'yet_another_one'`.
+- `{N..M}` — Substitutes any number in range from N to M including both borders.
+
+Constructions with `{}` are similar to the [remote](../../../sql-reference/table-functions/remote.md) table function.
+
+**Example**
+
+1. Suppose we have several files in TSV format with the following URIs on HDFS:
+
+ - 'hdfs://hdfs1:9000/some_dir/some_file_1'
+ - 'hdfs://hdfs1:9000/some_dir/some_file_2'
+ - 'hdfs://hdfs1:9000/some_dir/some_file_3'
+ - 'hdfs://hdfs1:9000/another_dir/some_file_1'
+ - 'hdfs://hdfs1:9000/another_dir/some_file_2'
+ - 'hdfs://hdfs1:9000/another_dir/some_file_3'
+
+1. There are several ways to make a table consisting of all six files:
+
+
+
+``` sql
+CREATE TABLE table_with_range (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/{some,another}_dir/some_file_{1..3}', 'TSV')
+```
+
+Another way:
+
+``` sql
+CREATE TABLE table_with_question_mark (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/{some,another}_dir/some_file_?', 'TSV')
+```
+
+Table consists of all the files in both directories (all files should satisfy format and schema described in query):
+
+``` sql
+CREATE TABLE table_with_asterisk (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/{some,another}_dir/*', 'TSV')
+```
+
+:::warning
+If the listing of files contains number ranges with leading zeros, use the construction with braces for each digit separately or use `?`.
+:::
+
+**Example**
+
+Create table with files named `file000`, `file001`, … , `file999`:
+
+``` sql
+CREATE TABLE big_table (name String, value UInt32) ENGINE = HDFS('hdfs://hdfs1:9000/big_dir/file{0..9}{0..9}{0..9}', 'CSV')
+```
+## Configuration {#configuration}
+
+Similar to GraphiteMergeTree, the HDFS engine supports extended configuration using the ClickHouse config file. There are two configuration keys that you can use: global (`hdfs`) and user-level (`hdfs_*`). The global configuration is applied first, and then the user-level configuration is applied (if it exists).
+
+``` xml
+
+
+ /tmp/keytab/clickhouse.keytab
+ clickuser@TEST.CLICKHOUSE.TECH
+ kerberos
+
+
+
+
+ root@TEST.CLICKHOUSE.TECH
+
+```
+
+### Configuration Options {#configuration-options}
+
+#### Supported by libhdfs3 {#supported-by-libhdfs3}
+
+
+| **parameter** | **default value** |
+| - | - |
+| rpc\_client\_connect\_tcpnodelay | true |
+| dfs\_client\_read\_shortcircuit | true |
+| output\_replace-datanode-on-failure | true |
+| input\_notretry-another-node | false |
+| input\_localread\_mappedfile | true |
+| dfs\_client\_use\_legacy\_blockreader\_local | false |
+| rpc\_client\_ping\_interval | 10 * 1000 |
+| rpc\_client\_connect\_timeout | 600 * 1000 |
+| rpc\_client\_read\_timeout | 3600 * 1000 |
+| rpc\_client\_write\_timeout | 3600 * 1000 |
+| rpc\_client\_socekt\_linger\_timeout | -1 |
+| rpc\_client\_connect\_retry | 10 |
+| rpc\_client\_timeout | 3600 * 1000 |
+| dfs\_default\_replica | 3 |
+| input\_connect\_timeout | 600 * 1000 |
+| input\_read\_timeout | 3600 * 1000 |
+| input\_write\_timeout | 3600 * 1000 |
+| input\_localread\_default\_buffersize | 1 * 1024 * 1024 |
+| dfs\_prefetchsize | 10 |
+| input\_read\_getblockinfo\_retry | 3 |
+| input\_localread\_blockinfo\_cachesize | 1000 |
+| input\_read\_max\_retry | 60 |
+| output\_default\_chunksize | 512 |
+| output\_default\_packetsize | 64 * 1024 |
+| output\_default\_write\_retry | 10 |
+| output\_connect\_timeout | 600 * 1000 |
+| output\_read\_timeout | 3600 * 1000 |
+| output\_write\_timeout | 3600 * 1000 |
+| output\_close\_timeout | 3600 * 1000 |
+| output\_packetpool\_size | 1024 |
+| output\_heeartbeat\_interval | 10 * 1000 |
+| dfs\_client\_failover\_max\_attempts | 15 |
+| dfs\_client\_read\_shortcircuit\_streams\_cache\_size | 256 |
+| dfs\_client\_socketcache\_expiryMsec | 3000 |
+| dfs\_client\_socketcache\_capacity | 16 |
+| dfs\_default\_blocksize | 64 * 1024 * 1024 |
+| dfs\_default\_uri | "hdfs://localhost:9000" |
+| hadoop\_security\_authentication | "simple" |
+| hadoop\_security\_kerberos\_ticket\_cache\_path | "" |
+| dfs\_client\_log\_severity | "INFO" |
+| dfs\_domain\_socket\_path | "" |
+
+
+[HDFS Configuration Reference](https://hawq.apache.org/docs/userguide/2.3.0.0-incubating/reference/HDFSConfigurationParameterReference.html) might explain some parameters.
+
+
+#### ClickHouse extras {#clickhouse-extras}
+
+| **parameter** | **default value** |
+| - | - |
+|hadoop\_kerberos\_keytab | "" |
+|hadoop\_kerberos\_principal | "" |
+|hadoop\_kerberos\_kinit\_command | kinit |
+|libhdfs3\_conf | "" |
+
+### Limitations {#limitations}
+* `hadoop_security_kerberos_ticket_cache_path` and `libhdfs3_conf` can be global only, not user specific
+
+## Kerberos support {#kerberos-support}
+
+If the `hadoop_security_authentication` parameter has the value `kerberos`, ClickHouse authenticates via Kerberos.
+Parameters are [here](#clickhouse-extras) and `hadoop_security_kerberos_ticket_cache_path` may be of help.
+Note that due to libhdfs3 limitations only old-fashioned approach is supported,
+datanode communications are not secured by SASL (`HADOOP_SECURE_DN_USER` is a reliable indicator of such
+security approach). Use `tests/integration/test_storage_kerberized_hdfs/hdfs_configs/bootstrap.sh` for reference.
+
+If `hadoop_kerberos_keytab`, `hadoop_kerberos_principal` or `hadoop_kerberos_kinit_command` is specified, `kinit` will be invoked. `hadoop_kerberos_keytab` and `hadoop_kerberos_principal` are mandatory in this case. `kinit` tool and krb5 configuration files are required.
+
+## HDFS Namenode HA support {#namenode-ha}
+
+libhdfs3 support HDFS namenode HA.
+
+- Copy `hdfs-site.xml` from an HDFS node to `/etc/clickhouse-server/`.
+- Add following piece to ClickHouse config file:
+
+``` xml
+
+ /etc/clickhouse-server/hdfs-site.xml
+
+```
+
+- Then use `dfs.nameservices` tag value of `hdfs-site.xml` as the namenode address in the HDFS URI. For example, replace `hdfs://appadmin@192.168.101.11:8020/abc/` with `hdfs://appadmin@my_nameservice/abc/`.
+
+
+## Virtual Columns {#virtual-columns}
+
+- `_path` — Path to the file.
+- `_file` — Name of the file.
+
+**See Also**
+
+- [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns)
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/hdfs/)
diff --git a/docs/en/reference/engines/table-engines/integrations/hive.md b/docs/en/reference/engines/table-engines/integrations/hive.md
new file mode 100644
index 00000000000..6731f0e7559
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/hive.md
@@ -0,0 +1,410 @@
+---
+sidebar_position: 4
+sidebar_label: Hive
+---
+
+# Hive {#hive}
+
+The Hive engine allows you to perform `SELECT` quries on HDFS Hive table. Currently it supports input formats as below:
+
+- Text: only supports simple scalar column types except `binary`
+
+- ORC: support simple scalar columns types except `char`; only support complex types like `array`
+
+- Parquet: support all simple scalar columns types; only support complex types like `array`
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [ALIAS expr1],
+ name2 [type2] [ALIAS expr2],
+ ...
+) ENGINE = Hive('thrift://host:port', 'database', 'table');
+PARTITION BY expr
+```
+See a detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
+
+The table structure can differ from the original Hive table structure:
+- Column names should be the same as in the original Hive table, but you can use just some of these columns and in any order, also you can use some alias columns calculated from other columns.
+- Column types should be the same from those in the original Hive table.
+- Partition by expression should be consistent with the original Hive table, and columns in partition by expression should be in the table structure.
+
+**Engine Parameters**
+
+- `thrift://host:port` — Hive Metastore address
+
+- `database` — Remote database name.
+
+- `table` — Remote table name.
+
+## Usage Example {#usage-example}
+
+### How to Use Local Cache for HDFS Filesystem
+We strongly advice you to enable local cache for remote filesystems. Benchmark shows that its almost 2x faster with cache.
+
+Before using cache, add it to `config.xml`
+``` xml
+
+ true
+ local_cache
+ 559096952
+ 1048576
+
+```
+
+- enable: ClickHouse will maintain local cache for remote filesystem(HDFS) after startup if true.
+- root_dir: Required. The root directory to store local cache files for remote filesystem.
+- limit_size: Required. The maximum size(in bytes) of local cache files.
+- bytes_read_before_flush: Control bytes before flush to local filesystem when downloading file from remote filesystem. The default value is 1MB.
+
+When ClickHouse is started up with local cache for remote filesystem enabled, users can still choose not to use cache with `settings use_local_cache_for_remote_fs = 0` in their query. `use_local_cache_for_remote_fs` is `false` in default.
+
+### Query Hive Table with ORC Input Format
+
+#### Create Table in Hive
+``` text
+hive > CREATE TABLE `test`.`test_orc`(
+ `f_tinyint` tinyint,
+ `f_smallint` smallint,
+ `f_int` int,
+ `f_integer` int,
+ `f_bigint` bigint,
+ `f_float` float,
+ `f_double` double,
+ `f_decimal` decimal(10,0),
+ `f_timestamp` timestamp,
+ `f_date` date,
+ `f_string` string,
+ `f_varchar` varchar(100),
+ `f_bool` boolean,
+ `f_binary` binary,
+ `f_array_int` array,
+ `f_array_string` array,
+ `f_array_float` array,
+ `f_array_array_int` array>,
+ `f_array_array_string` array>,
+ `f_array_array_float` array>)
+PARTITIONED BY (
+ `day` string)
+ROW FORMAT SERDE
+ 'org.apache.hadoop.hive.ql.io.orc.OrcSerde'
+STORED AS INPUTFORMAT
+ 'org.apache.hadoop.hive.ql.io.orc.OrcInputFormat'
+OUTPUTFORMAT
+ 'org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat'
+LOCATION
+ 'hdfs://testcluster/data/hive/test.db/test_orc'
+
+OK
+Time taken: 0.51 seconds
+
+hive > insert into test.test_orc partition(day='2021-09-18') select 1, 2, 3, 4, 5, 6.11, 7.22, 8.333, current_timestamp(), current_date(), 'hello world', 'hello world', 'hello world', true, 'hello world', array(1, 2, 3), array('hello world', 'hello world'), array(float(1.1), float(1.2)), array(array(1, 2), array(3, 4)), array(array('a', 'b'), array('c', 'd')), array(array(float(1.11), float(2.22)), array(float(3.33), float(4.44)));
+OK
+Time taken: 36.025 seconds
+
+hive > select * from test.test_orc;
+OK
+1 2 3 4 5 6.11 7.22 8 2021-11-05 12:38:16.314 2021-11-05 hello world hello world hello world true hello world [1,2,3] ["hello world","hello world"] [1.1,1.2] [[1,2],[3,4]] [["a","b"],["c","d"]] [[1.11,2.22],[3.33,4.44]] 2021-09-18
+Time taken: 0.295 seconds, Fetched: 1 row(s)
+```
+
+#### Create Table in ClickHouse
+Table in ClickHouse, retrieving data from the Hive table created above:
+``` sql
+CREATE TABLE test.test_orc
+(
+ `f_tinyint` Int8,
+ `f_smallint` Int16,
+ `f_int` Int32,
+ `f_integer` Int32,
+ `f_bigint` Int64,
+ `f_float` Float32,
+ `f_double` Float64,
+ `f_decimal` Float64,
+ `f_timestamp` DateTime,
+ `f_date` Date,
+ `f_string` String,
+ `f_varchar` String,
+ `f_bool` Bool,
+ `f_binary` String,
+ `f_array_int` Array(Int32),
+ `f_array_string` Array(String),
+ `f_array_float` Array(Float32),
+ `f_array_array_int` Array(Array(Int32)),
+ `f_array_array_string` Array(Array(String)),
+ `f_array_array_float` Array(Array(Float32)),
+ `day` String
+)
+ENGINE = Hive('thrift://202.168.117.26:9083', 'test', 'test_orc')
+PARTITION BY day
+
+```
+
+``` sql
+SELECT * FROM test.test_orc settings input_format_orc_allow_missing_columns = 1\G
+```
+
+``` text
+SELECT *
+FROM test.test_orc
+SETTINGS input_format_orc_allow_missing_columns = 1
+
+Query id: c3eaffdc-78ab-43cd-96a4-4acc5b480658
+
+Row 1:
+──────
+f_tinyint: 1
+f_smallint: 2
+f_int: 3
+f_integer: 4
+f_bigint: 5
+f_float: 6.11
+f_double: 7.22
+f_decimal: 8
+f_timestamp: 2021-12-04 04:00:44
+f_date: 2021-12-03
+f_string: hello world
+f_varchar: hello world
+f_bool: true
+f_binary: hello world
+f_array_int: [1,2,3]
+f_array_string: ['hello world','hello world']
+f_array_float: [1.1,1.2]
+f_array_array_int: [[1,2],[3,4]]
+f_array_array_string: [['a','b'],['c','d']]
+f_array_array_float: [[1.11,2.22],[3.33,4.44]]
+day: 2021-09-18
+
+
+1 rows in set. Elapsed: 0.078 sec.
+```
+
+### Query Hive Table with Parquet Input Format
+
+#### Create Table in Hive
+``` text
+hive >
+CREATE TABLE `test`.`test_parquet`(
+ `f_tinyint` tinyint,
+ `f_smallint` smallint,
+ `f_int` int,
+ `f_integer` int,
+ `f_bigint` bigint,
+ `f_float` float,
+ `f_double` double,
+ `f_decimal` decimal(10,0),
+ `f_timestamp` timestamp,
+ `f_date` date,
+ `f_string` string,
+ `f_varchar` varchar(100),
+ `f_char` char(100),
+ `f_bool` boolean,
+ `f_binary` binary,
+ `f_array_int` array,
+ `f_array_string` array,
+ `f_array_float` array,
+ `f_array_array_int` array>,
+ `f_array_array_string` array>,
+ `f_array_array_float` array>)
+PARTITIONED BY (
+ `day` string)
+ROW FORMAT SERDE
+ 'org.apache.hadoop.hive.ql.io.parquet.serde.ParquetHiveSerDe'
+STORED AS INPUTFORMAT
+ 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetInputFormat'
+OUTPUTFORMAT
+ 'org.apache.hadoop.hive.ql.io.parquet.MapredParquetOutputFormat'
+LOCATION
+ 'hdfs://testcluster/data/hive/test.db/test_parquet'
+OK
+Time taken: 0.51 seconds
+
+hive > insert into test.test_parquet partition(day='2021-09-18') select 1, 2, 3, 4, 5, 6.11, 7.22, 8.333, current_timestamp(), current_date(), 'hello world', 'hello world', 'hello world', true, 'hello world', array(1, 2, 3), array('hello world', 'hello world'), array(float(1.1), float(1.2)), array(array(1, 2), array(3, 4)), array(array('a', 'b'), array('c', 'd')), array(array(float(1.11), float(2.22)), array(float(3.33), float(4.44)));
+OK
+Time taken: 36.025 seconds
+
+hive > select * from test.test_parquet;
+OK
+1 2 3 4 5 6.11 7.22 8 2021-12-14 17:54:56.743 2021-12-14 hello world hello world hello world true hello world [1,2,3] ["hello world","hello world"] [1.1,1.2] [[1,2],[3,4]] [["a","b"],["c","d"]] [[1.11,2.22],[3.33,4.44]] 2021-09-18
+Time taken: 0.766 seconds, Fetched: 1 row(s)
+```
+
+#### Create Table in ClickHouse
+Table in ClickHouse, retrieving data from the Hive table created above:
+``` sql
+CREATE TABLE test.test_parquet
+(
+ `f_tinyint` Int8,
+ `f_smallint` Int16,
+ `f_int` Int32,
+ `f_integer` Int32,
+ `f_bigint` Int64,
+ `f_float` Float32,
+ `f_double` Float64,
+ `f_decimal` Float64,
+ `f_timestamp` DateTime,
+ `f_date` Date,
+ `f_string` String,
+ `f_varchar` String,
+ `f_char` String,
+ `f_bool` Bool,
+ `f_binary` String,
+ `f_array_int` Array(Int32),
+ `f_array_string` Array(String),
+ `f_array_float` Array(Float32),
+ `f_array_array_int` Array(Array(Int32)),
+ `f_array_array_string` Array(Array(String)),
+ `f_array_array_float` Array(Array(Float32)),
+ `day` String
+)
+ENGINE = Hive('thrift://localhost:9083', 'test', 'test_parquet')
+PARTITION BY day
+```
+
+``` sql
+SELECT * FROM test.test_parquet settings input_format_parquet_allow_missing_columns = 1\G
+```
+
+``` text
+SELECT *
+FROM test_parquet
+SETTINGS input_format_parquet_allow_missing_columns = 1
+
+Query id: 4e35cf02-c7b2-430d-9b81-16f438e5fca9
+
+Row 1:
+──────
+f_tinyint: 1
+f_smallint: 2
+f_int: 3
+f_integer: 4
+f_bigint: 5
+f_float: 6.11
+f_double: 7.22
+f_decimal: 8
+f_timestamp: 2021-12-14 17:54:56
+f_date: 2021-12-14
+f_string: hello world
+f_varchar: hello world
+f_char: hello world
+f_bool: true
+f_binary: hello world
+f_array_int: [1,2,3]
+f_array_string: ['hello world','hello world']
+f_array_float: [1.1,1.2]
+f_array_array_int: [[1,2],[3,4]]
+f_array_array_string: [['a','b'],['c','d']]
+f_array_array_float: [[1.11,2.22],[3.33,4.44]]
+day: 2021-09-18
+
+1 rows in set. Elapsed: 0.357 sec.
+```
+
+### Query Hive Table with Text Input Format
+#### Create Table in Hive
+``` text
+hive >
+CREATE TABLE `test`.`test_text`(
+ `f_tinyint` tinyint,
+ `f_smallint` smallint,
+ `f_int` int,
+ `f_integer` int,
+ `f_bigint` bigint,
+ `f_float` float,
+ `f_double` double,
+ `f_decimal` decimal(10,0),
+ `f_timestamp` timestamp,
+ `f_date` date,
+ `f_string` string,
+ `f_varchar` varchar(100),
+ `f_char` char(100),
+ `f_bool` boolean,
+ `f_binary` binary,
+ `f_array_int` array,
+ `f_array_string` array,
+ `f_array_float` array,
+ `f_array_array_int` array>,
+ `f_array_array_string` array>,
+ `f_array_array_float` array>)
+PARTITIONED BY (
+ `day` string)
+ROW FORMAT SERDE
+ 'org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe'
+STORED AS INPUTFORMAT
+ 'org.apache.hadoop.mapred.TextInputFormat'
+OUTPUTFORMAT
+ 'org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat'
+LOCATION
+ 'hdfs://testcluster/data/hive/test.db/test_text'
+Time taken: 0.1 seconds, Fetched: 34 row(s)
+
+
+hive > insert into test.test_text partition(day='2021-09-18') select 1, 2, 3, 4, 5, 6.11, 7.22, 8.333, current_timestamp(), current_date(), 'hello world', 'hello world', 'hello world', true, 'hello world', array(1, 2, 3), array('hello world', 'hello world'), array(float(1.1), float(1.2)), array(array(1, 2), array(3, 4)), array(array('a', 'b'), array('c', 'd')), array(array(float(1.11), float(2.22)), array(float(3.33), float(4.44)));
+OK
+Time taken: 36.025 seconds
+
+hive > select * from test.test_text;
+OK
+1 2 3 4 5 6.11 7.22 8 2021-12-14 18:11:17.239 2021-12-14 hello world hello world hello world true hello world [1,2,3] ["hello world","hello world"] [1.1,1.2] [[1,2],[3,4]] [["a","b"],["c","d"]] [[1.11,2.22],[3.33,4.44]] 2021-09-18
+Time taken: 0.624 seconds, Fetched: 1 row(s)
+```
+
+#### Create Table in ClickHouse
+
+Table in ClickHouse, retrieving data from the Hive table created above:
+``` sql
+CREATE TABLE test.test_text
+(
+ `f_tinyint` Int8,
+ `f_smallint` Int16,
+ `f_int` Int32,
+ `f_integer` Int32,
+ `f_bigint` Int64,
+ `f_float` Float32,
+ `f_double` Float64,
+ `f_decimal` Float64,
+ `f_timestamp` DateTime,
+ `f_date` Date,
+ `f_string` String,
+ `f_varchar` String,
+ `f_char` String,
+ `f_bool` Bool,
+ `day` String
+)
+ENGINE = Hive('thrift://localhost:9083', 'test', 'test_text')
+PARTITION BY day
+```
+
+``` sql
+SELECT * FROM test.test_text settings input_format_skip_unknown_fields = 1, input_format_with_names_use_header = 1, date_time_input_format = 'best_effort'\G
+```
+
+``` text
+SELECT *
+FROM test.test_text
+SETTINGS input_format_skip_unknown_fields = 1, input_format_with_names_use_header = 1, date_time_input_format = 'best_effort'
+
+Query id: 55b79d35-56de-45b9-8be6-57282fbf1f44
+
+Row 1:
+──────
+f_tinyint: 1
+f_smallint: 2
+f_int: 3
+f_integer: 4
+f_bigint: 5
+f_float: 6.11
+f_double: 7.22
+f_decimal: 8
+f_timestamp: 2021-12-14 18:11:17
+f_date: 2021-12-14
+f_string: hello world
+f_varchar: hello world
+f_char: hello world
+f_bool: true
+day: 2021-09-18
+```
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/hive/)
diff --git a/docs/en/reference/engines/table-engines/integrations/index.md b/docs/en/reference/engines/table-engines/integrations/index.md
new file mode 100644
index 00000000000..9230ad624ba
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/index.md
@@ -0,0 +1,23 @@
+---
+sidebar_position: 40
+sidebar_label: Integrations
+---
+
+# Table Engines for Integrations {#table-engines-for-integrations}
+
+ClickHouse provides various means for integrating with external systems, including table engines. Like with all other table engines, the configuration is done using `CREATE TABLE` or `ALTER TABLE` queries. Then from a user perspective, the configured integration looks like a normal table, but queries to it are proxied to the external system. This transparent querying is one of the key advantages of this approach over alternative integration methods, like external dictionaries or table functions, which require to use custom query methods on each use.
+
+List of supported integrations:
+
+- [ODBC](../../../engines/table-engines/integrations/odbc.md)
+- [JDBC](../../../engines/table-engines/integrations/jdbc.md)
+- [MySQL](../../../engines/table-engines/integrations/mysql.md)
+- [MongoDB](../../../engines/table-engines/integrations/mongodb.md)
+- [HDFS](../../../engines/table-engines/integrations/hdfs.md)
+- [S3](../../../engines/table-engines/integrations/s3.md)
+- [Kafka](../../../engines/table-engines/integrations/kafka.md)
+- [EmbeddedRocksDB](../../../engines/table-engines/integrations/embedded-rocksdb.md)
+- [RabbitMQ](../../../engines/table-engines/integrations/rabbitmq.md)
+- [PostgreSQL](../../../engines/table-engines/integrations/postgresql.md)
+- [SQLite](../../../engines/table-engines/integrations/sqlite.md)
+- [Hive](../../../engines/table-engines/integrations/hive.md)
diff --git a/docs/en/reference/engines/table-engines/integrations/jdbc.md b/docs/en/reference/engines/table-engines/integrations/jdbc.md
new file mode 100644
index 00000000000..0ce31f36070
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/jdbc.md
@@ -0,0 +1,95 @@
+---
+sidebar_position: 3
+sidebar_label: JDBC
+---
+
+# JDBC {#table-engine-jdbc}
+
+Allows ClickHouse to connect to external databases via [JDBC](https://en.wikipedia.org/wiki/Java_Database_Connectivity).
+
+To implement the JDBC connection, ClickHouse uses the separate program [clickhouse-jdbc-bridge](https://github.com/ClickHouse/clickhouse-jdbc-bridge) that should run as a daemon.
+
+This engine supports the [Nullable](../../../sql-reference/data-types/nullable.md) data type.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name
+(
+ columns list...
+)
+ENGINE = JDBC(datasource_uri, external_database, external_table)
+```
+
+**Engine Parameters**
+
+
+- `datasource_uri` — URI or name of an external DBMS.
+
+ URI Format: `jdbc:://:/?user=&password=`.
+ Example for MySQL: `jdbc:mysql://localhost:3306/?user=root&password=root`.
+
+- `external_database` — Database in an external DBMS.
+
+- `external_table` — Name of the table in `external_database` or a select query like `select * from table1 where column1=1`.
+
+## Usage Example {#usage-example}
+
+Creating a table in MySQL server by connecting directly with it’s console client:
+
+``` text
+mysql> CREATE TABLE `test`.`test` (
+ -> `int_id` INT NOT NULL AUTO_INCREMENT,
+ -> `int_nullable` INT NULL DEFAULT NULL,
+ -> `float` FLOAT NOT NULL,
+ -> `float_nullable` FLOAT NULL DEFAULT NULL,
+ -> PRIMARY KEY (`int_id`));
+Query OK, 0 rows affected (0,09 sec)
+
+mysql> insert into test (`int_id`, `float`) VALUES (1,2);
+Query OK, 1 row affected (0,00 sec)
+
+mysql> select * from test;
++------+----------+-----+----------+
+| int_id | int_nullable | float | float_nullable |
++------+----------+-----+----------+
+| 1 | NULL | 2 | NULL |
++------+----------+-----+----------+
+1 row in set (0,00 sec)
+```
+
+Creating a table in ClickHouse server and selecting data from it:
+
+``` sql
+CREATE TABLE jdbc_table
+(
+ `int_id` Int32,
+ `int_nullable` Nullable(Int32),
+ `float` Float32,
+ `float_nullable` Nullable(Float32)
+)
+ENGINE JDBC('jdbc:mysql://localhost:3306/?user=root&password=root', 'test', 'test')
+```
+
+``` sql
+SELECT *
+FROM jdbc_table
+```
+
+``` text
+┌─int_id─┬─int_nullable─┬─float─┬─float_nullable─┐
+│ 1 │ ᴺᵁᴸᴸ │ 2 │ ᴺᵁᴸᴸ │
+└────────┴──────────────┴───────┴────────────────┘
+```
+
+``` sql
+INSERT INTO jdbc_table(`int_id`, `float`)
+SELECT toInt32(number), toFloat32(number * 1.0)
+FROM system.numbers
+```
+
+## See Also {#see-also}
+
+- [JDBC table function](../../../sql-reference/table-functions/jdbc.md).
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/jdbc/)
diff --git a/docs/en/reference/engines/table-engines/integrations/kafka.md b/docs/en/reference/engines/table-engines/integrations/kafka.md
new file mode 100644
index 00000000000..3a8d98e1ca9
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/kafka.md
@@ -0,0 +1,198 @@
+---
+sidebar_position: 8
+sidebar_label: Kafka
+---
+
+# Kafka {#kafka}
+
+This engine works with [Apache Kafka](http://kafka.apache.org/).
+
+Kafka lets you:
+
+- Publish or subscribe to data flows.
+- Organize fault-tolerant storage.
+- Process streams as they become available.
+
+## Creating a Table {#table_engine-kafka-creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE = Kafka()
+SETTINGS
+ kafka_broker_list = 'host:port',
+ kafka_topic_list = 'topic1,topic2,...',
+ kafka_group_name = 'group_name',
+ kafka_format = 'data_format'[,]
+ [kafka_row_delimiter = 'delimiter_symbol',]
+ [kafka_schema = '',]
+ [kafka_num_consumers = N,]
+ [kafka_max_block_size = 0,]
+ [kafka_skip_broken_messages = N,]
+ [kafka_commit_every_batch = 0,]
+ [kafka_thread_per_consumer = 0]
+```
+
+Required parameters:
+
+- `kafka_broker_list` — A comma-separated list of brokers (for example, `localhost:9092`).
+- `kafka_topic_list` — A list of Kafka topics.
+- `kafka_group_name` — A group of Kafka consumers. Reading margins are tracked for each group separately. If you do not want messages to be duplicated in the cluster, use the same group name everywhere.
+- `kafka_format` — Message format. Uses the same notation as the SQL `FORMAT` function, such as `JSONEachRow`. For more information, see the [Formats](../../../interfaces/formats.md) section.
+
+Optional parameters:
+
+- `kafka_row_delimiter` — Delimiter character, which ends the message.
+- `kafka_schema` — Parameter that must be used if the format requires a schema definition. For example, [Cap’n Proto](https://capnproto.org/) requires the path to the schema file and the name of the root `schema.capnp:Message` object.
+- `kafka_num_consumers` — The number of consumers per table. Default: `1`. Specify more consumers if the throughput of one consumer is insufficient. The total number of consumers should not exceed the number of partitions in the topic, since only one consumer can be assigned per partition.
+- `kafka_max_block_size` — The maximum batch size (in messages) for poll (default: `max_block_size`).
+- `kafka_skip_broken_messages` — Kafka message parser tolerance to schema-incompatible messages per block. Default: `0`. If `kafka_skip_broken_messages = N` then the engine skips *N* Kafka messages that cannot be parsed (a message equals a row of data).
+- `kafka_commit_every_batch` — Commit every consumed and handled batch instead of a single commit after writing a whole block (default: `0`).
+- `kafka_thread_per_consumer` — Provide independent thread for each consumer (default: `0`). When enabled, every consumer flush the data independently, in parallel (otherwise — rows from several consumers squashed to form one block).
+
+Examples:
+
+``` sql
+ CREATE TABLE queue (
+ timestamp UInt64,
+ level String,
+ message String
+ ) ENGINE = Kafka('localhost:9092', 'topic', 'group1', 'JSONEachRow');
+
+ SELECT * FROM queue LIMIT 5;
+
+ CREATE TABLE queue2 (
+ timestamp UInt64,
+ level String,
+ message String
+ ) ENGINE = Kafka SETTINGS kafka_broker_list = 'localhost:9092',
+ kafka_topic_list = 'topic',
+ kafka_group_name = 'group1',
+ kafka_format = 'JSONEachRow',
+ kafka_num_consumers = 4;
+
+ CREATE TABLE queue3 (
+ timestamp UInt64,
+ level String,
+ message String
+ ) ENGINE = Kafka('localhost:9092', 'topic', 'group1')
+ SETTINGS kafka_format = 'JSONEachRow',
+ kafka_num_consumers = 4;
+```
+
+
+
+Deprecated Method for Creating a Table
+
+:::warning
+Do not use this method in new projects. If possible, switch old projects to the method described above.
+:::
+
+``` sql
+Kafka(kafka_broker_list, kafka_topic_list, kafka_group_name, kafka_format
+ [, kafka_row_delimiter, kafka_schema, kafka_num_consumers, kafka_skip_broken_messages])
+```
+
+
+
+## Description {#description}
+
+The delivered messages are tracked automatically, so each message in a group is only counted once. If you want to get the data twice, then create a copy of the table with another group name.
+
+Groups are flexible and synced on the cluster. For instance, if you have 10 topics and 5 copies of a table in a cluster, then each copy gets 2 topics. If the number of copies changes, the topics are redistributed across the copies automatically. Read more about this at http://kafka.apache.org/intro.
+
+`SELECT` is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using materialized views. To do this:
+
+1. Use the engine to create a Kafka consumer and consider it a data stream.
+2. Create a table with the desired structure.
+3. Create a materialized view that converts data from the engine and puts it into a previously created table.
+
+When the `MATERIALIZED VIEW` joins the engine, it starts collecting data in the background. This allows you to continually receive messages from Kafka and convert them to the required format using `SELECT`.
+One kafka table can have as many materialized views as you like, they do not read data from the kafka table directly, but receive new records (in blocks), this way you can write to several tables with different detail level (with grouping - aggregation and without).
+
+Example:
+
+``` sql
+ CREATE TABLE queue (
+ timestamp UInt64,
+ level String,
+ message String
+ ) ENGINE = Kafka('localhost:9092', 'topic', 'group1', 'JSONEachRow');
+
+ CREATE TABLE daily (
+ day Date,
+ level String,
+ total UInt64
+ ) ENGINE = SummingMergeTree(day, (day, level), 8192);
+
+ CREATE MATERIALIZED VIEW consumer TO daily
+ AS SELECT toDate(toDateTime(timestamp)) AS day, level, count() as total
+ FROM queue GROUP BY day, level;
+
+ SELECT level, sum(total) FROM daily GROUP BY level;
+```
+To improve performance, received messages are grouped into blocks the size of [max_insert_block_size](../../../operations/settings/settings.md#settings-max_insert_block_size). If the block wasn’t formed within [stream_flush_interval_ms](../../../operations/settings/settings.md/#stream-flush-interval-ms) milliseconds, the data will be flushed to the table regardless of the completeness of the block.
+
+To stop receiving topic data or to change the conversion logic, detach the materialized view:
+
+``` sql
+ DETACH TABLE consumer;
+ ATTACH TABLE consumer;
+```
+
+If you want to change the target table by using `ALTER`, we recommend disabling the material view to avoid discrepancies between the target table and the data from the view.
+
+## Configuration {#configuration}
+
+Similar to GraphiteMergeTree, the Kafka engine supports extended configuration using the ClickHouse config file. There are two configuration keys that you can use: global (`kafka`) and topic-level (`kafka_*`). The global configuration is applied first, and then the topic-level configuration is applied (if it exists).
+
+``` xml
+
+
+ cgrp
+ smallest
+
+
+
+
+ 250
+ 100000
+
+```
+
+For a list of possible configuration options, see the [librdkafka configuration reference](https://github.com/edenhill/librdkafka/blob/master/CONFIGURATION.md). Use the underscore (`_`) instead of a dot in the ClickHouse configuration. For example, `check.crcs=true` will be `true`.
+
+### Kerberos support {#kafka-kerberos-support}
+
+To deal with Kerberos-aware Kafka, add `security_protocol` child element with `sasl_plaintext` value. It is enough if Kerberos ticket-granting ticket is obtained and cached by OS facilities.
+ClickHouse is able to maintain Kerberos credentials using a keytab file. Consider `sasl_kerberos_service_name`, `sasl_kerberos_keytab`, `sasl_kerberos_principal` and `sasl.kerberos.kinit.cmd` child elements.
+
+Example:
+
+``` xml
+
+
+ SASL_PLAINTEXT
+ /home/kafkauser/kafkauser.keytab
+ kafkauser/kafkahost@EXAMPLE.COM
+
+```
+
+## Virtual Columns {#virtual-columns}
+
+- `_topic` — Kafka topic.
+- `_key` — Key of the message.
+- `_offset` — Offset of the message.
+- `_timestamp` — Timestamp of the message.
+- `_timestamp_ms` — Timestamp in milliseconds of the message.
+- `_partition` — Partition of Kafka topic.
+
+**See Also**
+
+- [Virtual columns](../../../engines/table-engines/index.md#table_engines-virtual_columns)
+- [background_message_broker_schedule_pool_size](../../../operations/settings/settings.md#background_message_broker_schedule_pool_size)
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/kafka/)
diff --git a/docs/en/reference/engines/table-engines/integrations/materialized-postgresql.md b/docs/en/reference/engines/table-engines/integrations/materialized-postgresql.md
new file mode 100644
index 00000000000..61f97961ddb
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/materialized-postgresql.md
@@ -0,0 +1,59 @@
+---
+sidebar_position: 12
+sidebar_label: MaterializedPostgreSQL
+---
+
+# MaterializedPostgreSQL {#materialize-postgresql}
+
+Creates ClickHouse table with an initial data dump of PostgreSQL table and starts replication process, i.e. executes background job to apply new changes as they happen on PostgreSQL table in the remote PostgreSQL database.
+
+If more than one table is required, it is highly recommended to use the [MaterializedPostgreSQL](../../../engines/database-engines/materialized-postgresql.md) database engine instead of the table engine and use the `materialized_postgresql_tables_list` setting, which specifies the tables to be replicated (will also be possible to add database `schema`). It will be much better in terms of CPU, fewer connections and fewer replication slots inside the remote PostgreSQL database.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE postgresql_db.postgresql_replica (key UInt64, value UInt64)
+ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgresql_replica', 'postgres_user', 'postgres_password')
+PRIMARY KEY key;
+```
+
+**Engine Parameters**
+
+- `host:port` — PostgreSQL server address.
+- `database` — Remote database name.
+- `table` — Remote table name.
+- `user` — PostgreSQL user.
+- `password` — User password.
+
+## Requirements {#requirements}
+
+1. The [wal_level](https://www.postgresql.org/docs/current/runtime-config-wal.html) setting must have a value `logical` and `max_replication_slots` parameter must have a value at least `2` in the PostgreSQL config file.
+
+2. A table with `MaterializedPostgreSQL` engine must have a primary key — the same as a replica identity index (by default: primary key) of a PostgreSQL table (see [details on replica identity index](../../../engines/database-engines/materialized-postgresql.md#requirements)).
+
+3. Only database [Atomic](https://en.wikipedia.org/wiki/Atomicity_(database_systems)) is allowed.
+
+## Virtual columns {#virtual-columns}
+
+- `_version` — Transaction counter. Type: [UInt64](../../../sql-reference/data-types/int-uint.md).
+
+- `_sign` — Deletion mark. Type: [Int8](../../../sql-reference/data-types/int-uint.md). Possible values:
+ - `1` — Row is not deleted,
+ - `-1` — Row is deleted.
+
+These columns do not need to be added when a table is created. They are always accessible in `SELECT` query.
+`_version` column equals `LSN` position in `WAL`, so it might be used to check how up-to-date replication is.
+
+``` sql
+CREATE TABLE postgresql_db.postgresql_replica (key UInt64, value UInt64)
+ENGINE = MaterializedPostgreSQL('postgres1:5432', 'postgres_database', 'postgresql_replica', 'postgres_user', 'postgres_password')
+PRIMARY KEY key;
+
+SELECT key, value, _version FROM postgresql_db.postgresql_replica;
+```
+
+:::warning
+Replication of [**TOAST**](https://www.postgresql.org/docs/9.5/storage-toast.html) values is not supported. The default value for the data type will be used.
+:::
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/materialized-postgresql)
diff --git a/docs/en/reference/engines/table-engines/integrations/mongodb.md b/docs/en/reference/engines/table-engines/integrations/mongodb.md
new file mode 100644
index 00000000000..d212ab4720f
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/mongodb.md
@@ -0,0 +1,79 @@
+---
+sidebar_position: 5
+sidebar_label: MongoDB
+---
+
+# MongoDB {#mongodb}
+
+MongoDB engine is read-only table engine which allows to read data (`SELECT` queries) from remote MongoDB collection. Engine supports only non-nested data types. `INSERT` queries are not supported.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name
+(
+ name1 [type1],
+ name2 [type2],
+ ...
+) ENGINE = MongoDB(host:port, database, collection, user, password [, options]);
+```
+
+**Engine Parameters**
+
+- `host:port` — MongoDB server address.
+
+- `database` — Remote database name.
+
+- `collection` — Remote collection name.
+
+- `user` — MongoDB user.
+
+- `password` — User password.
+
+- `options` — MongoDB connection string options (optional parameter).
+
+## Usage Example {#usage-example}
+
+Create a table in ClickHouse which allows to read data from MongoDB collection:
+
+``` sql
+CREATE TABLE mongo_table
+(
+ key UInt64,
+ data String
+) ENGINE = MongoDB('mongo1:27017', 'test', 'simple_table', 'testuser', 'clickhouse');
+```
+
+To read from an SSL secured MongoDB server:
+
+``` sql
+CREATE TABLE mongo_table_ssl
+(
+ key UInt64,
+ data String
+) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'clickhouse', 'ssl=true');
+```
+
+Query:
+
+``` sql
+SELECT COUNT() FROM mongo_table;
+```
+
+``` text
+┌─count()─┐
+│ 4 │
+└─────────┘
+```
+
+You can also adjust connection timeout:
+
+``` sql
+CREATE TABLE mongo_table
+(
+ key UInt64,
+ data String
+) ENGINE = MongoDB('mongo2:27017', 'test', 'simple_table', 'testuser', 'clickhouse', 'connectTimeoutMS=100000');
+```
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/mongodb/)
diff --git a/docs/en/reference/engines/table-engines/integrations/mysql.md b/docs/en/reference/engines/table-engines/integrations/mysql.md
new file mode 100644
index 00000000000..e962db58873
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/mysql.md
@@ -0,0 +1,152 @@
+---
+sidebar_position: 4
+sidebar_label: MySQL
+---
+
+# MySQL {#mysql}
+
+The MySQL engine allows you to perform `SELECT` and `INSERT` queries on data that is stored on a remote MySQL server.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
+ ...
+) ENGINE = MySQL('host:port', 'database', 'table', 'user', 'password'[, replace_query, 'on_duplicate_clause'])
+SETTINGS
+ [connection_pool_size=16, ]
+ [connection_max_tries=3, ]
+ [connection_wait_timeout=5, ] /* 0 -- do not wait */
+ [connection_auto_close=true ]
+;
+```
+
+See a detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
+
+The table structure can differ from the original MySQL table structure:
+
+- Column names should be the same as in the original MySQL table, but you can use just some of these columns and in any order.
+- Column types may differ from those in the original MySQL table. ClickHouse tries to [cast](../../../engines/database-engines/mysql.md#data_types-support) values to the ClickHouse data types.
+- The [external_table_functions_use_nulls](../../../operations/settings/settings.md#external-table-functions-use-nulls) setting defines how to handle Nullable columns. Default value: 1. If 0, the table function does not make Nullable columns and inserts default values instead of nulls. This is also applicable for NULL values inside arrays.
+
+**Engine Parameters**
+
+- `host:port` — MySQL server address.
+
+- `database` — Remote database name.
+
+- `table` — Remote table name.
+
+- `user` — MySQL user.
+
+- `password` — User password.
+
+- `replace_query` — Flag that converts `INSERT INTO` queries to `REPLACE INTO`. If `replace_query=1`, the query is substituted.
+
+- `on_duplicate_clause` — The `ON DUPLICATE KEY on_duplicate_clause` expression that is added to the `INSERT` query.
+
+ Example: `INSERT INTO t (c1,c2) VALUES ('a', 2) ON DUPLICATE KEY UPDATE c2 = c2 + 1`, where `on_duplicate_clause` is `UPDATE c2 = c2 + 1`. See the [MySQL documentation](https://dev.mysql.com/doc/refman/8.0/en/insert-on-duplicate.html) to find which `on_duplicate_clause` you can use with the `ON DUPLICATE KEY` clause.
+
+ To specify `on_duplicate_clause` you need to pass `0` to the `replace_query` parameter. If you simultaneously pass `replace_query = 1` and `on_duplicate_clause`, ClickHouse generates an exception.
+
+Simple `WHERE` clauses such as `=, !=, >, >=, <, <=` are executed on the MySQL server.
+
+The rest of the conditions and the `LIMIT` sampling constraint are executed in ClickHouse only after the query to MySQL finishes.
+
+Supports multiple replicas that must be listed by `|`. For example:
+
+```sql
+CREATE TABLE test_replicas (id UInt32, name String, age UInt32, money UInt32) ENGINE = MySQL(`mysql{2|3|4}:3306`, 'clickhouse', 'test_replicas', 'root', 'clickhouse');
+```
+
+## Usage Example {#usage-example}
+
+Table in MySQL:
+
+``` text
+mysql> CREATE TABLE `test`.`test` (
+ -> `int_id` INT NOT NULL AUTO_INCREMENT,
+ -> `int_nullable` INT NULL DEFAULT NULL,
+ -> `float` FLOAT NOT NULL,
+ -> `float_nullable` FLOAT NULL DEFAULT NULL,
+ -> PRIMARY KEY (`int_id`));
+Query OK, 0 rows affected (0,09 sec)
+
+mysql> insert into test (`int_id`, `float`) VALUES (1,2);
+Query OK, 1 row affected (0,00 sec)
+
+mysql> select * from test;
++------+----------+-----+----------+
+| int_id | int_nullable | float | float_nullable |
++------+----------+-----+----------+
+| 1 | NULL | 2 | NULL |
++------+----------+-----+----------+
+1 row in set (0,00 sec)
+```
+
+Table in ClickHouse, retrieving data from the MySQL table created above:
+
+``` sql
+CREATE TABLE mysql_table
+(
+ `float_nullable` Nullable(Float32),
+ `int_id` Int32
+)
+ENGINE = MySQL('localhost:3306', 'test', 'test', 'bayonet', '123')
+```
+
+``` sql
+SELECT * FROM mysql_table
+```
+
+``` text
+┌─float_nullable─┬─int_id─┐
+│ ᴺᵁᴸᴸ │ 1 │
+└────────────────┴────────┘
+```
+
+## Settings {#mysql-settings}
+
+Default settings are not very efficient, since they do not even reuse connections. These settings allow you to increase the number of queries run by the server per second.
+
+### connection_auto_close {#connection-auto-close}
+
+Allows to automatically close the connection after query execution, i.e. disable connection reuse.
+
+Possible values:
+
+- 1 — Auto-close connection is allowed, so the connection reuse is disabled
+- 0 — Auto-close connection is not allowed, so the connection reuse is enabled
+
+Default value: `1`.
+
+### connection_max_tries {#connection-max-tries}
+
+Sets the number of retries for pool with failover.
+
+Possible values:
+
+- Positive integer.
+- 0 — There are no retries for pool with failover.
+
+Default value: `3`.
+
+### connection_pool_size {#connection-pool-size}
+
+Size of connection pool (if all connections are in use, the query will wait until some connection will be freed).
+
+Possible values:
+
+- Positive integer.
+
+Default value: `16`.
+
+## See Also {#see-also}
+
+- [The mysql table function](../../../sql-reference/table-functions/mysql.md)
+- [Using MySQL as a source of external dictionary](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-mysql)
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/mysql/)
diff --git a/docs/en/reference/engines/table-engines/integrations/odbc.md b/docs/en/reference/engines/table-engines/integrations/odbc.md
new file mode 100644
index 00000000000..ed2b77d7ca3
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/odbc.md
@@ -0,0 +1,131 @@
+---
+sidebar_position: 2
+sidebar_label: ODBC
+---
+
+# ODBC {#table-engine-odbc}
+
+Allows ClickHouse to connect to external databases via [ODBC](https://en.wikipedia.org/wiki/Open_Database_Connectivity).
+
+To safely implement ODBC connections, ClickHouse uses a separate program `clickhouse-odbc-bridge`. If the ODBC driver is loaded directly from `clickhouse-server`, driver problems can crash the ClickHouse server. ClickHouse automatically starts `clickhouse-odbc-bridge` when it is required. The ODBC bridge program is installed from the same package as the `clickhouse-server`.
+
+This engine supports the [Nullable](../../../sql-reference/data-types/nullable.md) data type.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1],
+ name2 [type2],
+ ...
+)
+ENGINE = ODBC(connection_settings, external_database, external_table)
+```
+
+See a detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
+
+The table structure can differ from the source table structure:
+
+- Column names should be the same as in the source table, but you can use just some of these columns and in any order.
+- Column types may differ from those in the source table. ClickHouse tries to [cast](../../../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) values to the ClickHouse data types.
+- The [external_table_functions_use_nulls](../../../operations/settings/settings.md#external-table-functions-use-nulls) setting defines how to handle Nullable columns. Default value: 1. If 0, the table function does not make Nullable columns and inserts default values instead of nulls. This is also applicable for NULL values inside arrays.
+
+**Engine Parameters**
+
+- `connection_settings` — Name of the section with connection settings in the `odbc.ini` file.
+- `external_database` — Name of a database in an external DBMS.
+- `external_table` — Name of a table in the `external_database`.
+
+## Usage Example {#usage-example}
+
+**Retrieving data from the local MySQL installation via ODBC**
+
+This example is checked for Ubuntu Linux 18.04 and MySQL server 5.7.
+
+Ensure that unixODBC and MySQL Connector are installed.
+
+By default (if installed from packages), ClickHouse starts as user `clickhouse`. Thus, you need to create and configure this user in the MySQL server.
+
+``` bash
+$ sudo mysql
+```
+
+``` sql
+mysql> CREATE USER 'clickhouse'@'localhost' IDENTIFIED BY 'clickhouse';
+mysql> GRANT ALL PRIVILEGES ON *.* TO 'clickhouse'@'clickhouse' WITH GRANT OPTION;
+```
+
+Then configure the connection in `/etc/odbc.ini`.
+
+``` bash
+$ cat /etc/odbc.ini
+[mysqlconn]
+DRIVER = /usr/local/lib/libmyodbc5w.so
+SERVER = 127.0.0.1
+PORT = 3306
+DATABASE = test
+USERNAME = clickhouse
+PASSWORD = clickhouse
+```
+
+You can check the connection using the `isql` utility from the unixODBC installation.
+
+``` bash
+$ isql -v mysqlconn
++-------------------------+
+| Connected! |
+| |
+...
+```
+
+Table in MySQL:
+
+``` text
+mysql> CREATE TABLE `test`.`test` (
+ -> `int_id` INT NOT NULL AUTO_INCREMENT,
+ -> `int_nullable` INT NULL DEFAULT NULL,
+ -> `float` FLOAT NOT NULL,
+ -> `float_nullable` FLOAT NULL DEFAULT NULL,
+ -> PRIMARY KEY (`int_id`));
+Query OK, 0 rows affected (0,09 sec)
+
+mysql> insert into test (`int_id`, `float`) VALUES (1,2);
+Query OK, 1 row affected (0,00 sec)
+
+mysql> select * from test;
++------+----------+-----+----------+
+| int_id | int_nullable | float | float_nullable |
++------+----------+-----+----------+
+| 1 | NULL | 2 | NULL |
++------+----------+-----+----------+
+1 row in set (0,00 sec)
+```
+
+Table in ClickHouse, retrieving data from the MySQL table:
+
+``` sql
+CREATE TABLE odbc_t
+(
+ `int_id` Int32,
+ `float_nullable` Nullable(Float32)
+)
+ENGINE = ODBC('DSN=mysqlconn', 'test', 'test')
+```
+
+``` sql
+SELECT * FROM odbc_t
+```
+
+``` text
+┌─int_id─┬─float_nullable─┐
+│ 1 │ ᴺᵁᴸᴸ │
+└────────┴────────────────┘
+```
+
+## See Also {#see-also}
+
+- [ODBC external dictionaries](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-odbc)
+- [ODBC table function](../../../sql-reference/table-functions/odbc.md)
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/odbc/)
diff --git a/docs/en/reference/engines/table-engines/integrations/postgresql.md b/docs/en/reference/engines/table-engines/integrations/postgresql.md
new file mode 100644
index 00000000000..d6826000a1a
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/postgresql.md
@@ -0,0 +1,178 @@
+---
+sidebar_position: 11
+sidebar_label: PostgreSQL
+---
+
+# PostgreSQL {#postgresql}
+
+The PostgreSQL engine allows to perform `SELECT` and `INSERT` queries on data that is stored on a remote PostgreSQL server.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1] [TTL expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2] [TTL expr2],
+ ...
+) ENGINE = PostgreSQL('host:port', 'database', 'table', 'user', 'password'[, `schema`]);
+```
+
+See a detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
+
+The table structure can differ from the original PostgreSQL table structure:
+
+- Column names should be the same as in the original PostgreSQL table, but you can use just some of these columns and in any order.
+- Column types may differ from those in the original PostgreSQL table. ClickHouse tries to [cast](../../../engines/database-engines/postgresql.md#data_types-support) values to the ClickHouse data types.
+- The [external_table_functions_use_nulls](../../../operations/settings/settings.md#external-table-functions-use-nulls) setting defines how to handle Nullable columns. Default value: 1. If 0, the table function does not make Nullable columns and inserts default values instead of nulls. This is also applicable for NULL values inside arrays.
+
+**Engine Parameters**
+
+- `host:port` — PostgreSQL server address.
+- `database` — Remote database name.
+- `table` — Remote table name.
+- `user` — PostgreSQL user.
+- `password` — User password.
+- `schema` — Non-default table schema. Optional.
+- `on conflict ...` — example: `ON CONFLICT DO NOTHING`. Optional. Note: adding this option will make insertion less efficient.
+
+or via config (since version 21.11):
+
+```
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+```
+
+Some parameters can be overriden by key value arguments:
+``` sql
+SELECT * FROM postgresql(postgres1, schema='schema1', table='table1');
+```
+
+## Implementation Details {#implementation-details}
+
+`SELECT` queries on PostgreSQL side run as `COPY (SELECT ...) TO STDOUT` inside read-only PostgreSQL transaction with commit after each `SELECT` query.
+
+Simple `WHERE` clauses such as `=`, `!=`, `>`, `>=`, `<`, `<=`, and `IN` are executed on the PostgreSQL server.
+
+All joins, aggregations, sorting, `IN [ array ]` conditions and the `LIMIT` sampling constraint are executed in ClickHouse only after the query to PostgreSQL finishes.
+
+`INSERT` queries on PostgreSQL side run as `COPY "table_name" (field1, field2, ... fieldN) FROM STDIN` inside PostgreSQL transaction with auto-commit after each `INSERT` statement.
+
+PostgreSQL `Array` types are converted into ClickHouse arrays.
+
+:::warning
+Be careful - in PostgreSQL an array data, created like a `type_name[]`, may contain multi-dimensional arrays of different dimensions in different table rows in same column. But in ClickHouse it is only allowed to have multidimensional arrays of the same count of dimensions in all table rows in same column.
+:::
+
+Supports multiple replicas that must be listed by `|`. For example:
+
+```sql
+CREATE TABLE test_replicas (id UInt32, name String) ENGINE = PostgreSQL(`postgres{2|3|4}:5432`, 'clickhouse', 'test_replicas', 'postgres', 'mysecretpassword');
+```
+
+Replicas priority for PostgreSQL dictionary source is supported. The bigger the number in map, the less the priority. The highest priority is `0`.
+
+In the example below replica `example01-1` has the highest priority:
+
+```xml
+
+ 5432
+ clickhouse
+ qwerty
+
+ example01-1
+ 1
+
+
+ example01-2
+ 2
+
+ db_name
+
table_name
+ id=10
+ SQL_QUERY
+
+
+```
+
+## Usage Example {#usage-example}
+
+Table in PostgreSQL:
+
+``` text
+postgres=# CREATE TABLE "public"."test" (
+"int_id" SERIAL,
+"int_nullable" INT NULL DEFAULT NULL,
+"float" FLOAT NOT NULL,
+"str" VARCHAR(100) NOT NULL DEFAULT '',
+"float_nullable" FLOAT NULL DEFAULT NULL,
+PRIMARY KEY (int_id));
+
+CREATE TABLE
+
+postgres=# INSERT INTO test (int_id, str, "float") VALUES (1,'test',2);
+INSERT 0 1
+
+postgresql> SELECT * FROM test;
+ int_id | int_nullable | float | str | float_nullable
+ --------+--------------+-------+------+----------------
+ 1 | | 2 | test |
+ (1 row)
+```
+
+Table in ClickHouse, retrieving data from the PostgreSQL table created above:
+
+``` sql
+CREATE TABLE default.postgresql_table
+(
+ `float_nullable` Nullable(Float32),
+ `str` String,
+ `int_id` Int32
+)
+ENGINE = PostgreSQL('localhost:5432', 'public', 'test', 'postges_user', 'postgres_password');
+```
+
+``` sql
+SELECT * FROM postgresql_table WHERE str IN ('test');
+```
+
+``` text
+┌─float_nullable─┬─str──┬─int_id─┐
+│ ᴺᵁᴸᴸ │ test │ 1 │
+└────────────────┴──────┴────────┘
+```
+
+Using Non-default Schema:
+
+```text
+postgres=# CREATE SCHEMA "nice.schema";
+
+postgres=# CREATE TABLE "nice.schema"."nice.table" (a integer);
+
+postgres=# INSERT INTO "nice.schema"."nice.table" SELECT i FROM generate_series(0, 99) as t(i)
+```
+
+```sql
+CREATE TABLE pg_table_schema_with_dots (a UInt32)
+ ENGINE PostgreSQL('localhost:5432', 'clickhouse', 'nice.table', 'postgrsql_user', 'password', 'nice.schema');
+```
+
+**See Also**
+
+- [The `postgresql` table function](../../../sql-reference/table-functions/postgresql.md)
+- [Using PostgreSQL as a source of external dictionary](../../../sql-reference/dictionaries/external-dictionaries/external-dicts-dict-sources.md#dicts-external_dicts_dict_sources-postgresql)
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/postgresql/)
diff --git a/docs/en/reference/engines/table-engines/integrations/rabbitmq.md b/docs/en/reference/engines/table-engines/integrations/rabbitmq.md
new file mode 100644
index 00000000000..6653b76594a
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/rabbitmq.md
@@ -0,0 +1,175 @@
+---
+sidebar_position: 10
+sidebar_label: RabbitMQ
+---
+
+# RabbitMQ Engine {#rabbitmq-engine}
+
+This engine allows integrating ClickHouse with [RabbitMQ](https://www.rabbitmq.com).
+
+`RabbitMQ` lets you:
+
+- Publish or subscribe to data flows.
+- Process streams as they become available.
+
+## Creating a Table {#table_engine-rabbitmq-creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE = RabbitMQ SETTINGS
+ rabbitmq_host_port = 'host:port' [or rabbitmq_address = 'amqp(s)://guest:guest@localhost/vhost'],
+ rabbitmq_exchange_name = 'exchange_name',
+ rabbitmq_format = 'data_format'[,]
+ [rabbitmq_exchange_type = 'exchange_type',]
+ [rabbitmq_routing_key_list = 'key1,key2,...',]
+ [rabbitmq_secure = 0,]
+ [rabbitmq_row_delimiter = 'delimiter_symbol',]
+ [rabbitmq_schema = '',]
+ [rabbitmq_num_consumers = N,]
+ [rabbitmq_num_queues = N,]
+ [rabbitmq_queue_base = 'queue',]
+ [rabbitmq_deadletter_exchange = 'dl-exchange',]
+ [rabbitmq_persistent = 0,]
+ [rabbitmq_skip_broken_messages = N,]
+ [rabbitmq_max_block_size = N,]
+ [rabbitmq_flush_interval_ms = N]
+ [rabbitmq_queue_settings_list = 'x-dead-letter-exchange=my-dlx,x-max-length=10,x-overflow=reject-publish']
+```
+
+Required parameters:
+
+- `rabbitmq_host_port` – host:port (for example, `localhost:5672`).
+- `rabbitmq_exchange_name` – RabbitMQ exchange name.
+- `rabbitmq_format` – Message format. Uses the same notation as the SQL `FORMAT` function, such as `JSONEachRow`. For more information, see the [Formats](../../../interfaces/formats.md) section.
+
+Optional parameters:
+
+- `rabbitmq_exchange_type` – The type of RabbitMQ exchange: `direct`, `fanout`, `topic`, `headers`, `consistent_hash`. Default: `fanout`.
+- `rabbitmq_routing_key_list` – A comma-separated list of routing keys.
+- `rabbitmq_row_delimiter` – Delimiter character, which ends the message.
+- `rabbitmq_schema` – Parameter that must be used if the format requires a schema definition. For example, [Cap’n Proto](https://capnproto.org/) requires the path to the schema file and the name of the root `schema.capnp:Message` object.
+- `rabbitmq_num_consumers` – The number of consumers per table. Default: `1`. Specify more consumers if the throughput of one consumer is insufficient.
+- `rabbitmq_num_queues` – Total number of queues. Default: `1`. Increasing this number can significantly improve performance.
+- `rabbitmq_queue_base` - Specify a hint for queue names. Use cases of this setting are described below.
+- `rabbitmq_deadletter_exchange` - Specify name for a [dead letter exchange](https://www.rabbitmq.com/dlx.html). You can create another table with this exchange name and collect messages in cases when they are republished to dead letter exchange. By default dead letter exchange is not specified.
+- `rabbitmq_persistent` - If set to 1 (true), in insert query delivery mode will be set to 2 (marks messages as 'persistent'). Default: `0`.
+- `rabbitmq_skip_broken_messages` – RabbitMQ message parser tolerance to schema-incompatible messages per block. Default: `0`. If `rabbitmq_skip_broken_messages = N` then the engine skips *N* RabbitMQ messages that cannot be parsed (a message equals a row of data).
+- `rabbitmq_max_block_size`
+- `rabbitmq_flush_interval_ms`
+- `rabbitmq_queue_settings_list` - allows to set RabbitMQ settings when creating a queue. Available settings: `x-max-length`, `x-max-length-bytes`, `x-message-ttl`, `x-expires`, `x-priority`, `x-max-priority`, `x-overflow`, `x-dead-letter-exchange`, `x-queue-type`. The `durable` setting is enabled automatically for the queue.
+
+SSL connection:
+
+Use either `rabbitmq_secure = 1` or `amqps` in connection address: `rabbitmq_address = 'amqps://guest:guest@localhost/vhost'`.
+The default behaviour of the used library is not to check if the created TLS connection is sufficiently secure. Whether the certificate is expired, self-signed, missing or invalid: the connection is simply permitted. More strict checking of certificates can possibly be implemented in the future.
+
+Also format settings can be added along with rabbitmq-related settings.
+
+Example:
+
+``` sql
+ CREATE TABLE queue (
+ key UInt64,
+ value UInt64,
+ date DateTime
+ ) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672',
+ rabbitmq_exchange_name = 'exchange1',
+ rabbitmq_format = 'JSONEachRow',
+ rabbitmq_num_consumers = 5,
+ date_time_input_format = 'best_effort';
+```
+
+The RabbitMQ server configuration should be added using the ClickHouse config file.
+
+Required configuration:
+
+``` xml
+
+ root
+ clickhouse
+
+```
+
+Additional configuration:
+
+``` xml
+
+ clickhouse
+
+```
+
+## Description {#description}
+
+`SELECT` is not particularly useful for reading messages (except for debugging), because each message can be read only once. It is more practical to create real-time threads using [materialized views](../../../sql-reference/statements/create/view.md). To do this:
+
+1. Use the engine to create a RabbitMQ consumer and consider it a data stream.
+2. Create a table with the desired structure.
+3. Create a materialized view that converts data from the engine and puts it into a previously created table.
+
+When the `MATERIALIZED VIEW` joins the engine, it starts collecting data in the background. This allows you to continually receive messages from RabbitMQ and convert them to the required format using `SELECT`.
+One RabbitMQ table can have as many materialized views as you like.
+
+Data can be channeled based on `rabbitmq_exchange_type` and the specified `rabbitmq_routing_key_list`.
+There can be no more than one exchange per table. One exchange can be shared between multiple tables - it enables routing into multiple tables at the same time.
+
+Exchange type options:
+
+- `direct` - Routing is based on the exact matching of keys. Example table key list: `key1,key2,key3,key4,key5`, message key can equal any of them.
+- `fanout` - Routing to all tables (where exchange name is the same) regardless of the keys.
+- `topic` - Routing is based on patterns with dot-separated keys. Examples: `*.logs`, `records.*.*.2020`, `*.2018,*.2019,*.2020`.
+- `headers` - Routing is based on `key=value` matches with a setting `x-match=all` or `x-match=any`. Example table key list: `x-match=all,format=logs,type=report,year=2020`.
+- `consistent_hash` - Data is evenly distributed between all bound tables (where the exchange name is the same). Note that this exchange type must be enabled with RabbitMQ plugin: `rabbitmq-plugins enable rabbitmq_consistent_hash_exchange`.
+
+Setting `rabbitmq_queue_base` may be used for the following cases:
+
+- to let different tables share queues, so that multiple consumers could be registered for the same queues, which makes a better performance. If using `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` settings, the exact match of queues is achieved in case these parameters are the same.
+- to be able to restore reading from certain durable queues when not all messages were successfully consumed. To resume consumption from one specific queue - set its name in `rabbitmq_queue_base` setting and do not specify `rabbitmq_num_consumers` and `rabbitmq_num_queues` (defaults to 1). To resume consumption from all queues, which were declared for a specific table - just specify the same settings: `rabbitmq_queue_base`, `rabbitmq_num_consumers`, `rabbitmq_num_queues`. By default, queue names will be unique to tables.
+- to reuse queues as they are declared durable and not auto-deleted. (Can be deleted via any of RabbitMQ CLI tools.)
+
+To improve performance, received messages are grouped into blocks the size of [max_insert_block_size](../../../operations/server-configuration-parameters/settings.md#settings-max_insert_block_size). If the block wasn’t formed within [stream_flush_interval_ms](../../../operations/server-configuration-parameters/settings.md) milliseconds, the data will be flushed to the table regardless of the completeness of the block.
+
+If `rabbitmq_num_consumers` and/or `rabbitmq_num_queues` settings are specified along with `rabbitmq_exchange_type`, then:
+
+- `rabbitmq-consistent-hash-exchange` plugin must be enabled.
+- `message_id` property of the published messages must be specified (unique for each message/batch).
+
+For insert query there is message metadata, which is added for each published message: `messageID` and `republished` flag (true, if published more than once) - can be accessed via message headers.
+
+Do not use the same table for inserts and materialized views.
+
+Example:
+
+``` sql
+ CREATE TABLE queue (
+ key UInt64,
+ value UInt64
+ ) ENGINE = RabbitMQ SETTINGS rabbitmq_host_port = 'localhost:5672',
+ rabbitmq_exchange_name = 'exchange1',
+ rabbitmq_exchange_type = 'headers',
+ rabbitmq_routing_key_list = 'format=logs,type=report,year=2020',
+ rabbitmq_format = 'JSONEachRow',
+ rabbitmq_num_consumers = 5;
+
+ CREATE TABLE daily (key UInt64, value UInt64)
+ ENGINE = MergeTree() ORDER BY key;
+
+ CREATE MATERIALIZED VIEW consumer TO daily
+ AS SELECT key, value FROM queue;
+
+ SELECT key, value FROM daily ORDER BY key;
+```
+
+## Virtual Columns {#virtual-columns}
+
+- `_exchange_name` - RabbitMQ exchange name.
+- `_channel_id` - ChannelID, on which consumer, who received the message, was declared.
+- `_delivery_tag` - DeliveryTag of the received message. Scoped per channel.
+- `_redelivered` - `redelivered` flag of the message.
+- `_message_id` - messageID of the received message; non-empty if was set, when message was published.
+- `_timestamp` - timestamp of the received message; non-empty if was set, when message was published.
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/rabbitmq/)
diff --git a/docs/en/reference/engines/table-engines/integrations/s3.md b/docs/en/reference/engines/table-engines/integrations/s3.md
new file mode 100644
index 00000000000..42abc2a0b1e
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/s3.md
@@ -0,0 +1,163 @@
+---
+sidebar_position: 7
+sidebar_label: S3
+---
+
+# S3 Table Engine {#table-engine-s3}
+
+This engine provides integration with [Amazon S3](https://aws.amazon.com/s3/) ecosystem. This engine is similar to the [HDFS](../../../engines/table-engines/special/file.md#table_engines-hdfs) engine, but provides S3-specific features.
+
+## Create Table {#creating-a-table}
+
+``` sql
+CREATE TABLE s3_engine_table (name String, value UInt32)
+ ENGINE = S3(path, [aws_access_key_id, aws_secret_access_key,] format, [compression])
+ [SETTINGS ...]
+```
+
+**Engine parameters**
+
+- `path` — Bucket url with path to file. Supports following wildcards in readonly mode: `*`, `?`, `{abc,def}` and `{N..M}` where `N`, `M` — numbers, `'abc'`, `'def'` — strings. For more information see [below](#wildcards-in-path).
+- `format` — The [format](../../../interfaces/formats.md#formats) of the file.
+- `aws_access_key_id`, `aws_secret_access_key` - Long-term credentials for the [AWS](https://aws.amazon.com/) account user. You can use these to authenticate your requests. Parameter is optional. If credentials are not specified, they are used from the configuration file. For more information see [Using S3 for Data Storage](../mergetree-family/mergetree.md#table_engine-mergetree-s3).
+- `compression` — Compression type. Supported values: `none`, `gzip/gz`, `brotli/br`, `xz/LZMA`, `zstd/zst`. Parameter is optional. By default, it will autodetect compression by file extension.
+
+**Example**
+
+``` sql
+CREATE TABLE s3_engine_table (name String, value UInt32)
+ ENGINE=S3('https://clickhouse-public-datasets.s3.amazonaws.com/my-test-bucket-768/test-data.csv.gz', 'CSV', 'gzip')
+ SETTINGS input_format_with_names_use_header = 0;
+
+INSERT INTO s3_engine_table VALUES ('one', 1), ('two', 2), ('three', 3);
+
+SELECT * FROM s3_engine_table LIMIT 2;
+```
+
+```text
+┌─name─┬─value─┐
+│ one │ 1 │
+│ two │ 2 │
+└──────┴───────┘
+```
+## Virtual columns {#virtual-columns}
+
+- `_path` — Path to the file.
+- `_file` — Name of the file.
+
+For more information about virtual columns see [here](../../../engines/table-engines/index.md#table_engines-virtual_columns).
+
+## Implementation Details {#implementation-details}
+
+- Reads and writes can be parallel
+- [Zero-copy](../../../operations/storing-data.md#zero-copy) replication is supported.
+- Not supported:
+ - `ALTER` and `SELECT...SAMPLE` operations.
+ - Indexes.
+
+## Wildcards In Path {#wildcards-in-path}
+
+`path` argument can specify multiple files using bash-like wildcards. For being processed file should exist and match to the whole path pattern. Listing of files is determined during `SELECT` (not at `CREATE` moment).
+
+- `*` — Substitutes any number of any characters except `/` including empty string.
+- `?` — Substitutes any single character.
+- `{some_string,another_string,yet_another_one}` — Substitutes any of strings `'some_string', 'another_string', 'yet_another_one'`.
+- `{N..M}` — Substitutes any number in range from N to M including both borders. N and M can have leading zeroes e.g. `000..078`.
+
+Constructions with `{}` are similar to the [remote](../../../sql-reference/table-functions/remote.md) table function.
+
+:::warning
+If the listing of files contains number ranges with leading zeros, use the construction with braces for each digit separately or use `?`.
+:::
+
+**Example with wildcards 1**
+
+Create table with files named `file-000.csv`, `file-001.csv`, … , `file-999.csv`:
+
+``` sql
+CREATE TABLE big_table (name String, value UInt32)
+ ENGINE = S3('https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/my_folder/file-{000..999}.csv', 'CSV');
+```
+
+**Example with wildcards 2**
+
+Suppose we have several files in CSV format with the following URIs on S3:
+
+- 'https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/some_folder/some_file_1.csv'
+- 'https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/some_folder/some_file_2.csv'
+- 'https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/some_folder/some_file_3.csv'
+- 'https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/another_folder/some_file_1.csv'
+- 'https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/another_folder/some_file_2.csv'
+- 'https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/another_folder/some_file_3.csv'
+
+
+There are several ways to make a table consisting of all six files:
+
+1. Specify the range of file postfixes:
+
+``` sql
+CREATE TABLE table_with_range (name String, value UInt32)
+ ENGINE = S3('https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/{some,another}_folder/some_file_{1..3}', 'CSV');
+```
+
+2. Take all files with `some_file_` prefix (there should be no extra files with such prefix in both folders):
+
+``` sql
+CREATE TABLE table_with_question_mark (name String, value UInt32)
+ ENGINE = S3('https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/{some,another}_folder/some_file_?', 'CSV');
+```
+
+3. Take all the files in both folders (all files should satisfy format and schema described in query):
+
+``` sql
+CREATE TABLE table_with_asterisk (name String, value UInt32)
+ ENGINE = S3('https://clickhouse-public-datasets.s3.amazonaws.com/my-bucket/{some,another}_folder/*', 'CSV');
+```
+
+## S3-related Settings {#settings}
+
+The following settings can be set before query execution or placed into configuration file.
+
+- `s3_max_single_part_upload_size` — The maximum size of object to upload using singlepart upload to S3. Default value is `64Mb`.
+- `s3_min_upload_part_size` — The minimum size of part to upload during multipart upload to [S3 Multipart upload](https://docs.aws.amazon.com/AmazonS3/latest/dev/uploadobjusingmpu.html). Default value is `512Mb`.
+- `s3_max_redirects` — Max number of S3 redirects hops allowed. Default value is `10`.
+- `s3_single_read_retries` — The maximum number of attempts during single read. Default value is `4`.
+
+Security consideration: if malicious user can specify arbitrary S3 URLs, `s3_max_redirects` must be set to zero to avoid [SSRF](https://en.wikipedia.org/wiki/Server-side_request_forgery) attacks; or alternatively, `remote_host_filter` must be specified in server configuration.
+
+## Endpoint-based Settings {#endpoint-settings}
+
+The following settings can be specified in configuration file for given endpoint (which will be matched by exact prefix of a URL):
+
+- `endpoint` — Specifies prefix of an endpoint. Mandatory.
+- `access_key_id` and `secret_access_key` — Specifies credentials to use with given endpoint. Optional.
+- `use_environment_credentials` — If set to `true`, S3 client will try to obtain credentials from environment variables and [Amazon EC2](https://en.wikipedia.org/wiki/Amazon_Elastic_Compute_Cloud) metadata for given endpoint. Optional, default value is `false`.
+- `region` — Specifies S3 region name. Optional.
+- `use_insecure_imds_request` — If set to `true`, S3 client will use insecure IMDS request while obtaining credentials from Amazon EC2 metadata. Optional, default value is `false`.
+- `header` — Adds specified HTTP header to a request to given endpoint. Optional, can be speficied multiple times.
+- `server_side_encryption_customer_key_base64` — If specified, required headers for accessing S3 objects with SSE-C encryption will be set. Optional.
+- `max_single_read_retries` — The maximum number of attempts during single read. Default value is `4`. Optional.
+
+**Example:**
+
+``` xml
+
+
+ https://clickhouse-public-datasets.s3.amazonaws.com/my-test-bucket-768/
+
+
+
+
+
+
+
+
+
+
+```
+
+## See also
+
+- [s3 table function](../../../sql-reference/table-functions/s3.md)
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/s3/)
diff --git a/docs/en/reference/engines/table-engines/integrations/sqlite.md b/docs/en/reference/engines/table-engines/integrations/sqlite.md
new file mode 100644
index 00000000000..45cc1cfc28a
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/integrations/sqlite.md
@@ -0,0 +1,62 @@
+---
+sidebar_position: 7
+sidebar_label: SQLite
+---
+
+# SQLite {#sqlite}
+
+The engine allows to import and export data to SQLite and supports queries to SQLite tables directly from ClickHouse.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+ CREATE TABLE [IF NOT EXISTS] [db.]table_name
+ (
+ name1 [type1],
+ name2 [type2], ...
+ ) ENGINE = SQLite('db_path', 'table')
+```
+
+**Engine Parameters**
+
+- `db_path` — Path to SQLite file with a database.
+- `table` — Name of a table in the SQLite database.
+
+## Usage Example {#usage-example}
+
+Shows a query creating the SQLite table:
+
+```sql
+SHOW CREATE TABLE sqlite_db.table2;
+```
+
+``` text
+CREATE TABLE SQLite.table2
+(
+ `col1` Nullable(Int32),
+ `col2` Nullable(String)
+)
+ENGINE = SQLite('sqlite.db','table2');
+```
+
+Returns the data from the table:
+
+``` sql
+SELECT * FROM sqlite_db.table2 ORDER BY col1;
+```
+
+```text
+┌─col1─┬─col2──┐
+│ 1 │ text1 │
+│ 2 │ text2 │
+│ 3 │ text3 │
+└──────┴───────┘
+```
+
+**See Also**
+
+- [SQLite](../../../engines/database-engines/sqlite.md) engine
+- [sqlite](../../../sql-reference/table-functions/sqlite.md) table function
+
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/integrations/sqlite/)
diff --git a/docs/en/reference/engines/table-engines/log-family/index.md b/docs/en/reference/engines/table-engines/log-family/index.md
new file mode 100644
index 00000000000..89eb08ad7b9
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/log-family/index.md
@@ -0,0 +1,46 @@
+---
+sidebar_position: 20
+sidebar_label: Log Family
+---
+
+# Log Engine Family {#log-engine-family}
+
+These engines were developed for scenarios when you need to quickly write many small tables (up to about 1 million rows) and read them later as a whole.
+
+Engines of the family:
+
+- [StripeLog](../../../engines/table-engines/log-family/stripelog.md)
+- [Log](../../../engines/table-engines/log-family/log.md)
+- [TinyLog](../../../engines/table-engines/log-family/tinylog.md)
+
+`Log` family table engines can store data to [HDFS](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-hdfs) or [S3](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-s3) distributed file systems.
+
+## Common Properties {#common-properties}
+
+Engines:
+
+- Store data on a disk.
+
+- Append data to the end of file when writing.
+
+- Support locks for concurrent data access.
+
+ During `INSERT` queries, the table is locked, and other queries for reading and writing data both wait for the table to unlock. If there are no data writing queries, any number of data reading queries can be performed concurrently.
+
+- Do not support [mutations](../../../sql-reference/statements/alter/index.md#alter-mutations).
+
+- Do not support indexes.
+
+ This means that `SELECT` queries for ranges of data are not efficient.
+
+- Do not write data atomically.
+
+ You can get a table with corrupted data if something breaks the write operation, for example, abnormal server shutdown.
+
+## Differences {#differences}
+
+The `TinyLog` engine is the simplest in the family and provides the poorest functionality and lowest efficiency. The `TinyLog` engine does not support parallel data reading by several threads in a single query. It reads data slower than other engines in the family that support parallel reading from a single query and it uses almost as many file descriptors as the `Log` engine because it stores each column in a separate file. Use it only in simple scenarios.
+
+The `Log` and `StripeLog` engines support parallel data reading. When reading data, ClickHouse uses multiple threads. Each thread processes a separate data block. The `Log` engine uses a separate file for each column of the table. `StripeLog` stores all the data in one file. As a result, the `StripeLog` engine uses fewer file descriptors, but the `Log` engine provides higher efficiency when reading data.
+
+[Original article](https://clickhouse.com/docs/en/operations/table_engines/log_family/)
diff --git a/docs/en/reference/engines/table-engines/log-family/log.md b/docs/en/reference/engines/table-engines/log-family/log.md
new file mode 100644
index 00000000000..8858699f045
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/log-family/log.md
@@ -0,0 +1,15 @@
+---
+toc_priority: 33
+toc_title: Log
+---
+
+# Log {#log}
+
+The engine belongs to the family of `Log` engines. See the common properties of `Log` engines and their differences in the [Log Engine Family](../../../engines/table-engines/log-family/index.md) article.
+
+`Log` differs from [TinyLog](../../../engines/table-engines/log-family/tinylog.md) in that a small file of "marks" resides with the column files. These marks are written on every data block and contain offsets that indicate where to start reading the file in order to skip the specified number of rows. This makes it possible to read table data in multiple threads.
+For concurrent data access, the read operations can be performed simultaneously, while write operations block reads and each other.
+The `Log` engine does not support indexes. Similarly, if writing to a table failed, the table is broken, and reading from it returns an error. The `Log` engine is appropriate for temporary data, write-once tables, and for testing or demonstration purposes.
+
+[Original article](https://clickhouse.com/docs/en/engines/table-engines/log-family/log/)
+
diff --git a/docs/en/reference/engines/table-engines/log-family/stripelog.md b/docs/en/reference/engines/table-engines/log-family/stripelog.md
new file mode 100644
index 00000000000..62703245062
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/log-family/stripelog.md
@@ -0,0 +1,93 @@
+---
+toc_priority: 32
+toc_title: StripeLog
+---
+
+# Stripelog {#stripelog}
+
+This engine belongs to the family of log engines. See the common properties of log engines and their differences in the [Log Engine Family](../../../engines/table-engines/log-family/index.md) article.
+
+Use this engine in scenarios when you need to write many tables with a small amount of data (less than 1 million rows).
+
+## Creating a Table {#table_engines-stripelog-creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ column1_name [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ column2_name [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE = StripeLog
+```
+
+See the detailed description of the [CREATE TABLE](../../../sql-reference/statements/create/table.md#create-table-query) query.
+
+## Writing the Data {#table_engines-stripelog-writing-the-data}
+
+The `StripeLog` engine stores all the columns in one file. For each `INSERT` query, ClickHouse appends the data block to the end of a table file, writing columns one by one.
+
+For each table ClickHouse writes the files:
+
+- `data.bin` — Data file.
+- `index.mrk` — File with marks. Marks contain offsets for each column of each data block inserted.
+
+The `StripeLog` engine does not support the `ALTER UPDATE` and `ALTER DELETE` operations.
+
+## Reading the Data {#table_engines-stripelog-reading-the-data}
+
+The file with marks allows ClickHouse to parallelize the reading of data. This means that a `SELECT` query returns rows in an unpredictable order. Use the `ORDER BY` clause to sort rows.
+
+## Example of Use {#table_engines-stripelog-example-of-use}
+
+Creating a table:
+
+``` sql
+CREATE TABLE stripe_log_table
+(
+ timestamp DateTime,
+ message_type String,
+ message String
+)
+ENGINE = StripeLog
+```
+
+Inserting data:
+
+``` sql
+INSERT INTO stripe_log_table VALUES (now(),'REGULAR','The first regular message')
+INSERT INTO stripe_log_table VALUES (now(),'REGULAR','The second regular message'),(now(),'WARNING','The first warning message')
+```
+
+We used two `INSERT` queries to create two data blocks inside the `data.bin` file.
+
+ClickHouse uses multiple threads when selecting data. Each thread reads a separate data block and returns resulting rows independently as it finishes. As a result, the order of blocks of rows in the output does not match the order of the same blocks in the input in most cases. For example:
+
+``` sql
+SELECT * FROM stripe_log_table
+```
+
+``` text
+┌───────────timestamp─┬─message_type─┬─message────────────────────┐
+│ 2019-01-18 14:27:32 │ REGULAR │ The second regular message │
+│ 2019-01-18 14:34:53 │ WARNING │ The first warning message │
+└─────────────────────┴──────────────┴────────────────────────────┘
+┌───────────timestamp─┬─message_type─┬─message───────────────────┐
+│ 2019-01-18 14:23:43 │ REGULAR │ The first regular message │
+└─────────────────────┴──────────────┴───────────────────────────┘
+```
+
+Sorting the results (ascending order by default):
+
+``` sql
+SELECT * FROM stripe_log_table ORDER BY timestamp
+```
+
+``` text
+┌───────────timestamp─┬─message_type─┬─message────────────────────┐
+│ 2019-01-18 14:23:43 │ REGULAR │ The first regular message │
+│ 2019-01-18 14:27:32 │ REGULAR │ The second regular message │
+│ 2019-01-18 14:34:53 │ WARNING │ The first warning message │
+└─────────────────────┴──────────────┴────────────────────────────┘
+```
+
+[Original article](https://clickhouse.com/docs/en/operations/table_engines/stripelog/)
diff --git a/docs/en/reference/engines/table-engines/log-family/tinylog.md b/docs/en/reference/engines/table-engines/log-family/tinylog.md
new file mode 100644
index 00000000000..2407355a857
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/log-family/tinylog.md
@@ -0,0 +1,14 @@
+---
+toc_priority: 34
+toc_title: TinyLog
+---
+
+# TinyLog {#tinylog}
+
+The engine belongs to the log engine family. See [Log Engine Family](../../../engines/table-engines/log-family/index.md) for common properties of log engines and their differences.
+
+This table engine is typically used with the write-once method: write data one time, then read it as many times as necessary. For example, you can use `TinyLog`-type tables for intermediary data that is processed in small batches. Note that storing data in a large number of small tables is inefficient.
+
+Queries are executed in a single stream. In other words, this engine is intended for relatively small tables (up to about 1,000,000 rows). It makes sense to use this table engine if you have many small tables, since it’s simpler than the [Log](../../../engines/table-engines/log-family/log.md) engine (fewer files need to be opened).
+
+[Original article](https://clickhouse.com/docs/en/operations/table_engines/tinylog/)
diff --git a/docs/en/reference/engines/table-engines/mergetree-family/aggregatingmergetree.md b/docs/en/reference/engines/table-engines/mergetree-family/aggregatingmergetree.md
new file mode 100644
index 00000000000..7be10cec2f5
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/mergetree-family/aggregatingmergetree.md
@@ -0,0 +1,104 @@
+---
+sidebar_position: 60
+sidebar_label: AggregatingMergeTree
+---
+
+# AggregatingMergeTree {#aggregatingmergetree}
+
+The engine inherits from [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md#table_engines-mergetree), altering the logic for data parts merging. ClickHouse replaces all rows with the same primary key (or more accurately, with the same [sorting key](../../../engines/table-engines/mergetree-family/mergetree.md)) with a single row (within a one data part) that stores a combination of states of aggregate functions.
+
+You can use `AggregatingMergeTree` tables for incremental data aggregation, including for aggregated materialized views.
+
+The engine processes all columns with the following types:
+
+- [AggregateFunction](../../../sql-reference/data-types/aggregatefunction.md)
+- [SimpleAggregateFunction](../../../sql-reference/data-types/simpleaggregatefunction.md)
+
+It is appropriate to use `AggregatingMergeTree` if it reduces the number of rows by orders.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE = AggregatingMergeTree()
+[PARTITION BY expr]
+[ORDER BY expr]
+[SAMPLE BY expr]
+[TTL expr]
+[SETTINGS name=value, ...]
+```
+
+For a description of request parameters, see [request description](../../../sql-reference/statements/create/table.md).
+
+**Query clauses**
+
+When creating a `AggregatingMergeTree` table the same [clauses](../../../engines/table-engines/mergetree-family/mergetree.md) are required, as when creating a `MergeTree` table.
+
+
+
+Deprecated Method for Creating a Table
+
+:::warning
+Do not use this method in new projects and, if possible, switch the old projects to the method described above.
+:::
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE [=] AggregatingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity)
+```
+
+All of the parameters have the same meaning as in `MergeTree`.
+
+
+## SELECT and INSERT {#select-and-insert}
+
+To insert data, use [INSERT SELECT](../../../sql-reference/statements/insert-into.md) query with aggregate -State- functions.
+When selecting data from `AggregatingMergeTree` table, use `GROUP BY` clause and the same aggregate functions as when inserting data, but using `-Merge` suffix.
+
+In the results of `SELECT` query, the values of `AggregateFunction` type have implementation-specific binary representation for all of the ClickHouse output formats. If dump data into, for example, `TabSeparated` format with `SELECT` query then this dump can be loaded back using `INSERT` query.
+
+## Example of an Aggregated Materialized View {#example-of-an-aggregated-materialized-view}
+
+`AggregatingMergeTree` materialized view that watches the `test.visits` table:
+
+``` sql
+CREATE MATERIALIZED VIEW test.basic
+ENGINE = AggregatingMergeTree() PARTITION BY toYYYYMM(StartDate) ORDER BY (CounterID, StartDate)
+AS SELECT
+ CounterID,
+ StartDate,
+ sumState(Sign) AS Visits,
+ uniqState(UserID) AS Users
+FROM test.visits
+GROUP BY CounterID, StartDate;
+```
+
+Inserting data into the `test.visits` table.
+
+``` sql
+INSERT INTO test.visits ...
+```
+
+The data are inserted in both the table and view `test.basic` that will perform the aggregation.
+
+To get the aggregated data, we need to execute a query such as `SELECT ... GROUP BY ...` from the view `test.basic`:
+
+``` sql
+SELECT
+ StartDate,
+ sumMerge(Visits) AS Visits,
+ uniqMerge(Users) AS Users
+FROM test.basic
+GROUP BY StartDate
+ORDER BY StartDate;
+```
+
+[Original article](https://clickhouse.com/docs/en/operations/table_engines/aggregatingmergetree/)
diff --git a/docs/en/reference/engines/table-engines/mergetree-family/collapsingmergetree.md b/docs/en/reference/engines/table-engines/mergetree-family/collapsingmergetree.md
new file mode 100644
index 00000000000..22863611e79
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/mergetree-family/collapsingmergetree.md
@@ -0,0 +1,307 @@
+---
+sidebar_position: 70
+sidebar_label: CollapsingMergeTree
+---
+
+# CollapsingMergeTree {#table_engine-collapsingmergetree}
+
+The engine inherits from [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md) and adds the logic of rows collapsing to data parts merge algorithm.
+
+`CollapsingMergeTree` asynchronously deletes (collapses) pairs of rows if all of the fields in a sorting key (`ORDER BY`) are equivalent excepting the particular field `Sign` which can have `1` and `-1` values. Rows without a pair are kept. For more details see the [Collapsing](#table_engine-collapsingmergetree-collapsing) section of the document.
+
+The engine may significantly reduce the volume of storage and increase the efficiency of `SELECT` query as a consequence.
+
+## Creating a Table {#creating-a-table}
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE = CollapsingMergeTree(sign)
+[PARTITION BY expr]
+[ORDER BY expr]
+[SAMPLE BY expr]
+[SETTINGS name=value, ...]
+```
+
+For a description of query parameters, see [query description](../../../sql-reference/statements/create/table.md).
+
+**CollapsingMergeTree Parameters**
+
+- `sign` — Name of the column with the type of row: `1` is a “state” row, `-1` is a “cancel” row.
+
+ Column data type — `Int8`.
+
+**Query clauses**
+
+When creating a `CollapsingMergeTree` table, the same [query clauses](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-creating-a-table) are required, as when creating a `MergeTree` table.
+
+
+
+Deprecated Method for Creating a Table
+
+:::warning
+Do not use this method in new projects and, if possible, switch old projects to the method described above.
+:::
+
+``` sql
+CREATE TABLE [IF NOT EXISTS] [db.]table_name [ON CLUSTER cluster]
+(
+ name1 [type1] [DEFAULT|MATERIALIZED|ALIAS expr1],
+ name2 [type2] [DEFAULT|MATERIALIZED|ALIAS expr2],
+ ...
+) ENGINE [=] CollapsingMergeTree(date-column [, sampling_expression], (primary, key), index_granularity, sign)
+```
+
+All of the parameters excepting `sign` have the same meaning as in `MergeTree`.
+
+- `sign` — Name of the column with the type of row: `1` — “state” row, `-1` — “cancel” row.
+
+ Column Data Type — `Int8`.
+
+
+
+## Collapsing {#table_engine-collapsingmergetree-collapsing}
+
+### Data {#data}
+
+Consider the situation where you need to save continually changing data for some object. It sounds logical to have one row for an object and update it at any change, but update operation is expensive and slow for DBMS because it requires rewriting of the data in the storage. If you need to write data quickly, update not acceptable, but you can write the changes of an object sequentially as follows.
+
+Use the particular column `Sign`. If `Sign = 1` it means that the row is a state of an object, let’s call it “state” row. If `Sign = -1` it means the cancellation of the state of an object with the same attributes, let’s call it “cancel” row.
+
+For example, we want to calculate how much pages users checked at some site and how long they were there. At some moment we write the following row with the state of user activity:
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+At some moment later we register the change of user activity and write it with the following two rows.
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ -1 │
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+The first row cancels the previous state of the object (user). It should copy the sorting key fields of the cancelled state excepting `Sign`.
+
+The second row contains the current state.
+
+As we need only the last state of user activity, the rows
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ 1 │
+│ 4324182021466249494 │ 5 │ 146 │ -1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+can be deleted collapsing the invalid (old) state of an object. `CollapsingMergeTree` does this while merging of the data parts.
+
+Why we need 2 rows for each change read in the [Algorithm](#table_engine-collapsingmergetree-collapsing-algorithm) paragraph.
+
+**Peculiar properties of such approach**
+
+1. The program that writes the data should remember the state of an object to be able to cancel it. “Cancel” string should contain copies of the sorting key fields of the “state” string and the opposite `Sign`. It increases the initial size of storage but allows to write the data quickly.
+2. Long growing arrays in columns reduce the efficiency of the engine due to load for writing. The more straightforward data, the higher the efficiency.
+3. The `SELECT` results depend strongly on the consistency of object changes history. Be accurate when preparing data for inserting. You can get unpredictable results in inconsistent data, for example, negative values for non-negative metrics such as session depth.
+
+### Algorithm {#table_engine-collapsingmergetree-collapsing-algorithm}
+
+When ClickHouse merges data parts, each group of consecutive rows with the same sorting key (`ORDER BY`) is reduced to not more than two rows, one with `Sign = 1` (“state” row) and another with `Sign = -1` (“cancel” row). In other words, entries collapse.
+
+For each resulting data part ClickHouse saves:
+
+1. The first “cancel” and the last “state” rows, if the number of “state” and “cancel” rows matches and the last row is a “state” row.
+2. The last “state” row, if there are more “state” rows than “cancel” rows.
+3. The first “cancel” row, if there are more “cancel” rows than “state” rows.
+4. None of the rows, in all other cases.
+
+Also when there are at least 2 more “state” rows than “cancel” rows, or at least 2 more “cancel” rows then “state” rows, the merge continues, but ClickHouse treats this situation as a logical error and records it in the server log. This error can occur if the same data were inserted more than once.
+
+Thus, collapsing should not change the results of calculating statistics.
+Changes gradually collapsed so that in the end only the last state of almost every object left.
+
+The `Sign` is required because the merging algorithm does not guarantee that all of the rows with the same sorting key will be in the same resulting data part and even on the same physical server. ClickHouse process `SELECT` queries with multiple threads, and it can not predict the order of rows in the result. The aggregation is required if there is a need to get completely “collapsed” data from `CollapsingMergeTree` table.
+
+To finalize collapsing, write a query with `GROUP BY` clause and aggregate functions that account for the sign. For example, to calculate quantity, use `sum(Sign)` instead of `count()`. To calculate the sum of something, use `sum(Sign * x)` instead of `sum(x)`, and so on, and also add `HAVING sum(Sign) > 0`.
+
+The aggregates `count`, `sum` and `avg` could be calculated this way. The aggregate `uniq` could be calculated if an object has at least one state not collapsed. The aggregates `min` and `max` could not be calculated because `CollapsingMergeTree` does not save the values history of the collapsed states.
+
+If you need to extract data without aggregation (for example, to check whether rows are present whose newest values match certain conditions), you can use the `FINAL` modifier for the `FROM` clause. This approach is significantly less efficient.
+
+## Example of Use {#example-of-use}
+
+Example data:
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ 1 │
+│ 4324182021466249494 │ 5 │ 146 │ -1 │
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+Creation of the table:
+
+``` sql
+CREATE TABLE UAct
+(
+ UserID UInt64,
+ PageViews UInt8,
+ Duration UInt8,
+ Sign Int8
+)
+ENGINE = CollapsingMergeTree(Sign)
+ORDER BY UserID
+```
+
+Insertion of the data:
+
+``` sql
+INSERT INTO UAct VALUES (4324182021466249494, 5, 146, 1)
+```
+
+``` sql
+INSERT INTO UAct VALUES (4324182021466249494, 5, 146, -1),(4324182021466249494, 6, 185, 1)
+```
+
+We use two `INSERT` queries to create two different data parts. If we insert the data with one query ClickHouse creates one data part and will not perform any merge ever.
+
+Getting the data:
+
+``` sql
+SELECT * FROM UAct
+```
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ -1 │
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+What do we see and where is collapsing?
+
+With two `INSERT` queries, we created 2 data parts. The `SELECT` query was performed in 2 threads, and we got a random order of rows. Collapsing not occurred because there was no merge of the data parts yet. ClickHouse merges data part in an unknown moment which we can not predict.
+
+Thus we need aggregation:
+
+``` sql
+SELECT
+ UserID,
+ sum(PageViews * Sign) AS PageViews,
+ sum(Duration * Sign) AS Duration
+FROM UAct
+GROUP BY UserID
+HAVING sum(Sign) > 0
+```
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┐
+│ 4324182021466249494 │ 6 │ 185 │
+└─────────────────────┴───────────┴──────────┘
+```
+
+If we do not need aggregation and want to force collapsing, we can use `FINAL` modifier for `FROM` clause.
+
+``` sql
+SELECT * FROM UAct FINAL
+```
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+This way of selecting the data is very inefficient. Don’t use it for big tables.
+
+## Example of Another Approach {#example-of-another-approach}
+
+Example data:
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ 1 │
+│ 4324182021466249494 │ -5 │ -146 │ -1 │
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+The idea is that merges take into account only key fields. And in the “Cancel” line we can specify negative values that equalize the previous version of the row when summing without using the Sign column. For this approach, it is necessary to change the data type `PageViews`,`Duration` to store negative values of UInt8 -\> Int16.
+
+``` sql
+CREATE TABLE UAct
+(
+ UserID UInt64,
+ PageViews Int16,
+ Duration Int16,
+ Sign Int8
+)
+ENGINE = CollapsingMergeTree(Sign)
+ORDER BY UserID
+```
+
+Let’s test the approach:
+
+``` sql
+insert into UAct values(4324182021466249494, 5, 146, 1);
+insert into UAct values(4324182021466249494, -5, -146, -1);
+insert into UAct values(4324182021466249494, 6, 185, 1);
+
+select * from UAct final; // avoid using final in production (just for a test or small tables)
+```
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+``` sql
+SELECT
+ UserID,
+ sum(PageViews) AS PageViews,
+ sum(Duration) AS Duration
+FROM UAct
+GROUP BY UserID
+```
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┐
+│ 4324182021466249494 │ 6 │ 185 │
+└─────────────────────┴───────────┴──────────┘
+```
+
+``` sql
+select count() FROM UAct
+```
+
+``` text
+┌─count()─┐
+│ 3 │
+└─────────┘
+```
+
+``` sql
+optimize table UAct final;
+
+select * FROM UAct
+```
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+[Original article](https://clickhouse.com/docs/en/operations/table_engines/collapsingmergetree/)
diff --git a/docs/en/reference/engines/table-engines/mergetree-family/custom-partitioning-key.md b/docs/en/reference/engines/table-engines/mergetree-family/custom-partitioning-key.md
new file mode 100644
index 00000000000..716528f8d77
--- /dev/null
+++ b/docs/en/reference/engines/table-engines/mergetree-family/custom-partitioning-key.md
@@ -0,0 +1,136 @@
+---
+sidebar_position: 30
+sidebar_label: Custom Partitioning Key
+---
+
+# Custom Partitioning Key {#custom-partitioning-key}
+
+:::warning
+In most cases you do not need a partition key, and in most other cases you do not need a partition key more granular than by months. Partitioning does not speed up queries (in contrast to the ORDER BY expression).
+
+You should never use too granular of partitioning. Don't partition your data by client identifiers or names. Instead, make a client identifier or name the first column in the ORDER BY expression.
+:::
+
+Partitioning is available for the [MergeTree](../../../engines/table-engines/mergetree-family/mergetree.md) family tables (including [replicated](../../../engines/table-engines/mergetree-family/replication.md) tables). [Materialized views](../../../engines/table-engines/special/materializedview.md#materializedview) based on MergeTree tables support partitioning, as well.
+
+A partition is a logical combination of records in a table by a specified criterion. You can set a partition by an arbitrary criterion, such as by month, by day, or by event type. Each partition is stored separately to simplify manipulations of this data. When accessing the data, ClickHouse uses the smallest subset of partitions possible.
+
+The partition is specified in the `PARTITION BY expr` clause when [creating a table](../../../engines/table-engines/mergetree-family/mergetree.md#table_engine-mergetree-creating-a-table). The partition key can be any expression from the table columns. For example, to specify partitioning by month, use the expression `toYYYYMM(date_column)`:
+
+``` sql
+CREATE TABLE visits
+(
+ VisitDate Date,
+ Hour UInt8,
+ ClientID UUID
+)
+ENGINE = MergeTree()
+PARTITION BY toYYYYMM(VisitDate)
+ORDER BY Hour;
+```
+
+The partition key can also be a tuple of expressions (similar to the [primary key](../../../engines/table-engines/mergetree-family/mergetree.md#primary-keys-and-indexes-in-queries)). For example:
+
+``` sql
+ENGINE = ReplicatedCollapsingMergeTree('/clickhouse/tables/name', 'replica1', Sign)
+PARTITION BY (toMonday(StartDate), EventType)
+ORDER BY (CounterID, StartDate, intHash32(UserID));
+```
+
+In this example, we set partitioning by the event types that occurred during the current week.
+
+By default, the floating-point partition key is not supported. To use it enable the setting [allow_floating_point_partition_key](../../../operations/settings/merge-tree-settings.md#allow_floating_point_partition_key).
+
+When inserting new data to a table, this data is stored as a separate part (chunk) sorted by the primary key. In 10-15 minutes after inserting, the parts of the same partition are merged into the entire part.
+
+:::info
+A merge only works for data parts that have the same value for the partitioning expression. This means **you shouldn’t make overly granular partitions** (more than about a thousand partitions). Otherwise, the `SELECT` query performs poorly because of an unreasonably large number of files in the file system and open file descriptors.
+:::
+
+Use the [system.parts](../../../operations/system-tables/parts.md#system_tables-parts) table to view the table parts and partitions. For example, let’s assume that we have a `visits` table with partitioning by month. Let’s perform the `SELECT` query for the `system.parts` table:
+
+``` sql
+SELECT
+ partition,
+ name,
+ active
+FROM system.parts
+WHERE table = 'visits'
+```
+
+``` text
+┌─partition─┬─name──────────────┬─active─┐
+│ 201901 │ 201901_1_3_1 │ 0 │
+│ 201901 │ 201901_1_9_2_11 │ 1 │
+│ 201901 │ 201901_8_8_0 │ 0 │
+│ 201901 │ 201901_9_9_0 │ 0 │
+│ 201902 │ 201902_4_6_1_11 │ 1 │
+│ 201902 │ 201902_10_10_0_11 │ 1 │
+│ 201902 │ 201902_11_11_0_11 │ 1 │
+└───────────┴───────────────────┴────────┘
+```
+
+The `partition` column contains the names of the partitions. There are two partitions in this example: `201901` and `201902`. You can use this column value to specify the partition name in [ALTER … PARTITION](../../../sql-reference/statements/alter/partition.md) queries.
+
+The `name` column contains the names of the partition data parts. You can use this column to specify the name of the part in the [ALTER ATTACH PART](../../../sql-reference/statements/alter/partition.md#alter_attach-partition) query.
+
+Let’s break down the name of the part: `201901_1_9_2_11`:
+
+- `201901` is the partition name.
+- `1` is the minimum number of the data block.
+- `9` is the maximum number of the data block.
+- `2` is the chunk level (the depth of the merge tree it is formed from).
+- `11` is the mutation version (if a part mutated)
+
+:::info
+The parts of old-type tables have the name: `20190117_20190123_2_2_0` (minimum date - maximum date - minimum block number - maximum block number - level).
+:::
+
+The `active` column shows the status of the part. `1` is active; `0` is inactive. The inactive parts are, for example, source parts remaining after merging to a larger part. The corrupted data parts are also indicated as inactive.
+
+As you can see in the example, there are several separated parts of the same partition (for example, `201901_1_3_1` and `201901_1_9_2`). This means that these parts are not merged yet. ClickHouse merges the inserted parts of data periodically, approximately 15 minutes after inserting. In addition, you can perform a non-scheduled merge using the [OPTIMIZE](../../../sql-reference/statements/optimize.md) query. Example:
+
+``` sql
+OPTIMIZE TABLE visits PARTITION 201902;
+```
+
+``` text
+┌─partition─┬─name─────────────┬─active─┐
+│ 201901 │ 201901_1_3_1 │ 0 │
+│ 201901 │ 201901_1_9_2_11 │ 1 │
+│ 201901 │ 201901_8_8_0 │ 0 │
+│ 201901 │ 201901_9_9_0 │ 0 │
+│ 201902 │ 201902_4_6_1 │ 0 │
+│ 201902 │ 201902_4_11_2_11 │ 1 │
+│ 201902 │ 201902_10_10_0 │ 0 │
+│ 201902 │ 201902_11_11_0 │ 0 │
+└───────────┴──────────────────┴────────┘
+```
+
+Inactive parts will be deleted approximately 10 minutes after merging.
+
+Another way to view a set of parts and partitions is to go into the directory of the table: `/var/lib/clickhouse/data//
+ Processed 3095973 rows in 0.1569913 sec
+
+
+```
+
+Insert example:
+
+``` text
+Some header
+Page views: 5, User id: 4324182021466249494, Useless field: hello, Duration: 146, Sign: -1
+Page views: 6, User id: 4324182021466249494, Useless field: world, Duration: 185, Sign: 1
+Total rows: 2
+```
+
+``` sql
+INSERT INTO UserActivity FORMAT Template SETTINGS
+format_template_resultset = '/some/path/resultset.format', format_template_row = '/some/path/row.format'
+```
+
+`/some/path/resultset.format`:
+
+``` text
+Some header\n${data}\nTotal rows: ${:CSV}\n
+```
+
+`/some/path/row.format`:
+
+``` text
+Page views: ${PageViews:CSV}, User id: ${UserID:CSV}, Useless field: ${:CSV}, Duration: ${Duration:CSV}, Sign: ${Sign:CSV}
+```
+
+`PageViews`, `UserID`, `Duration` and `Sign` inside placeholders are names of columns in the table. Values after `Useless field` in rows and after `\nTotal rows:` in suffix will be ignored.
+All delimiters in the input data must be strictly equal to delimiters in specified format strings.
+
+## TemplateIgnoreSpaces {#templateignorespaces}
+
+This format is suitable only for input.
+Similar to `Template`, but skips whitespace characters between delimiters and values in the input stream. However, if format strings contain whitespace characters, these characters will be expected in the input stream. Also allows to specify empty placeholders (`${}` or `${:None}`) to split some delimiter into separate parts to ignore spaces between them. Such placeholders are used only for skipping whitespace characters.
+It’s possible to read `JSON` using this format, if values of columns have the same order in all rows. For example, the following request can be used for inserting data from output example of format [JSON](#json):
+
+``` sql
+INSERT INTO table_name FORMAT TemplateIgnoreSpaces SETTINGS
+format_template_resultset = '/some/path/resultset.format', format_template_row = '/some/path/row.format', format_template_rows_between_delimiter = ','
+```
+
+`/some/path/resultset.format`:
+
+``` text
+{${}"meta"${}:${:JSON},${}"data"${}:${}[${data}]${},${}"totals"${}:${:JSON},${}"extremes"${}:${:JSON},${}"rows"${}:${:JSON},${}"rows_before_limit_at_least"${}:${:JSON}${}}
+```
+
+`/some/path/row.format`:
+
+``` text
+{${}"SearchPhrase"${}:${}${phrase:JSON}${},${}"c"${}:${}${cnt:JSON}${}}
+```
+
+## TSKV {#tskv}
+
+Similar to TabSeparated, but outputs a value in name=value format. Names are escaped the same way as in TabSeparated format, and the = symbol is also escaped.
+
+``` text
+SearchPhrase= count()=8267016
+SearchPhrase=bathroom interior design count()=2166
+SearchPhrase=clickhouse count()=1655
+SearchPhrase=2014 spring fashion count()=1549
+SearchPhrase=freeform photos count()=1480
+SearchPhrase=angelina jolie count()=1245
+SearchPhrase=omsk count()=1112
+SearchPhrase=photos of dog breeds count()=1091
+SearchPhrase=curtain designs count()=1064
+SearchPhrase=baku count()=1000
+```
+
+[NULL](../sql-reference/syntax.md) is formatted as `\N`.
+
+``` sql
+SELECT * FROM t_null FORMAT TSKV
+```
+
+``` text
+x=1 y=\N
+```
+
+When there is a large number of small columns, this format is ineffective, and there is generally no reason to use it. Nevertheless, it is no worse than JSONEachRow in terms of efficiency.
+
+Both data output and parsing are supported in this format. For parsing, any order is supported for the values of different columns. It is acceptable for some values to be omitted – they are treated as equal to their default values. In this case, zeros and blank rows are used as default values. Complex values that could be specified in the table are not supported as defaults.
+
+Parsing allows the presence of the additional field `tskv` without the equal sign or a value. This field is ignored.
+
+## CSV {#csv}
+
+Comma Separated Values format ([RFC](https://tools.ietf.org/html/rfc4180)).
+
+When formatting, rows are enclosed in double-quotes. A double quote inside a string is output as two double quotes in a row. There are no other rules for escaping characters. Date and date-time are enclosed in double-quotes. Numbers are output without quotes. Values are separated by a delimiter character, which is `,` by default. The delimiter character is defined in the setting [format_csv_delimiter](../operations/settings/settings.md#settings-format_csv_delimiter). Rows are separated using the Unix line feed (LF). Arrays are serialized in CSV as follows: first, the array is serialized to a string as in TabSeparated format, and then the resulting string is output to CSV in double-quotes. Tuples in CSV format are serialized as separate columns (that is, their nesting in the tuple is lost).
+
+``` bash
+$ clickhouse-client --format_csv_delimiter="|" --query="INSERT INTO test.csv FORMAT CSV" < data.csv
+```
+
+\*By default, the delimiter is `,`. See the [format_csv_delimiter](../operations/settings/settings.md#settings-format_csv_delimiter) setting for more information.
+
+When parsing, all values can be parsed either with or without quotes. Both double and single quotes are supported. Rows can also be arranged without quotes. In this case, they are parsed up to the delimiter character or line feed (CR or LF). In violation of the RFC, when parsing rows without quotes, the leading and trailing spaces and tabs are ignored. For the line feed, Unix (LF), Windows (CR LF) and Mac OS Classic (CR LF) types are all supported.
+
+If setting [input_format_csv_empty_as_default](../operations/settings/settings.md#settings-input_format_csv_empty_as_default) is enabled,
+empty unquoted input values are replaced with default values. For complex default expressions [input_format_defaults_for_omitted_fields](../operations/settings/settings.md#settings-input_format_defaults_for_omitted_fields) must be enabled too.
+
+`NULL` is formatted according to setting [format_csv_null_representation](../operations/settings/settings.md#settings-format_csv_null_representation) (default value is `\N`).
+
+In input data, ENUM values can be represented as names or as ids. First, we try to match the input value to the ENUM name. If we fail and the input value is a number, we try to match this number to ENUM id.
+If input data contains only ENUM ids, it's recommended to enable the setting [input_format_csv_enum_as_number](../operations/settings/settings.md#settings-input_format_csv_enum_as_number) to optimize ENUM parsing.
+
+The CSV format supports the output of totals and extremes the same way as `TabSeparated`.
+
+## CSVWithNames {#csvwithnames}
+
+Also prints the header row with column names, similar to [TabSeparatedWithNames](#tabseparatedwithnames).
+
+## CSVWithNamesAndTypes {#csvwithnamesandtypes}
+
+Also prints two header rows with column names and types, similar to [TabSeparatedWithNamesAndTypes](#tabseparatedwithnamesandtypes).
+
+## CustomSeparated {#format-customseparated}
+
+Similar to [Template](#format-template), but it prints or reads all names and types of columns and uses escaping rule from [format_custom_escaping_rule](../operations/settings/settings.md#format-custom-escaping-rule) setting and delimiters from [format_custom_field_delimiter](../operations/settings/settings.md#format-custom-field-delimiter), [format_custom_row_before_delimiter](../operations/settings/settings.md#format-custom-row-before-delimiter), [format_custom_row_after_delimiter](../operations/settings/settings.md#format-custom-row-after-delimiter), [format_custom_row_between_delimiter](../operations/settings/settings.md#format-custom-row-between-delimiter), [format_custom_result_before_delimiter](../operations/settings/settings.md#format-custom-result-before-delimiter) and [format_custom_result_after_delimiter](../operations/settings/settings.md#format-custom-result-after-delimiter) settings, not from format strings.
+
+There is also `CustomSeparatedIgnoreSpaces` format, which is similar to [TemplateIgnoreSpaces](#templateignorespaces).
+
+## CustomSeparatedWithNames {#customseparatedwithnames}
+
+Also prints the header row with column names, similar to [TabSeparatedWithNames](#tabseparatedwithnames).
+
+## CustomSeparatedWithNamesAndTypes {#customseparatedwithnamesandtypes}
+
+Also prints two header rows with column names and types, similar to [TabSeparatedWithNamesAndTypes](#tabseparatedwithnamesandtypes).
+
+## JSON {#json}
+
+Outputs data in JSON format. Besides data tables, it also outputs column names and types, along with some additional information: the total number of output rows, and the number of rows that could have been output if there weren’t a LIMIT. Example:
+
+``` sql
+SELECT SearchPhrase, count() AS c FROM test.hits GROUP BY SearchPhrase WITH TOTALS ORDER BY c DESC LIMIT 5 FORMAT JSON
+```
+
+``` json
+{
+ "meta":
+ [
+ {
+ "name": "'hello'",
+ "type": "String"
+ },
+ {
+ "name": "multiply(42, number)",
+ "type": "UInt64"
+ },
+ {
+ "name": "range(5)",
+ "type": "Array(UInt8)"
+ }
+ ],
+
+ "data":
+ [
+ {
+ "'hello'": "hello",
+ "multiply(42, number)": "0",
+ "range(5)": [0,1,2,3,4]
+ },
+ {
+ "'hello'": "hello",
+ "multiply(42, number)": "42",
+ "range(5)": [0,1,2,3,4]
+ },
+ {
+ "'hello'": "hello",
+ "multiply(42, number)": "84",
+ "range(5)": [0,1,2,3,4]
+ }
+ ],
+
+ "rows": 3,
+
+ "rows_before_limit_at_least": 3
+}
+```
+
+The JSON is compatible with JavaScript. To ensure this, some characters are additionally escaped: the slash `/` is escaped as `\/`; alternative line breaks `U+2028` and `U+2029`, which break some browsers, are escaped as `\uXXXX`. ASCII control characters are escaped: backspace, form feed, line feed, carriage return, and horizontal tab are replaced with `\b`, `\f`, `\n`, `\r`, `\t` , as well as the remaining bytes in the 00-1F range using `\uXXXX` sequences. Invalid UTF-8 sequences are changed to the replacement character � so the output text will consist of valid UTF-8 sequences. For compatibility with JavaScript, Int64 and UInt64 integers are enclosed in double-quotes by default. To remove the quotes, you can set the configuration parameter [output_format_json_quote_64bit_integers](../operations/settings/settings.md#session_settings-output_format_json_quote_64bit_integers) to 0.
+
+`rows` – The total number of output rows.
+
+`rows_before_limit_at_least` The minimal number of rows there would have been without LIMIT. Output only if the query contains LIMIT.
+If the query contains GROUP BY, rows_before_limit_at_least is the exact number of rows there would have been without a LIMIT.
+
+`totals` – Total values (when using WITH TOTALS).
+
+`extremes` – Extreme values (when extremes are set to 1).
+
+This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
+
+ClickHouse supports [NULL](../sql-reference/syntax.md), which is displayed as `null` in the JSON output. To enable `+nan`, `-nan`, `+inf`, `-inf` values in output, set the [output_format_json_quote_denormals](../operations/settings/settings.md#settings-output_format_json_quote_denormals) to 1.
+
+**See Also**
+
+- [JSONEachRow](#jsoneachrow) format
+- [output_format_json_array_of_rows](../operations/settings/settings.md#output-format-json-array-of-rows) setting
+
+## JSONStrings {#jsonstrings}
+
+Differs from JSON only in that data fields are output in strings, not in typed JSON values.
+
+Example:
+
+```json
+{
+ "meta":
+ [
+ {
+ "name": "'hello'",
+ "type": "String"
+ },
+ {
+ "name": "multiply(42, number)",
+ "type": "UInt64"
+ },
+ {
+ "name": "range(5)",
+ "type": "Array(UInt8)"
+ }
+ ],
+
+ "data":
+ [
+ {
+ "'hello'": "hello",
+ "multiply(42, number)": "0",
+ "range(5)": "[0,1,2,3,4]"
+ },
+ {
+ "'hello'": "hello",
+ "multiply(42, number)": "42",
+ "range(5)": "[0,1,2,3,4]"
+ },
+ {
+ "'hello'": "hello",
+ "multiply(42, number)": "84",
+ "range(5)": "[0,1,2,3,4]"
+ }
+ ],
+
+ "rows": 3,
+
+ "rows_before_limit_at_least": 3
+}
+```
+
+## JSONAsString {#jsonasstring}
+
+In this format, a single JSON object is interpreted as a single value. If the input has several JSON objects (comma separated), they are interpreted as separate rows. If the input data is enclosed in square brackets, it is interpreted as an array of JSONs.
+
+This format can only be parsed for table with a single field of type [String](../sql-reference/data-types/string.md). The remaining columns must be set to [DEFAULT](../sql-reference/statements/create/table.md#default) or [MATERIALIZED](../sql-reference/statements/create/table.md#materialized), or omitted. Once you collect whole JSON object to string you can use [JSON functions](../sql-reference/functions/json-functions.md) to process it.
+
+**Examples**
+
+Query:
+
+``` sql
+DROP TABLE IF EXISTS json_as_string;
+CREATE TABLE json_as_string (json String) ENGINE = Memory;
+INSERT INTO json_as_string (json) FORMAT JSONAsString {"foo":{"bar":{"x":"y"},"baz":1}},{},{"any json stucture":1}
+SELECT * FROM json_as_string;
+```
+
+Result:
+
+``` text
+┌─json──────────────────────────────┐
+│ {"foo":{"bar":{"x":"y"},"baz":1}} │
+│ {} │
+│ {"any json stucture":1} │
+└───────────────────────────────────┘
+```
+
+**An array of JSON objects**
+
+Query:
+
+``` sql
+CREATE TABLE json_square_brackets (field String) ENGINE = Memory;
+INSERT INTO json_square_brackets FORMAT JSONAsString [{"id": 1, "name": "name1"}, {"id": 2, "name": "name2"}];
+
+SELECT * FROM json_square_brackets;
+```
+
+Result:
+
+```text
+┌─field──────────────────────┐
+│ {"id": 1, "name": "name1"} │
+│ {"id": 2, "name": "name2"} │
+└────────────────────────────┘
+```
+
+## JSONCompact {#jsoncompact}
+## JSONCompactStrings {#jsoncompactstrings}
+
+Differs from JSON only in that data rows are output in arrays, not in objects.
+
+Example:
+
+```
+// JSONCompact
+{
+ "meta":
+ [
+ {
+ "name": "'hello'",
+ "type": "String"
+ },
+ {
+ "name": "multiply(42, number)",
+ "type": "UInt64"
+ },
+ {
+ "name": "range(5)",
+ "type": "Array(UInt8)"
+ }
+ ],
+
+ "data":
+ [
+ ["hello", "0", [0,1,2,3,4]],
+ ["hello", "42", [0,1,2,3,4]],
+ ["hello", "84", [0,1,2,3,4]]
+ ],
+
+ "rows": 3,
+
+ "rows_before_limit_at_least": 3
+}
+```
+
+```
+// JSONCompactStrings
+{
+ "meta":
+ [
+ {
+ "name": "'hello'",
+ "type": "String"
+ },
+ {
+ "name": "multiply(42, number)",
+ "type": "UInt64"
+ },
+ {
+ "name": "range(5)",
+ "type": "Array(UInt8)"
+ }
+ ],
+
+ "data":
+ [
+ ["hello", "0", "[0,1,2,3,4]"],
+ ["hello", "42", "[0,1,2,3,4]"],
+ ["hello", "84", "[0,1,2,3,4]"]
+ ],
+
+ "rows": 3,
+
+ "rows_before_limit_at_least": 3
+}
+```
+
+## JSONEachRow {#jsoneachrow}
+## JSONStringsEachRow {#jsonstringseachrow}
+## JSONCompactEachRow {#jsoncompacteachrow}
+## JSONCompactStringsEachRow {#jsoncompactstringseachrow}
+
+When using these formats, ClickHouse outputs rows as separated, newline-delimited JSON values, but the data as a whole is not valid JSON.
+
+``` json
+{"some_int":42,"some_str":"hello","some_tuple":[1,"a"]} // JSONEachRow
+[42,"hello",[1,"a"]] // JSONCompactEachRow
+["42","hello","(2,'a')"] // JSONCompactStringsEachRow
+```
+
+When inserting the data, you should provide a separate JSON value for each row.
+
+## JSONEachRowWithProgress {#jsoneachrowwithprogress}
+## JSONStringsEachRowWithProgress {#jsonstringseachrowwithprogress}
+
+Differs from `JSONEachRow`/`JSONStringsEachRow` in that ClickHouse will also yield progress information as JSON values.
+
+```json
+{"row":{"'hello'":"hello","multiply(42, number)":"0","range(5)":[0,1,2,3,4]}}
+{"row":{"'hello'":"hello","multiply(42, number)":"42","range(5)":[0,1,2,3,4]}}
+{"row":{"'hello'":"hello","multiply(42, number)":"84","range(5)":[0,1,2,3,4]}}
+{"progress":{"read_rows":"3","read_bytes":"24","written_rows":"0","written_bytes":"0","total_rows_to_read":"3"}}
+```
+
+## JSONCompactEachRowWithNames {#jsoncompacteachrowwithnames}
+
+Differs from `JSONCompactEachRow` format in that it also prints the header row with column names, similar to [TabSeparatedWithNames](#tabseparatedwithnames).
+
+## JSONCompactEachRowWithNamesAndTypes {#jsoncompacteachrowwithnamesandtypes}
+
+Differs from `JSONCompactEachRow` format in that it also prints two header rows with column names and types, similar to [TabSeparatedWithNamesAndTypes](#tabseparatedwithnamesandtypes).
+
+## JSONCompactStringsEachRowWithNames {#jsoncompactstringseachrowwithnames}
+
+Differs from `JSONCompactStringsEachRow` in that in that it also prints the header row with column names, similar to [TabSeparatedWithNames](#tabseparatedwithnames).
+
+## JSONCompactStringsEachRowWithNamesAndTypes {#jsoncompactstringseachrowwithnamesandtypes}
+
+Differs from `JSONCompactStringsEachRow` in that it also prints two header rows with column names and types, similar to [TabSeparatedWithNamesAndTypes](#tabseparatedwithnamesandtypes).
+
+```json
+["'hello'", "multiply(42, number)", "range(5)"]
+["String", "UInt64", "Array(UInt8)"]
+["hello", "0", [0,1,2,3,4]]
+["hello", "42", [0,1,2,3,4]]
+["hello", "84", [0,1,2,3,4]]
+```
+
+### Inserting Data {#inserting-data}
+
+``` sql
+INSERT INTO UserActivity FORMAT JSONEachRow {"PageViews":5, "UserID":"4324182021466249494", "Duration":146,"Sign":-1} {"UserID":"4324182021466249494","PageViews":6,"Duration":185,"Sign":1}
+```
+
+ClickHouse allows:
+
+- Any order of key-value pairs in the object.
+- Omitting some values.
+
+ClickHouse ignores spaces between elements and commas after the objects. You can pass all the objects in one line. You do not have to separate them with line breaks.
+
+**Omitted values processing**
+
+ClickHouse substitutes omitted values with the default values for the corresponding [data types](../sql-reference/data-types/index.md).
+
+If `DEFAULT expr` is specified, ClickHouse uses different substitution rules depending on the [input_format_defaults_for_omitted_fields](../operations/settings/settings.md#session_settings-input_format_defaults_for_omitted_fields) setting.
+
+Consider the following table:
+
+``` sql
+CREATE TABLE IF NOT EXISTS example_table
+(
+ x UInt32,
+ a DEFAULT x * 2
+) ENGINE = Memory;
+```
+
+- If `input_format_defaults_for_omitted_fields = 0`, then the default value for `x` and `a` equals `0` (as the default value for the `UInt32` data type).
+- If `input_format_defaults_for_omitted_fields = 1`, then the default value for `x` equals `0`, but the default value of `a` equals `x * 2`.
+
+:::warning
+When inserting data with `input_format_defaults_for_omitted_fields = 1`, ClickHouse consumes more computational resources, compared to insertion with `input_format_defaults_for_omitted_fields = 0`.
+:::
+
+### Selecting Data {#selecting-data}
+
+Consider the `UserActivity` table as an example:
+
+``` text
+┌──────────────UserID─┬─PageViews─┬─Duration─┬─Sign─┐
+│ 4324182021466249494 │ 5 │ 146 │ -1 │
+│ 4324182021466249494 │ 6 │ 185 │ 1 │
+└─────────────────────┴───────────┴──────────┴──────┘
+```
+
+The query `SELECT * FROM UserActivity FORMAT JSONEachRow` returns:
+
+``` text
+{"UserID":"4324182021466249494","PageViews":5,"Duration":146,"Sign":-1}
+{"UserID":"4324182021466249494","PageViews":6,"Duration":185,"Sign":1}
+```
+
+Unlike the [JSON](#json) format, there is no substitution of invalid UTF-8 sequences. Values are escaped in the same way as for `JSON`.
+
+:::info
+Any set of bytes can be output in the strings. Use the `JSONEachRow` format if you are sure that the data in the table can be formatted as JSON without losing any information.
+:::
+
+### Usage of Nested Structures {#jsoneachrow-nested}
+
+If you have a table with [Nested](../sql-reference/data-types/nested-data-structures/nested.md) data type columns, you can insert JSON data with the same structure. Enable this feature with the [input_format_import_nested_json](../operations/settings/settings.md#settings-input_format_import_nested_json) setting.
+
+For example, consider the following table:
+
+``` sql
+CREATE TABLE json_each_row_nested (n Nested (s String, i Int32) ) ENGINE = Memory
+```
+
+As you can see in the `Nested` data type description, ClickHouse treats each component of the nested structure as a separate column (`n.s` and `n.i` for our table). You can insert data in the following way:
+
+``` sql
+INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n.s": ["abc", "def"], "n.i": [1, 23]}
+```
+
+To insert data as a hierarchical JSON object, set [input_format_import_nested_json=1](../operations/settings/settings.md#settings-input_format_import_nested_json).
+
+``` json
+{
+ "n": {
+ "s": ["abc", "def"],
+ "i": [1, 23]
+ }
+}
+```
+
+Without this setting, ClickHouse throws an exception.
+
+``` sql
+SELECT name, value FROM system.settings WHERE name = 'input_format_import_nested_json'
+```
+
+``` text
+┌─name────────────────────────────┬─value─┐
+│ input_format_import_nested_json │ 0 │
+└─────────────────────────────────┴───────┘
+```
+
+``` sql
+INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
+```
+
+``` text
+Code: 117. DB::Exception: Unknown field found while parsing JSONEachRow format: n: (at row 1)
+```
+
+``` sql
+SET input_format_import_nested_json=1
+INSERT INTO json_each_row_nested FORMAT JSONEachRow {"n": {"s": ["abc", "def"], "i": [1, 23]}}
+SELECT * FROM json_each_row_nested
+```
+
+``` text
+┌─n.s───────────┬─n.i────┐
+│ ['abc','def'] │ [1,23] │
+└───────────────┴────────┘
+```
+
+## Native {#native}
+
+The most efficient format. Data is written and read by blocks in binary format. For each block, the number of rows, number of columns, column names and types, and parts of columns in this block are recorded one after another. In other words, this format is “columnar” – it does not convert columns to rows. This is the format used in the native interface for interaction between servers, for using the command-line client, and for C++ clients.
+
+You can use this format to quickly generate dumps that can only be read by the ClickHouse DBMS. It does not make sense to work with this format yourself.
+
+## Null {#null}
+
+Nothing is output. However, the query is processed, and when using the command-line client, data is transmitted to the client. This is used for tests, including performance testing.
+Obviously, this format is only appropriate for output, not for parsing.
+
+## Pretty {#pretty}
+
+Outputs data as Unicode-art tables, also using ANSI-escape sequences for setting colours in the terminal.
+A full grid of the table is drawn, and each row occupies two lines in the terminal.
+Each result block is output as a separate table. This is necessary so that blocks can be output without buffering results (buffering would be necessary in order to pre-calculate the visible width of all the values).
+
+[NULL](../sql-reference/syntax.md) is output as `ᴺᵁᴸᴸ`.
+
+Example (shown for the [PrettyCompact](#prettycompact) format):
+
+``` sql
+SELECT * FROM t_null
+```
+
+``` text
+┌─x─┬────y─┐
+│ 1 │ ᴺᵁᴸᴸ │
+└───┴──────┘
+```
+
+Rows are not escaped in Pretty\* formats. Example is shown for the [PrettyCompact](#prettycompact) format:
+
+``` sql
+SELECT 'String with \'quotes\' and \t character' AS Escaping_test
+```
+
+``` text
+┌─Escaping_test────────────────────────┐
+│ String with 'quotes' and character │
+└──────────────────────────────────────┘
+```
+
+To avoid dumping too much data to the terminal, only the first 10,000 rows are printed. If the number of rows is greater than or equal to 10,000, the message “Showed first 10 000” is printed.
+This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
+
+The Pretty format supports outputting total values (when using WITH TOTALS) and extremes (when ‘extremes’ is set to 1). In these cases, total values and extreme values are output after the main data, in separate tables. Example (shown for the [PrettyCompact](#prettycompact) format):
+
+``` sql
+SELECT EventDate, count() AS c FROM test.hits GROUP BY EventDate WITH TOTALS ORDER BY EventDate FORMAT PrettyCompact
+```
+
+``` text
+┌──EventDate─┬───────c─┐
+│ 2014-03-17 │ 1406958 │
+│ 2014-03-18 │ 1383658 │
+│ 2014-03-19 │ 1405797 │
+│ 2014-03-20 │ 1353623 │
+│ 2014-03-21 │ 1245779 │
+│ 2014-03-22 │ 1031592 │
+│ 2014-03-23 │ 1046491 │
+└────────────┴─────────┘
+
+Totals:
+┌──EventDate─┬───────c─┐
+│ 1970-01-01 │ 8873898 │
+└────────────┴─────────┘
+
+Extremes:
+┌──EventDate─┬───────c─┐
+│ 2014-03-17 │ 1031592 │
+│ 2014-03-23 │ 1406958 │
+└────────────┴─────────┘
+```
+
+## PrettyCompact {#prettycompact}
+
+Differs from [Pretty](#pretty) in that the grid is drawn between rows and the result is more compact.
+This format is used by default in the command-line client in interactive mode.
+
+## PrettyCompactMonoBlock {#prettycompactmonoblock}
+
+Differs from [PrettyCompact](#prettycompact) in that up to 10,000 rows are buffered, then output as a single table, not by blocks.
+
+## PrettyNoEscapes {#prettynoescapes}
+
+Differs from Pretty in that ANSI-escape sequences aren’t used. This is necessary for displaying this format in a browser, as well as for using the ‘watch’ command-line utility.
+
+Example:
+
+``` bash
+$ watch -n1 "clickhouse-client --query='SELECT event, value FROM system.events FORMAT PrettyCompactNoEscapes'"
+```
+
+You can use the HTTP interface for displaying in the browser.
+
+### PrettyCompactNoEscapes {#prettycompactnoescapes}
+
+The same as the previous setting.
+
+### PrettySpaceNoEscapes {#prettyspacenoescapes}
+
+The same as the previous setting.
+
+## PrettySpace {#prettyspace}
+
+Differs from [PrettyCompact](#prettycompact) in that whitespace (space characters) is used instead of the grid.
+
+## RowBinary {#rowbinary}
+
+Formats and parses data by row in binary format. Rows and values are listed consecutively, without separators.
+This format is less efficient than the Native format since it is row-based.
+
+Integers use fixed-length little-endian representation. For example, UInt64 uses 8 bytes.
+DateTime is represented as UInt32 containing the Unix timestamp as the value.
+Date is represented as a UInt16 object that contains the number of days since 1970-01-01 as the value.
+String is represented as a varint length (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), followed by the bytes of the string.
+FixedString is represented simply as a sequence of bytes.
+
+Array is represented as a varint length (unsigned [LEB128](https://en.wikipedia.org/wiki/LEB128)), followed by successive elements of the array.
+
+For [NULL](../sql-reference/syntax.md#null-literal) support, an additional byte containing 1 or 0 is added before each [Nullable](../sql-reference/data-types/nullable.md) value. If 1, then the value is `NULL` and this byte is interpreted as a separate value. If 0, the value after the byte is not `NULL`.
+
+## RowBinaryWithNames {#rowbinarywithnames}
+
+Similar to [RowBinary](#rowbinary), but with added header:
+
+- [LEB128](https://en.wikipedia.org/wiki/LEB128)-encoded number of columns (N)
+- N `String`s specifying column names
+
+## RowBinaryWithNamesAndTypes {#rowbinarywithnamesandtypes}
+
+Similar to [RowBinary](#rowbinary), but with added header:
+
+- [LEB128](https://en.wikipedia.org/wiki/LEB128)-encoded number of columns (N)
+- N `String`s specifying column names
+- N `String`s specifying column types
+
+## Values {#data-format-values}
+
+Prints every row in brackets. Rows are separated by commas. There is no comma after the last row. The values inside the brackets are also comma-separated. Numbers are output in a decimal format without quotes. Arrays are output in square brackets. Strings, dates, and dates with times are output in quotes. Escaping rules and parsing are similar to the [TabSeparated](#tabseparated) format. During formatting, extra spaces aren’t inserted, but during parsing, they are allowed and skipped (except for spaces inside array values, which are not allowed). [NULL](../sql-reference/syntax.md) is represented as `NULL`.
+
+The minimum set of characters that you need to escape when passing data in Values format: single quotes and backslashes.
+
+This is the format that is used in `INSERT INTO t VALUES ...`, but you can also use it for formatting query results.
+
+See also: [input_format_values_interpret_expressions](../operations/settings/settings.md#settings-input_format_values_interpret_expressions) and [input_format_values_deduce_templates_of_expressions](../operations/settings/settings.md#settings-input_format_values_deduce_templates_of_expressions) settings.
+
+## Vertical {#vertical}
+
+Prints each value on a separate line with the column name specified. This format is convenient for printing just one or a few rows if each row consists of a large number of columns.
+
+[NULL](../sql-reference/syntax.md) is output as `ᴺᵁᴸᴸ`.
+
+Example:
+
+``` sql
+SELECT * FROM t_null FORMAT Vertical
+```
+
+``` text
+Row 1:
+──────
+x: 1
+y: ᴺᵁᴸᴸ
+```
+
+Rows are not escaped in Vertical format:
+
+``` sql
+SELECT 'string with \'quotes\' and \t with some special \n characters' AS test FORMAT Vertical
+```
+
+``` text
+Row 1:
+──────
+test: string with 'quotes' and with some special
+ characters
+```
+
+This format is only appropriate for outputting a query result, but not for parsing (retrieving data to insert in a table).
+
+## XML {#xml}
+
+XML format is suitable only for output, not for parsing. Example:
+
+``` xml
+
+
+
+
+
+ SearchPhrase
+ String
+
+
+ count()
+ UInt64
+
+
+
+
+
+
+ 8267016
+
+
+ bathroom interior design
+ 2166
+
+
+ clickhouse
+ 1655
+
+
+ 2014 spring fashion
+ 1549
+
+
+ freeform photos
+ 1480
+
+
+ angelina jolie
+ 1245
+
+
+ omsk
+ 1112
+
+
+ photos of dog breeds
+ 1091
+
+
+ curtain designs
+ 1064
+
+
+ baku
+ 1000
+
+
+ 10
+ 141137
+
+```
+
+If the column name does not have an acceptable format, just ‘field’ is used as the element name. In general, the XML structure follows the JSON structure.
+Just as for JSON, invalid UTF-8 sequences are changed to the replacement character � so the output text will consist of valid UTF-8 sequences.
+
+In string values, the characters `<` and `&` are escaped as `<` and `&`.
+
+Arrays are output as `HelloWorld...`,and tuples as `HelloWorld...`.
+
+## CapnProto {#capnproto}
+
+CapnProto is a binary message format similar to [Protocol Buffers](https://developers.google.com/protocol-buffers/) and [Thrift](https://en.wikipedia.org/wiki/Apache_Thrift), but not like [JSON](#json) or [MessagePack](https://msgpack.org/).
+
+CapnProto messages are strictly typed and not self-describing, meaning they need an external schema description. The schema is applied on the fly and cached for each query.
+
+See also [Format Schema](#formatschema).
+
+### Data Types Matching {#data_types-matching-capnproto}
+
+The table below shows supported data types and how they match ClickHouse [data types](../sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
+
+| CapnProto data type (`INSERT`) | ClickHouse data type | CapnProto data type (`SELECT`) |
+|--------------------------------|-----------------------------------------------------------|--------------------------------|
+| `UINT8`, `BOOL` | [UInt8](../sql-reference/data-types/int-uint.md) | `UINT8` |
+| `INT8` | [Int8](../sql-reference/data-types/int-uint.md) | `INT8` |
+| `UINT16` | [UInt16](../sql-reference/data-types/int-uint.md), [Date](../sql-reference/data-types/date.md) | `UINT16` |
+| `INT16` | [Int16](../sql-reference/data-types/int-uint.md) | `INT16` |
+| `UINT32` | [UInt32](../sql-reference/data-types/int-uint.md), [DateTime](../sql-reference/data-types/datetime.md) | `UINT32` |
+| `INT32` | [Int32](../sql-reference/data-types/int-uint.md) | `INT32` |
+| `UINT64` | [UInt64](../sql-reference/data-types/int-uint.md) | `UINT64` |
+| `INT64` | [Int64](../sql-reference/data-types/int-uint.md), [DateTime64](../sql-reference/data-types/datetime.md) | `INT64` |
+| `FLOAT32` | [Float32](../sql-reference/data-types/float.md) | `FLOAT32` |
+| `FLOAT64` | [Float64](../sql-reference/data-types/float.md) | `FLOAT64` |
+| `TEXT, DATA` | [String](../sql-reference/data-types/string.md), [FixedString](../sql-reference/data-types/fixedstring.md) | `TEXT, DATA` |
+| `union(T, Void), union(Void, T)` | [Nullable(T)](../sql-reference/data-types/date.md) | `union(T, Void), union(Void, T)` |
+| `ENUM` | [Enum(8\|16)](../sql-reference/data-types/enum.md) | `ENUM` |
+| `LIST` | [Array](../sql-reference/data-types/array.md) | `LIST` |
+| `STRUCT` | [Tuple](../sql-reference/data-types/tuple.md) | `STRUCT` |
+
+For working with `Enum` in CapnProto format use the [format_capn_proto_enum_comparising_mode](../operations/settings/settings.md#format-capn-proto-enum-comparising-mode) setting.
+
+Arrays can be nested and can have a value of the `Nullable` type as an argument. `Tuple` type also can be nested.
+
+### Inserting and Selecting Data {#inserting-and-selecting-data-capnproto}
+
+You can insert CapnProto data from a file into ClickHouse table by the following command:
+
+``` bash
+$ cat capnproto_messages.bin | clickhouse-client --query "INSERT INTO test.hits FORMAT CapnProto SETTINGS format_schema = 'schema:Message'"
+```
+
+Where `schema.capnp` looks like this:
+
+``` capnp
+struct Message {
+ SearchPhrase @0 :Text;
+ c @1 :Uint64;
+}
+```
+
+You can select data from a ClickHouse table and save them into some file in the CapnProto format by the following command:
+
+``` bash
+$ clickhouse-client --query = "SELECT * FROM test.hits FORMAT CapnProto SETTINGS format_schema = 'schema:Message'"
+```
+
+## Protobuf {#protobuf}
+
+Protobuf - is a [Protocol Buffers](https://developers.google.com/protocol-buffers/) format.
+
+This format requires an external format schema. The schema is cached between queries.
+ClickHouse supports both `proto2` and `proto3` syntaxes. Repeated/optional/required fields are supported.
+
+Usage examples:
+
+``` sql
+SELECT * FROM test.table FORMAT Protobuf SETTINGS format_schema = 'schemafile:MessageType'
+```
+
+``` bash
+cat protobuf_messages.bin | clickhouse-client --query "INSERT INTO test.table FORMAT Protobuf SETTINGS format_schema='schemafile:MessageType'"
+```
+
+where the file `schemafile.proto` looks like this:
+
+``` capnp
+syntax = "proto3";
+
+message MessageType {
+ string name = 1;
+ string surname = 2;
+ uint32 birthDate = 3;
+ repeated string phoneNumbers = 4;
+};
+```
+
+To find the correspondence between table columns and fields of Protocol Buffers’ message type ClickHouse compares their names.
+This comparison is case-insensitive and the characters `_` (underscore) and `.` (dot) are considered as equal.
+If types of a column and a field of Protocol Buffers’ message are different the necessary conversion is applied.
+
+Nested messages are supported. For example, for the field `z` in the following message type
+
+``` capnp
+message MessageType {
+ message XType {
+ message YType {
+ int32 z;
+ };
+ repeated YType y;
+ };
+ XType x;
+};
+```
+
+ClickHouse tries to find a column named `x.y.z` (or `x_y_z` or `X.y_Z` and so on).
+Nested messages are suitable to input or output a [nested data structures](../sql-reference/data-types/nested-data-structures/nested.md).
+
+Default values defined in a protobuf schema like this
+
+``` capnp
+syntax = "proto2";
+
+message MessageType {
+ optional int32 result_per_page = 3 [default = 10];
+}
+```
+
+are not applied; the [table defaults](../sql-reference/statements/create/table.md#create-default-values) are used instead of them.
+
+ClickHouse inputs and outputs protobuf messages in the `length-delimited` format.
+It means before every message should be written its length as a [varint](https://developers.google.com/protocol-buffers/docs/encoding#varints).
+See also [how to read/write length-delimited protobuf messages in popular languages](https://cwiki.apache.org/confluence/display/GEODE/Delimiting+Protobuf+Messages).
+
+## ProtobufSingle {#protobufsingle}
+
+Same as [Protobuf](#protobuf) but for storing/parsing single Protobuf message without length delimiters.
+
+## Avro {#data-format-avro}
+
+[Apache Avro](https://avro.apache.org/) is a row-oriented data serialization framework developed within Apache’s Hadoop project.
+
+ClickHouse Avro format supports reading and writing [Avro data files](https://avro.apache.org/docs/current/spec.html#Object+Container+Files).
+
+### Data Types Matching {#data_types-matching}
+
+The table below shows supported data types and how they match ClickHouse [data types](../sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
+
+| Avro data type `INSERT` | ClickHouse data type | Avro data type `SELECT` |
+|---------------------------------------------|-----------------------------------------------------------------------------------------------------------------------|------------------------------|
+| `boolean`, `int`, `long`, `float`, `double` | [Int(8\|16\|32)](../sql-reference/data-types/int-uint.md), [UInt(8\|16\|32)](../sql-reference/data-types/int-uint.md) | `int` |
+| `boolean`, `int`, `long`, `float`, `double` | [Int64](../sql-reference/data-types/int-uint.md), [UInt64](../sql-reference/data-types/int-uint.md) | `long` |
+| `boolean`, `int`, `long`, `float`, `double` | [Float32](../sql-reference/data-types/float.md) | `float` |
+| `boolean`, `int`, `long`, `float`, `double` | [Float64](../sql-reference/data-types/float.md) | `double` |
+| `bytes`, `string`, `fixed`, `enum` | [String](../sql-reference/data-types/string.md) | `bytes` or `string` \* |
+| `bytes`, `string`, `fixed` | [FixedString(N)](../sql-reference/data-types/fixedstring.md) | `fixed(N)` |
+| `enum` | [Enum(8\|16)](../sql-reference/data-types/enum.md) | `enum` |
+| `array(T)` | [Array(T)](../sql-reference/data-types/array.md) | `array(T)` |
+| `union(null, T)`, `union(T, null)` | [Nullable(T)](../sql-reference/data-types/date.md) | `union(null, T)` |
+| `null` | [Nullable(Nothing)](../sql-reference/data-types/special-data-types/nothing.md) | `null` |
+| `int (date)` \** | [Date](../sql-reference/data-types/date.md) | `int (date)` \** |
+| `long (timestamp-millis)` \** | [DateTime64(3)](../sql-reference/data-types/datetime.md) | `long (timestamp-millis)` \* |
+| `long (timestamp-micros)` \** | [DateTime64(6)](../sql-reference/data-types/datetime.md) | `long (timestamp-micros)` \* |
+
+\* `bytes` is default, controlled by [output_format_avro_string_column_pattern](../operations/settings/settings.md#settings-output_format_avro_string_column_pattern)
+\** [Avro logical types](https://avro.apache.org/docs/current/spec.html#Logical+Types)
+
+Unsupported Avro data types: `record` (non-root), `map`
+
+Unsupported Avro logical data types: `time-millis`, `time-micros`, `duration`
+
+### Inserting Data {#inserting-data-1}
+
+To insert data from an Avro file into ClickHouse table:
+
+``` bash
+$ cat file.avro | clickhouse-client --query="INSERT INTO {some_table} FORMAT Avro"
+```
+
+The root schema of input Avro file must be of `record` type.
+
+To find the correspondence between table columns and fields of Avro schema ClickHouse compares their names. This comparison is case-sensitive.
+Unused fields are skipped.
+
+Data types of ClickHouse table columns can differ from the corresponding fields of the Avro data inserted. When inserting data, ClickHouse interprets data types according to the table above and then [casts](../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) the data to corresponding column type.
+
+### Selecting Data {#selecting-data-1}
+
+To select data from ClickHouse table into an Avro file:
+
+``` bash
+$ clickhouse-client --query="SELECT * FROM {some_table} FORMAT Avro" > file.avro
+```
+
+Column names must:
+
+- start with `[A-Za-z_]`
+- subsequently contain only `[A-Za-z0-9_]`
+
+Output Avro file compression and sync interval can be configured with [output_format_avro_codec](../operations/settings/settings.md#settings-output_format_avro_codec) and [output_format_avro_sync_interval](../operations/settings/settings.md#settings-output_format_avro_sync_interval) respectively.
+
+## AvroConfluent {#data-format-avro-confluent}
+
+AvroConfluent supports decoding single-object Avro messages commonly used with [Kafka](https://kafka.apache.org/) and [Confluent Schema Registry](https://docs.confluent.io/current/schema-registry/index.html).
+
+Each Avro message embeds a schema id that can be resolved to the actual schema with help of the Schema Registry.
+
+Schemas are cached once resolved.
+
+Schema Registry URL is configured with [format_avro_schema_registry_url](../operations/settings/settings.md#format_avro_schema_registry_url).
+
+### Data Types Matching {#data_types-matching-1}
+
+Same as [Avro](#data-format-avro).
+
+### Usage {#usage}
+
+To quickly verify schema resolution you can use [kafkacat](https://github.com/edenhill/kafkacat) with [clickhouse-local](../operations/utilities/clickhouse-local.md):
+
+``` bash
+$ kafkacat -b kafka-broker -C -t topic1 -o beginning -f '%s' -c 3 | clickhouse-local --input-format AvroConfluent --format_avro_schema_registry_url 'http://schema-registry' -S "field1 Int64, field2 String" -q 'select * from table'
+1 a
+2 b
+3 c
+```
+
+To use `AvroConfluent` with [Kafka](../engines/table-engines/integrations/kafka.md):
+
+``` sql
+CREATE TABLE topic1_stream
+(
+ field1 String,
+ field2 String
+)
+ENGINE = Kafka()
+SETTINGS
+kafka_broker_list = 'kafka-broker',
+kafka_topic_list = 'topic1',
+kafka_group_name = 'group1',
+kafka_format = 'AvroConfluent';
+
+SET format_avro_schema_registry_url = 'http://schema-registry';
+
+SELECT * FROM topic1_stream;
+```
+
+:::warning
+Setting `format_avro_schema_registry_url` needs to be configured in `users.xml` to maintain it’s value after a restart. Also you can use the `format_avro_schema_registry_url` setting of the `Kafka` table engine.
+:::
+
+## Parquet {#data-format-parquet}
+
+[Apache Parquet](https://parquet.apache.org/) is a columnar storage format widespread in the Hadoop ecosystem. ClickHouse supports read and write operations for this format.
+
+### Data Types Matching {#data_types-matching-2}
+
+The table below shows supported data types and how they match ClickHouse [data types](../sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
+
+| Parquet data type (`INSERT`) | ClickHouse data type | Parquet data type (`SELECT`) |
+|------------------------------|-----------------------------------------------------------|------------------------------|
+| `UINT8`, `BOOL` | [UInt8](../sql-reference/data-types/int-uint.md) | `UINT8` |
+| `INT8` | [Int8](../sql-reference/data-types/int-uint.md) | `INT8` |
+| `UINT16` | [UInt16](../sql-reference/data-types/int-uint.md) | `UINT16` |
+| `INT16` | [Int16](../sql-reference/data-types/int-uint.md) | `INT16` |
+| `UINT32` | [UInt32](../sql-reference/data-types/int-uint.md) | `UINT32` |
+| `INT32` | [Int32](../sql-reference/data-types/int-uint.md) | `INT32` |
+| `UINT64` | [UInt64](../sql-reference/data-types/int-uint.md) | `UINT64` |
+| `INT64` | [Int64](../sql-reference/data-types/int-uint.md) | `INT64` |
+| `FLOAT`, `HALF_FLOAT` | [Float32](../sql-reference/data-types/float.md) | `FLOAT` |
+| `DOUBLE` | [Float64](../sql-reference/data-types/float.md) | `DOUBLE` |
+| `DATE32` | [Date](../sql-reference/data-types/date.md) | `UINT16` |
+| `DATE64`, `TIMESTAMP` | [DateTime](../sql-reference/data-types/datetime.md) | `UINT32` |
+| `STRING`, `BINARY` | [String](../sql-reference/data-types/string.md) | `BINARY` |
+| — | [FixedString](../sql-reference/data-types/fixedstring.md) | `BINARY` |
+| `DECIMAL` | [Decimal](../sql-reference/data-types/decimal.md) | `DECIMAL` |
+| `LIST` | [Array](../sql-reference/data-types/array.md) | `LIST` |
+| `STRUCT` | [Tuple](../sql-reference/data-types/tuple.md) | `STRUCT` |
+| `MAP` | [Map](../sql-reference/data-types/map.md) | `MAP` |
+
+Arrays can be nested and can have a value of the `Nullable` type as an argument. `Tuple` and `Map` types also can be nested.
+
+ClickHouse supports configurable precision of `Decimal` type. The `INSERT` query treats the Parquet `DECIMAL` type as the ClickHouse `Decimal128` type.
+
+Unsupported Parquet data types: `TIME32`, `FIXED_SIZE_BINARY`, `JSON`, `UUID`, `ENUM`.
+
+Data types of ClickHouse table columns can differ from the corresponding fields of the Parquet data inserted. When inserting data, ClickHouse interprets data types according to the table above and then [cast](../sql-reference/functions/type-conversion-functions/#type_conversion_function-cast) the data to that data type which is set for the ClickHouse table column.
+
+### Inserting and Selecting Data {#inserting-and-selecting-data}
+
+You can insert Parquet data from a file into ClickHouse table by the following command:
+
+``` bash
+$ cat {filename} | clickhouse-client --query="INSERT INTO {some_table} FORMAT Parquet"
+```
+
+To insert data into [Nested](../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs values you must switch on the [input_format_parquet_import_nested](../operations/settings/settings.md#input_format_parquet_import_nested) setting.
+
+You can select data from a ClickHouse table and save them into some file in the Parquet format by the following command:
+
+``` bash
+$ clickhouse-client --query="SELECT * FROM {some_table} FORMAT Parquet" > {some_file.pq}
+```
+
+To exchange data with Hadoop, you can use [HDFS table engine](../engines/table-engines/integrations/hdfs.md).
+
+## Arrow {#data-format-arrow}
+
+[Apache Arrow](https://arrow.apache.org/) comes with two built-in columnar storage formats. ClickHouse supports read and write operations for these formats.
+
+`Arrow` is Apache Arrow’s "file mode" format. It is designed for in-memory random access.
+
+### Data Types Matching {#data_types-matching-arrow}
+
+The table below shows supported data types and how they match ClickHouse [data types](../sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
+
+| Arrow data type (`INSERT`) | ClickHouse data type | Arrow data type (`SELECT`) |
+|----------------------------|-----------------------------------------------------|----------------------------|
+| `UINT8`, `BOOL` | [UInt8](../sql-reference/data-types/int-uint.md) | `UINT8` |
+| `INT8` | [Int8](../sql-reference/data-types/int-uint.md) | `INT8` |
+| `UINT16` | [UInt16](../sql-reference/data-types/int-uint.md) | `UINT16` |
+| `INT16` | [Int16](../sql-reference/data-types/int-uint.md) | `INT16` |
+| `UINT32` | [UInt32](../sql-reference/data-types/int-uint.md) | `UINT32` |
+| `INT32` | [Int32](../sql-reference/data-types/int-uint.md) | `INT32` |
+| `UINT64` | [UInt64](../sql-reference/data-types/int-uint.md) | `UINT64` |
+| `INT64` | [Int64](../sql-reference/data-types/int-uint.md) | `INT64` |
+| `FLOAT`, `HALF_FLOAT` | [Float32](../sql-reference/data-types/float.md) | `FLOAT32` |
+| `DOUBLE` | [Float64](../sql-reference/data-types/float.md) | `FLOAT64` |
+| `DATE32` | [Date](../sql-reference/data-types/date.md) | `UINT16` |
+| `DATE64`, `TIMESTAMP` | [DateTime](../sql-reference/data-types/datetime.md) | `UINT32` |
+| `STRING`, `BINARY` | [String](../sql-reference/data-types/string.md) | `BINARY` |
+| `STRING`, `BINARY` | [FixedString](../sql-reference/data-types/fixedstring.md) | `BINARY` |
+| `DECIMAL` | [Decimal](../sql-reference/data-types/decimal.md) | `DECIMAL` |
+| `DECIMAL256` | [Decimal256](../sql-reference/data-types/decimal.md)| `DECIMAL256` |
+| `LIST` | [Array](../sql-reference/data-types/array.md) | `LIST` |
+| `STRUCT` | [Tuple](../sql-reference/data-types/tuple.md) | `STRUCT` |
+| `MAP` | [Map](../sql-reference/data-types/map.md) | `MAP` |
+
+Arrays can be nested and can have a value of the `Nullable` type as an argument. `Tuple` and `Map` types also can be nested.
+
+The `DICTIONARY` type is supported for `INSERT` queries, and for `SELECT` queries there is an [output_format_arrow_low_cardinality_as_dictionary](../operations/settings/settings.md#output-format-arrow-low-cardinality-as-dictionary) setting that allows to output [LowCardinality](../sql-reference/data-types/lowcardinality.md) type as a `DICTIONARY` type.
+
+ClickHouse supports configurable precision of the `Decimal` type. The `INSERT` query treats the Arrow `DECIMAL` type as the ClickHouse `Decimal128` type.
+
+Unsupported Arrow data types: `TIME32`, `FIXED_SIZE_BINARY`, `JSON`, `UUID`, `ENUM`.
+
+The data types of ClickHouse table columns do not have to match the corresponding Arrow data fields. When inserting data, ClickHouse interprets data types according to the table above and then [casts](../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) the data to the data type set for the ClickHouse table column.
+
+### Inserting Data {#inserting-data-arrow}
+
+You can insert Arrow data from a file into ClickHouse table by the following command:
+
+``` bash
+$ cat filename.arrow | clickhouse-client --query="INSERT INTO some_table FORMAT Arrow"
+```
+
+To insert data into [Nested](../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs values you must switch on the [input_format_arrow_import_nested](../operations/settings/settings.md#input_format_arrow_import_nested) setting.
+
+### Selecting Data {#selecting-data-arrow}
+
+You can select data from a ClickHouse table and save them into some file in the Arrow format by the following command:
+
+``` bash
+$ clickhouse-client --query="SELECT * FROM {some_table} FORMAT Arrow" > {filename.arrow}
+```
+
+## ArrowStream {#data-format-arrow-stream}
+
+`ArrowStream` is Apache Arrow’s “stream mode” format. It is designed for in-memory stream processing.
+
+## ORC {#data-format-orc}
+
+[Apache ORC](https://orc.apache.org/) is a columnar storage format widespread in the [Hadoop](https://hadoop.apache.org/) ecosystem.
+
+### Data Types Matching {#data_types-matching-3}
+
+The table below shows supported data types and how they match ClickHouse [data types](../sql-reference/data-types/index.md) in `INSERT` and `SELECT` queries.
+
+| ORC data type (`INSERT`) | ClickHouse data type | ORC data type (`SELECT`) |
+|--------------------------|-----------------------------------------------------|--------------------------|
+| `UINT8`, `BOOL` | [UInt8](../sql-reference/data-types/int-uint.md) | `UINT8` |
+| `INT8` | [Int8](../sql-reference/data-types/int-uint.md) | `INT8` |
+| `UINT16` | [UInt16](../sql-reference/data-types/int-uint.md) | `UINT16` |
+| `INT16` | [Int16](../sql-reference/data-types/int-uint.md) | `INT16` |
+| `UINT32` | [UInt32](../sql-reference/data-types/int-uint.md) | `UINT32` |
+| `INT32` | [Int32](../sql-reference/data-types/int-uint.md) | `INT32` |
+| `UINT64` | [UInt64](../sql-reference/data-types/int-uint.md) | `UINT64` |
+| `INT64` | [Int64](../sql-reference/data-types/int-uint.md) | `INT64` |
+| `FLOAT`, `HALF_FLOAT` | [Float32](../sql-reference/data-types/float.md) | `FLOAT` |
+| `DOUBLE` | [Float64](../sql-reference/data-types/float.md) | `DOUBLE` |
+| `DATE32` | [Date](../sql-reference/data-types/date.md) | `DATE32` |
+| `DATE64`, `TIMESTAMP` | [DateTime](../sql-reference/data-types/datetime.md) | `TIMESTAMP` |
+| `STRING`, `BINARY` | [String](../sql-reference/data-types/string.md) | `BINARY` |
+| `DECIMAL` | [Decimal](../sql-reference/data-types/decimal.md) | `DECIMAL` |
+| `LIST` | [Array](../sql-reference/data-types/array.md) | `LIST` |
+| `STRUCT` | [Tuple](../sql-reference/data-types/tuple.md) | `STRUCT` |
+| `MAP` | [Map](../sql-reference/data-types/map.md) | `MAP` |
+
+Arrays can be nested and can have a value of the `Nullable` type as an argument. `Tuple` and `Map` types also can be nested.
+
+ClickHouse supports configurable precision of the `Decimal` type. The `INSERT` query treats the ORC `DECIMAL` type as the ClickHouse `Decimal128` type.
+
+Unsupported ORC data types: `TIME32`, `FIXED_SIZE_BINARY`, `JSON`, `UUID`, `ENUM`.
+
+The data types of ClickHouse table columns do not have to match the corresponding ORC data fields. When inserting data, ClickHouse interprets data types according to the table above and then [casts](../sql-reference/functions/type-conversion-functions.md#type_conversion_function-cast) the data to the data type set for the ClickHouse table column.
+
+### Inserting Data {#inserting-data-2}
+
+You can insert ORC data from a file into ClickHouse table by the following command:
+
+``` bash
+$ cat filename.orc | clickhouse-client --query="INSERT INTO some_table FORMAT ORC"
+```
+
+To insert data into [Nested](../sql-reference/data-types/nested-data-structures/nested.md) columns as an array of structs values you must switch on the [input_format_orc_import_nested](../operations/settings/settings.md#input_format_orc_import_nested) setting.
+
+### Selecting Data {#selecting-data-2}
+
+You can select data from a ClickHouse table and save them into some file in the ORC format by the following command:
+
+``` bash
+$ clickhouse-client --query="SELECT * FROM {some_table} FORMAT ORC" > {filename.orc}
+```
+
+To exchange data with Hadoop, you can use [HDFS table engine](../engines/table-engines/integrations/hdfs.md).
+
+## LineAsString {#lineasstring}
+
+In this format, every line of input data is interpreted as a single string value. This format can only be parsed for table with a single field of type [String](../sql-reference/data-types/string.md). The remaining columns must be set to [DEFAULT](../sql-reference/statements/create/table.md#default) or [MATERIALIZED](../sql-reference/statements/create/table.md#materialized), or omitted.
+
+**Example**
+
+Query:
+
+``` sql
+DROP TABLE IF EXISTS line_as_string;
+CREATE TABLE line_as_string (field String) ENGINE = Memory;
+INSERT INTO line_as_string FORMAT LineAsString "I love apple", "I love banana", "I love orange";
+SELECT * FROM line_as_string;
+```
+
+Result:
+
+``` text
+┌─field─────────────────────────────────────────────┐
+│ "I love apple", "I love banana", "I love orange"; │
+└───────────────────────────────────────────────────┘
+```
+
+## Regexp {#data-format-regexp}
+
+Each line of imported data is parsed according to the regular expression.
+
+When working with the `Regexp` format, you can use the following settings:
+
+- `format_regexp` — [String](../sql-reference/data-types/string.md). Contains regular expression in the [re2](https://github.com/google/re2/wiki/Syntax) format.
+
+- `format_regexp_escaping_rule` — [String](../sql-reference/data-types/string.md). The following escaping rules are supported:
+
+ - CSV (similarly to [CSV](#csv))
+ - JSON (similarly to [JSONEachRow](#jsoneachrow))
+ - Escaped (similarly to [TSV](#tabseparated))
+ - Quoted (similarly to [Values](#data-format-values))
+ - Raw (extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](#tabseparatedraw))
+
+- `format_regexp_skip_unmatched` — [UInt8](../sql-reference/data-types/int-uint.md). Defines the need to throw an exeption in case the `format_regexp` expression does not match the imported data. Can be set to `0` or `1`.
+
+**Usage**
+
+The regular expression from `format_regexp` setting is applied to every line of imported data. The number of subpatterns in the regular expression must be equal to the number of columns in imported dataset.
+
+Lines of the imported data must be separated by newline character `'\n'` or DOS-style newline `"\r\n"`.
+
+The content of every matched subpattern is parsed with the method of corresponding data type, according to `format_regexp_escaping_rule` setting.
+
+If the regular expression does not match the line and `format_regexp_skip_unmatched` is set to 1, the line is silently skipped. If `format_regexp_skip_unmatched` is set to 0, exception is thrown.
+
+**Example**
+
+Consider the file data.tsv:
+
+```text
+id: 1 array: [1,2,3] string: str1 date: 2020-01-01
+id: 2 array: [1,2,3] string: str2 date: 2020-01-02
+id: 3 array: [1,2,3] string: str3 date: 2020-01-03
+```
+and the table:
+
+```sql
+CREATE TABLE imp_regex_table (id UInt32, array Array(UInt32), string String, date Date) ENGINE = Memory;
+```
+
+Import command:
+
+```bash
+$ cat data.tsv | clickhouse-client --query "INSERT INTO imp_regex_table FORMAT Regexp SETTINGS format_regexp='id: (.+?) array: (.+?) string: (.+?) date: (.+?)', format_regexp_escaping_rule='Escaped', format_regexp_skip_unmatched=0;"
+```
+
+Query:
+
+```sql
+SELECT * FROM imp_regex_table;
+```
+
+Result:
+
+```text
+┌─id─┬─array───┬─string─┬───────date─┐
+│ 1 │ [1,2,3] │ str1 │ 2020-01-01 │
+│ 2 │ [1,2,3] │ str2 │ 2020-01-02 │
+│ 3 │ [1,2,3] │ str3 │ 2020-01-03 │
+└────┴─────────┴────────┴────────────┘
+```
+
+## Format Schema {#formatschema}
+
+The file name containing the format schema is set by the setting `format_schema`.
+It’s required to set this setting when it is used one of the formats `Cap'n Proto` and `Protobuf`.
+The format schema is a combination of a file name and the name of a message type in this file, delimited by a colon,
+e.g. `schemafile.proto:MessageType`.
+If the file has the standard extension for the format (for example, `.proto` for `Protobuf`),
+it can be omitted and in this case, the format schema looks like `schemafile:MessageType`.
+
+If you input or output data via the [client](../interfaces/cli.md) in the [interactive mode](../interfaces/cli.md#cli_usage), the file name specified in the format schema
+can contain an absolute path or a path relative to the current directory on the client.
+If you use the client in the [batch mode](../interfaces/cli.md#cli_usage), the path to the schema must be relative due to security reasons.
+
+If you input or output data via the [HTTP interface](../interfaces/http.md) the file name specified in the format schema
+should be located in the directory specified in [format_schema_path](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-format_schema_path)
+in the server configuration.
+
+## Skipping Errors {#skippingerrors}
+
+Some formats such as `CSV`, `TabSeparated`, `TSKV`, `JSONEachRow`, `Template`, `CustomSeparated` and `Protobuf` can skip broken row if parsing error occurred and continue parsing from the beginning of next row. See [input_format_allow_errors_num](../operations/settings/settings.md#settings-input_format_allow_errors_num) and
+[input_format_allow_errors_ratio](../operations/settings/settings.md#settings-input_format_allow_errors_ratio) settings.
+Limitations:
+- In case of parsing error `JSONEachRow` skips all data until the new line (or EOF), so rows must be delimited by `\n` to count errors correctly.
+- `Template` and `CustomSeparated` use delimiter after the last column and delimiter between rows to find the beginning of next row, so skipping errors works only if at least one of them is not empty.
+
+## RawBLOB {#rawblob}
+
+In this format, all input data is read to a single value. It is possible to parse only a table with a single field of type [String](../sql-reference/data-types/string.md) or similar.
+The result is output in binary format without delimiters and escaping. If more than one value is output, the format is ambiguous, and it will be impossible to read the data back.
+
+Below is a comparison of the formats `RawBLOB` and [TabSeparatedRaw](#tabseparatedraw).
+`RawBLOB`:
+- data is output in binary format, no escaping;
+- there are no delimiters between values;
+- no newline at the end of each value.
+[TabSeparatedRaw] (#tabseparatedraw):
+- data is output without escaping;
+- the rows contain values separated by tabs;
+- there is a line feed after the last value in every row.
+
+The following is a comparison of the `RawBLOB` and [RowBinary](#rowbinary) formats.
+`RawBLOB`:
+- String fields are output without being prefixed by length.
+`RowBinary`:
+- String fields are represented as length in varint format (unsigned [LEB128] (https://en.wikipedia.org/wiki/LEB128)), followed by the bytes of the string.
+
+When an empty data is passed to the `RawBLOB` input, ClickHouse throws an exception:
+
+``` text
+Code: 108. DB::Exception: No data to insert
+```
+
+**Example**
+
+``` bash
+$ clickhouse-client --query "CREATE TABLE {some_table} (a String) ENGINE = Memory;"
+$ cat {filename} | clickhouse-client --query="INSERT INTO {some_table} FORMAT RawBLOB"
+$ clickhouse-client --query "SELECT * FROM {some_table} FORMAT RawBLOB" | md5sum
+```
+
+Result:
+
+``` text
+f9725a22f9191e064120d718e26862a9 -
+```
+
+## MsgPack {#msgpack}
+
+ClickHouse supports reading and writing [MessagePack](https://msgpack.org/) data files.
+
+### Data Types Matching {#data-types-matching-msgpack}
+
+| MessagePack data type (`INSERT`) | ClickHouse data type | MessagePack data type (`SELECT`) |
+|--------------------------------------------------------------------|-----------------------------------------------------------|------------------------------------|
+| `uint N`, `positive fixint` | [UIntN](../sql-reference/data-types/int-uint.md) | `uint N` |
+| `int N` | [IntN](../sql-reference/data-types/int-uint.md) | `int N` |
+| `bool` | [UInt8](../sql-reference/data-types/int-uint.md) | `uint 8` |
+| `fixstr`, `str 8`, `str 16`, `str 32`, `bin 8`, `bin 16`, `bin 32` | [String](../sql-reference/data-types/string.md) | `bin 8`, `bin 16`, `bin 32` |
+| `fixstr`, `str 8`, `str 16`, `str 32`, `bin 8`, `bin 16`, `bin 32` | [FixedString](../sql-reference/data-types/fixedstring.md) | `bin 8`, `bin 16`, `bin 32` |
+| `float 32` | [Float32](../sql-reference/data-types/float.md) | `float 32` |
+| `float 64` | [Float64](../sql-reference/data-types/float.md) | `float 64` |
+| `uint 16` | [Date](../sql-reference/data-types/date.md) | `uint 16` |
+| `uint 32` | [DateTime](../sql-reference/data-types/datetime.md) | `uint 32` |
+| `uint 64` | [DateTime64](../sql-reference/data-types/datetime.md) | `uint 64` |
+| `fixarray`, `array 16`, `array 32` | [Array](../sql-reference/data-types/array.md) | `fixarray`, `array 16`, `array 32` |
+| `fixmap`, `map 16`, `map 32` | [Map](../sql-reference/data-types/map.md) | `fixmap`, `map 16`, `map 32` |
+
+Example:
+
+Writing to a file ".msgpk":
+
+```sql
+$ clickhouse-client --query="CREATE TABLE msgpack (array Array(UInt8)) ENGINE = Memory;"
+$ clickhouse-client --query="INSERT INTO msgpack VALUES ([0, 1, 2, 3, 42, 253, 254, 255]), ([255, 254, 253, 42, 3, 2, 1, 0])";
+$ clickhouse-client --query="SELECT * FROM msgpack FORMAT MsgPack" > tmp_msgpack.msgpk;
+```
diff --git a/docs/en/reference/interfaces/grpc.md b/docs/en/reference/interfaces/grpc.md
new file mode 100644
index 00000000000..6ada38c6220
--- /dev/null
+++ b/docs/en/reference/interfaces/grpc.md
@@ -0,0 +1,99 @@
+---
+sidebar_position: 19
+sidebar_label: gRPC Interface
+---
+
+# gRPC Interface {#grpc-interface}
+
+## Introduction {#grpc-interface-introduction}
+
+ClickHouse supports [gRPC](https://grpc.io/) interface. It is an open source remote procedure call system that uses HTTP/2 and [Protocol Buffers](https://en.wikipedia.org/wiki/Protocol_Buffers). The implementation of gRPC in ClickHouse supports:
+
+- SSL;
+- authentication;
+- sessions;
+- compression;
+- parallel queries through the same channel;
+- cancellation of queries;
+- getting progress and logs;
+- external tables.
+
+The specification of the interface is described in [clickhouse_grpc.proto](https://github.com/ClickHouse/ClickHouse/blob/master/src/Server/grpc_protos/clickhouse_grpc.proto).
+
+## gRPC Configuration {#grpc-interface-configuration}
+
+To use the gRPC interface set `grpc_port` in the main [server configuration](../operations/configuration-files.md). Other configuration options see in the following example:
+
+```xml
+9100
+
+ false
+
+
+ /path/to/ssl_cert_file
+ /path/to/ssl_key_file
+
+
+ false
+
+
+ /path/to/ssl_ca_cert_file
+
+
+ deflate
+
+
+ medium
+
+
+ -1
+ -1
+
+
+ false
+
+```
+
+## Built-in Client {#grpc-client}
+
+You can write a client in any of the programming languages supported by gRPC using the provided [specification](https://github.com/ClickHouse/ClickHouse/blob/master/src/Server/grpc_protos/clickhouse_grpc.proto).
+Or you can use a built-in Python client. It is placed in [utils/grpc-client/clickhouse-grpc-client.py](https://github.com/ClickHouse/ClickHouse/blob/master/utils/grpc-client/clickhouse-grpc-client.py) in the repository. The built-in client requires [grpcio and grpcio-tools](https://grpc.io/docs/languages/python/quickstart) Python modules.
+
+The client supports the following arguments:
+
+- `--help` – Shows a help message and exits.
+- `--host HOST, -h HOST` – A server name. Default value: `localhost`. You can use IPv4 or IPv6 addresses also.
+- `--port PORT` – A port to connect to. This port should be enabled in the ClickHouse server configuration (see `grpc_port`). Default value: `9100`.
+- `--user USER_NAME, -u USER_NAME` – A user name. Default value: `default`.
+- `--password PASSWORD` – A password. Default value: empty string.
+- `--query QUERY, -q QUERY` – A query to process when using non-interactive mode.
+- `--database DATABASE, -d DATABASE` – A default database. If not specified, the current database set in the server settings is used (`default` by default).
+- `--format OUTPUT_FORMAT, -f OUTPUT_FORMAT` – A result output [format](formats.md). Default value for interactive mode: `PrettyCompact`.
+- `--debug` – Enables showing debug information.
+
+To run the client in an interactive mode call it without `--query` argument.
+
+In a batch mode query data can be passed via `stdin`.
+
+**Client Usage Example**
+
+In the following example a table is created and loaded with data from a CSV file. Then the content of the table is queried.
+
+``` bash
+./clickhouse-grpc-client.py -q "CREATE TABLE grpc_example_table (id UInt32, text String) ENGINE = MergeTree() ORDER BY id;"
+echo "0,Input data for" > a.txt ; echo "1,gRPC protocol example" >> a.txt
+cat a.txt | ./clickhouse-grpc-client.py -q "INSERT INTO grpc_example_table FORMAT CSV"
+
+./clickhouse-grpc-client.py --format PrettyCompact -q "SELECT * FROM grpc_example_table;"
+```
+
+Result:
+
+``` text
+┌─id─┬─text──────────────────┐
+│ 0 │ Input data for │
+│ 1 │ gRPC protocol example │
+└────┴───────────────────────┘
+```
diff --git a/docs/en/reference/interfaces/http.md b/docs/en/reference/interfaces/http.md
new file mode 100644
index 00000000000..a97cf6671b2
--- /dev/null
+++ b/docs/en/reference/interfaces/http.md
@@ -0,0 +1,664 @@
+---
+sidebar_position: 19
+sidebar_label: HTTP Interface
+---
+
+# HTTP Interface {#http-interface}
+
+The HTTP interface lets you use ClickHouse on any platform from any programming language. We use it for working from Java and Perl, as well as shell scripts. In other departments, the HTTP interface is used from Perl, Python, and Go. The HTTP interface is more limited than the native interface, but it has better compatibility.
+
+By default, `clickhouse-server` listens for HTTP on port 8123 (this can be changed in the config).
+
+Sometimes, `curl` command is not available on user operating systems. On Ubuntu or Debian, run `sudo apt install curl`. Please refer this [documentation](https://curl.se/download.html) to install it before running the examples.
+
+If you make a `GET /` request without parameters, it returns 200 response code and the string which defined in [http_server_default_response](../operations/server-configuration-parameters/settings.md#server_configuration_parameters-http_server_default_response) default value “Ok.” (with a line feed at the end)
+
+``` bash
+$ curl 'http://localhost:8123/'
+Ok.
+```
+
+Web UI can be accessed here: `http://localhost:8123/play`.
+
+![Web UI](../images/play.png)
+
+
+In health-check scripts use `GET /ping` request. This handler always returns “Ok.” (with a line feed at the end). Available from version 18.12.13. See also `/replicas_status` to check replica's delay.
+
+``` bash
+$ curl 'http://localhost:8123/ping'
+Ok.
+$ curl 'http://localhost:8123/replicas_status'
+Ok.
+```
+
+Send the request as a URL ‘query’ parameter, or as a POST. Or send the beginning of the query in the ‘query’ parameter, and the rest in the POST (we’ll explain later why this is necessary). The size of the URL is limited to 16 KB, so keep this in mind when sending large queries.
+
+If successful, you receive the 200 response code and the result in the response body.
+If an error occurs, you receive the 500 response code and an error description text in the response body.
+
+When using the GET method, ‘readonly’ is set. In other words, for queries that modify data, you can only use the POST method. You can send the query itself either in the POST body or in the URL parameter.
+
+Examples:
+
+``` bash
+$ curl 'http://localhost:8123/?query=SELECT%201'
+1
+
+$ wget -nv -O- 'http://localhost:8123/?query=SELECT 1'
+1
+
+$ echo -ne 'GET /?query=SELECT%201 HTTP/1.0\r\n\r\n' | nc localhost 8123
+HTTP/1.0 200 OK
+Date: Wed, 27 Nov 2019 10:30:18 GMT
+Connection: Close
+Content-Type: text/tab-separated-values; charset=UTF-8
+X-ClickHouse-Server-Display-Name: clickhouse.ru-central1.internal
+X-ClickHouse-Query-Id: 5abe861c-239c-467f-b955-8a201abb8b7f
+X-ClickHouse-Summary: {"read_rows":"0","read_bytes":"0","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}
+
+1
+```
+
+As you can see, `curl` is somewhat inconvenient in that spaces must be URL escaped.
+Although `wget` escapes everything itself, we do not recommend using it because it does not work well over HTTP 1.1 when using keep-alive and Transfer-Encoding: chunked.
+
+``` bash
+$ echo 'SELECT 1' | curl 'http://localhost:8123/' --data-binary @-
+1
+
+$ echo 'SELECT 1' | curl 'http://localhost:8123/?query=' --data-binary @-
+1
+
+$ echo '1' | curl 'http://localhost:8123/?query=SELECT' --data-binary @-
+1
+```
+
+If part of the query is sent in the parameter, and part in the POST, a line feed is inserted between these two data parts.
+Example (this won’t work):
+
+``` bash
+$ echo 'ECT 1' | curl 'http://localhost:8123/?query=SEL' --data-binary @-
+Code: 59, e.displayText() = DB::Exception: Syntax error: failed at position 0: SEL
+ECT 1
+, expected One of: SHOW TABLES, SHOW DATABASES, SELECT, INSERT, CREATE, ATTACH, RENAME, DROP, DETACH, USE, SET, OPTIMIZE., e.what() = DB::Exception
+```
+
+By default, data is returned in [TabSeparated](formats.md#tabseparated) format.
+
+You use the FORMAT clause of the query to request any other format.
+
+Also, you can use the ‘default_format’ URL parameter or the ‘X-ClickHouse-Format’ header to specify a default format other than TabSeparated.
+
+``` bash
+$ echo 'SELECT 1 FORMAT Pretty' | curl 'http://localhost:8123/?' --data-binary @-
+┏━━━┓
+┃ 1 ┃
+┡━━━┩
+│ 1 │
+└───┘
+```
+
+The POST method of transmitting data is necessary for `INSERT` queries. In this case, you can write the beginning of the query in the URL parameter, and use POST to pass the data to insert. The data to insert could be, for example, a tab-separated dump from MySQL. In this way, the `INSERT` query replaces `LOAD DATA LOCAL INFILE` from MySQL.
+
+**Examples**
+
+Creating a table:
+
+``` bash
+$ echo 'CREATE TABLE t (a UInt8) ENGINE = Memory' | curl 'http://localhost:8123/' --data-binary @-
+```
+
+Using the familiar INSERT query for data insertion:
+
+``` bash
+$ echo 'INSERT INTO t VALUES (1),(2),(3)' | curl 'http://localhost:8123/' --data-binary @-
+```
+
+Data can be sent separately from the query:
+
+``` bash
+$ echo '(4),(5),(6)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20VALUES' --data-binary @-
+```
+
+You can specify any data format. The ‘Values’ format is the same as what is used when writing INSERT INTO t VALUES:
+
+``` bash
+$ echo '(7),(8),(9)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20Values' --data-binary @-
+```
+
+To insert data from a tab-separated dump, specify the corresponding format:
+
+``` bash
+$ echo -ne '10\n11\n12\n' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20FORMAT%20TabSeparated' --data-binary @-
+```
+
+Reading the table contents. Data is output in random order due to parallel query processing:
+
+``` bash
+$ curl 'http://localhost:8123/?query=SELECT%20a%20FROM%20t'
+7
+8
+9
+10
+11
+12
+1
+2
+3
+4
+5
+6
+```
+
+Deleting the table.
+
+``` bash
+$ echo 'DROP TABLE t' | curl 'http://localhost:8123/' --data-binary @-
+```
+
+For successful requests that do not return a data table, an empty response body is returned.
+
+
+## Compression {#compression}
+
+You can use compression to reduce network traffic when transmitting a large amount of data or for creating dumps that are immediately compressed.
+
+You can use the internal ClickHouse compression format when transmitting data. The compressed data has a non-standard format, and you need `clickhouse-compressor` program to work with it. It is installed with the `clickhouse-client` package. To increase the efficiency of data insertion, you can disable server-side checksum verification by using the [http_native_compression_disable_checksumming_on_decompress](../operations/settings/settings.md#settings-http_native_compression_disable_checksumming_on_decompress) setting.
+
+If you specify `compress=1` in the URL, the server will compress the data it sends to you. If you specify `decompress=1` in the URL, the server will decompress the data which you pass in the `POST` method.
+
+You can also choose to use [HTTP compression](https://en.wikipedia.org/wiki/HTTP_compression). ClickHouse supports the following [compression methods](https://en.wikipedia.org/wiki/HTTP_compression#Content-Encoding_tokens):
+
+- `gzip`
+- `br`
+- `deflate`
+- `xz`
+
+To send a compressed `POST` request, append the request header `Content-Encoding: compression_method`.
+In order for ClickHouse to compress the response, enable compression with [enable_http_compression](../operations/settings/settings.md#settings-enable_http_compression) setting and append `Accept-Encoding: compression_method` header to the request. You can configure the data compression level in the [http_zlib_compression_level](../operations/settings/settings.md#settings-http_zlib_compression_level) setting for all compression methods.
+
+:::info
+Some HTTP clients might decompress data from the server by default (with `gzip` and `deflate`) and you might get decompressed data even if you use the compression settings correctly.
+:::
+
+**Examples**
+
+``` bash
+# Sending compressed data to the server
+$ echo "SELECT 1" | gzip -c | \
+ curl -sS --data-binary @- -H 'Content-Encoding: gzip' 'http://localhost:8123/'
+```
+
+``` bash
+# Receiving compressed data archive from the server
+$ curl -vsS "http://localhost:8123/?enable_http_compression=1" \
+ -H 'Accept-Encoding: gzip' --output result.gz -d 'SELECT number FROM system.numbers LIMIT 3'
+$ zcat result.gz
+0
+1
+2
+```
+
+```bash
+# Receiving compressed data from the server and using the gunzip to receive decompressed data
+$ curl -sS "http://localhost:8123/?enable_http_compression=1" \
+ -H 'Accept-Encoding: gzip' -d 'SELECT number FROM system.numbers LIMIT 3' | gunzip -
+0
+1
+2
+```
+
+## Default Database {#default-database}
+
+You can use the ‘database’ URL parameter or the ‘X-ClickHouse-Database’ header to specify the default database.
+
+``` bash
+$ echo 'SELECT number FROM numbers LIMIT 10' | curl 'http://localhost:8123/?database=system' --data-binary @-
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+```
+
+By default, the database that is registered in the server settings is used as the default database. By default, this is the database called ‘default’. Alternatively, you can always specify the database using a dot before the table name.
+
+The username and password can be indicated in one of three ways:
+
+1. Using HTTP Basic Authentication. Example:
+
+
+
+``` bash
+$ echo 'SELECT 1' | curl 'http://user:password@localhost:8123/' -d @-
+```
+
+1. In the ‘user’ and ‘password’ URL parameters. Example:
+
+
+
+``` bash
+$ echo 'SELECT 1' | curl 'http://localhost:8123/?user=user&password=password' -d @-
+```
+
+1. Using ‘X-ClickHouse-User’ and ‘X-ClickHouse-Key’ headers. Example:
+
+
+
+``` bash
+$ echo 'SELECT 1' | curl -H 'X-ClickHouse-User: user' -H 'X-ClickHouse-Key: password' 'http://localhost:8123/' -d @-
+```
+
+If the user name is not specified, the `default` name is used. If the password is not specified, the empty password is used.
+You can also use the URL parameters to specify any settings for processing a single query or entire profiles of settings. Example:http://localhost:8123/?profile=web&max_rows_to_read=1000000000&query=SELECT+1
+
+For more information, see the [Settings](../operations/settings/index.md) section.
+
+``` bash
+$ echo 'SELECT number FROM system.numbers LIMIT 10' | curl 'http://localhost:8123/?' --data-binary @-
+0
+1
+2
+3
+4
+5
+6
+7
+8
+9
+```
+
+For information about other parameters, see the section “SET”.
+
+Similarly, you can use ClickHouse sessions in the HTTP protocol. To do this, you need to add the `session_id` GET parameter to the request. You can use any string as the session ID. By default, the session is terminated after 60 seconds of inactivity. To change this timeout, modify the `default_session_timeout` setting in the server configuration, or add the `session_timeout` GET parameter to the request. To check the session status, use the `session_check=1` parameter. Only one query at a time can be executed within a single session.
+
+You can receive information about the progress of a query in `X-ClickHouse-Progress` response headers. To do this, enable [send_progress_in_http_headers](../operations/settings/settings.md#settings-send_progress_in_http_headers). Example of the header sequence:
+
+``` text
+X-ClickHouse-Progress: {"read_rows":"2752512","read_bytes":"240570816","total_rows_to_read":"8880128"}
+X-ClickHouse-Progress: {"read_rows":"5439488","read_bytes":"482285394","total_rows_to_read":"8880128"}
+X-ClickHouse-Progress: {"read_rows":"8783786","read_bytes":"819092887","total_rows_to_read":"8880128"}
+```
+
+Possible header fields:
+
+- `read_rows` — Number of rows read.
+- `read_bytes` — Volume of data read in bytes.
+- `total_rows_to_read` — Total number of rows to be read.
+- `written_rows` — Number of rows written.
+- `written_bytes` — Volume of data written in bytes.
+
+Running requests do not stop automatically if the HTTP connection is lost. Parsing and data formatting are performed on the server-side, and using the network might be ineffective.
+The optional ‘query_id’ parameter can be passed as the query ID (any string). For more information, see the section “Settings, replace_running_query”.
+
+The optional ‘quota_key’ parameter can be passed as the quota key (any string). For more information, see the section “Quotas”.
+
+The HTTP interface allows passing external data (external temporary tables) for querying. For more information, see the section “External data for query processing”.
+
+## Response Buffering {#response-buffering}
+
+You can enable response buffering on the server-side. The `buffer_size` and `wait_end_of_query` URL parameters are provided for this purpose.
+
+`buffer_size` determines the number of bytes in the result to buffer in the server memory. If a result body is larger than this threshold, the buffer is written to the HTTP channel, and the remaining data is sent directly to the HTTP channel.
+
+To ensure that the entire response is buffered, set `wait_end_of_query=1`. In this case, the data that is not stored in memory will be buffered in a temporary server file.
+
+Example:
+
+``` bash
+$ curl -sS 'http://localhost:8123/?max_result_bytes=4000000&buffer_size=3000000&wait_end_of_query=1' -d 'SELECT toUInt8(number) FROM system.numbers LIMIT 9000000 FORMAT RowBinary'
+```
+
+Use buffering to avoid situations where a query processing error occurred after the response code and HTTP headers were sent to the client. In this situation, an error message is written at the end of the response body, and on the client-side, the error can only be detected at the parsing stage.
+
+### Queries with Parameters {#cli-queries-with-parameters}
+
+You can create a query with parameters and pass values for them from the corresponding HTTP request parameters. For more information, see [Queries with Parameters for CLI](../interfaces/cli.md#cli-queries-with-parameters).
+
+### Example {#example}
+
+``` bash
+$ curl -sS "?param_id=2¶m_phrase=test" -d "SELECT * FROM table WHERE int_column = {id:UInt8} and string_column = {phrase:String}"
+```
+
+## Predefined HTTP Interface {#predefined_http_interface}
+
+ClickHouse supports specific queries through the HTTP interface. For example, you can write data to a table as follows:
+
+``` bash
+$ echo '(4),(5),(6)' | curl 'http://localhost:8123/?query=INSERT%20INTO%20t%20VALUES' --data-binary @-
+```
+
+ClickHouse also supports Predefined HTTP Interface which can help you more easily integrate with third-party tools like [Prometheus exporter](https://github.com/percona-lab/clickhouse_exporter).
+
+Example:
+
+- First of all, add this section to server configuration file:
+
+
+
+``` xml
+
+
+ /predefined_query
+ POST,GET
+
+ predefined_query_handler
+ SELECT * FROM system.metrics LIMIT 5 FORMAT Template SETTINGS format_template_resultset = 'prometheus_template_output_format_resultset', format_template_row = 'prometheus_template_output_format_row', format_template_rows_between_delimiter = '\n'
+
+
+ ...
+ ...
+
+```
+
+- You can now request the URL directly for data in the Prometheus format:
+
+
+
+``` bash
+$ curl -v 'http://localhost:8123/predefined_query'
+* Trying ::1...
+* Connected to localhost (::1) port 8123 (#0)
+> GET /predefined_query HTTP/1.1
+> Host: localhost:8123
+> User-Agent: curl/7.47.0
+> Accept: */*
+>
+< HTTP/1.1 200 OK
+< Date: Tue, 28 Apr 2020 08:52:56 GMT
+< Connection: Keep-Alive
+< Content-Type: text/plain; charset=UTF-8
+< X-ClickHouse-Server-Display-Name: i-mloy5trc
+< Transfer-Encoding: chunked
+< X-ClickHouse-Query-Id: 96fe0052-01e6-43ce-b12a-6b7370de6e8a
+< X-ClickHouse-Format: Template
+< X-ClickHouse-Timezone: Asia/Shanghai
+< Keep-Alive: timeout=3
+< X-ClickHouse-Summary: {"read_rows":"0","read_bytes":"0","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}
+<
+# HELP "Query" "Number of executing queries"
+# TYPE "Query" counter
+"Query" 1
+
+# HELP "Merge" "Number of executing background merges"
+# TYPE "Merge" counter
+"Merge" 0
+
+# HELP "PartMutation" "Number of mutations (ALTER DELETE/UPDATE)"
+# TYPE "PartMutation" counter
+"PartMutation" 0
+
+# HELP "ReplicatedFetch" "Number of data parts being fetched from replica"
+# TYPE "ReplicatedFetch" counter
+"ReplicatedFetch" 0
+
+# HELP "ReplicatedSend" "Number of data parts being sent to replicas"
+# TYPE "ReplicatedSend" counter
+"ReplicatedSend" 0
+
+* Connection #0 to host localhost left intact
+
+* Connection #0 to host localhost left intact
+```
+
+As you can see from the example if `http_handlers` is configured in the config.xml file and `http_handlers` can contain many `rules`. ClickHouse will match the HTTP requests received to the predefined type in `rule` and the first matched runs the handler. Then ClickHouse will execute the corresponding predefined query if the match is successful.
+
+Now `rule` can configure `method`, `headers`, `url`, `handler`:
+- `method` is responsible for matching the method part of the HTTP request. `method` fully conforms to the definition of [method](https://developer.mozilla.org/en-US/docs/Web/HTTP/Methods) in the HTTP protocol. It is an optional configuration. If it is not defined in the configuration file, it does not match the method portion of the HTTP request.
+
+- `url` is responsible for matching the URL part of the HTTP request. It is compatible with [RE2](https://github.com/google/re2)’s regular expressions. It is an optional configuration. If it is not defined in the configuration file, it does not match the URL portion of the HTTP request.
+
+- `headers` are responsible for matching the header part of the HTTP request. It is compatible with RE2’s regular expressions. It is an optional configuration. If it is not defined in the configuration file, it does not match the header portion of the HTTP request.
+
+- `handler` contains the main processing part. Now `handler` can configure `type`, `status`, `content_type`, `response_content`, `query`, `query_param_name`.
+ `type` currently supports three types: [predefined_query_handler](#predefined_query_handler), [dynamic_query_handler](#dynamic_query_handler), [static](#static).
+
+ - `query` — use with `predefined_query_handler` type, executes query when the handler is called.
+
+ - `query_param_name` — use with `dynamic_query_handler` type, extracts and executes the value corresponding to the `query_param_name` value in HTTP request params.
+
+ - `status` — use with `static` type, response status code.
+
+ - `content_type` — use with `static` type, response [content-type](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type).
+
+ - `response_content` — use with `static` type, response content sent to client, when using the prefix ‘file://’ or ‘config://’, find the content from the file or configuration sends to client.
+
+Next are the configuration methods for different `type`.
+
+### predefined_query_handler {#predefined_query_handler}
+
+`predefined_query_handler` supports setting `Settings` and `query_params` values. You can configure `query` in the type of `predefined_query_handler`.
+
+`query` value is a predefined query of `predefined_query_handler`, which is executed by ClickHouse when an HTTP request is matched and the result of the query is returned. It is a must configuration.
+
+The following example defines the values of [max_threads](../operations/settings/settings.md#settings-max_threads) and `max_final_threads` settings, then queries the system table to check whether these settings were set successfully.
+
+:::warning
+To keep the default `handlers` such as` query`, `play`,` ping`, add the `` rule.
+:::
+
+Example:
+
+``` xml
+
+
+ [^/]+)(/(?P[^/]+))?]]>
+ GET
+
+ TEST_HEADER_VALUE
+ [^/]+)(/(?P[^/]+))?]]>
+
+
+ predefined_query_handler
+ SELECT value FROM system.settings WHERE name = {name_1:String}
+ SELECT name, value FROM system.settings WHERE name = {name_2:String}
+
+
+
+
+```
+
+``` bash
+$ curl -H 'XXX:TEST_HEADER_VALUE' -H 'PARAMS_XXX:max_threads' 'http://localhost:8123/query_param_with_url/1/max_threads/max_final_threads?max_threads=1&max_final_threads=2'
+1
+max_final_threads 2
+```
+
+:::warning
+In one `predefined_query_handler` only supports one `query` of an insert type.
+:::
+
+### dynamic_query_handler {#dynamic_query_handler}
+
+In `dynamic_query_handler`, the query is written in the form of param of the HTTP request. The difference is that in `predefined_query_handler`, the query is written in the configuration file. You can configure `query_param_name` in `dynamic_query_handler`.
+
+ClickHouse extracts and executes the value corresponding to the `query_param_name` value in the URL of the HTTP request. The default value of `query_param_name` is `/query` . It is an optional configuration. If there is no definition in the configuration file, the param is not passed in.
+
+To experiment with this functionality, the example defines the values of [max_threads](../operations/settings/settings.md#settings-max_threads) and `max_final_threads` and `queries` whether the settings were set successfully.
+
+Example:
+
+``` xml
+
+
+
+ TEST_HEADER_VALUE_DYNAMIC
+
+ dynamic_query_handler
+ query_param
+
+
+
+
+```
+
+``` bash
+$ curl -H 'XXX:TEST_HEADER_VALUE_DYNAMIC' 'http://localhost:8123/own?max_threads=1&max_final_threads=2¶m_name_1=max_threads¶m_name_2=max_final_threads&query_param=SELECT%20name,value%20FROM%20system.settings%20where%20name%20=%20%7Bname_1:String%7D%20OR%20name%20=%20%7Bname_2:String%7D'
+max_threads 1
+max_final_threads 2
+```
+
+### static {#static}
+
+`static` can return [content_type](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Type), [status](https://developer.mozilla.org/en-US/docs/Web/HTTP/Status) and `response_content`. `response_content` can return the specified content.
+
+Example:
+
+Return a message.
+
+``` xml
+
+
+ GET
+ xxx
+ /hi
+
+ static
+ 402
+ text/html; charset=UTF-8
+ Say Hi!
+
+
+
+
+```
+
+``` bash
+$ curl -vv -H 'XXX:xxx' 'http://localhost:8123/hi'
+* Trying ::1...
+* Connected to localhost (::1) port 8123 (#0)
+> GET /hi HTTP/1.1
+> Host: localhost:8123
+> User-Agent: curl/7.47.0
+> Accept: */*
+> XXX:xxx
+>
+< HTTP/1.1 402 Payment Required
+< Date: Wed, 29 Apr 2020 03:51:26 GMT
+< Connection: Keep-Alive
+< Content-Type: text/html; charset=UTF-8
+< Transfer-Encoding: chunked
+< Keep-Alive: timeout=3
+< X-ClickHouse-Summary: {"read_rows":"0","read_bytes":"0","written_rows":"0","written_bytes":"0","total_rows_to_read":"0"}
+<
+* Connection #0 to host localhost left intact
+Say Hi!%
+```
+
+Find the content from the configuration send to client.
+
+``` xml
+