Skip to content

Commit

Permalink
[docs] Update Pulsar SQL configuration and deployment (apache#5651)
Browse files Browse the repository at this point in the history
### Motivation
The structure and content needs improvement.

### Modifications
1. Improve the structure.
2. Improve the deployment steps.
  • Loading branch information
Jennifer88huang-zz authored and sijie committed Nov 21, 2019
1 parent fa02970 commit f68ee8b
Showing 1 changed file with 34 additions and 38 deletions.
72 changes: 34 additions & 38 deletions site2/docs/sql-deployment-configurations.md
Original file line number Diff line number Diff line change
@@ -1,14 +1,13 @@
---
id: sql-deployment-configurations
title: Pulsar SQL Deployment and Configuration
sidebar_label: Deployment and Configuration
title: Pulsar SQL configuration and deployment
sidebar_label: Configuration and deployment
---

Below is a list configurations for the Presto Pulsar connector and instruction on how to deploy a cluster.
You can configure Presto Pulsar connector and deploy a cluster with the following instruction.

## Presto Pulsar Connector Configurations
There are several configurations for the Presto Pulsar Connector. The properties file that contain these configurations can be found at ```${project.root}/conf/presto/catalog/pulsar.properties```.
The configurations for the connector and its default values are discribed below.
## Configure Presto Pulsar Connector
You can configure Presto Pulsar Connector in the `${project.root}/conf/presto/catalog/pulsar.properties` properties file. The configuration for the connector and the default values are as follows.

```properties
# name of the connector to be displayed in the catalog
Expand All @@ -27,21 +26,22 @@ pulsar.entry-read-batch-size=100
pulsar.target-num-splits=4
```

## Query Pulsar from Existing Presto Cluster
## Query data from existing Presto clusters

If you already have an existing Presto cluster, you can copy Presto Pulsar connector plugin to your existing cluster. You can download the archived plugin package via:
If you already have a Presto cluster, you can copy the Presto Pulsar connector plugin to your existing cluster. Download the archived plugin package with the following command.

```bash
$ wget pulsar:binary_release_url
```

## Deploying a new cluster
## Deploy a new cluster

Please note that the [Getting Started](sql-getting-started.md) guide shows you how to easily setup a standalone single node enviroment to experiment with.
Since Pulsar SQL is powered by [Presto](https://prestodb.io), the configuration for deployment is the same for the Pulsar SQL worker.

Pulsar SQL is powered by [Presto](https://prestodb.io) thus many of the configurations for deployment is the same for the Pulsar SQL worker.
> Note
> For how to set up a standalone single node environment, refer to [Query data](sql-getting-started.md).
You can use the same CLI args as the Presto launcher:
You can use the same CLI args as the Presto launcher.

```bash
$ ./bin/pulsar sql-worker --help
Expand Down Expand Up @@ -72,27 +72,27 @@ Options:

```

There is a set of default configs for the cluster located in ```${project.root}/conf/presto``` that will be used by default. You can change them to customize your deployment
The default configuration for the cluster is located in `${project.root}/conf/presto`. You can customize your deployment by modifying the default configuration.

You can also set the worker to read from a different configuration directory as well as set a different directory for writing its data:
You can set the worker to read from a different configuration directory, or set a different directory to write data.

```bash
$ ./bin/pulsar sql-worker run --etc-dir /tmp/incubator-pulsar/conf/presto --data-dir /tmp/presto-1
```

You can also start the worker as daemon process:
You can start the worker as daemon process.

```bash
$ ./bin sql-worker start
```

### Deploying to a 3 node cluster
### Deploy a cluster on multiple nodes

For example, if I wanted to deploy a Pulsar SQL/Presto cluster on 3 nodes, you can do the following:
You can deploy a Pulsar SQL cluster or Presto cluster on multiple nodes. The following example shows how to deploy a cluster on three-node cluster.

First, copy the Pulsar binary distribution to all three nodes.
1. Copy the Pulsar binary distribution to three nodes.

The first node, will run the Presto coordinator. The mininal configuration in ```${project.root}/conf/presto/config.properties``` can be the following
The first node runs as Presto coordinator. The minimal configuration requirement in the `${project.root}/conf/presto/config.properties` file is as follows.

```properties
coordinator=true
Expand All @@ -104,37 +104,37 @@ discovery-server.enabled=true
discovery.uri=<coordinator-url>
```

Also, modify ```pulsar.broker-service-url``` and ```pulsar.zookeeper-uri``` configs in ```${project.root}/conf/presto/catalog/pulsar.properties``` on those nodes accordingly

Afterwards, you can start the coordinator by just running

```$ ./bin/pulsar sql-worker run```

For the other two nodes that will only serve as worker nodes, the configurations can be the following:
The other two nodes serve as worker nodes, you can use the following configuration for worker nodes.

```properties
coordinator=false
http-server.http.port=8080
query.max-memory=50GB
query.max-memory-per-node=1GB
discovery.uri=<coordinator-url>

```

Also, modify ```pulsar.broker-service-url``` and ```pulsar.zookeeper-uri``` configs in ```${project.root}/conf/presto/catalog/pulsar.properties``` accordingly
2. Modify `pulsar.broker-service-url` and `pulsar.zookeeper-uri` configuration in the `${project.root}/conf/presto/catalog/pulsar.properties` file accordingly for the three nodes.

3. Start the coordinator node.

```
$ ./bin/pulsar sql-worker run
```

You can also start the worker by just running:
4. Start worker nodes.

```$ ./bin/pulsar sql-worker run```
```
$ ./bin/pulsar sql-worker run
```

You can check the status of your cluster from the SQL CLI. To start the SQL CLI:
5. Start the SQL CLI and check the status of your cluster.

```bash
$ ./bin/pulsar sql --server <coordinate_url>

```

You can then run the following command to check the status of your nodes:
6. Check the status of your nodes.

```bash
presto> SELECT * FROM system.runtime.nodes;
Expand All @@ -145,8 +145,4 @@ presto> SELECT * FROM system.runtime.nodes;
2 | http://192.168.2.3:8081 | testversion | false | active
```


For more information about deployment in Presto, please reference:

[Deploying Presto](https://prestodb.io/docs/current/installation/deployment.html)

For more information about deployment in Presto, refer to [Presto deployment](https://prestodb.io/docs/current/installation/deployment.html).

0 comments on commit f68ee8b

Please sign in to comment.