Skip to content

Commit

Permalink
[Doc] Documentation about deploying and configuring pulsar functions-…
Browse files Browse the repository at this point in the history
…worker (apache#4229)
  • Loading branch information
sijie authored May 18, 2019
1 parent 353ca73 commit db9f789
Show file tree
Hide file tree
Showing 7 changed files with 245 additions and 2 deletions.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added site2/docs/assets/functions-worker-corun.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added site2/docs/assets/functions-worker-separated.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 3 additions & 1 deletion site2/docs/deploy-bare-metal.md
Original file line number Diff line number Diff line change
Expand Up @@ -342,7 +342,7 @@ webServicePortTls=8443
If you want to enable [Pulsar Functions](functions-overview.md), you can follow the instructions as below:
1. Edit `conf/broker.conf` to enable function worker, by setting `functionsWorkerEnabled` to `true`.
1. Edit `conf/broker.conf` to enable functions worker, by setting `functionsWorkerEnabled` to `true`.
```conf
functionsWorkerEnabled=true
Expand All @@ -354,6 +354,8 @@ If you want to enable [Pulsar Functions](functions-overview.md), you can follow
pulsarFunctionsCluster: pulsar-cluster-1
```
If you would like to learn more options about deploying functions worker, please checkout [Deploy and manage functions worker](functions-worker.md).
### Starting Brokers
You can then provide any other configuration changes that you'd like in the [`conf/broker.conf`](reference-configuration.md#broker) file. Once you've decided on a configuration, you can start up the brokers for your Pulsar cluster. Like ZooKeeper and BookKeeper, brokers can be started either in the foreground or in the background, using nohup.
Expand Down
240 changes: 240 additions & 0 deletions site2/docs/functions-worker.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,240 @@
---
id: functions-worker
title: Deploy and manage functions worker
sidebar_label: Functions Worker
---

Pulsar `functions-worker` is a logic component to run Pulsar Functions in cluster mode. Two options are available, and you can select either of the two options based on your requirements.
- [run with brokers](#run-Functions-worker-with-brokers)
- [run it separately](#run-Functions-worker-separately) in a different broker

> Note
> The `--- Service Urls---` lines in the following diagrams represent Pulsar service URLs that Pulsar client and admin use to connect to a Pulsar cluster.
## Run Functions-worker with brokers

The following diagram illustrates the deployment of functions-workers running along with brokers.

![assets/functions-worker-corun.png](assets/functions-worker-corun.png)

To enable functions-worker running as part of a broker, you need to set `functionsWorkerEnabled` to `true` in the `broker.conf` file.

```conf
functionsWorkerEnabled=true
```

When you set `functionsWorkerEnabled` to `true`, it means that you start functions-worker as part of a broker. You need to configure the `conf/functions_worker.yml` file to customize your functions_worker.

Before you run Functions-work with broker, you have to configure Functions-worker, and then start it with brokers.

### Configure Functions-Worker to run with brokers
In this mode, since `functions-worker` is running as part of broker, most of the settings already inherit from your broker configuration (for example, configurationStore settings, authentication settings, and so on).

Pay attention to the following required settings when configuring functions-worker in this mode.

- `numFunctionPackageReplicas`: The number of replicas to store function packages. The default value is `1`, which is good for standalone deployment. For production deployment, to ensure high availability, set it to be more than `2` .
- `pulsarFunctionsCluster`: Set the value to your Pulsar cluster name (same as the `clusterName` setting in the broker configuration).

If authentication is enabled on the BookKeeper cluster, configure the following BookKeeper authentication settings.

- `bookkeeperClientAuthenticationPlugin`: the BookKeeper client authentication plugin name.
- `bookkeeperClientAuthenticationParametersName`: the BookKeeper client authentication plugin parameters name.
- `bookkeeperClientAuthenticationParameters`: the BookKeeper client authentication plugin parameters.

### Start Functions-worker with broker

Once you have configured the `functions_worker.yml` file, you can start or restart your broker.

And then you can use the following command to verify if `functions-worker` is running well.

```bash
curl <broker-ip>:8080/admin/v2/worker/cluster
```

After entering the command above, a list of active function workers in the cluster is returned. The output is something similar as follows.

```json
[{"workerId":"<worker-id>","workerHostname":"<worker-hostname>","port":8080}]
```

## Run Functions-worker separately

This section illustrates how to run `functions-worker` as a separate process in separate machines.

![assets/functions-worker-separated.png](assets/functions-worker-separated.png)

> Note
In this mode, make sure `functionsWorkerEnabled` is set to `false`, so you won't start `functions-worker` with brokers by mistake.

### Configure Functions-worker to run separately

To run function-worker separately, you have to configure the following parameters.

#### Worker parameters

- `workerId`: The type is string. It is unique across clusters, used to identify a worker machine.
- `workerHostname`: The hostname of the worker machine.
- `workerPort`: The port that the worker server listens on. Keep it as default if you don't customize it.
- `workerPortTls`: The TLS port that the worker server listens on. Keep it as default if you don't customize it.

#### Function package parameter

- `numFunctionPackageReplicas`: The number of replicas to store function packages. The default value is `1`.

#### Function metadata parameter

- `pulsarServiceUrl`: The Pulsar service URL for your broker cluster.
- `pulsarWebServiceUrl`: The Pulser web service URL for your broker cluster.
- `pulsarFunctionsCluster`: Set the value to your Pulsar cluster name (same as the `clusterName` setting in the broker configuration).

If authentication is enabled for your broker cluster, you *should* configure the authentication plugin and parameters for the functions worker to communicate with the brokers.

- `clientAuthenticationPlugin`
- `clientAuthenticationParameters`

#### Security settings

If you want to enable security on functions workers, you *should*:
- [Enable TLS transport encryption](#enable-tls-transport-encryption)
- [Enable Authentication Provider](#enable-authentication-provider)
- [Enable Authorization Provider](#enable-authorization-provider)

**Enable TLS transport encryption**

To enable TLS transport encryption, configure the following settings.

```
tlsEnabled: true
tlsCertificateFilePath: /path/to/functions-worker.cert.pem
tlsKeyFilePath: /path/to/functions-worker.key-pk8.pem
tlsTrustCertsFilePath: /path/to/ca.cert.pem
```

For details on TLS encryption, refer to [Transport Encryption using TLS](security-tls-transport.md).

**Enable Authentication Provider**

To enable authentication on Functions Worker, configure the following settings.
> Note
Substitute the *providers list* with the providers you want to enable.

```
authenticationEnabled: true
authenticationProviders: [ provider1, provider2 ]
```

For *SASL Authentication* provider, add `saslJaasClientAllowedIds` and `saslJaasBrokerSectionName`
under `properties` if needed.

```
properties:
saslJaasClientAllowedIds: .*pulsar.*
saslJaasBrokerSectionName: Broker
```

For *Token Authentication* prodivder, add necessary settings under `properties` if needed.
See [Token Authentication](security-token-admin.md) for more details.
```
properties:
tokenSecretKey: file://my/secret.key
# If using public/private
# tokenPublicKey: file:///path/to/public.key
```

**Enable Authorization Provider**

To enable authorization on Functions Worker, you need to configure `authorizationEnabled` and `configurationStoreServers`. The authentication provider connects to `configurationStoreServers` to receive namespace policies.

```yaml
authorizationEnabled: true
configurationStoreServers: <configuration-store-servers>
```
You should also configure a list of superuser roles. The superuser roles are able to access any admin API. The following is a configuration example.
```yaml
superUserRoles:
- role1
- role2
- role3
```
#### BookKeeper Authentication
If authentication is enabled on the BookKeeper cluster, you should configure the BookKeeper authentication settings as follows:
- `bookkeeperClientAuthenticationPlugin`: the plugin name of BookKeeper client authentication.
- `bookkeeperClientAuthenticationParametersName`: the plugin parameters name of BookKeeper client authentication.
- `bookkeeperClientAuthenticationParameters`: the plugin parameters of BookKeeper client authentication.

### Start Functions-worker

Once you have finished configuring the `functions_worker.yml` configuration file, you can use the following command to start a `functions-worker`:

```bash
bin/pulsar functions-worker
```

### Configure Proxies for Functions-workers

When you are running `functions-worker` in a separate cluster, the admin rest endpoints are split into two clusters. `functions`, `function-worker`, `source` and `sink` endpoints are now served
by the `functions-worker` cluster, while all the other remaining endpoints are served by the broker cluster.
Hence you need to configure your `pulsar-admin` to use the right service URL accordingly.

In order to address this inconvenience, you can start a proxy cluster for routing the admin rest requests accordingly. Hence you will have one central entry point for your admin service.

If you already have a proxy cluster, continue reading. If you haven't setup a proxy cluster before, you can follow the [instructions](http://pulsar.apache.org/docs/en/administration-proxy/) to
start proxies.

![assets/functions-worker-separated.png](assets/functions-worker-separated-proxy.png)

To enable routing functions related admin requests to `functions-worker` in a proxy, you can edit the `proxy.conf` file to modify the following settings:

```conf
functionWorkerWebServiceURL=<pulsar-functions-worker-web-service-url>
functionWorkerWebServiceURLTLS=<pulsar-functions-worker-web-service-url>
```

## Compare the Run-with-Broker and Run-separately modes

As described above, you can run Function-worker with brokers, or run it separately. And it is more convenient to run functions-workers along with brokers. However, running functions-workers in a separate cluster provides better resource isolation for running functions in `Process` or `Thread` mode.

Use which mode for your cases, refer to the following guidelines to determine.

Use the `Run-with-Broker` mode in the following cases:
- a) if resource isolation is not required when running functions in `Process` or `Thread` mode;
- b) if you configure the functions-worker to run functions on Kubernetes (where the resource isolation problem is addressed by Kubernetes).

Use the `Run-separately` mode in the following cases:
- a) you don't have a Kubernetes cluster;
- b) if you want to run functions and brokers separately.

## Troubleshooting

**Error message: Namespace missing local cluster name in clusters list**

```
Failed to get partitioned topic metadata: org.apache.pulsar.client.api.PulsarClientException$BrokerMetadataException: Namespace missing local cluster name in clusters list: local_cluster=xyz ns=public/functions clusters=[standalone]
```
The error message prompts when either of the cases occurs:
- a) a broker is started with `functionsWorkerEnabled=true`, but the `pulsarFunctionsCluster` is not set to the correct cluster in the `conf/functions_worker.yaml` file;
- b) setting up a geo-replicated Pulsar cluster with `functionsWorkerEnabled=true`, while brokers in one cluster run well, brokers in the other cluster do not work well.
**Workaround**
If any of these cases happens, follow the instructions below to fix the problem:
1. Get the current clusters list of `public/functions` namespace.
```bash
bin/pulsar-admin namespaces get-clusters public/functions
```

2. Check if the cluster is in the clusters list. If the cluster is not in the list, add it to the list and update the clusters list.

```bash
bin/pulsar-admin namespaces set-clusters --cluster=<existing-clusters>,<new-cluster> public/functions
```

3. Set the correct cluster name in `pulsarFunctionsCluster` in the `conf/functions_worker.yml` file.
3 changes: 2 additions & 1 deletion site2/website/sidebars.json
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,8 @@
"functions-deploying",
"functions-guarantees",
"functions-state",
"functions-metrics"
"functions-metrics",
"functions-worker"
],
"Pulsar IO": [
"io-overview",
Expand Down

0 comments on commit db9f789

Please sign in to comment.