Skip to content

Commit

Permalink
feat: docs migration - Administration (apache#12204)
Browse files Browse the repository at this point in the history
Signed-off-by: LiLi <[email protected]>
  • Loading branch information
urfreespace authored Sep 28, 2021
1 parent 91697c5 commit 043dddc
Show file tree
Hide file tree
Showing 30 changed files with 4,342 additions and 14 deletions.
69 changes: 69 additions & 0 deletions site2/website-next/docs/administration-dashboard.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,69 @@
---
id: administration-dashboard
title: Pulsar dashboard
sidebar_label: Dashboard
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';


:::note

Pulsar dashboard is deprecated. We recommend you use [Pulsar Manager](administration-pulsar-manager.md) to manage and monitor the stats of your topics.

:::

Pulsar dashboard is a web application that enables users to monitor current stats for all [topics](reference-terminology.md#topic) in tabular form.

The dashboard is a data collector that polls stats from all the brokers in a Pulsar instance (across multiple clusters) and stores all the information in a [PostgreSQL](https://www.postgresql.org/) database.

You can use the [Django](https://www.djangoproject.com) web app to render the collected data.

## Install

The easiest way to use the dashboard is to run it inside a [Docker](https://www.docker.com/products/docker) container.

```shell
$ SERVICE_URL=http://broker.example.com:8080/
$ docker run -p 80:80 \
-e SERVICE_URL=$SERVICE_URL \
apachepulsar/pulsar-dashboard:@pulsar:version@
```

You can find the {@inject: github:Dockerfile:/dashboard/Dockerfile} in the `dashboard` directory and build an image from scratch as well:

```shell
$ docker build -t apachepulsar/pulsar-dashboard dashboard
```

If token authentication is enabled:
> Provided token should have super-user access.
```shell
$ SERVICE_URL=http://broker.example.com:8080/
$ JWT_TOKEN=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6IkpvaG4gRG9lIiwiaWF0IjoxNTE2MjM5MDIyfQ.SflKxwRJSMeKKF2QT4fwpMeJf36POk6yJV_adQssw5c
$ docker run -p 80:80 \
-e SERVICE_URL=$SERVICE_URL \
-e JWT_TOKEN=$JWT_TOKEN \
apachepulsar/pulsar-dashboard
```

You need to specify only one service URL for a Pulsar cluster. Internally, the collector figures out all the existing clusters and the brokers from where it needs to pull the metrics. If you connect the dashboard to Pulsar running in standalone mode, the URL is `http://<broker-ip>:8080` by default. `<broker-ip>` is the IP address or hostname of the machine that runs Pulsar standalone. The IP address or hostname should be accessible from the running dashboard in the docker instance.

Once the Docker container starts, the web dashboard is accessible via `localhost` or whichever host that Docker uses.

> The `SERVICE_URL` that the dashboard uses needs to be reachable from inside the Docker container.
If the Pulsar service runs in standalone mode in `localhost`, the `SERVICE_URL` has to
be the IP address of the machine.

Similarly, given the Pulsar standalone advertises itself with localhost by default, you need to
explicitly set the advertise address to the host IP address. For example:

```shell
$ bin/pulsar standalone --advertised-address 1.2.3.4
```

### Known issues

Currently, only Pulsar Token [authentication](security-overview.md#authentication-providers) is supported.
201 changes: 201 additions & 0 deletions site2/website-next/docs/administration-geo.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
---
id: administration-geo
title: Pulsar geo-replication
sidebar_label: Geo-replication
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';


*Geo-replication* is the replication of persistently stored message data across multiple clusters of a Pulsar instance.

## How geo-replication works

The diagram below illustrates the process of geo-replication across Pulsar clusters:

![Replication Diagram](/assets/geo-replication.png)

In this diagram, whenever **P1**, **P2**, and **P3** producers publish messages to the **T1** topic on **Cluster-A**, **Cluster-B**, and **Cluster-C** clusters respectively, those messages are instantly replicated across clusters. Once the messages are replicated, **C1** and **C2** consumers can consume those messages from their respective clusters.

Without geo-replication, **C1** and **C2** consumers are not able to consume messages that **P3** producer publishes.

## Geo-replication and Pulsar properties

You must enable geo-replication on a per-tenant basis in Pulsar. You can enable geo-replication between clusters only when a tenant is created that allows access to both clusters.

Although geo-replication must be enabled between two clusters, actually geo-replication is managed at the namespace level. You must complete the following tasks to enable geo-replication for a namespace:

* [Enable geo-replication namespaces](#enable-geo-replication-namespaces)
* Configure that namespace to replicate across two or more provisioned clusters

Any message published on *any* topic in that namespace is replicated to all clusters in the specified set.

## Local persistence and forwarding

When messages are produced on a Pulsar topic, messages are first persisted in the local cluster, and then forwarded asynchronously to the remote clusters.

In normal cases, when connectivity issues are none, messages are replicated immediately, at the same time as they are dispatched to local consumers. Typically, the network [round-trip time](https://en.wikipedia.org/wiki/Round-trip_delay_time) (RTT) between the remote regions defines end-to-end delivery latency.

Applications can create producers and consumers in any of the clusters, even when the remote clusters are not reachable (like during a network partition).

Producers and consumers can publish messages to and consume messages from any cluster in a Pulsar instance. However, subscriptions cannot only be local to the cluster where the subscriptions are created but also can be transferred between clusters after replicated subscription is enabled. Once replicated subscription is enabled, you can keep subscription state in synchronization. Therefore, a topic can be asynchronously replicated across multiple geographical regions. In case of failover, a consumer can restart consuming messages from the failure point in a different cluster.

In the aforementioned example, the **T1** topic is replicated among three clusters, **Cluster-A**, **Cluster-B**, and **Cluster-C**.

All messages produced in any of the three clusters are delivered to all subscriptions in other clusters. In this case, **C1** and **C2** consumers receive all messages that **P1**, **P2**, and **P3** producers publish. Ordering is still guaranteed on a per-producer basis.

## Configure replication

As stated in [Geo-replication and Pulsar properties](#geo-replication-and-pulsar-properties) section, geo-replication in Pulsar is managed at the [tenant](reference-terminology.md#tenant) level.

The following example connects three clusters: **us-east**, **us-west**, and **us-cent**.

### Connect replication clusters

To replicate data among clusters, you need to configure each cluster to connect to the other. You can use the [`pulsar-admin`](https://pulsar.apache.org/tools/pulsar-admin/) tool to create a connection.

**Example**

Suppose that you have 3 replication clusters: `us-west`, `us-cent`, and `us-east`.

1. Configure the connection from `us-west` to `us-east`.

Run the following command on `us-west`.

```shell
$ bin/pulsar-admin clusters create \
--broker-url pulsar://<DNS-OF-US-EAST>:<PORT> \
--url http://<DNS-OF-US-EAST>:<PORT> \
us-east
```

:::tip


If you want to use a secure connection for a cluster, you can use the flags `--broker-url-secure` and `--url-secure`. For more information, see [pulsar-admin clusters create](https://pulsar.apache.org/tools/pulsar-admin/).

:::

2. Configure the connection from `us-west` to `us-cent`.

Run the following command on `us-west`.

```shell
$ bin/pulsar-admin clusters create \
--broker-url pulsar://<DNS-OF-US-CENT>:<PORT> \
--url http://<DNS-OF-US-CENT>:<PORT> \
us-cent
```

3. Run similar commands on `us-east` and `us-cent` to create connections among clusters.

### Grant permissions to properties

To replicate to a cluster, the tenant needs permission to use that cluster. You can grant permission to the tenant when you create the tenant or grant later.

Specify all the intended clusters when you create a tenant:

```shell
$ bin/pulsar-admin tenants create my-tenant \
--admin-roles my-admin-role \
--allowed-clusters us-west,us-east,us-cent
```

To update permissions of an existing tenant, use `update` instead of `create`.

### Enable geo-replication namespaces

You can create a namespace with the following command sample.

```shell
$ bin/pulsar-admin namespaces create my-tenant/my-namespace
```

Initially, the namespace is not assigned to any cluster. You can assign the namespace to clusters using the `set-clusters` subcommand:

```shell
$ bin/pulsar-admin namespaces set-clusters my-tenant/my-namespace \
--clusters us-west,us-east,us-cent
```

You can change the replication clusters for a namespace at any time, without disruption to ongoing traffic. Replication channels are immediately set up or stopped in all clusters as soon as the configuration changes.

### Use topics with geo-replication

Once you create a geo-replication namespace, any topics that producers or consumers create within that namespace is replicated across clusters. Typically, each application uses the `serviceUrl` for the local cluster.

#### Selective replication

By default, messages are replicated to all clusters configured for the namespace. You can restrict replication selectively by specifying a replication list for a message, and then that message is replicated only to the subset in the replication list.

The following is an example for the [Java API](client-libraries-java.md). Note the use of the `setReplicationClusters` method when you construct the {@inject: javadoc:Message:/client/org/apache/pulsar/client/api/Message} object:

```java
List<String> restrictReplicationTo = Arrays.asList(
"us-west",
"us-east"
);

Producer producer = client.newProducer()
.topic("some-topic")
.create();

producer.newMessage()
.value("my-payload".getBytes())
.setReplicationClusters(restrictReplicationTo)
.send();
```

#### Topic stats

Topic-specific statistics for geo-replication topics are available via the [`pulsar-admin`](reference-pulsar-admin.md) tool and {@inject: rest:REST:/} API:

```shell
$ bin/pulsar-admin persistent stats persistent://my-tenant/my-namespace/my-topic
```

Each cluster reports its own local stats, including the incoming and outgoing replication rates and backlogs.

#### Delete a geo-replication topic

Given that geo-replication topics exist in multiple regions, directly deleting a geo-replication topic is not possible. Instead, you should rely on automatic topic garbage collection.

In Pulsar, a topic is automatically deleted when the topic meets the following three conditions:
- no producers or consumers are connected to it;
- no subscriptions to it;
- no more messages are kept for retention.
For geo-replication topics, each region uses a fault-tolerant mechanism to decide when deleting the topic locally is safe.

You can explicitly disable topic garbage collection by setting `brokerDeleteInactiveTopicsEnabled` to `false` in your [broker configuration](reference-configuration.md#broker).

To delete a geo-replication topic, close all producers and consumers on the topic, and delete all of its local subscriptions in every replication cluster. When Pulsar determines that no valid subscription for the topic remains across the system, it will garbage collect the topic.

## Replicated subscriptions

Pulsar supports replicated subscriptions, so you can keep subscription state in sync, within a sub-second timeframe, in the context of a topic that is being asynchronously replicated across multiple geographical regions.

In case of failover, a consumer can restart consuming from the failure point in a different cluster.

### Enable replicated subscription

Replicated subscription is disabled by default. You can enable replicated subscription when creating a consumer.

```java
Consumer<String> consumer = client.newConsumer(Schema.STRING)
.topic("my-topic")
.subscriptionName("my-subscription")
.replicateSubscriptionState(true)
.subscribe();
```

### Advantages

* It is easy to implement the logic.
* You can choose to enable or disable replicated subscription.
* When you enable it, the overhead is low, and it is easy to configure.
* When you disable it, the overhead is zero.

### Limitations

When you enable replicated subscription, you're creating a consistent distributed snapshot to establish an association between message ids from different clusters. The snapshots are taken periodically. The default value is `1 second`. It means that a consumer failing over to a different cluster can potentially receive 1 second of duplicates. You can also configure the frequency of the snapshot in the `broker.conf` file.
126 changes: 126 additions & 0 deletions site2/website-next/docs/administration-isolation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
---
id: administration-isolation
title: Pulsar isolation
sidebar_label: Pulsar isolation
---

import Tabs from '@theme/Tabs';
import TabItem from '@theme/TabItem';


In an organization, a Pulsar instance provides services to multiple teams. When organizing the resources across multiple teams, you want to make a suitable isolation plan to avoid the resource competition between different teams and applications and provide high-quality messaging service. In this case, you need to take resource isolation into consideration and weigh your intended actions against expected and unexpected consequences.

To enforce resource isolation, you can use the Pulsar isolation policy, which allows you to allocate resources (**broker** and **bookie**) for the namespace.

## Broker isolation

In Pulsar, when namespaces (more specifically, namespace bundles) are assigned dynamically to brokers, the namespace isolation policy limits the set of brokers that can be used for assignment. Before topics are assigned to brokers, you can set the namespace isolation policy with a primary or a secondary regex to select desired brokers.

You can set a namespace isolation policy for a cluster using one of the following methods.

<Tabs
defaultValue="Admin CLI"
values={[
{
"label": "Admin CLI",
"value": "Admin CLI"
},
{
"label": "REST API",
"value": "REST API"
},
{
"label": "Java admin API",
"value": "Java admin API"
}
]}>

<TabItem value="Admin CLI">

```
pulsar-admin ns-isolation-policy set options
```

For more information about the command `pulsar-admin ns-isolation-policy set options`, see [here](https://pulsar.apache.org/tools/pulsar-admin/).

**Example**

```shell
bin/pulsar-admin ns-isolation-policy set \
--auto-failover-policy-type min_available \
--auto-failover-policy-params min_limit=1,usage_threshold=80 \
--namespaces my-tenant/my-namespace \
--primary 10.193.216.* my-cluster policy-name
```

</TabItem>
<TabItem value="REST API">

[PUT /admin/v2/namespaces/{tenant}/{namespace}](https://pulsar.apache.org/admin-rest-api/?version=master&apiversion=v2#operation/createNamespace)

</TabItem>
<TabItem value="Java admin API">

For how to set namespace isolation policy using Java admin API, see [here](https://github.com/apache/pulsar/blob/master/pulsar-client-admin/src/main/java/org/apache/pulsar/client/admin/internal/NamespacesImpl.java#L251).

</TabItem>

</Tabs>

## Bookie isolation

A namespace can be isolated into user-defined groups of bookies, which guarantees all the data that belongs to the namespace is stored in desired bookies. The bookie affinity group uses the BookKeeper [rack-aware placement policy](https://bookkeeper.apache.org/docs/latest/api/javadoc/org/apache/bookkeeper/client/EnsemblePlacementPolicy.html) and it is a way to feed rack information which is stored as JSON format in znode.

You can set a bookie affinity group using one of the following methods.

<Tabs
defaultValue="Admin CLI"
values={[
{
"label": "Admin CLI",
"value": "Admin CLI"
},
{
"label": "REST API",
"value": "REST API"
},
{
"label": "Java admin API",
"value": "Java admin API"
}
]}>

<TabItem value="Admin CLI">

```
pulsar-admin namespaces set-bookie-affinity-group options
```

For more information about the command `pulsar-admin namespaces set-bookie-affinity-group options`, see [here](https://pulsar.apache.org/tools/pulsar-admin/).

**Example**

```shell
bin/pulsar-admin bookies set-bookie-rack \
--bookie 127.0.0.1:3181 \
--hostname 127.0.0.1:3181 \
--group group-bookie1 \
--rack rack1

bin/pulsar-admin namespaces set-bookie-affinity-group public/default \
--primary-group group-bookie1
```

</TabItem>
<TabItem value="REST API">

[POST /admin/v2/namespaces/{tenant}/{namespace}/persistence/bookieAffinity](https://pulsar.apache.org/admin-rest-api/?version=master&apiversion=v2#operation/setBookieAffinityGroup)

</TabItem>
<TabItem value="Java admin API">

For how to set bookie affinity group for a namespace using Java admin API, see [here](https://github.com/apache/pulsar/blob/master/pulsar-client-admin/src/main/java/org/apache/pulsar/client/admin/internal/NamespacesImpl.java#L1164).

</TabItem>

</Tabs>
Loading

0 comments on commit 043dddc

Please sign in to comment.