Skip to content

Commit

Permalink
Add full document for version 2.6.0 (apache#7310)
Browse files Browse the repository at this point in the history
Master Issue: apache#7083 

### Motivation

currently, website is released in a versioned way. but each versioned directory, contains only part of md files. This is hard to maintain the content of each version.
It would be better to make each versioned directory contain all the md files.


### Modifications

* Generate full document for 2.6.0
  • Loading branch information
tuteng authored Jun 27, 2020
1 parent a077788 commit 9c01a07
Show file tree
Hide file tree
Showing 99 changed files with 19,572 additions and 0 deletions.
265 changes: 265 additions & 0 deletions site2/website/versioned_docs/version-2.6.0/adaptors-kafka.md

Large diffs are not rendered by default.

77 changes: 77 additions & 0 deletions site2/website/versioned_docs/version-2.6.0/adaptors-spark.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
---
id: version-2.6.0-adaptors-spark
title: Pulsar adaptor for Apache Spark
sidebar_label: Apache Spark
original_id: adaptors-spark
---

The Spark Streaming receiver for Pulsar is a custom receiver that enables Apache [Spark Streaming](https://spark.apache.org/streaming/) to receive data from Pulsar.

An application can receive data in [Resilient Distributed Dataset](https://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds) (RDD) format via the Spark Streaming Pulsar receiver and can process it in a variety of ways.

## Prerequisites

To use the receiver, include a dependency for the `pulsar-spark` library in your Java configuration.

### Maven

If you're using Maven, add this to your `pom.xml`:

```xml
<!-- in your <properties> block -->
<pulsar.version>{{pulsar:version}}</pulsar.version>

<!-- in your <dependencies> block -->
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-spark</artifactId>
<version>${pulsar.version}</version>
</dependency>
```

### Gradle

If you're using Gradle, add this to your `build.gradle` file:

```groovy
def pulsarVersion = "{{pulsar:version}}"
dependencies {
compile group: 'org.apache.pulsar', name: 'pulsar-spark', version: pulsarVersion
}
```

## Usage

Pass an instance of `SparkStreamingPulsarReceiver` to the `receiverStream` method in `JavaStreamingContext`:

```java
String serviceUrl = "pulsar://localhost:6650/";
String topic = "persistent://public/default/test_src";
String subs = "test_sub";

SparkConf sparkConf = new SparkConf().setMaster("local[*]").setAppName("Pulsar Spark Example");

JavaStreamingContext jsc = new JavaStreamingContext(sparkConf, Durations.seconds(60));

ConsumerConfigurationData<byte[]> pulsarConf = new ConsumerConfigurationData();

Set<String> set = new HashSet<>();
set.add(topic);
pulsarConf.setTopicNames(set);
pulsarConf.setSubscriptionName(subs);

SparkStreamingPulsarReceiver pulsarReceiver = new SparkStreamingPulsarReceiver(
serviceUrl,
pulsarConf,
new AuthenticationDisabled());

JavaReceiverInputDStream<byte[]> lineDStream = jsc.receiverStream(pulsarReceiver);
```


## Example

You can find a complete example [here](https://github.com/apache/pulsar/tree/master/examples/spark/src/main/java/org/apache/spark/streaming/receiver/example/SparkStreamingPulsarReceiverExample.java).
In this example, the number of messages which contain the string "Pulsar" in received messages is counted.

91 changes: 91 additions & 0 deletions site2/website/versioned_docs/version-2.6.0/adaptors-storm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
id: version-2.6.0-adaptors-storm
title: Pulsar adaptor for Apache Storm
sidebar_label: Apache Storm
original_id: adaptors-storm
---

Pulsar Storm is an adaptor for integrating with [Apache Storm](http://storm.apache.org/) topologies. It provides core Storm implementations for sending and receiving data.

An application can inject data into a Storm topology via a generic Pulsar spout, as well as consume data from a Storm topology via a generic Pulsar bolt.

## Using the Pulsar Storm Adaptor

Include dependency for Pulsar Storm Adaptor:

```xml
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-storm</artifactId>
<version>${pulsar.version}</version>
</dependency>
```

## Pulsar Spout

The Pulsar Spout allows for the data published on a topic to be consumed by a Storm topology. It emits a Storm tuple based on the message received and the `MessageToValuesMapper` provided by the client.

The tuples that fail to be processed by the downstream bolts will be re-injected by the spout with an exponential backoff, within a configurable timeout (the default is 60 seconds) or a configurable number of retries, whichever comes first, after which it is acknowledged by the consumer. Here's an example construction of a spout:

```java
MessageToValuesMapper messageToValuesMapper = new MessageToValuesMapper() {

@Override
public Values toValues(Message msg) {
return new Values(new String(msg.getData()));
}

@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// declare the output fields
declarer.declare(new Fields("string"));
}
};

// Configure a Pulsar Spout
PulsarSpoutConfiguration spoutConf = new PulsarSpoutConfiguration();
spoutConf.setServiceUrl("pulsar://broker.messaging.usw.example.com:6650");
spoutConf.setTopic("persistent://my-property/usw/my-ns/my-topic1");
spoutConf.setSubscriptionName("my-subscriber-name1");
spoutConf.setMessageToValuesMapper(messageToValuesMapper);

// Create a Pulsar Spout
PulsarSpout spout = new PulsarSpout(spoutConf);
```

## Pulsar Bolt

The Pulsar bolt allows data in a Storm topology to be published on a topic. It publishes messages based on the Storm tuple received and the `TupleToMessageMapper` provided by the client.

A partitioned topic can also be used to publish messages on different topics. In the implementation of the `TupleToMessageMapper`, a "key" will need to be provided in the message which will send the messages with the same key to the same topic. Here's an example bolt:

```java
TupleToMessageMapper tupleToMessageMapper = new TupleToMessageMapper() {

@Override
public TypedMessageBuilder<byte[]> toMessage(TypedMessageBuilder<byte[]> msgBuilder, Tuple tuple) {
String receivedMessage = tuple.getString(0);
// message processing
String processedMsg = receivedMessage + "-processed";
return msgBuilder.value(processedMsg.getBytes());
}

@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// declare the output fields
}
};

// Configure a Pulsar Bolt
PulsarBoltConfiguration boltConf = new PulsarBoltConfiguration();
boltConf.setServiceUrl("pulsar://broker.messaging.usw.example.com:6650");
boltConf.setTopic("persistent://my-property/usw/my-ns/my-topic2");
boltConf.setTupleToMessageMapper(tupleToMessageMapper);

// Create a Pulsar Bolt
PulsarBolt bolt = new PulsarBolt(boltConf);
```

## Example

You can find a complete example [here](https://github.com/apache/pulsar/tree/master/pulsar-storm/src/test/java/org/apache/pulsar/storm/example/StormExample.java).
210 changes: 210 additions & 0 deletions site2/website/versioned_docs/version-2.6.0/admin-api-clusters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
---
id: version-2.6.0-admin-api-clusters
title: Managing Clusters
sidebar_label: Clusters
original_id: admin-api-clusters
---

Pulsar clusters consist of one or more Pulsar [brokers](reference-terminology.md#broker), one or more [BookKeeper](reference-terminology.md#bookkeeper)
servers (aka [bookies](reference-terminology.md#bookie)), and a [ZooKeeper](https://zookeeper.apache.org) cluster that provides configuration and coordination management.

Clusters can be managed via:

* The [`clusters`](reference-pulsar-admin.md#clusters) command of the [`pulsar-admin`](reference-pulsar-admin.md) tool
* The `/admin/v2/clusters` endpoint of the admin {@inject: rest:REST:/} API
* The `clusters` method of the {@inject: javadoc:PulsarAdmin:/admin/org/apache/pulsar/client/admin/PulsarAdmin} object in the [Java API](client-libraries-java.md)

## Clusters resources

### Provision

New clusters can be provisioned using the admin interface.

> Please note that this operation requires superuser privileges.
#### pulsar-admin

You can provision a new cluster using the [`create`](reference-pulsar-admin.md#clusters-create) subcommand. Here's an example:

```shell
$ pulsar-admin clusters create cluster-1 \
--url http://my-cluster.org.com:8080 \
--broker-url pulsar://my-cluster.org.com:6650
```

#### REST API

{@inject: endpoint|PUT|/admin/v2/clusters/:cluster|operation/createCluster}

#### Java

```java
ClusterData clusterData = new ClusterData(
serviceUrl,
serviceUrlTls,
brokerServiceUrl,
brokerServiceUrlTls
);
admin.clusters().createCluster(clusterName, clusterData);
```

### Initialize cluster metadata

When provision a new cluster, you need to initialize that cluster's [metadata](concepts-architecture-overview.md#metadata-store). When initializing cluster metadata, you need to specify all of the following:

* The name of the cluster
* The local ZooKeeper connection string for the cluster
* The configuration store connection string for the entire instance
* The web service URL for the cluster
* A broker service URL enabling interaction with the [brokers](reference-terminology.md#broker) in the cluster

You must initialize cluster metadata *before* starting up any [brokers](admin-api-brokers.md) that will belong to the cluster.

> #### No cluster metadata initialization through the REST API or the Java admin API
>
> Unlike most other admin functions in Pulsar, cluster metadata initialization cannot be performed via the admin REST API
> or the admin Java client, as metadata initialization involves communicating with ZooKeeper directly.
> Instead, you can use the [`pulsar`](reference-cli-tools.md#pulsar) CLI tool, in particular
> the [`initialize-cluster-metadata`](reference-cli-tools.md#pulsar-initialize-cluster-metadata) command.
Here's an example cluster metadata initialization command:

```shell
bin/pulsar initialize-cluster-metadata \
--cluster us-west \
--zookeeper zk1.us-west.example.com:2181 \
--configuration-store zk1.us-west.example.com:2184 \
--web-service-url http://pulsar.us-west.example.com:8080/ \
--web-service-url-tls https://pulsar.us-west.example.com:8443/ \
--broker-service-url pulsar://pulsar.us-west.example.com:6650/ \
--broker-service-url-tls pulsar+ssl://pulsar.us-west.example.com:6651/
```

You'll need to use `--*-tls` flags only if you're using [TLS authentication](security-tls-authentication.md) in your instance.

### Get configuration

You can fetch the [configuration](reference-configuration.md) for an existing cluster at any time.

#### pulsar-admin

Use the [`get`](reference-pulsar-admin.md#clusters-get) subcommand and specify the name of the cluster. Here's an example:

```shell
$ pulsar-admin clusters get cluster-1
{
"serviceUrl": "http://my-cluster.org.com:8080/",
"serviceUrlTls": null,
"brokerServiceUrl": "pulsar://my-cluster.org.com:6650/",
"brokerServiceUrlTls": null
"peerClusterNames": null
}
```

#### REST API

{@inject: endpoint|GET|/admin/v2/clusters/:cluster|operation/getCluster}

#### Java

```java
admin.clusters().getCluster(clusterName);
```

### Update

You can update the configuration for an existing cluster at any time.

#### pulsar-admin

Use the [`update`](reference-pulsar-admin.md#clusters-update) subcommand and specify new configuration values using flags.

```shell
$ pulsar-admin clusters update cluster-1 \
--url http://my-cluster.org.com:4081 \
--broker-url pulsar://my-cluster.org.com:3350
```

#### REST

{@inject: endpoint|POST|/admin/v2/clusters/:cluster|operation/updateCluster}

#### Java

```java
ClusterData clusterData = new ClusterData(
serviceUrl,
serviceUrlTls,
brokerServiceUrl,
brokerServiceUrlTls
);
admin.clusters().updateCluster(clusterName, clusterData);
```

### Delete

Clusters can be deleted from a Pulsar [instance](reference-terminology.md#instance).

#### pulsar-admin

Use the [`delete`](reference-pulsar-admin.md#clusters-delete) subcommand and specify the name of the cluster.

```
$ pulsar-admin clusters delete cluster-1
```

#### REST API

{@inject: endpoint|DELETE|/admin/v2/clusters/:cluster|operation/deleteCluster}

#### Java

```java
admin.clusters().deleteCluster(clusterName);
```

### List

You can fetch a list of all clusters in a Pulsar [instance](reference-terminology.md#instance).

#### pulsar-admin

Use the [`list`](reference-pulsar-admin.md#clusters-list) subcommand.

```shell
$ pulsar-admin clusters list
cluster-1
cluster-2
```

#### REST API

{@inject: endpoint|GET|/admin/v2/clusters|operation/getClusters}

###### Java

```java
admin.clusters().getClusters();
```

### Update peer-cluster data

Peer clusters can be configured for a given cluster in a Pulsar [instance](reference-terminology.md#instance).

#### pulsar-admin

Use the [`update-peer-clusters`](reference-pulsar-admin.md#clusters-update-peer-clusters) subcommand and specify the list of peer-cluster names.

```
$ pulsar-admin update-peer-clusters cluster-1 --peer-clusters cluster-2
```

#### REST API

{@inject: endpoint|POST|/admin/v2/clusters/:cluster/peers|operation/setPeerClusterNames}

#### Java

```java
admin.clusters().updatePeerClusterNames(clusterName, peerClusterList);
```
Loading

0 comments on commit 9c01a07

Please sign in to comment.