Skip to content

Commit

Permalink
Add remaining files to 2.6.4 release doc set (apache#10799)
Browse files Browse the repository at this point in the history
- copied from 2.6.3 doc set
  • Loading branch information
lhotari authored Jun 3, 2021
1 parent cd0bf4a commit 78458bf
Show file tree
Hide file tree
Showing 88 changed files with 17,052 additions and 0 deletions.
265 changes: 265 additions & 0 deletions site2/website/versioned_docs/version-2.6.4/adaptors-kafka.md

Large diffs are not rendered by default.

72 changes: 72 additions & 0 deletions site2/website/versioned_docs/version-2.6.4/adaptors-spark.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
id: version-2.6.4-adaptors-spark
title: Pulsar adaptor for Apache Spark
sidebar_label: Apache Spark
original_id: adaptors-spark
---

The Spark Streaming receiver for Pulsar is a custom receiver that enables Apache [Spark Streaming](https://spark.apache.org/streaming/) to receive data from Pulsar.

An application can receive data in [Resilient Distributed Dataset](https://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds) (RDD) format via the Spark Streaming Pulsar receiver and can process it in a variety of ways.

## Prerequisites

To use the receiver, include a dependency for the `pulsar-spark` library in your Java configuration.

### Maven

If you're using Maven, add this to your `pom.xml`:

```xml
<!-- in your <properties> block -->
<pulsar.version>{{pulsar:version}}</pulsar.version>

<!-- in your <dependencies> block -->
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-spark</artifactId>
<version>${pulsar.version}</version>
</dependency>
```

### Gradle

If you're using Gradle, add this to your `build.gradle` file:

```groovy
def pulsarVersion = "{{pulsar:version}}"
dependencies {
compile group: 'org.apache.pulsar', name: 'pulsar-spark', version: pulsarVersion
}
```

## Usage

Pass an instance of `SparkStreamingPulsarReceiver` to the `receiverStream` method in `JavaStreamingContext`:

```java
String serviceUrl = "pulsar://localhost:6650/";
String topic = "persistent://public/default/test_src";
String subs = "test_sub";

SparkConf sparkConf = new SparkConf().setMaster("local[*]").setAppName("Pulsar Spark Example");

JavaStreamingContext jsc = new JavaStreamingContext(sparkConf, Durations.seconds(60));

ConsumerConfigurationData<byte[]> pulsarConf = new ConsumerConfigurationData();

Set<String> set = new HashSet<>();
set.add(topic);
pulsarConf.setTopicNames(set);
pulsarConf.setSubscriptionName(subs);

SparkStreamingPulsarReceiver pulsarReceiver = new SparkStreamingPulsarReceiver(
serviceUrl,
pulsarConf,
new AuthenticationDisabled());

JavaReceiverInputDStream<byte[]> lineDStream = jsc.receiverStream(pulsarReceiver);
```

For a complete example, click [here](https://github.com/apache/pulsar-adapters/blob/master/examples/spark/src/main/java/org/apache/spark/streaming/receiver/example/SparkStreamingPulsarReceiverExample.java). In this example, the number of messages that contain the string "Pulsar" in received messages is counted.
89 changes: 89 additions & 0 deletions site2/website/versioned_docs/version-2.6.4/adaptors-storm.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
---
id: version-2.6.4-adaptors-storm
title: Pulsar adaptor for Apache Storm
sidebar_label: Apache Storm
original_id: adaptors-storm
---

Pulsar Storm is an adaptor for integrating with [Apache Storm](http://storm.apache.org/) topologies. It provides core Storm implementations for sending and receiving data.

An application can inject data into a Storm topology via a generic Pulsar spout, as well as consume data from a Storm topology via a generic Pulsar bolt.

## Using the Pulsar Storm Adaptor

Include dependency for Pulsar Storm Adaptor:

```xml
<dependency>
<groupId>org.apache.pulsar</groupId>
<artifactId>pulsar-storm</artifactId>
<version>${pulsar.version}</version>
</dependency>
```

## Pulsar Spout

The Pulsar Spout allows for the data published on a topic to be consumed by a Storm topology. It emits a Storm tuple based on the message received and the `MessageToValuesMapper` provided by the client.

The tuples that fail to be processed by the downstream bolts will be re-injected by the spout with an exponential backoff, within a configurable timeout (the default is 60 seconds) or a configurable number of retries, whichever comes first, after which it is acknowledged by the consumer. Here's an example construction of a spout:

```java
MessageToValuesMapper messageToValuesMapper = new MessageToValuesMapper() {

@Override
public Values toValues(Message msg) {
return new Values(new String(msg.getData()));
}

@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// declare the output fields
declarer.declare(new Fields("string"));
}
};

// Configure a Pulsar Spout
PulsarSpoutConfiguration spoutConf = new PulsarSpoutConfiguration();
spoutConf.setServiceUrl("pulsar://broker.messaging.usw.example.com:6650");
spoutConf.setTopic("persistent://my-property/usw/my-ns/my-topic1");
spoutConf.setSubscriptionName("my-subscriber-name1");
spoutConf.setMessageToValuesMapper(messageToValuesMapper);

// Create a Pulsar Spout
PulsarSpout spout = new PulsarSpout(spoutConf);
```

For a complete example, click [here](https://github.com/apache/pulsar-adapters/blob/master/pulsar-storm/src/test/java/org/apache/pulsar/storm/PulsarSpoutTest.java).

## Pulsar Bolt

The Pulsar bolt allows data in a Storm topology to be published on a topic. It publishes messages based on the Storm tuple received and the `TupleToMessageMapper` provided by the client.

A partitioned topic can also be used to publish messages on different topics. In the implementation of the `TupleToMessageMapper`, a "key" will need to be provided in the message which will send the messages with the same key to the same topic. Here's an example bolt:

```java
TupleToMessageMapper tupleToMessageMapper = new TupleToMessageMapper() {

@Override
public TypedMessageBuilder<byte[]> toMessage(TypedMessageBuilder<byte[]> msgBuilder, Tuple tuple) {
String receivedMessage = tuple.getString(0);
// message processing
String processedMsg = receivedMessage + "-processed";
return msgBuilder.value(processedMsg.getBytes());
}

@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// declare the output fields
}
};

// Configure a Pulsar Bolt
PulsarBoltConfiguration boltConf = new PulsarBoltConfiguration();
boltConf.setServiceUrl("pulsar://broker.messaging.usw.example.com:6650");
boltConf.setTopic("persistent://my-property/usw/my-ns/my-topic2");
boltConf.setTupleToMessageMapper(tupleToMessageMapper);

// Create a Pulsar Bolt
PulsarBolt bolt = new PulsarBolt(boltConf);
```
210 changes: 210 additions & 0 deletions site2/website/versioned_docs/version-2.6.4/admin-api-clusters.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,210 @@
---
id: version-2.6.4-admin-api-clusters
title: Managing Clusters
sidebar_label: Clusters
original_id: admin-api-clusters
---

Pulsar clusters consist of one or more Pulsar [brokers](reference-terminology.md#broker), one or more [BookKeeper](reference-terminology.md#bookkeeper)
servers (aka [bookies](reference-terminology.md#bookie)), and a [ZooKeeper](https://zookeeper.apache.org) cluster that provides configuration and coordination management.

Clusters can be managed via:

* The [`clusters`](reference-pulsar-admin.md#clusters) command of the [`pulsar-admin`](reference-pulsar-admin.md) tool
* The `/admin/v2/clusters` endpoint of the admin {@inject: rest:REST:/} API
* The `clusters` method of the {@inject: javadoc:PulsarAdmin:/admin/org/apache/pulsar/client/admin/PulsarAdmin} object in the [Java API](client-libraries-java.md)

## Clusters resources

### Provision

New clusters can be provisioned using the admin interface.

> Please note that this operation requires superuser privileges.
#### pulsar-admin

You can provision a new cluster using the [`create`](reference-pulsar-admin.md#clusters-create) subcommand. Here's an example:

```shell
$ pulsar-admin clusters create cluster-1 \
--url http://my-cluster.org.com:8080 \
--broker-url pulsar://my-cluster.org.com:6650
```

#### REST API

{@inject: endpoint|PUT|/admin/v2/clusters/:cluster|operation/createCluster?version=[[pulsar:version_number]]}

#### Java

```java
ClusterData clusterData = new ClusterData(
serviceUrl,
serviceUrlTls,
brokerServiceUrl,
brokerServiceUrlTls
);
admin.clusters().createCluster(clusterName, clusterData);
```

### Initialize cluster metadata

When provision a new cluster, you need to initialize that cluster's [metadata](concepts-architecture-overview.md#metadata-store). When initializing cluster metadata, you need to specify all of the following:

* The name of the cluster
* The local ZooKeeper connection string for the cluster
* The configuration store connection string for the entire instance
* The web service URL for the cluster
* A broker service URL enabling interaction with the [brokers](reference-terminology.md#broker) in the cluster

You must initialize cluster metadata *before* starting up any [brokers](admin-api-brokers.md) that will belong to the cluster.

> #### No cluster metadata initialization through the REST API or the Java admin API
>
> Unlike most other admin functions in Pulsar, cluster metadata initialization cannot be performed via the admin REST API
> or the admin Java client, as metadata initialization involves communicating with ZooKeeper directly.
> Instead, you can use the [`pulsar`](reference-cli-tools.md#pulsar) CLI tool, in particular
> the [`initialize-cluster-metadata`](reference-cli-tools.md#pulsar-initialize-cluster-metadata) command.
Here's an example cluster metadata initialization command:

```shell
bin/pulsar initialize-cluster-metadata \
--cluster us-west \
--zookeeper zk1.us-west.example.com:2181 \
--configuration-store zk1.us-west.example.com:2184 \
--web-service-url http://pulsar.us-west.example.com:8080/ \
--web-service-url-tls https://pulsar.us-west.example.com:8443/ \
--broker-service-url pulsar://pulsar.us-west.example.com:6650/ \
--broker-service-url-tls pulsar+ssl://pulsar.us-west.example.com:6651/
```

You'll need to use `--*-tls` flags only if you're using [TLS authentication](security-tls-authentication.md) in your instance.

### Get configuration

You can fetch the [configuration](reference-configuration.md) for an existing cluster at any time.

#### pulsar-admin

Use the [`get`](reference-pulsar-admin.md#clusters-get) subcommand and specify the name of the cluster. Here's an example:

```shell
$ pulsar-admin clusters get cluster-1
{
"serviceUrl": "http://my-cluster.org.com:8080/",
"serviceUrlTls": null,
"brokerServiceUrl": "pulsar://my-cluster.org.com:6650/",
"brokerServiceUrlTls": null
"peerClusterNames": null
}
```

#### REST API

{@inject: endpoint|GET|/admin/v2/clusters/:cluster|operation/getCluster?version=[[pulsar:version_number]]}

#### Java

```java
admin.clusters().getCluster(clusterName);
```

### Update

You can update the configuration for an existing cluster at any time.

#### pulsar-admin

Use the [`update`](reference-pulsar-admin.md#clusters-update) subcommand and specify new configuration values using flags.

```shell
$ pulsar-admin clusters update cluster-1 \
--url http://my-cluster.org.com:4081 \
--broker-url pulsar://my-cluster.org.com:3350
```

#### REST

{@inject: endpoint|POST|/admin/v2/clusters/:cluster|operation/updateCluster?version=[[pulsar:version_number]]}

#### Java

```java
ClusterData clusterData = new ClusterData(
serviceUrl,
serviceUrlTls,
brokerServiceUrl,
brokerServiceUrlTls
);
admin.clusters().updateCluster(clusterName, clusterData);
```

### Delete

Clusters can be deleted from a Pulsar [instance](reference-terminology.md#instance).

#### pulsar-admin

Use the [`delete`](reference-pulsar-admin.md#clusters-delete) subcommand and specify the name of the cluster.

```
$ pulsar-admin clusters delete cluster-1
```

#### REST API

{@inject: endpoint|DELETE|/admin/v2/clusters/:cluster|operation/deleteCluster?version=[[pulsar:version_number]]}

#### Java

```java
admin.clusters().deleteCluster(clusterName);
```

### List

You can fetch a list of all clusters in a Pulsar [instance](reference-terminology.md#instance).

#### pulsar-admin

Use the [`list`](reference-pulsar-admin.md#clusters-list) subcommand.

```shell
$ pulsar-admin clusters list
cluster-1
cluster-2
```

#### REST API

{@inject: endpoint|GET|/admin/v2/clusters|operation/getClusters?version=[[pulsar:version_number]]}

###### Java

```java
admin.clusters().getClusters();
```

### Update peer-cluster data

Peer clusters can be configured for a given cluster in a Pulsar [instance](reference-terminology.md#instance).

#### pulsar-admin

Use the [`update-peer-clusters`](reference-pulsar-admin.md#clusters-update-peer-clusters) subcommand and specify the list of peer-cluster names.

```
$ pulsar-admin update-peer-clusters cluster-1 --peer-clusters cluster-2
```

#### REST API

{@inject: endpoint|POST|/admin/v2/clusters/:cluster/peers|operation/setPeerClusterNames?version=[[pulsar:version_number]]}

#### Java

```java
admin.clusters().updatePeerClusterNames(clusterName, peerClusterList);
```
Loading

0 comments on commit 78458bf

Please sign in to comment.