forked from apache/pulsar
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Add full document for version 2.6.0 (apache#7310)
Master Issue: apache#7083 ### Motivation currently, website is released in a versioned way. but each versioned directory, contains only part of md files. This is hard to maintain the content of each version. It would be better to make each versioned directory contain all the md files. ### Modifications * Generate full document for 2.6.0
- Loading branch information
Showing
99 changed files
with
19,572 additions
and
0 deletions.
There are no files selected for viewing
265 changes: 265 additions & 0 deletions
265
site2/website/versioned_docs/version-2.6.0/adaptors-kafka.md
Large diffs are not rendered by default.
Oops, something went wrong.
77 changes: 77 additions & 0 deletions
77
site2/website/versioned_docs/version-2.6.0/adaptors-spark.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,77 @@ | ||
--- | ||
id: version-2.6.0-adaptors-spark | ||
title: Pulsar adaptor for Apache Spark | ||
sidebar_label: Apache Spark | ||
original_id: adaptors-spark | ||
--- | ||
|
||
The Spark Streaming receiver for Pulsar is a custom receiver that enables Apache [Spark Streaming](https://spark.apache.org/streaming/) to receive data from Pulsar. | ||
|
||
An application can receive data in [Resilient Distributed Dataset](https://spark.apache.org/docs/latest/programming-guide.html#resilient-distributed-datasets-rdds) (RDD) format via the Spark Streaming Pulsar receiver and can process it in a variety of ways. | ||
|
||
## Prerequisites | ||
|
||
To use the receiver, include a dependency for the `pulsar-spark` library in your Java configuration. | ||
|
||
### Maven | ||
|
||
If you're using Maven, add this to your `pom.xml`: | ||
|
||
```xml | ||
<!-- in your <properties> block --> | ||
<pulsar.version>{{pulsar:version}}</pulsar.version> | ||
|
||
<!-- in your <dependencies> block --> | ||
<dependency> | ||
<groupId>org.apache.pulsar</groupId> | ||
<artifactId>pulsar-spark</artifactId> | ||
<version>${pulsar.version}</version> | ||
</dependency> | ||
``` | ||
|
||
### Gradle | ||
|
||
If you're using Gradle, add this to your `build.gradle` file: | ||
|
||
```groovy | ||
def pulsarVersion = "{{pulsar:version}}" | ||
dependencies { | ||
compile group: 'org.apache.pulsar', name: 'pulsar-spark', version: pulsarVersion | ||
} | ||
``` | ||
|
||
## Usage | ||
|
||
Pass an instance of `SparkStreamingPulsarReceiver` to the `receiverStream` method in `JavaStreamingContext`: | ||
|
||
```java | ||
String serviceUrl = "pulsar://localhost:6650/"; | ||
String topic = "persistent://public/default/test_src"; | ||
String subs = "test_sub"; | ||
|
||
SparkConf sparkConf = new SparkConf().setMaster("local[*]").setAppName("Pulsar Spark Example"); | ||
|
||
JavaStreamingContext jsc = new JavaStreamingContext(sparkConf, Durations.seconds(60)); | ||
|
||
ConsumerConfigurationData<byte[]> pulsarConf = new ConsumerConfigurationData(); | ||
|
||
Set<String> set = new HashSet<>(); | ||
set.add(topic); | ||
pulsarConf.setTopicNames(set); | ||
pulsarConf.setSubscriptionName(subs); | ||
|
||
SparkStreamingPulsarReceiver pulsarReceiver = new SparkStreamingPulsarReceiver( | ||
serviceUrl, | ||
pulsarConf, | ||
new AuthenticationDisabled()); | ||
|
||
JavaReceiverInputDStream<byte[]> lineDStream = jsc.receiverStream(pulsarReceiver); | ||
``` | ||
|
||
|
||
## Example | ||
|
||
You can find a complete example [here](https://github.com/apache/pulsar/tree/master/examples/spark/src/main/java/org/apache/spark/streaming/receiver/example/SparkStreamingPulsarReceiverExample.java). | ||
In this example, the number of messages which contain the string "Pulsar" in received messages is counted. | ||
|
91 changes: 91 additions & 0 deletions
91
site2/website/versioned_docs/version-2.6.0/adaptors-storm.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,91 @@ | ||
--- | ||
id: version-2.6.0-adaptors-storm | ||
title: Pulsar adaptor for Apache Storm | ||
sidebar_label: Apache Storm | ||
original_id: adaptors-storm | ||
--- | ||
|
||
Pulsar Storm is an adaptor for integrating with [Apache Storm](http://storm.apache.org/) topologies. It provides core Storm implementations for sending and receiving data. | ||
|
||
An application can inject data into a Storm topology via a generic Pulsar spout, as well as consume data from a Storm topology via a generic Pulsar bolt. | ||
|
||
## Using the Pulsar Storm Adaptor | ||
|
||
Include dependency for Pulsar Storm Adaptor: | ||
|
||
```xml | ||
<dependency> | ||
<groupId>org.apache.pulsar</groupId> | ||
<artifactId>pulsar-storm</artifactId> | ||
<version>${pulsar.version}</version> | ||
</dependency> | ||
``` | ||
|
||
## Pulsar Spout | ||
|
||
The Pulsar Spout allows for the data published on a topic to be consumed by a Storm topology. It emits a Storm tuple based on the message received and the `MessageToValuesMapper` provided by the client. | ||
|
||
The tuples that fail to be processed by the downstream bolts will be re-injected by the spout with an exponential backoff, within a configurable timeout (the default is 60 seconds) or a configurable number of retries, whichever comes first, after which it is acknowledged by the consumer. Here's an example construction of a spout: | ||
|
||
```java | ||
MessageToValuesMapper messageToValuesMapper = new MessageToValuesMapper() { | ||
|
||
@Override | ||
public Values toValues(Message msg) { | ||
return new Values(new String(msg.getData())); | ||
} | ||
|
||
@Override | ||
public void declareOutputFields(OutputFieldsDeclarer declarer) { | ||
// declare the output fields | ||
declarer.declare(new Fields("string")); | ||
} | ||
}; | ||
|
||
// Configure a Pulsar Spout | ||
PulsarSpoutConfiguration spoutConf = new PulsarSpoutConfiguration(); | ||
spoutConf.setServiceUrl("pulsar://broker.messaging.usw.example.com:6650"); | ||
spoutConf.setTopic("persistent://my-property/usw/my-ns/my-topic1"); | ||
spoutConf.setSubscriptionName("my-subscriber-name1"); | ||
spoutConf.setMessageToValuesMapper(messageToValuesMapper); | ||
|
||
// Create a Pulsar Spout | ||
PulsarSpout spout = new PulsarSpout(spoutConf); | ||
``` | ||
|
||
## Pulsar Bolt | ||
|
||
The Pulsar bolt allows data in a Storm topology to be published on a topic. It publishes messages based on the Storm tuple received and the `TupleToMessageMapper` provided by the client. | ||
|
||
A partitioned topic can also be used to publish messages on different topics. In the implementation of the `TupleToMessageMapper`, a "key" will need to be provided in the message which will send the messages with the same key to the same topic. Here's an example bolt: | ||
|
||
```java | ||
TupleToMessageMapper tupleToMessageMapper = new TupleToMessageMapper() { | ||
|
||
@Override | ||
public TypedMessageBuilder<byte[]> toMessage(TypedMessageBuilder<byte[]> msgBuilder, Tuple tuple) { | ||
String receivedMessage = tuple.getString(0); | ||
// message processing | ||
String processedMsg = receivedMessage + "-processed"; | ||
return msgBuilder.value(processedMsg.getBytes()); | ||
} | ||
|
||
@Override | ||
public void declareOutputFields(OutputFieldsDeclarer declarer) { | ||
// declare the output fields | ||
} | ||
}; | ||
|
||
// Configure a Pulsar Bolt | ||
PulsarBoltConfiguration boltConf = new PulsarBoltConfiguration(); | ||
boltConf.setServiceUrl("pulsar://broker.messaging.usw.example.com:6650"); | ||
boltConf.setTopic("persistent://my-property/usw/my-ns/my-topic2"); | ||
boltConf.setTupleToMessageMapper(tupleToMessageMapper); | ||
|
||
// Create a Pulsar Bolt | ||
PulsarBolt bolt = new PulsarBolt(boltConf); | ||
``` | ||
|
||
## Example | ||
|
||
You can find a complete example [here](https://github.com/apache/pulsar/tree/master/pulsar-storm/src/test/java/org/apache/pulsar/storm/example/StormExample.java). |
210 changes: 210 additions & 0 deletions
210
site2/website/versioned_docs/version-2.6.0/admin-api-clusters.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,210 @@ | ||
--- | ||
id: version-2.6.0-admin-api-clusters | ||
title: Managing Clusters | ||
sidebar_label: Clusters | ||
original_id: admin-api-clusters | ||
--- | ||
|
||
Pulsar clusters consist of one or more Pulsar [brokers](reference-terminology.md#broker), one or more [BookKeeper](reference-terminology.md#bookkeeper) | ||
servers (aka [bookies](reference-terminology.md#bookie)), and a [ZooKeeper](https://zookeeper.apache.org) cluster that provides configuration and coordination management. | ||
|
||
Clusters can be managed via: | ||
|
||
* The [`clusters`](reference-pulsar-admin.md#clusters) command of the [`pulsar-admin`](reference-pulsar-admin.md) tool | ||
* The `/admin/v2/clusters` endpoint of the admin {@inject: rest:REST:/} API | ||
* The `clusters` method of the {@inject: javadoc:PulsarAdmin:/admin/org/apache/pulsar/client/admin/PulsarAdmin} object in the [Java API](client-libraries-java.md) | ||
|
||
## Clusters resources | ||
|
||
### Provision | ||
|
||
New clusters can be provisioned using the admin interface. | ||
|
||
> Please note that this operation requires superuser privileges. | ||
#### pulsar-admin | ||
|
||
You can provision a new cluster using the [`create`](reference-pulsar-admin.md#clusters-create) subcommand. Here's an example: | ||
|
||
```shell | ||
$ pulsar-admin clusters create cluster-1 \ | ||
--url http://my-cluster.org.com:8080 \ | ||
--broker-url pulsar://my-cluster.org.com:6650 | ||
``` | ||
|
||
#### REST API | ||
|
||
{@inject: endpoint|PUT|/admin/v2/clusters/:cluster|operation/createCluster} | ||
|
||
#### Java | ||
|
||
```java | ||
ClusterData clusterData = new ClusterData( | ||
serviceUrl, | ||
serviceUrlTls, | ||
brokerServiceUrl, | ||
brokerServiceUrlTls | ||
); | ||
admin.clusters().createCluster(clusterName, clusterData); | ||
``` | ||
|
||
### Initialize cluster metadata | ||
|
||
When provision a new cluster, you need to initialize that cluster's [metadata](concepts-architecture-overview.md#metadata-store). When initializing cluster metadata, you need to specify all of the following: | ||
|
||
* The name of the cluster | ||
* The local ZooKeeper connection string for the cluster | ||
* The configuration store connection string for the entire instance | ||
* The web service URL for the cluster | ||
* A broker service URL enabling interaction with the [brokers](reference-terminology.md#broker) in the cluster | ||
|
||
You must initialize cluster metadata *before* starting up any [brokers](admin-api-brokers.md) that will belong to the cluster. | ||
|
||
> #### No cluster metadata initialization through the REST API or the Java admin API | ||
> | ||
> Unlike most other admin functions in Pulsar, cluster metadata initialization cannot be performed via the admin REST API | ||
> or the admin Java client, as metadata initialization involves communicating with ZooKeeper directly. | ||
> Instead, you can use the [`pulsar`](reference-cli-tools.md#pulsar) CLI tool, in particular | ||
> the [`initialize-cluster-metadata`](reference-cli-tools.md#pulsar-initialize-cluster-metadata) command. | ||
Here's an example cluster metadata initialization command: | ||
|
||
```shell | ||
bin/pulsar initialize-cluster-metadata \ | ||
--cluster us-west \ | ||
--zookeeper zk1.us-west.example.com:2181 \ | ||
--configuration-store zk1.us-west.example.com:2184 \ | ||
--web-service-url http://pulsar.us-west.example.com:8080/ \ | ||
--web-service-url-tls https://pulsar.us-west.example.com:8443/ \ | ||
--broker-service-url pulsar://pulsar.us-west.example.com:6650/ \ | ||
--broker-service-url-tls pulsar+ssl://pulsar.us-west.example.com:6651/ | ||
``` | ||
|
||
You'll need to use `--*-tls` flags only if you're using [TLS authentication](security-tls-authentication.md) in your instance. | ||
|
||
### Get configuration | ||
|
||
You can fetch the [configuration](reference-configuration.md) for an existing cluster at any time. | ||
|
||
#### pulsar-admin | ||
|
||
Use the [`get`](reference-pulsar-admin.md#clusters-get) subcommand and specify the name of the cluster. Here's an example: | ||
|
||
```shell | ||
$ pulsar-admin clusters get cluster-1 | ||
{ | ||
"serviceUrl": "http://my-cluster.org.com:8080/", | ||
"serviceUrlTls": null, | ||
"brokerServiceUrl": "pulsar://my-cluster.org.com:6650/", | ||
"brokerServiceUrlTls": null | ||
"peerClusterNames": null | ||
} | ||
``` | ||
|
||
#### REST API | ||
|
||
{@inject: endpoint|GET|/admin/v2/clusters/:cluster|operation/getCluster} | ||
|
||
#### Java | ||
|
||
```java | ||
admin.clusters().getCluster(clusterName); | ||
``` | ||
|
||
### Update | ||
|
||
You can update the configuration for an existing cluster at any time. | ||
|
||
#### pulsar-admin | ||
|
||
Use the [`update`](reference-pulsar-admin.md#clusters-update) subcommand and specify new configuration values using flags. | ||
|
||
```shell | ||
$ pulsar-admin clusters update cluster-1 \ | ||
--url http://my-cluster.org.com:4081 \ | ||
--broker-url pulsar://my-cluster.org.com:3350 | ||
``` | ||
|
||
#### REST | ||
|
||
{@inject: endpoint|POST|/admin/v2/clusters/:cluster|operation/updateCluster} | ||
|
||
#### Java | ||
|
||
```java | ||
ClusterData clusterData = new ClusterData( | ||
serviceUrl, | ||
serviceUrlTls, | ||
brokerServiceUrl, | ||
brokerServiceUrlTls | ||
); | ||
admin.clusters().updateCluster(clusterName, clusterData); | ||
``` | ||
|
||
### Delete | ||
|
||
Clusters can be deleted from a Pulsar [instance](reference-terminology.md#instance). | ||
|
||
#### pulsar-admin | ||
|
||
Use the [`delete`](reference-pulsar-admin.md#clusters-delete) subcommand and specify the name of the cluster. | ||
|
||
``` | ||
$ pulsar-admin clusters delete cluster-1 | ||
``` | ||
|
||
#### REST API | ||
|
||
{@inject: endpoint|DELETE|/admin/v2/clusters/:cluster|operation/deleteCluster} | ||
|
||
#### Java | ||
|
||
```java | ||
admin.clusters().deleteCluster(clusterName); | ||
``` | ||
|
||
### List | ||
|
||
You can fetch a list of all clusters in a Pulsar [instance](reference-terminology.md#instance). | ||
|
||
#### pulsar-admin | ||
|
||
Use the [`list`](reference-pulsar-admin.md#clusters-list) subcommand. | ||
|
||
```shell | ||
$ pulsar-admin clusters list | ||
cluster-1 | ||
cluster-2 | ||
``` | ||
|
||
#### REST API | ||
|
||
{@inject: endpoint|GET|/admin/v2/clusters|operation/getClusters} | ||
|
||
###### Java | ||
|
||
```java | ||
admin.clusters().getClusters(); | ||
``` | ||
|
||
### Update peer-cluster data | ||
|
||
Peer clusters can be configured for a given cluster in a Pulsar [instance](reference-terminology.md#instance). | ||
|
||
#### pulsar-admin | ||
|
||
Use the [`update-peer-clusters`](reference-pulsar-admin.md#clusters-update-peer-clusters) subcommand and specify the list of peer-cluster names. | ||
|
||
``` | ||
$ pulsar-admin update-peer-clusters cluster-1 --peer-clusters cluster-2 | ||
``` | ||
|
||
#### REST API | ||
|
||
{@inject: endpoint|POST|/admin/v2/clusters/:cluster/peers|operation/setPeerClusterNames} | ||
|
||
#### Java | ||
|
||
```java | ||
admin.clusters().updatePeerClusterNames(clusterName, peerClusterList); | ||
``` |
Oops, something went wrong.