Skip to content

Commit

Permalink
schema version (apache#7417)
Browse files Browse the repository at this point in the history
Motivation
Pulsar 2.4.0 Added schema versioning to support multi version messages produce and consume apache#3876 apache#3670 apache#4211 apache#4325 apache#4548. but the doc is not updated accordingly.

Modifications
Update the schema version in the pulsar registry doc for releases 2.4.0/2.4.1/2.4.2.
  • Loading branch information
Huanli-Meng authored Jul 1, 2020
1 parent 842fe06 commit 3c422f0
Show file tree
Hide file tree
Showing 3 changed files with 41 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,14 @@ Pulsar schemas are fairly simple data structures that consist of:

## Schema versions

Each schema stored with a topic has a version. Schema version manages schema changes happening within a topic.

Messages produced with a given schema is tagged with a schema version. Therefore, when a message is consumed by a Pulsar client, the Pulsar client can use the schema version to retrieve the corresponding schema and deserialize data.

Schemas are versioned in succession.The schema is stored in a broker that handles the associated topics, so that version assignments can be made.

Once a version is assigned/fetched to/for a schema, all subsequent messages produced by that producer are tagged with the appropriate version.

In order to illustrate how schema versioning works, let's walk through an example. Imagine that the Pulsar [Java client](client-libraries-java.md) created using the code below attempts to connect to Pulsar and begin sending messages:

```java
Expand All @@ -62,6 +70,12 @@ A schema already exists; the producer connects using a new schema that is compat

> Schemas are versioned in succession. Schema storage happens in the broker that handles the associated topic so that version assignments can be made. Once a version is assigned/fetched to/for a schema, all subsequent messages produced by that producer are tagged with the appropriate version.
If you do not know the schema type of a Pulsar topic in advance, you can use AUTO schema to produce or consume generic records to or from brokers.

- `AUTO_PRODUCE` schema helps a producer validate whether the bytes sent by the producer is compatible with the schema of a topic.
- `AUTO_CONSUME` schema helps a Pulsar topic validate whether the bytes sent by a Pulsar topic is compatible with a consumer, that is, the Pulsar topic deserializes messages into language-specific objects using the schema retrieved from broker-side.

In `AUTO_CONSUME` mode, you can set the `useProvidedSchemaAsReaderSchema` flag to `false`. Therefore, the messages can be decoded based on the schema associated with the messages.

## Supported schema formats

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,14 @@ Pulsar schemas are fairly simple data structures that consist of:

## Schema versions

Each schema stored with a topic has a version. Schema version manages schema changes happening within a topic.

Messages produced with a given schema is tagged with a schema version. Therefore, when a message is consumed by a Pulsar client, the Pulsar client can use the schema version to retrieve the corresponding schema and deserialize data.

Schemas are versioned in succession.The schema is stored in a broker that handles the associated topics, so that version assignments can be made.

Once a version is assigned/fetched to/for a schema, all subsequent messages produced by that producer are tagged with the appropriate version.

In order to illustrate how schema versioning works, let's walk through an example. Imagine that the Pulsar [Java client](client-libraries-java.md) created using the code below attempts to connect to Pulsar and begin sending messages:

```java
Expand All @@ -62,6 +70,12 @@ A schema already exists; the producer connects using a new schema that is compat

> Schemas are versioned in succession. Schema storage happens in the broker that handles the associated topic so that version assignments can be made. Once a version is assigned/fetched to/for a schema, all subsequent messages produced by that producer are tagged with the appropriate version.
If you do not know the schema type of a Pulsar topic in advance, you can use AUTO schema to produce or consume generic records to or from brokers.

- `AUTO_PRODUCE` schema helps a producer validate whether the bytes sent by the producer is compatible with the schema of a topic.
- `AUTO_CONSUME` schema helps a Pulsar topic validate whether the bytes sent by a Pulsar topic is compatible with a consumer, that is, the Pulsar topic deserializes messages into language-specific objects using the schema retrieved from broker-side.

In `AUTO_CONSUME` mode, you can set the `useProvidedSchemaAsReaderSchema` flag to `false`. Therefore, the messages can be decoded based on the schema associated with the messages.

## Supported schema formats

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,14 @@ Pulsar schemas are fairly simple data structures that consist of:

## Schema versions

Each schema stored with a topic has a version. Schema version manages schema changes happening within a topic.

Messages produced with a given schema is tagged with a schema version. Therefore, when a message is consumed by a Pulsar client, the Pulsar client can use the schema version to retrieve the corresponding schema and deserialize data.

Schemas are versioned in succession.The schema is stored in a broker that handles the associated topics, so that version assignments can be made.

Once a version is assigned/fetched to/for a schema, all subsequent messages produced by that producer are tagged with the appropriate version.

In order to illustrate how schema versioning works, let's walk through an example. Imagine that the Pulsar [Java client](client-libraries-java.md) created using the code below attempts to connect to Pulsar and begin sending messages:

```java
Expand All @@ -60,8 +68,12 @@ No schema exists for the topic | The producer is created using the given schema.
A schema already exists; the producer connects using the same schema that's already stored | The schema is transmitted to the Pulsar broker. The broker determines that the schema is compatible. The broker attempts to store the schema in [BookKeeper](concepts-architecture-overview.md#persistent-storage) but then determines that it's already stored, so it's then used to tag produced messages.
A schema already exists; the producer connects using a new schema that is compatible | The producer transmits the schema to the broker. The broker determines that the schema is compatible and stores the new schema as the current version (with a new version number).

> Schemas are versioned in succession. Schema storage happens in the broker that handles the associated topic so that version assignments can be made. Once a version is assigned/fetched to/for a schema, all subsequent messages produced by that producer are tagged with the appropriate version.
If you do not know the schema type of a Pulsar topic in advance, you can use AUTO schema to produce or consume generic records to or from brokers.

- `AUTO_PRODUCE` schema helps a producer validate whether the bytes sent by the producer is compatible with the schema of a topic.
- `AUTO_CONSUME` schema helps a Pulsar topic validate whether the bytes sent by a Pulsar topic is compatible with a consumer, that is, the Pulsar topic deserializes messages into language-specific objects using the schema retrieved from broker-side.

In `AUTO_CONSUME` mode, you can set the `useProvidedSchemaAsReaderSchema` flag to `false`. Therefore, the messages can be decoded based on the schema associated with the messages.

## Supported schema formats

Expand Down

0 comments on commit 3c422f0

Please sign in to comment.