Skip to content

Commit

Permalink
[docs] Add how to SerDe for Java function (apache#8278)
Browse files Browse the repository at this point in the history
* add info for schema in Java function

* fix typos

* address comments
  • Loading branch information
Jennifer88huang-zz authored Oct 19, 2020
1 parent 5df23b5 commit 96804b7
Showing 1 changed file with 9 additions and 11 deletions.
20 changes: 9 additions & 11 deletions site2/docs/functions-develop.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ title: Develop Pulsar Functions
sidebar_label: "How-to: Develop"
---

This tutorial walks you through how to develop Pulsar Functions.
You learn how to develop Pulsar Functions with different APIs for Java, Python and Go.

## Available APIs
In Java and Python, you have two options to write Pulsar Functions. In Go, you can use Pulsar Functions SDK for Go.
Expand Down Expand Up @@ -104,22 +104,14 @@ For complete code, see [here](https://github.com/apache/pulsar/blob/master/pulsa
<!--END_DOCUSAURUS_CODE_TABS-->

## Schema registry
Pulsar has a built in schema registry and comes bundled with a variety of popular schema types(avro, json and protobuf). Pulsar Functions can leverage existing schema information from input topics and derive the input type. The schema registry applies for output topic as well.
Pulsar has a built-in schema registry and is bundled with popular schema types, such as Avro, JSON and Protobuf. Pulsar Functions can leverage the existing schema information from input topics and derive the input type. The schema registry applies for output topic as well.

## SerDe
SerDe stands for **Ser**ialization and **De**serialization. Pulsar Functions uses SerDe when publishing data to and consuming data from Pulsar topics. How SerDe works by default depends on the language you use for a particular function.

<!--DOCUSAURUS_CODE_TABS-->
<!--Java-->
When you write Pulsar Functions in Java, the following basic Java types are built in and supported by default:

* `String`
* `Double`
* `Integer`
* `Float`
* `Long`
* `Short`
* `Byte`
When you write Pulsar Functions in Java, the following basic Java types are built in and supported by default: `String`, `Double`, `Integer`, `Float`, `Long`, `Short`, and `Byte`.

To customize Java types, you need to implement the following interface.

Expand All @@ -129,6 +121,12 @@ public interface SerDe<T> {
byte[] serialize(T input);
}
```
SerDe works in the following ways in Java Functions.
- If the input and output topics have schema, Pulsar Functions use schema for SerDe.
- If the input or output topics do not exist, Pulsar Functions adopt the following rules to determine SerDe:
- If the schema type is specified, Pulsar Functions use the specified schema type.
- If SerDe is specified, Pulsar Functions use the specified SerDe, and the schema type for input and output topics is `Byte`.
- If neither the schema type nor SerDe is specified, Pulsar Functions use the built-in SerDe. For non-primitive schema type, the built-in SerDe serializes and deserializes objects in the `JSON` format.

<!--Python-->
In Python, the default SerDe is identity, meaning that the type is serialized as whatever type the producer function returns.
Expand Down

0 comments on commit 96804b7

Please sign in to comment.