Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
project		project
src		src
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.sbt		build.sbt
sbt		sbt
version.sbt		version.sbt

Repository files navigation

kafka-avro-codec

Avro encoder/decoder for use as serializer.class in Apache Kafka 0.8

Table of Contents

Usage
Development
Contributing
License

Usage

Background: Avro and Strings

By default Avro uses CharSequence for fields defined as string in an Avro schema. Oftentimes this is not what you want -- see the discussions at AVRO-803 and Apache Avro: map uses CharSequence as key for details.

For the unit tests included in this project we explicitly configure Avro to use java.lang.String instead of CharSequence (or Avro's own Utf8 class), which provides us with the behavior you would expect when working with string-like objects in Java/Scala.

How you would configure Avro's "string behavior" depends on your project setup. In our case we are using sbt and sbt-avro to compile Avro schemas into Java code. Here, build.sbt is the place to configure Avro through sbt-avro's stringType setting. If you are not using sbt but, say, Maven then you would need to add <stringType>String</stringType> to the configuration of avro-maven-plugin in your pom.xml.

Using the codec in Kafka

Assumptions

The codec assumes you are using Avro with code generation, i.e. you create an Avro schema (see twitter.avsc) and then compile the schema into Java code. "With code generation" is the scenario where you would use SpecificDatumWriter and SpecificDatumReader in Avro (which is what this codec does) instead of GenericDatumWriter and GenericDatumReader.

See the Avro documentation for details.

Dependency management

Note: The build artifacts published in Clojars.org are built against Java 6.

When using sbt add the following lines to build.sbt:

resolvers ++= Seq(
  "clojars-repository" at "https://clojars.org/repo"
)

libraryDependencies ++= Seq(
  "com.miguno" %% "kafka-avro-codec" % "0.1.1"
)

When using gradle add the following lines to build.gradle:

repositories {
  mavenCentral()
  maven { url 'https://clojars.org/repo' }
}

dependencies {
  compile 'com.miguno:kafka-avro-codec_2.10:0.1.1'
}

Implementing a custom codec for your data records

Let us assume you have defined an Avro schema for Twitter tweets, and you want to use this schema for data messages that you sent into a Kafka topic (see twitter.avsc). To implement an accompanying Avro encoder/decoder pair for Kafka you would need to extend two base classes:

Example:

package your.app

import kafka.utils.VerifiableProperties
import com.miguno.kafka.avro.AvroEncoder
import com.miguno.avro.Tweet // <-- This is your Avro record, see twitter.avsc

class TweetAvroEncoder[Tweet](props: VerifiableProperties = null) extends AvroEncoder[Tweet](props,
  Tweet.getClassSchema)

class TweetAvroDecoder[Tweet](props: VerifiableProperties = null) extends AvroDecoder[Tweet](props,
  Tweet.getClassSchema)

Using the custom codec in a Kafka producer

TODO

Using the custom codec in a Kafka consumer

val topic = "zerg.hydra" // name of the Kafka topic you are reading from
val numThreads = 3 // number of threads for reading from that topic (note: #partitions should be >= 3 in this example)
val topicCountMap = Map(topic -> numThreads)
val valueDecoder = new TweetAvroDecoder[Tweet]
val keyDecoder = valueDecoder // or use `null` in case you explicitly not want to use keys (note that in Kafka 0.8
                              // this means some topic partitions may never see data)
val consumerMap = consumerConnector.createMessageStreams(topicCountMap, keyDecoder, valueDecoder)

Development

Build requirements

Scala 2.10.5
sbt 0.13.8
Java JDK 6 or 7
Avro 1.7.6

Building the code

$ ./sbt clean compile

Running tests

$ ./sbt clean test

Packaging the code

$ ./sbt clean package

Publishing build artifacts

Configure Clojars.org user credentials

Create a file ~/.clojars-credentials.txt with the following contents:

realm=clojars
host=clojars.org
user=YOUR_CLOJARS_USERNAME
password=YOUR_CLOJARS_PASSWORD

Security-wise it is also recommended to chmod 600 ~/.clojars-credentials.txt.

Publish to Clojars.org

TODO: Explain versioning etc.

Publish the build artifacts via:

$ ./sbt clean test publish

Contributing to kafka-avro-codec

Code contributions, bug reports, feature requests etc. are all welcome.

If you are new to GitHub please read Contributing to a project for how to send patches and pull requests to kafka-avro-codec.

License

See LICENSE for licensing information.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kafka-avro-codec

Usage

Background: Avro and Strings

Using the codec in Kafka

Assumptions

Dependency management

Implementing a custom codec for your data records

Using the custom codec in a Kafka producer

Using the custom codec in a Kafka consumer

Development

Build requirements

Building the code

Running tests

Packaging the code

Publishing build artifacts

Configure Clojars.org user credentials

Publish to Clojars.org

Contributing to kafka-avro-codec

License

About

Releases

Packages

Languages

License

Hanmourang/kafka-avro-codec

Folders and files

Latest commit

History

Repository files navigation

kafka-avro-codec

Usage

Background: Avro and Strings

Using the codec in Kafka

Assumptions

Dependency management

Implementing a custom codec for your data records

Using the custom codec in a Kafka producer

Using the custom codec in a Kafka consumer

Development

Build requirements

Building the code

Running tests

Packaging the code

Publishing build artifacts

Configure Clojars.org user credentials

Publish to Clojars.org

Contributing to kafka-avro-codec

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages