title | description | services | author | ms.author | ms.reviewer | ms.service | ms.topic | ms.date |
---|---|---|---|---|---|---|---|---|
Quickstart: Ingest data from Kafka into Azure Data Explorer |
In this quickstart, you learn how to ingest (load) data into Azure Data Explorer from Kafka. |
data-explorer |
orspod |
v-orspod |
mblythe |
data-explorer |
quickstart |
11/19/2018 |
Azure Data Explorer is a fast and highly scalable data exploration service for log and telemetry data. Azure Data Explorer offers ingestion (data loading) from Kafka. Kafka is a distributed streaming platform that allows building of real-time streaming data pipelines that reliably move data between systems or applications.
-
If you don't have an Azure subscription, create a free Azure account before you begin.
-
A sample app that generates data and sends it to Kafka
-
Visual studio 2017 Version 15.3.2 or greater to run the sample app
Kafka Connect is a tool for scalably and reliably streaming data between Apache Kafka and other systems. It makes it simple to quickly define connectors that move large collections of data into and out of Kafka. The ADX Kafka Sink serves as the connector from Kafka.
Kafka can load a .jar
as a plugin that will act as a custom connector.
To produce such a .jar
, we will clone the code locally and build using Maven.
git clone git://github.com:Azure/kafka-sink-azure-kusto.git
cd ./kafka-sink-azure-kusto/kafka/
Build locally with Maven to produce a .jar
complete with dependencies.
Inside the root directory kafka-sink-azure-kusto, run:
mvn clean compile assembly:single
Load plugin into Kafka. An deployment example using docker can be found at kafka-sink-azure-kusto
Detailed documentation on Kafka connectors and how to deploy them can be found at Kafka Connect
name=KustoSinkConnector
connector.class=com.microsoft.azure.kusto.kafka.connect.sink.KustoSinkConnector
kusto.sink.flush_interval_ms=300000
key.converter=org.apache.kafka.connect.storage.StringConverter
value.converter=org.apache.kafka.connect.storage.StringConverter
tasks.max=1
topics=testing1
kusto.tables.topics_mapping=[{'topic': 'testing1','db': 'daniel', 'table': 'TestTable','format': 'json', 'mapping':'TestMapping'}]
kusto.auth.authority=XXX
kusto.url=https://ingest-{mycluster}.kusto.windows.net/
kusto.auth.appid=XXX
kusto.auth.appkey=XXX
kusto.sink.tempdir=/var/tmp/
kusto.sink.flush_size=1000
Create a table in ADX to which Kafka can send data. Create the table in the cluster and database provisioned in the Prerequisites.
-
In the Azure portal, navigate to your cluster and select Query.
-
Copy the following command into the window and select Run.
.create table TestTable (TimeStamp: datetime, Name: string, Metric: int, Source:string)
-
Copy the following command into the window and select Run.
.create table TestTable ingestion json mapping 'TestMapping' '[{"column":"TimeStamp","path":"$.timeStamp","datatype":"datetime"},{"column":"Name","path":"$.name","datatype":"string"},{"column":"Metric","path":"$.metric","datatype":"int"},{"column":"Source","path":"$.source","datatype":"string"}]'
This command maps incoming JSON data to the column names and data types of the table (TestTable).
Now that the Kafka cluster is connected to ADX, use the sample app you downloaded to generate data.
Clone the sample app locally:
git clone git://github.com:Azure/azure-kusto-samples-dotnet.git
cd ./azure-kusto-samples-dotnet/kafka/
-
Open the sample app solution in Visual Studio.
-
In the
Program.cs
file, update theconnectionString
constant to your Kafka connection string.const string connectionString = @"<YourConnectionString>";
-
Build and run the app. The app sends messages to the Kafka cluster, and it prints out its status every ten seconds.
-
After the app has sent a few messages, move on to the next step.
-
To make sure no errors occured during ingestion:
.show ingestion failures
-
To see the newly ingested data:
TestTable | count
-
To see the content of the messages:
TestTable
The result set should look like the following:
[!div class="nextstepaction"] Quickstart: Query data in Azure Data Explorer