Skip to content

Files

Latest commit

8e3ba9e · Apr 22, 2020

History

History
This branch is 20 commits behind confluentinc/demo-scene:master.

streams-kubernetes

Streaming Movie Ratings on Kubernetes

This document explains how to build «Streaming Movies Demo» app docker image and run it in Kubernetes.

Build a Docker image and publish to a docker registry

link:../build.gradle[role=include]
  1. Artifactory docker registry

  2. GCP docker registry

  3. use credHelper for GCP

    1. Build an image and push to the docker registry: ./gradlew build jib

      1. You also can build to a local tar: ./gradlew build jibBuildTar

    2. For local test via docker:

      docker pull ${DOCKER_REGISTRY_URL}/streaming-movie-ratings:latest
      
      docker run -ti -e "JAVA_TOOL_OPTIONS=-DLOGLEVEL=INFO" --rm ${DOCKER_REGISTRY_URL}/streaming-movie-ratings:latest

Demo Playbook

load initial data
cat ./data/movies.dat | ccloud produce -t raw-movies
cat ./data/ratings.dat | ccloud produce -t raw-ratings
Consume results from Avro topic
confluent consume rated-movies --cloud --value-format avro --property schema.registry.url=https://sr.confluent.cloud --property basic.auth.credentials.source=USER_INFO --property schema.registry.basic.auth.user.info=<user_key> --from-beginning

or use CCloud UI (TBD add screenshot)

Generate Test Toad

  • Start raw rating generator

    ./gradlew loader:streamWithRawRatingStreamer -PconfigPath=$HOME/.ccloud/config
    Note
    I recommend to run the raw rating generator in a separate terminal window so you can interrupt it with Ctrl+C

TODO

  • ❏ Create standalone generator app + docker image

Troubleshooting

To start over

Reset Kafka Streams state
kafka-streams-application-reset --application-id kafka-films --bootstrap-servers your.bootstrap.server:9092 --config-file ~/.ccloud/config --input-topics raw_movies,raw_ratings

Running Kubernetes cluster on GKE

Create GKE cluster
gcloud container clusters create kafka-streams-cluster \
  --num-nodes 2 \
  --machine-type n1-standard-1 \
  --zone us-east1-c
Deploy the container (stateless) - one instance, as specified in deployment:
kubectl create -f movie-ratings-deployment.yaml

you can start another window to watch the logs:

kubectl logs `kubectl get pods -l app=streaming-movie-ratings -o=name` -f
You can watch the output
confluent consume rated-movies --cloud --value-format avro --property schema.registry.url=https://sr.confluent.cloud --property basic.auth.credentials.source=USER_INFO --property schema.registry.basic.auth.user.info=your_sr_api_key --from-beginning
Scale up to 3 instances
kubectl scale deployment streams-stock-stats --replicas=3 #(1)
  1. Having more than 3 instances is pointless since we only have 3 partitions in our topic

Since we only configured 2 nodes and our deployment has "Anti-Affinity" properties, only 2 of the 3 instances will be scheduled (one on each node) and one will be pending.

Note
Is you wish to add more nodes in the Kubernetes cluster nodes pool you can do that with the command gcloud container clusters resize kafka-streams-cluster --size=3 --zone us-west1-c

You can see that by running:

kubectl get pods

Watch the logs for the rebalance and the output to see that the job just keeps running!

And scale back down
kubectl scale deployment streams-stock-stats --replicas=1
Finally, you can just kill the whole job:
kubectl delete -f kafka-streams-stockstats-deployment.yaml

To run with a stateful set!

This example has a tiny tiny state, so if we restart a pod and the local state is lost and needs to be re-created, no big deal. But if you have large state, you’ll want to preserve it between restarts.

Note
I configured shared storage, I didn’t worry about stateful network identity - since this example doesn’t include interactive queries.
  1. You can watch the pods getting created, note how they each have an identity: kubectl get pods -w -l app=streams-stock-stats

  2. Start the stateful set: kubectl create -f kafka-streams-stockstats-stateful.yaml

  3. Delete a pod and watch it restart with its old state: kubectl delete pods streams-stock-stats-1

  4. And finally, we can get rid of the entire set. Note that the storage will remain: kubectl delete -f kafka-streams-stockstats-stateful.yaml