Kafka Bandits

Multi armed bandit using Thompson Sampling. Implemented on Kafka Streams.

A multi armed Bandit is a simple reinforcement learning algorithm. It plays on multiple slot machines and optimizes the time spent on machines with best reward.

Algorithm

Thompson Sampling is using the Beta Distribution. The distribution has two weights α and β. When pulling arms, β is increased for the arm. On reward, α is increased for the arm. Finally a score is computed for the arm, by drawing from the Beta distribution with the respective weight. Higher α and lower β lead to higher scores. The player would always choose the arm with best score.

https://towardsdatascience.com/how-not-to-sort-by-popularity-92745397a7ae

Implementation

Given a website wants to optimize which color to use for an action button. It would ask a rest service with access to an aggregated choices dataset, which color to show at any given moment.

Afterwards following procedure is applied:

Send message via Kafka queue, which color was shown to user.
Observe reward - user clicking the button
Send reward message with chosen colour.
Kafka streams job receives draw and reward messages.
Messages are mapped to arm choices with increased/decreasd alpha/beta weight
Arms alpha and betas are reduced to scores dataset according to beta distribution

Run with docker-compose

Run docker-compose up.

In separate terminal run:

docker-compose exec bandits-server sbt produceEvents
curl localhost:8080/bandits/colors | jq .

{
  "issue": "colors",
  "arms": [
    {
      "issue": "colors",
      "label": "red",
      "alpha": 6,
      "beta": 25,
      "score": 0.09209356761521977
    },
    {
      "issue": "colors",
      "label": "green",
      "alpha": 61,
      "beta": 40,
      "score": 0.5210357518035633
    },
    {
      "issue": "colors",
      "label": "blue",
      "alpha": 11,
      "beta": 1,
      "score": 0.9719363670994956
    }
  ]
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
src/main/scala		src/main/scala
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
build.sbt		build.sbt
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kafka Bandits

Algorithm

Implementation

Run with docker-compose

About

Releases

Packages

Languages

dre-hh/kafka_bandits

Folders and files

Latest commit

History

Repository files navigation

Kafka Bandits

Algorithm

Implementation

Run with docker-compose

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages