Skip to content

piddubnyi/kafka-nobleprog

Repository files navigation

Igor Piddubnyi github.com/piddubnyi [email protected]

Overview of Distributed Messaging Systems

Alternatives to Kafka

  • ActiveMQ / RabbitMQ (no real data distribution)
  • zmq (no disk persistence)
  • nats (no disk persistence)
  • apache pulsar

https://github.com/tszmytka/benchmark-mq

Use Cases

  • Messaging
  • Website Activity Tracking
  • Metrics
  • Log Aggregation
  • Stream Processing
  • Event Sourcing

Deployment and startup

  • native
  • docker
  • k8s

#Kafka API

./bin/kafka-topics.sh --bootstrap-server 127.0.0.1:9092 --describe *

./kafka-console-consumer.sh --bootstrap-server 127.0.0.1:9092 --topic my-topic

  • Simple Producer
  • Producer API
  • Simple Consumer
  • Consumer API

#Configuration

Broker

  • broker.id
  • log.dirs
  • zookeeper.connect
  • advertised.listeners
  • advertised.port
  • log.retention.bytes
  • log.retention.hours

Topic

  • compression.type

Consumer and Producer Configuration

  • bootstrap.servers
  • (Producer) acks
  • compression.type
  • (Consumer) group.id group.instance.id

#Design

Persistence

  • All data is immediately written to a persistent log on the filesystem without necessarily flushing to disk.
  • In effect this just means that it is transferred into the kernel's pagecache.

Efficiency

  • Data buffering (avoid small IO and frequent flushes)
  • Reduce sendfile and using the zero-copy
  • !Batch Compression!

The Producer

  • key is used for partitioning
  • async sending + batches
  • no producer persistence

The Consumer

  • pull mode (+blocking)
  • event mode
  • consumer position (offset)
  • Static Membership

Message Delivery Semantics

At most once — Messages may be lost but are never redelivered.
    Consumer read the messages, then save its position in the log, and finally process the messages
    Consumer process crashes after saving its position but before saving the output
At least once — Messages are never lost but may be redelivered.
    Consumer read the messages, process the messages, and finally save its position
    Crashes after processing messages but before saving its position
Exactly once — each message is delivered once and only once.
    The consumer's position is stored as a message in a topic, 
    Write the offset to Kafka in the same transaction as the output topics receiving the processed data. 
    If the transaction is aborted, the consumer's position will revert
  • Как копровать данные из одного топика на другой на постоянной основе?

  • Как архивировать данные из одого топика в такой же, но архивный на постоянной основе

  • Как создать аргегированный топик из нескольких?

  • Авторризация в разрезе топиков https://kafka.apache.org/documentation/#security_authz

  • имплементация ksql и для стримов

  • Replication

  • Implementation

  • API Design

  • Network Layer

  • Messages

  • Message format

  • Log

  • Distribution

Basic Kafka Operations

  • Adding and removing topics
  • Modifying topics
  • Graceful shutdown
  • Balancing leadership
  • Checking consumer position
  • Mirroring data between clusters
  • Expanding your cluster
  • Decommissioning brokers
  • Increasing replication factor

Datacenters

  • Important Server Configs

  • Important Client Configs

  • Production Server Configs

  • Monitoring

  • ZooKeeper

  • как удобно искать по тегам сообщение в топике, было ли или не было

  • интеграция авро в уже существующее решение

  • log compacted топики

  • Tools for troubleshooting

  • Monitoring Approaches

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages