Company
Date Published
Author
Evan Mouzakitis, David M. Lentz
Word count
5423
Language
English
Hacker News points
3

Summary

Kafka is a distributed, partitioned, replicated log service developed by LinkedIn and open sourced in 2011. It's designed for handling real-time data feeds of large companies. Kafka has several key differences from other message queue systems like RabbitMQ, ActiveMQ, or Redis's Pub/Sub. These include being a replicated log service, using a custom binary TCP-based protocol, being very fast even with small clusters, having strong ordering semantics and durability guarantees. Many organizations use Kafka, including LinkedIn, Pinterest, Twitter, and Datadog. The latest release is version 2.4.1. A Kafka deployment consists of brokers that act as intermediaries between producer applications and consumer applications. Producers push messages to brokers in batches, while consumers pull messages from the log at their own rate. Messages are organized into topics, which store related messages, and partitions are assigned to brokers. The greater the number of partitions, the more concurrent consumers a topic can support. Kafka's replication feature provides high availability by persisting each partition on multiple brokers. ZooKeeper is used in Kafka deployments for maintaining information about Kafka's brokers and topics, applying quotas to govern traffic, and storing replicas. Monitoring ZooKeeper metrics is key to maintaining a healthy Kafka cluster. Key metrics include outstanding requests, average latency, number of alive connections, pending syncs, bytes sent/received, usable memory, swap usage, and disk latency. A properly functioning Kafka cluster can handle significant amounts of data, but monitoring health and performance is crucial for reliable performance from dependent applications.