Horizontally scaling Kafka consumers with rendezvous hashing
Blog post from Tinybird
Apache Kafka is a widely used architecture for building scalable and fault-tolerant streaming applications, making it an essential component for enterprises handling event data. Tinybird, a platform for developers to create low-latency APIs, faced a challenge with its Kafka connector due to the exponential increase in Kafka costs with only linear customer growth. Initially, Tinybird's Kafka consumers were optimized for throughput and availability, but this approach resulted in scalability issues due to too many Kafka connections. To address this, Tinybird implemented rendezvous hashing, a method allowing them to reduce the number of Kafka connections while maintaining high throughput and availability. This solution distributed agents across hundreds of topics, minimized rebalancing during changes, and significantly reduced infrastructure costs. Tinybird now supports ingestion from numerous concurrent topics, providing a cost-efficient and scalable solution for developers building analytics on Kafka data.