How to scale a real-time data platform
Blog post from Tinybird
Tinybird, an enterprise-grade data platform, is designed to handle large-scale data processing with a focus on real-time analytics, security, performance, and scalability. The platform is built around the real-time database ClickHouse and utilizes several database replicas for efficient load management, which can increase or decrease depending on customer needs. Tinybird's scaling philosophy prioritizes optimizing SQL queries to minimize resource usage before expanding infrastructure, as hardware is costly while logical improvements are not. The platform supports high-concurrency, low-latency APIs by scaling query concurrency through adding database replicas or CPUs and by optimizing ingestion processes with shared storage and efficient data handling before database entry. This approach enables Tinybird to manage significant data loads, such as supporting a top-five global clothing retailer during Black Friday with 4.3 billion events and maintaining low latency. Through features like Materialized Views for pre-calculating aggregates, Tinybird aids clients in enhancing query efficiency without additional hardware. The company continues to refine its methods for scaling, supported by contributions to the open-source ClickHouse project and ongoing improvements in handling high-concurrency systems.