Blog
Blog post from Tinybird
In a detailed exploration of high-speed data ingestion, the text describes the process of achieving Tesla's reported 1 billion rows per second ingestion rate using ClickHouse®, a columnar database management system. The author tests this claim on a MacBook M4 Pro, demonstrating that scaling ClickHouse® clusters by adding shards can linearly increase ingestion capacity. The setup involves creating a ClickHouse® cluster on Google Cloud Platform, configuring nodes, and optimizing data insertion techniques to achieve high throughput. The text also discusses the challenges of real-time ingestion, including handling failures, retries, and database upgrades, as well as balancing throughput, part size, and merges. It suggests that Tesla likely uses a combination of Kafka for data buffering and a real-time ETL process to manage data before ingestion into ClickHouse®, which is likely divided into multiple shards with various optimizations for efficient data processing. The author concludes that achieving such high ingestion rates requires careful architecture planning and possibly additional resources, indicating that this complex setup is more feasible for companies needing high-insert rate architectures.