How we processed 12 trillion rows during Black Friday
Blog post from Tinybird
Faced with the challenge of hosting a real-time analytics API for a retail client during Black Friday, the team implemented a system using their Tinybird backend, incorporating Nginx, Varnish, and a Clickhouse cluster to manage and query large volumes of transactional data. The approach involved utilizing materialized views, splitting the main sales table into two—one for recent data and another for historical data—to efficiently handle upserts and maintain real-time capabilities. Despite some hiccups, including a minor downtime during peak traffic due to a script error, the system successfully managed to ingest over 650 billion rows and handled queries reading a total of over 12 trillion rows, achieving a median of 50 queries per second with peaks up to 300 QPS. The team optimized performance by focusing on reducing query times and byte scans, employing techniques like lightweight operations and strategic caching with Varnish, ultimately delivering an average API response time of 600ms.