Home / Companies / ClickHouse / Blog / Post Details
Content Deep Dive

A Quadrillion Rows across three Clouds: scaling LogHouse

Blog post from ClickHouse

Post Details
Company
Date Published
Author
-
Word Count
4,820
Company Posts That Month
82
Language
English
Hacker News Points
-
Summary

LogHouse, the internal logging platform for ClickHouse Cloud, has significantly expanded, now managing 431 PiB of uncompressed data across 1.59 quadrillion rows, a 23-fold increase over two years. It operates across 30+ regions on three cloud providers, handling data with high efficiency, such as 80 GiB/s and 190 million rows per second at peak. This growth is supported by a geosharding strategy that allows writes to remain local to their region, enabling independent scaling while minimizing cross-region costs. The platform uses Async Inserts to manage small write operations efficiently and a three-level table hierarchy that facilitates low-latency, cross-cloud queries by hiding the complex topology from users. LogHouse's development includes features like Distributed tables for seamless data querying across regions and a robust setup for reliable data delivery, even during outages, by leveraging S3 for persistent buffering. Despite its advancements, LogHouse continues to evolve, focusing on reducing memory consumption, enhancing durability in async inserts, improving telemetry without customer impact, and expanding data types to include more OpenTelemetry traces and metrics.

Trends Found in this Post

No tracked trend matches for this post yet.