Home / Companies / ClickHouse / Blog / Post Details
Content Deep Dive

How we scaled raw GROUP BY to 100 B+ rows in under a second

Blog post from ClickHouse

Post Details
Company
Date Published
Author
Tom Schreiber
Word Count
5,440
Company Posts That Month
20
Language
English
Hacker News Points
-
Summary

ClickHouse Cloud introduces a new feature called parallel replicas, enabling unprecedented horizontal query scaling by distributing a single query across thousands of cores. This feature allows complex queries, such as GROUP BY operations, to be executed on datasets as large as 100 billion rows in under half a second without pre-aggregation or data reshuffling. The technology achieves this by treating multiple nodes as virtual replicas, allowing for elastic scaling: as more nodes are added, query speed increases without data movement. The parallel replicas, currently in beta, enhance ClickHouse's capability to handle massive data throughput efficiently and are poised to become a benchmark for analytics performance, offering interactive speed across extensive datasets. This advancement allows ClickHouse to maintain its foundational goal of rapid data aggregation while extending its reach from single-node efficiency to cloud-scale operations, providing users with the flexibility to scale workloads seamlessly from individual nodes to large, distributed clusters with minimal configuration changes.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
Observability 3 1,462 347 128 -22%