Home / Companies / ClickHouse / Blog / Post Details
Content Deep Dive

How ClickHouse became fast at joins

Blog post from ClickHouse

Post Details
Company
Date Published
Author
Two years of focused join engineering #
Word Count
3,341
Language
English
Hacker News Points
-
Summary

Over the past two years, ClickHouse has significantly enhanced its performance on join-heavy analytical workloads, achieving a 26× speed increase on the TPC-H SF100 benchmark compared to version 22.4. This improvement was accomplished through targeted engineering efforts, focusing on making joins a core strength of the system. In the first year, foundational updates like faster parallel hash joins, smarter planning, and aggressive filter pushdown were implemented, resulting in a 4.4× speedup by version 25.4. The second year introduced further enhancements such as correlated subqueries, lazy column replication, runtime filters, and statistics-based join reordering, which collectively contributed to an additional 6× speed increase. These advancements have allowed ClickHouse to execute complex join queries more efficiently and cost-effectively, enabling it to compete with platforms like Snowflake, Databricks, BigQuery, and Redshift. The company plans to continue optimizing join performance with ongoing developments, including distributed joins to handle even larger workloads.