Home / Companies / ClickHouse / Blog / Post Details
Content Deep Dive

ClickHouse Joins Under the Hood - Direct Join

Blog post from ClickHouse

Post Details
Company
Date Published
Author
Tom Schreiber
Word Count
3,240
Language
English
Hacker News Points
-
Summary

The Direct Join algorithm is the fastest join algorithm in ClickHouse, applicable when the underlying storage for the right-hand side table supports low latency key-value requests. It beats all other ClickHouse join algorithms with a significant improvement in execution time, especially with large right-hand side tables. The algorithm requires that the right table is backed by a dictionary, which allows for extremely fast key-value lookups with O(1) time complexity. The direct join run from the query using a flat memory layout dictionary is ~25 times faster than the hash join run and ~15 times faster than the parallel hash join run. Even with added dictionary bytes_allocated to peak memory consumption, it remains lower compared to the hash algorithm runs.