Company
Date Published
Author
Tom Schreiber
Word count
4914
Language
English
Hacker News points
None

Summary

ClickHouse joins use various algorithms to optimize performance and memory usage. The Hash join algorithm is fast but single-threaded, while the Parallel hash join algorithm can be faster with large right-hand side tables but requires more memory. The Grace hash join algorithm is non-memory bound and uses a two-phase approach to joining data, splitting it into buckets that are processed in-memory sequentially or offloaded to disk. ClickHouse automatically chooses one of 30+ variations for join algorithms based on query specifics, including support for multiple join keys and strictness settings. The choice of algorithm can significantly impact performance and memory usage, with the Grace hash join offering a good balance between execution time and memory consumption.