Building a faster hash table for high performance SQL joins
Blog post from QuestDB
QuestDB, an open-source time-series database, leverages a specialized hash table called FastMap to optimize SQL execution, particularly for operations like JOIN and GROUP BY. FastMap differentiates itself from traditional hash tables by being designed for grow-only operations, storing data outside of the JVM's heap in native memory, and maintaining insertion order, which aids in efficient sorting during query execution. This structure uses open addressing, linear probing, and a low load factor to enhance performance, and it uniquely supports variable-size keys with fixed-size values. FastMap's design choices, including caching hash codes and employing a compact hash function, contribute to faster query execution by reducing CPU overhead and avoiding garbage collection pressures. Microbenchmarks demonstrate that FastMap performs better than the standard Java HashMap in terms of read and write operations. While FastMap is tailored for QuestDB's use cases, ongoing optimizations, such as exploring Robin Hood hashing, aim to further improve its efficiency.