Home / Companies / Tiger Data / Blog / Post Details
Content Deep Dive

Blocked Bloom Filters: Speeding Up Point Lookups in Tiger Postgres' Native Columnstore

Blog post from Tiger Data

Post Details
Company
Date Published
Author
Jacky Liang
Word Count
2,759
Language
English
Hacker News Points
-
Summary

The article delves into the integration of Bloom filters in TimescaleDB's columnstore to enhance the speed of point-lookups, particularly in scenarios involving large-scale time-series data and analytics workloads. It explains how traditional columnstores struggle with queries on unsorted fields due to the need to decompress and scan every block, a process that can be time-consuming. Bloom filters are introduced as a solution that allows the database to quickly determine whether a value is definitely not in a batch, thus significantly reducing the number of blocks that need to be scanned. This approach is particularly effective for queries involving non-temporal identifiers, such as UUIDs or transaction IDs, leading to performance improvements of up to 100x. The article further explains the mechanics of Bloom filters, including their efficiency and limitations, and highlights TimescaleDB's implementation of "blocked Bloom filters" which optimize performance by reducing I/O operations. While Bloom filters excel in exact match queries, they have limitations in handling range queries or not-equal searches, but they offer a substantial boost in speed for many common use cases without requiring manual configuration, aligning with TimescaleDB's goal of offering speed without sacrificing flexibility.