Lifting the Index Size Limit of Prometheus with Postings Compression
Blog post from Grafana Labs
Prometheus's TimeSeries DataBase (TSDB) faced limitations due to its 32-bit index postings, which restricted the addressable space to 64 GiB, challenging its scalability as data and series numbers grew. During the Google Summer of Code 2019, a solution was explored by mentoring Alec Wang to address these limitations by experimenting with 64-bit postings, which could potentially slow down performance due to increased disk load. To mitigate performance issues, compression techniques such as prefix compression, using roaring bitmaps, were tested and implemented. This method involved storing the first 48 bits of postings as a key with the remaining 16 bits repeated across entries sharing the same key, maintaining performance efficiency in querying while expanding the index size limit. The project, which is undergoing review, shows promising results in testing with both synthetic and real-world data.