Company
Date Published
Author
Mikhail Bautin and Kannan Muthukkaruppan
Word count
1499
Language
English
Hacker News points
None

Summary

RocksDB's block cache has been made scan-resistant by dividing the LRU into two portions and requiring multiple touches before promoting a block to the hot portion of the cache. Additionally, RocksDB's SSTable files have been enhanced to be multi-level/block-oriented structures for bloom filters and indexes, enabling demand-paging into the block cache like data blocks. Each node in DocDB now dedicates one instance of RocksDB per tablet, rather than sharing a single instance across multiple tablets. This design enables efficient cluster rebalancing on node failure or addition, as well as simplifying deletion operations. Furthermore, DocsDB allows for per-table storage policy and compression options, including in-memory delta-encoding schemes. The block cache has also been optimized to avoid premature flushing of memstores with a global limit, while separate queues have been implemented for large and small compactions. Smart load balancing across multiple disks is achieved by distributing RocksDB instances uniformly across available SSDs on a per-table basis. Other optimizations include avoiding double journaling in the write-ahead log, removing unnecessary functionality from RocksDB, and implementing multi-version concurrency control using hybrid timestamps.