Home / Companies / Honeycomb / Blog / Post Details
Content Deep Dive

Virtualizing Our Storage Engine

Blog post from Honeycomb

Post Details
Company
Date Published
Author
Hazel Edmands
Word Count
1,748
Language
English
Hacker News Points
-
Summary

Honeycomb's storage engine, known as Retriever, has undergone internal changes to enhance performance while maintaining the same user experience. Retriever operates through two processes: a writer that appends event data to disk, and a reader that processes queries by calculating aggregations from the stored data. With the adoption of a new data model for environments and services, Retriever was adapted to support multi-dataset queries, allowing for more efficient querying across multiple services. This led to the development of "virtual datasets," which map datasets to container datasets, enabling more efficient queries by reducing the need to read irrelevant data. This change improved query runtimes significantly, especially for complex environments with numerous services, reducing median query duration from around 20 seconds to 0.2 seconds. The new system also provides the flexibility to add features without impacting query performance, and Honeycomb plans to continue leveraging this model to further improve query efficiency.