The Problem with Pre-aggregated Metrics: Part 2, the “aggregated”
Blog post from Honeycomb
Pre-aggregated metrics, while offering quick and cost-effective storage solutions, impose significant limitations on data exploration and problem-solving due to their reliance on a fixed set of well-known metrics and a storage strategy that involves breaking down events into discrete counters. This approach reduces data granularity and complicates the differentiation between signal and noise. The addition of new attributes or high-cardinality attributes, such as user IDs or OS versions, can exponentially increase the number of unique metrics and the storage required, leading to difficult choices between managing storage costs and losing valuable data insights. For example, identifying problematic queries at Parse would have been challenging with pre-aggregated metrics due to the need to prioritize certain users over others. Honeycomb offers a solution that allows more flexible data segmentation and the ability to track detailed breakdowns, enabling users to better understand and respond to diverse traffic patterns without being constrained by their tools.