Refinery and EMA Sampling
Blog post from Honeycomb
Refinery, Honeycomb’s sampling proxy, enhances telemetry value for large customers through various samplers, including dynamic sampling, which adjusts sample rates based on data volume to prioritize rare events over common ones. This method uses key fields with low cardinality to distinguish event types, allowing Refinery to predict and set sample rates for each key in the subsequent interval. Dynamic samplers, like the Exponential Moving Average (EMA) sampler, adapt sample rates smoothly over time, but require proper tuning and a suitable AdjustmentInterval to function effectively. Misalignment of key sets across intervals can lead to instability, hence the importance of ensuring that the interval is long enough to capture most keys consistently. Efforts to improve EMA involve tracking key space cardinality over multiple intervals to automatically adjust the interval length, aiming for stability. This nuanced approach is crucial for maintaining telemetry’s effectiveness in variable conditions, with ongoing enhancements promising smoother operations in future Refinery releases.