Home / Companies / Honeycomb / Blog / Post Details
Content Deep Dive

Shipping on a Spent Error Budget

Blog post from Honeycomb

Post Details
Company
Date Published
Author
Paul Osman
Word Count
1,030
Language
English
Hacker News Points
-
Summary

Balancing the need for high availability with the development of new features is a common challenge for engineering teams, which can be addressed by using service level objectives (SLOs) and error budgets to prioritize tasks. This approach was exemplified by Honeycomb, which faced an incident when integrating OpenTelemetry Protocol (OTLP) support into its ingest service, resulting in burned error budgets and high response times. To rectify the situation without impacting existing traffic, Honeycomb created a separate cluster for OTLP traffic, leveraging AWS's gRPC support in their Application Load Balancers. This allowed the team to iterate and test safely while maintaining the reliability of their main services. The incident highlighted the importance of SLOs in facilitating discussions about product development and infrastructure investment, ultimately enabling Honeycomb to deploy OTLP support effectively and gather customer feedback.