Company
Date Published
Author
Dan Holloran, Elisa Binette
Word count
2708
Language
English
Hacker News points
None

Summary

At New Relic, defining and setting service level indicators (SLIs) and service level objectives (SLOs) is crucial for site reliability engineering (SRE) practice. The company uses a simplified version of its architecture to illustrate how to apply SLIs and SLOs in a real-world complex modern software platform. By focusing on system boundaries, New Relic simplifies the measurement process and captures the value of critical system measurements. The team establishes a baseline for service boundaries with one click, sets SLI + SLO using a simple recipe, and measures customer experience to understand SLO/SLIs for UIs. Hard dependencies require higher SLOs, such as the network tier, which has a significant impact on the overall platform's reliability. By defining SLIs and SLOs for specific capabilities at system boundaries, combining them into single SLOs, documenting and sharing contracts, and assuming evolution over time, New Relic aims to help teams build resilient and reliable software architectures.