Company
Date Published
Author
Matthew Flaming
Word count
1376
Language
English
Hacker News points
None

Summary

Reliability is a complex issue in modern software development, with teams facing various challenges such as edge cases, variations across architectures and tiers, and the complexity of scale. To address these issues, engineering managers at New Relic use seven questions to determine if their teams are meeting reliability best practices, including robust deploy and rollback tooling, game-day testing, reliable pre-production testing, risk matrix updates, adequate free capacity, defensive rate limiting, and systems that can scale without meaningful architectural changes for the next 12 months. By regularly assessing these areas, teams can identify gaps and make improvements to enhance their reliability.