Company
Date Published
Author
Beth Adele Long
Word count
1209
Language
English
Hacker News points
None

Summary

New Relic has experienced significant growth and complexity over the past few years, handling over 30 million HTTP requests per minute, 600 million new data points, and 50 billion events queried daily. To address this growth, New Relic's reliability practices have undergone a major overhaul, with key lessons learned from high-severity incidents in 2014 and 2015. The company has implemented various measures to improve reliability, including manual processes being automated, defining realistic metrics of reliability, making the right answer easy every time, not waiting for perfect answers, and ensuring autonomy within teams while maintaining organizational support. Through these changes, New Relic aims to create a culture that prioritizes reliability in large-scale systems.