8 Takeaways From New Relic’s New SRE Handbook

Post Details

Company

New Relic

Date Published

June 1, 2018

Author

Fredric Paul, Editor in Chief

Word Count

835

Company Posts That Month

12

Language

English

Hacker News Points

-

Source URL

newrelic.com/blog/best-practices/site-reliability-engineering-sre-handbook

Summary

Site Reliability Engineering (SRE) is a critical function in many companies, and it was invented at Google by Benjamin Treynor Sloss to ensure the health of production systems at scale. SREs are in demand, with opportunities available in tech companies and legacy enterprises, and they are considered one of the most promising jobs in tech by LinkedIn. SRE is seen as the purest form of DevOps, focusing on greater reliability with less manual intervention as a system scales. To achieve this goal, SREs rely on automation to increase the reliability of everything they touch without slowing down software shipping. Successful SREs have to think big, considering how their work affects the larger infrastructure and making decisions that impact multiple systems and teams. They use service level objectives (SLOs) to track reliability and make adjustments to meet company strategy and prioritize efforts. The role of SRE is expanding into more companies, with different organizations defining it in unique ways, from tech giants like Google to smaller companies like New Relic.

Trends Found in this Post

No tracked trend matches for this post yet.