Company
Date Published
Author
Caitlin Halla
Word count
850
Language
English
Hacker News points
None

Summary

At New Relic, the company employs reliability best practices through chaos engineering to test its systems for efficiency and resiliency against hazardous conditions, using techniques like carefully injecting harm into systems to prepare for outages. The company leveraged its GraphQL API as an entry point for internal chaos engineering practices and developed a tool called Chaos Panda to test against latency and field errors in the API. With Chaos Panda, teams can configure the API to add latency or cause certain fields to fail at specific failure rates, allowing them to identify opportunities to improve resilient code in their services. The tool is designed to be simple and easy to use, with a mutation that kicks off a chaos session for specific fields and slows down responses by a specified amount of time.