How to run an adversarial game day
Blog post from New Relic
New Relic employs "adversarial game days," a variant of chaos engineering, to assess the resilience and reliability of their systems by simulating failures and observing team responses. These exercises, inspired by the concept of deliberately injecting faults into systems, help reveal gaps in the team's mental models and operational processes, allowing them to better prepare for real-world incidents. Game days facilitate cross-functional collaboration, testing communication tools and incident response strategies in environments that mimic production settings. The outcomes of these simulations highlight areas for improvement, such as enhancing monitoring and alerting systems, refining incident response protocols, and updating documentation, ultimately aiming to reduce mean time to resolution (MTTR) during actual system failures. These exercises not only strengthen the team's understanding of the systems but also encourage the sharing of insights and best practices across teams, fostering a culture of continuous improvement and preparedness.