Continuous Verification with Steadybit: Boost Resilience
Blog post from Steadybit
Continuous verification through resilience testing, as exemplified by Steadybit, is crucial for building confidence in complex systems by ensuring their reliability and evolution over time. This article discusses how Steadybit integrates chaotic and turbulent conditions into existing testing methodologies, such as unit, integration, end-to-end, and performance/load tests, through experiments that combine attacks, checks, probes, and arbitrary actions. A specific example is given where Steadybit was used to address deployment issues in a Kubernetes environment caused by an AWS ALB misconfiguration related to sticky sessions, demonstrating how observations from resilience testing can lead to the creation of fast, repeatable, and cost-effective verification checks that also serve as documentation. These checks include ensuring no pending rollouts, all pods being ready, no degraded synthetic checks, and successful API calls, which are achieved through Steadybit's experiment designer and its integration with other tools like Checkly and Prometheus. The article underscores the importance of continuous learning and verification to maintain a current understanding of system risks and enhance system confidence through automation.