Simplifying Chaos Fault Validation with an E2E Testing Framework
Blog post from Harness
Vedant Shrotria discusses the development of a developer-friendly end-to-end (E2E) testing framework for chaos fault validation aimed at reducing setup friction while maintaining control and correctness in chaos engineering. Previously, the process to validate chaos faults was cumbersome, requiring manual installation of dependencies, configuration of environment variables, and YAML-based workflows, which hindered feedback loops and adoption. The new framework offers an API-driven model, real-time log streaming, and intelligent target discovery, alongside dual-phase validation to ensure both fault impact and recovery are verified. The architecture includes an Experiment Runner for orchestrating the experiment lifecycle, an Experiment Monitor for status tracking, and a Validation Framework for concrete chaos impact verification, allowing for faster execution and scalability, with test setups now taking under five minutes. The framework is proprietary but highlights best practices that can be applied to similar testing infrastructures, emphasizing the importance of developer experience, automation, and knowledge sharing to enhance testing processes.