How We Used Agentic AI to Automate Fixes for Flaky Tests
Blog post from Kong
Kong Gateway has implemented an innovative agentic AI workflow to address the inefficiencies caused by flaky tests within its CI processes, which are tests that intermittently fail and necessitate costly reruns. By utilizing Claude Code, an autonomous system, Kong has been able to identify, diagnose, and fix flaky tests more efficiently than manual efforts, significantly enhancing CI stability and reducing engineering time. The process involves a structured agentic framework where the orchestrating agent, "fix-flakes," along with subagents like "flake-fixer" and "flake-verifier," systematically identify flaky tests, analyze logs, propose fixes, and verify their effectiveness. This automated approach has yielded notable success, fixing 12 out of 15 of the flakiest tests identified and uncovering two previously unnoticed bugs in the codebase, all while reducing token usage and maintaining a streamlined context management strategy. The initiative has resulted in faster PR merges, decreased CI queue times, and an overall boost in productivity and confidence in the testing framework, demonstrating the potential of agentic coding in maintaining and scaling large engineering projects.