Flaky Tests: a journey to beat them all
Blog post from Kestra
Flaky tests, which can unpredictably pass or fail without code changes, pose a significant challenge in software development due to factors like timing and resource contention. At Kestra, where over 6,000 tests run across repositories, these issues became pronounced, prompting a journey to address them. Initially, retrying tests using JUnit's retry annotation was attempted, but this only partially alleviated the problem, as it inflated test times and masked underlying issues. The team then focused on fixing tests by improving resource management through custom JUnit extensions, yet some tests remained problematic or had to be disabled. Ultimately, Kestra embraced the inevitability of some tests failing by flagging them as flaky, allowing them to fail in CI without affecting the overall result. This pragmatic approach separates critical tests from those allowed to fail, ensuring reliable CI signals while maintaining test coverage and addressing the challenges of non-determinism.