Building a Self-Fuzzing CI Pipeline for Dragonfly
Blog post from Dragonfly
In the pursuit of enhancing the robustness of Dragonfly, a high-performance in-memory data store compatible with Redis and Memcached, the team transitioned from manual fuzzing to an automated CI pipeline where an LLM generates targeted attack vectors for every pull request. The initial efforts involved a basic setup called df-afl that, despite its rudimentary design, demonstrated the potential of fuzzing by finding real bugs through random command generation. The integration of AFL++ into Dragonfly's build system, particularly through persistent mode, significantly boosted the efficiency by allowing multiple inputs to be processed in a continuous loop, thus uncovering edge case bugs that traditional unit tests missed. Custom protocol-aware mutators were developed to improve the fuzzer's ability to reach command logic by operating at the command level rather than the byte level, enhancing the detection of logic bugs. The CI integration includes nightly fuzzing campaigns and targeted fuzzing during pull requests using LLM-generated seeds, which identifies specific code paths for testing based on recent changes. This system not only caught potential production issues early but also streamlined bug reproduction through features like AFL_PERSISTENT_RECORD. The focus now shifts to expanding fuzzing capabilities to cover Dragonfly's cluster mode and replication, and to develop mechanisms for hang detection, with the fuzzing infrastructure available open-source for further improvements.
| Trend | Post Mentions | Total Month Mentions | Posts | Companies | MoM |
|---|---|---|---|---|---|
| LLM | 4 | 6,078 | 960 | 218 | +18% |