Home / Companies / Comfy / Blog / Post Details
Content Deep Dive

Comfy Internals | How we got four rival AI labs to fight over our code reviews

Blog post from Comfy

Post Details
Company
Date Published
Author
Matt Miller
Word Count
2,023
Language
English
Hacker News Points
-
Summary

At Comfy, a system was developed to enhance code review processes by leveraging AI models from four different labs, each offering unique perspectives and reducing blind spots that a single model might miss. This system fans out a pull request (PR) diff to four AI models from OpenAI, Anthropic, Google, and Moonshot, conducting two review passes per model, and consolidates the results through a single judge model. This approach aims to catch nuanced bugs such as concurrency issues and API contract drifts, which might be overlooked by human reviewers due to fatigue or by models sharing the same training priors. Operating as a $200/month GitHub Action, this system runs in continuous integration (CI) and is designed to avoid being manipulated by malicious PRs. It has successfully identified significant bugs in about 110 PRs so far, and the architecture is open-sourced to encourage further development and feedback from the engineering community.