Hands-On Evaluation of NVIDIA Nemotron 3 Super: Punching Above Its Weight Class | Greptile Blog
Blog post from Greptile
Greptile is collaborating with multiple AI labs to create agents that review and test code changes, with a focus on validating pull requests using models like NVIDIA's Nemotron 3 Super. The Nemotron 3 Super is a hybrid model featuring 120 billion parameters, recommended for multi-agent workflows due to its high accuracy and extensive context window. Greptile tested this model using an internal evaluation harness on a dataset of buggy code changes, finding it highly effective despite its smaller size compared to other frontier models. In a test involving a 134KB diff across 19 files, Nemotron 3 Super identified significant issues such as a CORS regression and smaller logic errors with impressive speed and minimal exploration. The model performed well on issues discernible from the patch itself, although it was less effective when deeper context was required. Overall, Nemotron 3 Super demonstrated strong potential as a first-pass code review tool, combining speed, efficiency, and accuracy, making it a promising option for further development and integration into code validation processes.