Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Frontier Scores Without a Single Gradient Step

Post Details

Company

HuggingFace

Date Published

May 15, 2026

Author

VIDRAFT_LAB

Word Count

882

Language

-

Hacker News Points

-

Source URL

huggingface.co/blog/FINAL-Bench/darwin-papers

Summary

VIDRAFT's Darwin Family introduces an innovative approach to developing frontier-level reasoning language models (LLMs) without relying on traditional gradient-based training methods. Instead, the Darwin Family recombines the weight spaces of existing model checkpoints using a 14-dimensional adaptive genome, MRI-Trust Fusion, and an Architecture Mapper, which allows for the integration of different architectural elements. This method has led to the creation of Darwin-28B-Opus, a model achieving an 88.89% score on the challenging GPQA Diamond benchmark without any gradient-based training steps. The approach significantly reduces the computational cost typically associated with training high-capability models and demonstrates that open-source LLMs contain latent capabilities that can be unlocked through recombination. The framework's success suggests a shift in focus from traditional training to the extraction and recombination of existing model capabilities, potentially lowering the barriers to producing state-of-the-art reasoning models.