Training-Free Reasoning at 88.89% on GPQA Diamond: How Darwin Family Hit Frontier Scores Without a Single Gradient Step
Blog post from HuggingFace
VIDRAFT's Darwin Family introduces an innovative approach to developing frontier-level reasoning language models (LLMs) without relying on traditional gradient-based training methods. Instead, the Darwin Family recombines the weight spaces of existing model checkpoints using a 14-dimensional adaptive genome, MRI-Trust Fusion, and an Architecture Mapper, which allows for the integration of different architectural elements. This method has led to the creation of Darwin-28B-Opus, a model achieving an 88.89% score on the challenging GPQA Diamond benchmark without any gradient-based training steps. The approach significantly reduces the computational cost typically associated with training high-capability models and demonstrates that open-source LLMs contain latent capabilities that can be unlocked through recombination. The framework's success suggests a shift in focus from traditional training to the extraction and recombination of existing model capabilities, potentially lowering the barriers to producing state-of-the-art reasoning models.