"Darwin-27B-Opus: Surpassing the Foundation Model Without Training"
Blog post from HuggingFace
Darwin-27B-Opus is a groundbreaking 27-billion-parameter model that achieved a remarkable 86.9% on the GPQA Diamond benchmark, placing it fifth globally, without undergoing any training. This accomplishment challenges traditional methods of improving language models, which typically involve more data, GPUs, and extensive training. Instead, Darwin-27B-Opus utilizes an innovative approach called evolutionary crossbreeding, which reorganizes existing knowledge within pretrained models by transplanting Feed-Forward Network (FFN) layers between architecturally compatible models while maintaining attention layers intact. This technique leverages the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) to optimize layer-specific blending ratios, resulting in a model that exceeds the performance of much larger models without the need for additional training. The findings suggest that the latent value within the open-source model ecosystem is substantial, with potential implications for reducing compute requirements and advancing model development through compositional methods, using existing models as building blocks rather than starting from scratch.