Unsloth Dynamic 2.0 GGUFs
Blog post from Unsloth
Unsloth has introduced Dynamic v2.0, a significant enhancement to their quantization method, which now sets new benchmarks for 5-shot MMLU and KL Divergence, offering improved accuracy for quantized large language models (LLMs) across various inference engines. The method intelligently adjusts the quantization type for each layer and model, extending beyond the previous focus on MoE architectures, and utilizes a high-quality calibration dataset to enhance conversational performance. Unlike other quantization methods which sometimes overfit to Wikipedia data, Dynamic 2.0 ensures fair and controlled evaluations using standardized datasets. The team developed a robust evaluation framework to accurately benchmark and replicate MMLU scores, addressing implementation issues that previously hindered replication. Dynamic 2.0 consistently outperforms other methods in accuracy and efficiency, as demonstrated through extensive benchmarking against previous versions and other popular quantization techniques, while also resolving significant issues in models like Llama 4.