Run DeepSeek-V3-0324

Post Details

Company

Unsloth

Date Published

March 25, 2025

Author

Daniel & Michael

Word Count

1,234

Language

English

Hacker News Points

-

Source URL

unsloth.ai/blog/deepseek-v3-0324

Summary

DeepSeek's V3-0324 model is presented as a competitor to OpenAI's GPT-4.5 and Claude 3.7 Sonnet, demonstrating notable performance in various benchmarks. The model can be run using Unsloth's Dynamic GGUFs across diverse inference frameworks, with detailed guidance available for local execution. To optimize the balance between accuracy and size, not all layers are quantized; specifically, MoE layers are selectively quantized, while attention layers are left in 4 or 6-bit. The model showcases its capabilities through tests such as the Heptagon puzzle and the Flappy Bird game, where its dynamic 2.71-bit quantization nearly matches full 8-bit performance, outperforming standard quantization methods. It is recommended to run the model with at least 160GB of combined VRAM and RAM, although it can operate without a GPU, albeit slowly. Various dynamic quantized versions are provided, with an emphasis on the 2.71-bit Dynamic version for optimal performance.