Run DeepSeek-V3-0324
Blog post from Unsloth
DeepSeek's V3-0324 model is presented as a competitor to OpenAI's GPT-4.5 and Claude 3.7 Sonnet, demonstrating notable performance in various benchmarks. The model can be run using Unsloth's Dynamic GGUFs across diverse inference frameworks, with detailed guidance available for local execution. To optimize the balance between accuracy and size, not all layers are quantized; specifically, MoE layers are selectively quantized, while attention layers are left in 4 or 6-bit. The model showcases its capabilities through tests such as the Heptagon puzzle and the Flappy Bird game, where its dynamic 2.71-bit quantization nearly matches full 8-bit performance, outperforming standard quantization methods. It is recommended to run the model with at least 160GB of combined VRAM and RAM, although it can operate without a GPU, albeit slowly. Various dynamic quantized versions are provided, with an emphasis on the 2.71-bit Dynamic version for optimal performance.