Run DeepSeek-V3.1 Dynamic 1-bit GGUFs

Post Details

Company

Unsloth

Date Published

Aug. 21, 2025

Author

Daniel & Michael

Word Count

961

Language

English

Hacker News Points

-

Source URL

unsloth.ai/blog/deepseek-v3.1

Summary

DeepSeek-V3.1 is an updated hybrid reasoning model by DeepSeek, designed to rival major AI models like OpenAI's GPT-4.5 and Google's Gemini 2.5 Pro, offering significant size reduction from 720GB to 170GB through selective quantization. This model, which requires 715GB for its full 671B parameter version, can be run effectively using Unsloth's 1-bit Dynamic 2.0 GGUFs on popular inference frameworks like llama.cpp and Ollama. Recommended settings for optimal performance include setting the temperature to 0.6 to minimize repetition, using a top_p of 0.95, and a context length of 128K. The update also addresses chat template issues to improve compatibility with various engines and provides detailed instructions for running the model locally or via platforms like Hugging Face. Users are encouraged to leverage community resources and platforms for support and updates.