Home / Companies / Unsloth / Blog / Post Details
Content Deep Dive

Run & Finetune DeepSeek-R1

Blog post from Unsloth

Post Details
Company
Date Published
Author
Daniel & Michael
Word Count
889
Language
English
Hacker News Points
-
Summary

DeepSeek's new R1 model sets impressive benchmarks in reasoning performance, equaling OpenAI's o1 model and building on the previously launched DeepSeek-V3. The model has been distilled and fine-tuned on Llama 3 and Qwen 2.5, allowing users to fine-tune it with Unsloth. The R1 series includes various versions like GGUF's and a new 1.58-bit Dynamic GGUF that reduces size by 80%. Running DeepSeek-R1 requires llama.cpp, and while a GPU isn't necessary, a CPU with substantial RAM and disk space is essential. The DeepSeek team has made smaller, distilled models available for local use, and fine-tuning options are provided using compatible Llama and Qwen architectures. They also offer resources like Colab notebooks for free fine-tuning, encouraging community engagement through platforms like Reddit and Discord.