Qwen 2.5 Coder Fine-tuning

Post Details

Company

Unsloth

Date Published

Nov. 13, 2024

Author

Daniel & Michael

Word Count

344

Language

English

Hacker News Points

-

Source URL

unsloth.ai/blog/qwen-coder

Summary

Unsloth has announced support for the Qwen 2.5 and Qwen 2.5 Coder models, boasting improvements in fine-tuning speed and memory efficiency, with Unsloth enabling a 2x faster process and a 60% reduction in memory usage compared to Flash Attention 2 plus Hugging Face. Google Colab notebooks have been provided for fine-tuning on a free Tesla T4, and the models, originally with 32K context lengths, have been extended to 128K using YaRN, with all uploads available on Hugging Face. An update on November 13, 2024, fixed GGUF YaRNs, and analysis revealed certain bugs, including improper usage of the `<|endoftext|>` token leading to infinite generations during fine-tuning, and untrained `<|im_start|>` and `<|im_end|>` tokens in the base models. The community is encouraged to participate via Discord, Twitter, and Substack, with gratitude expressed by the developers, Daniel and Michael Han, for ongoing support and engagement.