2x faster Gemma 2 finetuning + 63% less VRAM

Post Details

Company

Unsloth

Date Published

July 3, 2024

Author

Daniel & Michael

Word Count

1,163

Language

English

Hacker News Points

-

Source URL

unsloth.ai/blog/gemma2

Summary

Unsloth's latest advancements in finetuning Google's Gemma 2 models significantly boost performance and efficiency, allowing for faster processing and reduced VRAM usage compared to previous methods. The Gemma 2 (9B) model can now be finetuned twice as fast with 63% less memory, while the Gemma 2 (27B) achieves 1.9x faster finetuning with a 51% VRAM reduction. Unsloth also enables longer context lengths, up to 4-5 times for the 9B model, by implementing softcapping mechanisms that improve training accuracy and reduce VRAM usage. The integration of QLoRA and gradient checkpointing further enhances the training process, with updates to support Microsoft's Phi-3 mini update. Additionally, Unsloth has contributed fixes to the Gemma 2 Pytorch repository, addressing issues related to mixed precision training, and has actively participated in the AI Engineer World's Fair, engaging with the community through workshops and talks.