Home / Companies / Together AI / Blog / Post Details
Content Deep Dive

BitDelta: Your Fine-Tune May Only Be Worth One Bit

Blog post from Together AI

Post Details
Company
Date Published
Author
James Liu, Guangxuan Xiao, Kai Li, Jason D. Lee, Song Han, Tri Dao, Tianle Cai
Word Count
1,690
Language
English
Hacker News Points
-
Summary

The pretrain-finetune paradigm has revolutionized machine learning by enabling LLMs to align with distinct user preferences or specialized task requirements through fine-tuning. However, multi-tenant serving is challenging due to expensive storage and serving challenges. Researchers have proposed a novel approach called BitDelta, which decomposes the weights of fine-tuned models into their pre-trained components and an additional delta, allowing for 1-bit quantization without compromising performance. This approach addresses both storage and serving challenges by reducing GPU memory requirements and improving inference speedup. BitDelta is fast, general, and can retain all sorts of fine-tuning information, making it a promising solution for the future of machine learning in multi-tenant settings.