Beating Proprietary Models with a Quick Fine-Tune

Company

Modal

Date Published

April 26, 2024

Author

Jason Liu

Word count

2384

Language

English

Hacker News points

URL

modal.com/blog/fine-tuning-embeddings

Summary

With just a handful of examples, fine-tuned open-source embedding models can provide greater accuracy at a lower price than proprietary models. Custom models matter for companies like Netflix and Spotify that use data to improve their recommendation systems. Large pre-trained models with permissive licenses have simplified the bootstrap step, allowing organizations to start with these models and expect them to perform reasonably well on their task. Fine-tuning kicks off the data flywheel by accumulating data quickly, which can lead to better performance than the off-the-shelf model. The process of fine-tuning involves design decisions such as finding or creating a dataset, choosing a base model, and acquiring training infrastructure. Running a grid search over fine-tuning hyperparameters is an effective way to explore experimental parameters, and Modal's autoscaling infrastructure can be used to scale experiments in parallel. Even with just a few hundred examples, it's possible to beat proprietary models on a simple question-answering task, and moving forward, the next step would be to operationalize this process to collect more data and iterate on the model.