Llama-2-7B-32K-Instruct — and fine-tuning for Llama-2 models with Together API

Company

Together AI

Date Published

Aug. 18, 2023

Author

Together

Word count

1092

Language

English

Hacker News points

None

URL

www.together.ai/blog/llama-2-7b-32k-instruct

Summary

The Llama-2-7B-32K-Instruct model achieves state-of-the-art performance for long-context tasks such as summarization and multi-document question answering while maintaining similar performance at a shorter context length compared to the base Llama-2-7B model. The model was fine-tuned using the Together API, which allows developers to easily build custom models with less than 200 lines of Python script. The fine-tuning process involves four main steps: distilling instructions from human inputs, training the model on a mixture of data sources, testing the model in the Together Playgrounds, and deploying it via the Together Inference API. The model outperforms other baseline models including GPT-3.5-Turbo-16k, Llama-2-7b-chat, Longchat-7b-16k and Longchat-7b-v1.5-32k on long-context benchmarks, demonstrating its robustness across these tasks. The model is now available for public use with the Together API, enabling developers to build custom models with ease.