Preparing for the era of 32K context: Early learnings and explorations

Company

Together AI

Date Published

July 28, 2023

Author

Together

Word count

1831

Language

English

Hacker News points

None

URL

www.together.ai/blog/llama-2-7b-32k

Summary

Together AI has released LLaMA-2-7B-32K, a 32K context model built using Position Interpolation and Together AI's data recipe and system optimizations. This model extends the original LLaMA-2 to 32K long context, achieving comparable perplexity and quality to state-of-the-art closed-source models. The power of this base model lies in its ability to be fine-tuned for targeted applications, such as multi-document question answering and summarization. To support this, Together AI has updated their inference and training stack with FlashAttention-2 and other optimizations, allowing for efficient inference and fine-tuning with 32K context. The community is encouraged to build on this work by exploring ways to extend the context length of open-source models, preparing better data for long-context tasks, and improving system support for long-context training and inference.