Company
Date Published
Author
Together
Word count
1831
Language
English
Hacker News points
None

Summary

Together AI has released LLaMA-2-7B-32K, a 32K context model built using Position Interpolation and Together AI's data recipe and system optimizations. This model extends the original LLaMA-2 to 32K long context, achieving comparable perplexity and quality to state-of-the-art closed-source models. The power of this base model lies in its ability to be fine-tuned for targeted applications, such as multi-document question answering and summarization. To support this, Together AI has updated their inference and training stack with FlashAttention-2 and other optimizations, allowing for efficient inference and fine-tuning with 32K context. The community is encouraged to build on this work by exploring ways to extend the context length of open-source models, preparing better data for long-context tasks, and improving system support for long-context training and inference.