Long context prompting for Claude 2.1

Company

Anthropic

Date Published

Dec. 6, 2023

Author

Word count

899

Language

English

Hacker News points

None

URL

www.anthropic.com/news/claude-2-1-prompting

Summary

Claude 2.1`, a state-of-the-art language model, offers a 200K token context window, equivalent to around 500 pages of information. This allows it to excel at real-world retrieval tasks across longer contexts, with a 30% reduction in incorrect answers compared to its predecessor `Claude 2.0`. However, the model can be reluctant to answer questions based on an individual sentence in a document, especially if that sentence is out of place or injected. A minor prompting edit can remove this reluctance and results in excellent performance on these tasks. To effectively use the long context window, users need to prompt the model to look for relevant sentences first, which can be achieved by adding a specific sentence to the prompt. This approach improves the model's performance on both out-of-place and in-place sentences, achieving 90-95% accuracy in some cases. The model is trained on a mix of data aimed at reducing inaccuracies, including not answering questions based on a document if it doesn't contain enough information to justify that answer.