Company
Date Published
Author
-
Word count
899
Language
English
Hacker News points
None

Summary

Claude 2.1`, a state-of-the-art language model, offers a 200K token context window, equivalent to around 500 pages of information. This allows it to excel at real-world retrieval tasks across longer contexts, with a 30% reduction in incorrect answers compared to its predecessor `Claude 2.0`. However, the model can be reluctant to answer questions based on an individual sentence in a document, especially if that sentence is out of place or injected. A minor prompting edit can remove this reluctance and results in excellent performance on these tasks. To effectively use the long context window, users need to prompt the model to look for relevant sentences first, which can be achieved by adding a specific sentence to the prompt. This approach improves the model's performance on both out-of-place and in-place sentences, achieving 90-95% accuracy in some cases. The model is trained on a mix of data aimed at reducing inaccuracies, including not answering questions based on a document if it doesn't contain enough information to justify that answer.