Home / Companies / Supermaven / Blog / Post Details
Content Deep Dive

Benchmarking Supermaven's Long-Context Capabilities

Blog post from Supermaven

Post Details
Company
Date Published
Author
Jacob Jackson, CEO
Word Count
663
Language
English
Hacker News Points
-
Summary

Supermaven, a code completion tool boasting a 300,000-token context window, undergoes rigorous testing to validate its performance and utility in leveraging extensive context. The "needle in a haystack" test reveals that while Supermaven can effectively retrieve specific information embedded within a large text, it is notably easier due to the distinctiveness of the inserted "needle." To further challenge the model, a dense retrieval task is devised, requiring Supermaven to recall key-value pairs across a lengthy sequence, demonstrating its ability to handle more complex memory tasks. Results indicate that the model excels in retrieving information when occurrences are near the beginning or end of a sequence but struggles with mid-sequence retrieval, achieving around 75% accuracy when separated by 50,000 tokens. Additionally, an analysis of prediction error against context length shows that Supermaven effectively utilizes the full context available to improve prediction accuracy, with error rates decreasing as more context is incorporated.