Home / Companies / RunPod / Blog / Post Details
Content Deep Dive

16k Context LLM Models Now Available On Runpod

Blog post from RunPod

Post Details
Company
Date Published
Author
Brendan McKeag
Word Count
612
Language
English
Hacker News Points
-
Summary

Panchovix has introduced a new set of models featuring a 16384-token context window, doubling the capacity of their previous models and aligning with the capabilities of the latest version of text-generation-webui (Oobabooga). This expansion enhances performance in long, detailed interactions such as question-answer sessions and roleplay, but it also requires more VRAM, potentially up to 63% usage of an a100's 80 GB memory for fully loaded contexts. The increased context window allows for richer interactions, reducing the chance of losing earlier parts of a conversation when the window fills up, yet it can increase perplexity as the model must perform more comparisons to predict subsequent words accurately. While these merged models can deliver impressive results, the tradeoff between context capacity and model performance should be considered based on specific use cases, such as opting for the enhanced context in roleplay scenarios but sticking to the base model for straightforward tasks like sentiment analysis.