16k Context LLM Models Now Available On Runpod

Post Details

Company

RunPod

Date Published

July 19, 2023

Author

Brendan McKeag

Word Count

612

Language

English

Hacker News Points

-

Source URL

www.runpod.io/blog/16k-context-llm-models-now-available-on-runpod

Summary

Panchovix has introduced a new set of models featuring a 16384-token context window, doubling the capacity of their previous models and aligning with the capabilities of the latest version of text-generation-webui (Oobabooga). This expansion enhances performance in long, detailed interactions such as question-answer sessions and roleplay, but it also requires more VRAM, potentially up to 63% usage of an a100's 80 GB memory for fully loaded contexts. The increased context window allows for richer interactions, reducing the chance of losing earlier parts of a conversation when the window fills up, yet it can increase perplexity as the model must perform more comparisons to predict subsequent words accurately. While these merged models can deliver impressive results, the tradeoff between context capacity and model performance should be considered based on specific use cases, such as opting for the enhanced context in roleplay scenarios but sticking to the base model for straightforward tasks like sentiment analysis.