SuperHot 8k Token Context Models Are Here For Text Generation

Post Details

Company

RunPod

Date Published

July 7, 2023

Author

Brendan McKeag

Word Count

592

Language

English

Hacker News Points

-

Source URL

www.runpod.io/blog/superhot-8k-context-models

Summary

Esteemed contributor TheBloke has released quantizations of several well-known AI models, such as WizardLM, Vicuna, and Manticore, on GitHub, allowing for increased context retention in AI storytelling. This enhancement addresses a common issue where AI models forget critical scene details once they exceed their context window, unlike human partners who can reference past information. With traditional models having a 2k token limit, crucial details like character positioning and attire risk being forgotten, potentially disrupting immersion. However, with an expanded 8k context limit, users can now incorporate more detailed character histories and scene contexts, enhancing narrative continuity. For users of specific models like Llama, adjustments such as using ExLlama as the loader and modifying certain settings are necessary to utilize the increased context effectively, with further guidance available through the community Discord.