Home / Companies / SuperAGI / Blog / Post Details
Content Deep Dive

Extending Context Window of a 7B LLM from 8k to 32k using PoSE (Positional Skip-wisE)

Blog post from SuperAGI

Post Details
Company
Date Published
Author
admin_sagi
Word Count
1,150
Language
English
Hacker News Points
-
Summary

Positional Skip-wisE (PoSE) training is introduced as an efficient method to extend the context window of Large Language Models (LLMs) without the high computational costs associated with full-length fine-tuning. Unlike traditional methods such as Position Interpolation, PoSE manipulates position indices to simulate longer inputs within a fixed context window, minimizing memory and time overhead while maintaining performance. This approach was successfully applied to extend the context window of the Mistral 7B model from 8K to 32K, demonstrating its effectiveness in language modeling and information extraction tasks with minimal performance degradation. PoSE is compatible with all RoPE-based LLMs and position interpolation strategies, providing a cost-effective solution for handling extremely long contexts. The model employing PoSE is available on Hugging Face, validating its practical application and empirical success.