How to Build LLM Streams That Survive Reconnects, Refreshes, and Crashes

Post Details

Company

Upstash

Date Published

April 14, 2025

Author

Josh

Word Count

5,175

Language

English

Hacker News Points

-

Source URL

upstash.com/blog/resumable-llm-streams

Summary

The article describes the creation of highly durable and resumable large language model (LLM) streams that remain operational despite client-side disruptions such as network outages, page refreshes, or device disconnections. This is achieved by decoupling the client from the generation environment, allowing uninterrupted generation of LLM outputs via a separate stream generator. The use of Redis streams for persistent storage ensures that each chunk of the LLM response is stored and accessible, while Redis Pub/Sub facilitates real-time updates to alert the stream consumer of new data. The system also includes an automatic reconnection feature that allows clients to seamlessly receive all missed content upon reconnection, without duplicates or missing data. Additionally, session management allows users to view streams on multiple devices simultaneously, thereby enhancing the user experience, particularly suitable for applications like LLM chat services.