Home / Companies / Ably / Blog / Post Details
Content Deep Dive

Resume tokens and last-event IDs for LLM streaming: How they work & what they cost to build

Blog post from Ably

Post Details
Company
Date Published
Author
Ably
Word Count
1,639
Language
-
Hacker News Points
-
Summary

Amber Dawson explores the intricacies and challenges involved in implementing resume tokens and last-event IDs for live streaming in AI applications, which are designed to enhance user experience by allowing streams to resume seamlessly after disconnections. She explains the foundational components of resumable streaming, including message identifiers, client state tracking, reconnection protocols, and catchup delivery, and highlights the limitations of Server-Sent Events (SSE) in handling bidirectional messaging and distributed infrastructure. Dawson discusses the complexities of building resume functionality into WebSockets, emphasizing the need for custom session management and token-level storage, which can become performance bottlenecks when handling large volumes of data. The article also addresses the issues of duplicates and gaps in message delivery, the challenges of maintaining multi-device continuity, and the significant effort required for developing a reliable, production-grade resumable streaming system. Ultimately, Dawson suggests that while building custom solutions is feasible, leveraging transport infrastructure that integrates resume capabilities can alleviate many of these challenges and allow teams to focus on application logic rather than infrastructure concerns.