Home / Companies / LangChain / Blog / Post Details
Content Deep Dive

Prompt Caching with Deep Agents

Blog post from LangChain

Post Details
Company
Date Published
Author
Alex Olsen
Word Count
1,006
Company Posts That Month
24
Language
English
Hacker News Points
-
Summary

Prompt caching is a cost-effective feature for running AI agents at scale, offering significant reductions in token costs by storing and reusing snapshots of a model's state after processing a prompt. The Deep Agents harness leverages prompt caching across various model providers to minimize API costs by automatically setting explicit cache breakpoints when supported, and opting into implicit caching otherwise, to maximize cache reads. While different providers offer varied levels of support for features like explicit breakpoints, configurable TTLs, and cache prewarming, Deep Agents ensures that users can switch providers without losing cost-saving benefits. Real-world evaluations with models like claude-haiku-4-5, gpt-5.4-mini, and gemini-3.5-flash have shown token cost reductions ranging from 49% to 80%. As the feature landscape evolves, Deep Agents will integrate new capabilities, while tools like LangSmith provide observability into API costs and caching efficiency to further optimize agent performance.

Trends Found in this Post
Trend Post Mentions Total Month Mentions Posts Companies MoM
AI Agents 1 4,874 1,103 240 -1%
Observability 1 3,430 674 183 +0%