Prompt Caching with Deep Agents

Post Details

Company

LangChain

Date Published

June 26, 2026

Author

Alex Olsen

Word Count

1,006

Company Posts That Month

24

Language

English

Hacker News Points

-

Source URL

www.langchain.com/blog/deep-agents-prompt-caching

Summary

Prompt caching is a cost-effective feature for running AI agents at scale, offering significant reductions in token costs by storing and reusing snapshots of a model's state after processing a prompt. The Deep Agents harness leverages prompt caching across various model providers to minimize API costs by automatically setting explicit cache breakpoints when supported, and opting into implicit caching otherwise, to maximize cache reads. While different providers offer varied levels of support for features like explicit breakpoints, configurable TTLs, and cache prewarming, Deep Agents ensures that users can switch providers without losing cost-saving benefits. Real-world evaluations with models like claude-haiku-4-5, gpt-5.4-mini, and gemini-3.5-flash have shown token cost reductions ranging from 49% to 80%. As the feature landscape evolves, Deep Agents will integrate new capabilities, while tools like LangSmith provide observability into API costs and caching efficiency to further optimize agent performance.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
AI Agents	1	4,874	1,103	240	-1%
Observability	1	3,430	674	183	+0%