Home / Companies / Unified.to / Blog / Post Details
Content Deep Dive

Scaling MCP Tools with Anthropic's Defer Loading

Blog post from Unified.to

Post Details
Company
Date Published
Author
-
Word Count
1,027
Language
-
Hacker News Points
-
Summary

Anthropic's defer_loading feature enhances the efficiency of managing multiple tools within AI applications built with Unified's MCP server by dynamically loading tools on-demand instead of upfront, addressing challenges such as context window bloat and tool selection degradation when dealing with large numbers of tools. The Unified MCP server can support over 22,000 tools from various integrations, but traditional methods struggle with token consumption and correct tool selection when faced with hundreds of tools. The defer_loading feature offers two tool search variants, regex and BM25, to improve tool discovery and selection through pattern-based and semantic searches, respectively. By deferring tool loading, developers can significantly reduce context usage, improve tool selection accuracy, and effectively manage multi-tenant applications with diverse data sources, allowing for scalable AI solutions. Best practices include keeping core tools non-deferred, using permissions to scope tools, restricting tools, monitoring token usage, and combining with prompt caching for multi-turn conversations.