Scaling MCP Tools with Anthropic's (& OpenAI's) Defer Loading
Blog post from Unified.to
Anthropic's defer_loading feature, integrated with OpenAI's tools and Unified's MCP server, addresses the challenges of managing hundreds of AI application tools by dynamically loading tools on-demand rather than upfront, thus reducing context window bloat and improving tool selection accuracy. The Unified MCP server supports over 22,000 tools, but traditional approaches struggle with the large quantity, leading to inefficiencies. The defer_loading feature offers two search variants: Regex Tool Search for exact matches and BM25 Tool Search for semantic understanding, enabling more effective tool management. By implementing this feature, users can scale tools across multiple integrations while maintaining high accuracy and context efficiency, making it easier to build AI applications that interact with diverse data sources. This solution is particularly beneficial when dealing with more than 20 tools or multiple integrations, offering a significant reduction in context usage and improved tool selection accuracy, paving the way for more capable AI assistants.