Home / Companies / Arcade / Blog / Post Details
Content Deep Dive

We Threw 4,000 Tools at Anthropic's New Tool Search. Here's What Happened.

Blog post from Arcade

Post Details
Company
Date Published
Author
Eric Gustin
Word Count
542
Language
English
Hacker News Points
-
Summary

Anthropic's new Tool Search aims to enable its AI, Claude, to access thousands of tools without overwhelming its context window, a promising step forward in optimizing agent workflows. Arcade.dev tested this feature by integrating 4,027 tools across multiple platforms, such as Gmail, Slack, and Salesforce, and running 25 routine tasks. The results showed that the Regex-based search had a 56% success rate, while the BM25-based search achieved 64%, indicating significant room for improvement in tool retrieval accuracy. Some tasks were handled seamlessly, like creating events on Google Calendar and sending messages via Microsoft Teams, while others struggled, notably failing to retrieve common tools for sending emails or creating tickets. Despite these challenges, Anthropic's approach is commendable for addressing context-bloat issues and offering potential token savings, although the current retrieval accuracy may not yet meet enterprise reliability standards. Arcade remains committed to enhancing agent performance and promises further updates to advance tool interaction capabilities.