Code Mode: give agents an entire API in 1,000 tokens
Blog post from Cloudflare
Model Context Protocol (MCP) has emerged as a standard for AI agents to utilize external tools, but it faces a challenge where adding more tools limits the space available for tasks in the model's context window. Addressing this issue, Code Mode allows models to write and execute code using a typed SDK, thereby creating a compact plan for tool operations and reducing context window usage. This technique, also explored by Anthropic, has been applied to a new MCP server for the Cloudflare API, which consolidates the API's extensive operations into just two tools: search() and execute(). By using Code Mode, this server significantly reduces token usage by 99.9%, maintaining a fixed footprint regardless of the number of API endpoints. This efficiency is achieved by minimizing the input tokens to around 1,000, compared to over a million tokens without Code Mode. The approach is being open-sourced in the Cloudflare Agents SDK, enabling broader application in other MCP servers and AI agents.