Home / Companies / Pydantic / Blog / Post Details
Content Deep Dive

Your agent would rather write code

Blog post from Pydantic

Post Details
Company
Date Published
Author
-
Word Count
1,666
Language
English
Hacker News Points
-
Summary

In an exploration of optimizing AI observability with Logfire, the developers initially created over 40 meticulously designed MCP tools to handle various tasks, such as SQL queries and managing dashboards, only to discover that a single exec tool allowing the execution of Python code was more efficient. This shift was inspired by the realization that large language models (LLMs) excel at writing code rather than selecting from extensive tool menus, as demonstrated by Cloudflare's similar transition to fewer tools. The exec tool, powered by a Python interpreter called Monty, simplifies complex tasks into single scripts that execute server-side, reducing API calls, execution time, and token usage. While this approach significantly improves efficiency, challenges remain, such as "vibe coding" errors by models and the ongoing development of Monty to support more Python features. The team emphasizes security by tightly controlling Monty's external interactions and uses an evaluation framework to ensure the reliability of the exec tool in multi-step processes. As the MCP evolves with features like interactive UIs and human-in-the-loop capabilities, the focus shifts towards more expressive execution with fewer tools, recognizing that letting models write code aligns better with their capabilities.