Your agent would rather write code

Post Details

Company

Pydantic

Date Published

April 8, 2026

Author

-

Word Count

1,666

Company Posts That Month

13

Language

English

Hacker News Points

-

Post removed?

No

Source URL

pydantic.dev/articles/your-agent-would-rather-write-code

Summary

In an exploration of optimizing AI observability with Logfire, the developers initially created over 40 meticulously designed MCP tools to handle various tasks, such as SQL queries and managing dashboards, only to discover that a single exec tool allowing the execution of Python code was more efficient. This shift was inspired by the realization that large language models (LLMs) excel at writing code rather than selecting from extensive tool menus, as demonstrated by Cloudflare's similar transition to fewer tools. The exec tool, powered by a Python interpreter called Monty, simplifies complex tasks into single scripts that execute server-side, reducing API calls, execution time, and token usage. While this approach significantly improves efficiency, challenges remain, such as "vibe coding" errors by models and the ongoing development of Monty to support more Python features. The team emphasizes security by tightly controlling Monty's external interactions and uses an evaluation framework to ensure the reliability of the exec tool in multi-step processes. As the MCP evolves with features like interactive UIs and human-in-the-loop capabilities, the focus shifts towards more expressive execution with fewer tools, recognizing that letting models write code aligns better with their capabilities.

Trends Found in this Post

Trend	Post Mentions	Total Month Mentions	Posts	Companies	MoM
MCP	10	6,108	613	170	+36%
LLM	2	5,932	1,046	223	-2%
OpenTelemetry	2	1,197	139	44	+92%
Harness engineering	1	164	111	62	+6%
Observability	1	4,496	812	176	+40%

Use This Data

Use this post, company, and trend context to find content marketing opportunities, perform competitive analysis, or address product feature gaps via the Plushcap MCP server or the Plushcap API.