How we made a SQL query optimization agent 59% more accurate using autoresearch and LLM Observability
Blog post from Datadog
The Datadog team aimed to enhance their Database Monitoring (DBM) system's automated query optimization recommendations by integrating an AI agent with the existing multi-source heuristic engine. Using Karpathy's autoresearch tool, they conducted 23 autonomous experiments, which improved the AI agent's precision from P=0.54 to P=0.86 by optimizing the prompting and tool chains, adjusting the model for cost-performance balance, and implementing a two-pass approach. The heuristic engine was precise, achieving P=0.903, but the AI agent, while less precise initially, could identify a broader set of potential optimizations, leading the team to develop a rigorous evaluation dataset and experiment infrastructure for rapid iteration. The iterative process involved optimizing the agent's system prompt and tool descriptions, compressing solutions to a smaller model, and using a two-pass system to reach precision goals. The team's methodology, supported by LLM Observability Experiments, provided a structured approach to experimentation, enabling detailed tracking and analysis, which can be applied broadly to AI agent development beyond query optimization.