Self-Improving Agents, Powered by Your Evals

Post Details

Company

Fireworks AI

Date Published

Dec. 17, 2025

Author

-

Word Count

1,389

Language

English

Hacker News Points

-

Source URL

fireworks.ai/blog/self-improving-agent

Summary

Eval Protocol introduces an innovative integration with GEPA to enhance prompt optimization for open-source models without modifying model weights. This unified evaluation interface, which also supports reinforcement learning (RL), allows users to convert failure signals into actionable prompt improvements, improving model accuracy efficiently. In a case study involving a Text2SQL agent, GEPA's prompt optimization led to a significant increase in test and validation set accuracy, demonstrating the potential for substantial gains from reflective prompt adjustments. By using Eval Protocol, users can systematically write evaluations that not only assess but also enhance performance. This approach allows for a continuous improvement cycle where evals serve as both a diagnostic tool and a mechanism for performance enhancement, culminating in a seamless transition to techniques like RFT for even greater accuracy improvements.