Home / Companies / Pydantic / Blog / Post Details
Content Deep Dive

Introducing online evals in Pydantic Logfire

Blog post from Pydantic

Post Details
Company
Date Published
Author
Karina Ung
Word Count
1,035
Language
English
Hacker News Points
-
Summary

Pydantic Logfire's online evaluations enable real-time scoring of AI agents on live production data, complementing offline evaluations that occur during development. By attaching evaluators to functions or agents, users can sample traffic and review results in the Logfire UI, allowing for continuous monitoring of metrics such as hallucination rates, tool-use accuracy, and response quality. These online evaluations utilize the same Evaluator classes as offline ones, ensuring consistency in scoring criteria and enabling immediate feedback on production performance. This approach helps identify regressions or improvements post-deployment and integrates seamlessly with OpenTelemetry for data flow across the same pipeline. Logfire's UI provides detailed insights with trend lines and event filtering, turning evaluations into a queryable surface rather than just a dashboard. Online evaluations do not replace human review but streamline the process, allowing human reviewers to focus on low-scoring traces and edge cases, thereby refining both agent performance and evaluator accuracy over time.