Online evals in AI Configs is now GA

Post Details

Company

LaunchDarkly

Date Published

March 11, 2026

Author

Kelvin Yap

Word Count

660

Language

English

Hacker News Points

-

Source URL

launchdarkly.com/blog/online-evals-ai-configs-ga-customizable-judges

Summary

Online evaluations, now generally available in AI Configs, offer a method to automatically assess AI output quality using large language models (LLMs) as judges, with the addition of customizable judges that allow teams to define their own criteria for what constitutes "good" output. This flexibility enables teams to tailor evaluations to their specific needs, ensuring that AI behavior aligns with the intended experience and policy boundaries of their industry, brand, or workflow. For instance, a banking chatbot must maintain a professional tone to build trust, avoiding casual language that, while accurate, could undermine user confidence. Custom judges allow teams to score outputs based on these nuanced requirements, providing actionable insights during rollouts, enabling them to pause or revert changes if necessary. These scores become valuable tools during releases, complementing existing metrics like latency and cost, and are managed through the same workflow as other AI Configurations, allowing for continuous iteration and refinement in evaluation criteria.