Company
Date Published
Author
Caroline Borders July
Word count
486
Language
English
Hacker News points
None

Summary

The latest Opik releases enhance the evaluation and optimization of multi-step agentic systems by focusing on action groups rather than individual LLM calls, enabling a more comprehensive performance analysis of AI applications. These updates allow users to evaluate entire multi-turn conversations by inviting human experts to review, score, and apply metrics like user frustration and conversational coherence, thus optimizing conversation-level performance. Opik’s agent optimizer SDK now supports the automated optimization of multi-step agents, offering features such as thread evaluation and thread-level expert feedback for improved collaboration and clarity. The platform, which integrates with tools like LangGraph and PydanticAI, allows for more control by separating optimizing and evaluation LLMs. Additionally, Dmitrii Krasnov from Zencoder shares how his team uses Opik to enhance their AI-powered code assistant, benefiting from improved research efficiency and trace visibility. Opik also invites AI developers to various upcoming events to connect, learn, and collaborate within the community.