Home / Companies / LangChain / Blog / Post Details
Content Deep Dive

monday Service + LangSmith: Building a Code-First Evaluation Strategy from Day 1

Blog post from LangChain

Post Details
Company
Date Published
Author
-
Word Count
1,826
Language
English
Hacker News Points
-
Summary

Monday.com has developed an evals-driven development framework for its AI Native Enterprise Service Management platform, designed to automate and resolve inquiries across various service departments. By integrating evaluation as a core component from the outset, the company has significantly accelerated feedback loops, enabling comprehensive testing across numerous examples in minutes rather than hours. The framework employs a dual-layered evaluation approach: offline evaluations act as a safety net, using curated datasets to ensure core logic and specific edge cases are robust, while online evaluations provide continuous quality monitoring in real-time production environments. The evaluations are managed as version-controlled code, using GitOps-style CI/CD deployment, which enhances agent observability and ensures high-quality AI interactions. The platform's architecture allows for parallel and concurrent testing, drastically improving evaluation speed and efficiency. This structured approach to evaluations, with tools like LangSmith and Vitest, reflects a commitment to rigorous testing standards akin to production code management, ensuring the AI service workforce remains reliable and adaptable to various enterprise service management use cases.