Home / Companies / AI21 Labs / Blog / Post Details
Content Deep Dive

How to scale agentic evaluation: Lessons from 200,000 SWE-bench runs

Blog post from AI21 Labs

Post Details
Company
Date Published
Author
Yaron Sternbach, VP Engineering
Word Count
1,384
Language
English
Hacker News Points
-
Summary

No summary generated yet.