How Coursera builds next-generation learning tools

Company

Braintrust

Date Published

May 12, 2025

Author

Ornella Altunyan, Winnie Tam, Sophie Gao

Word count

1110

Language

English

Hacker News points

None

URL

www.braintrust.dev/blog/coursera

Summary

Coursera has built a structured evaluation process to quickly ship reliable AI features that customers love. They began adopting large language models to enhance their user experience, particularly with their Coursera Coach chatbot and AI-assisted grading tools, but realized the need for a better evaluation workflow. Before establishing a formal framework, they relied on fragmented offline jobs in spreadsheets and human labeling processes, which made it difficult to validate AI features and confidently push them to production. The business impact of these AI features is significant, with metrics demonstrating their value. The Coursera Coach serves as a 24/7 learning assistant and psychological support system for students, maintaining an impressive 90% learner satisfaction rating, while automated grading addresses a critical scaling challenge in Coursera's educational model. To evaluate AI features, Coursera uses a four-step approach: defining clear evaluation criteria upfront, curating targeted datasets, implementing both heuristic and model-based scorers, and running evaluations and iterating rapidly. Their structured evaluation framework has transformed their AI development process, increasing development confidence, moving ideas from concept to release faster, and enabling more comprehensive testing.