
In this session, you'll learn how to build evaluation systems for real-world AI applications. We’ll cover how to design evaluation workflows, run evals using the OpenAI Evals API, and use structured testing to measure AI system quality and reliability in practice.
You’ll also learn how to define graders, structure evaluation datasets, and run evals against real inputs to generate meaningful quality signals. We’ll walk through how to interpret results, identify failure patterns, and compare system performance across different configurations. Ideal for builders looking to move from prototyping to reliable, production-ready AI systems.
This session is geared toward intermediate builders looking to learn how to design and run evals in practice.
Speakers
Sean Lubbers
Technical Enablement Manager @ OpenAI
Live in 16 days
April 09, 5:00 PM GMT
Online
Organized by

OpenAI Academy
Add to calendar
Live in 16 days
April 09, 5:00 PM GMT
Online
Organized by

OpenAI Academy
Add to calendar

