Go offline with the Player FM app!
Episode 50: A Field Guide to Rapidly Improving AI Products -- With Hamel Husain
Manage episode 489221120 series 3317544
If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks.
In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The conversation is based on Hamel’s blog post A Field Guide to Rapidly Improving AI Products, which he joined Hugo’s class to discuss.
They cover:
🔍 Why most teams struggle to measure whether their systems are actually improving
📊 How error analysis helps you prioritize what to fix (and when to write evals)
🧮 Why evaluation isn’t just a metric — but a full development process
⚠️ Common mistakes when debugging LLM and agent systems
🛠️ How to think about the tradeoffs in adding more evals vs. fixing obvious issues
👥 Why enabling domain experts — not just engineers — can accelerate iteration
If you’ve ever built an AI system and found yourself unsure how to make it better, this conversation is for you.
LINKS
- A Field Guide to Rapidly Improving AI Products by Hamel Husain
- Vanishing Gradients YouTube Channel
- Upcoming Events on Luma
- Hugo's recent newsletter about upcoming events and more!
🎓 Learn more:
- Hugo's course: Building LLM Applications for Data Scientists and Software Engineers — next cohort starts July 8: https://maven.com/s/course/d56067f338
- Hamel & Shreya's course: Evals for LLMs — use code
GOHUGORGOHOME
for $800 off
📺 Watch the video version on YouTube: YouTube link
50 episodes
Manage episode 489221120 series 3317544
If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks.
In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The conversation is based on Hamel’s blog post A Field Guide to Rapidly Improving AI Products, which he joined Hugo’s class to discuss.
They cover:
🔍 Why most teams struggle to measure whether their systems are actually improving
📊 How error analysis helps you prioritize what to fix (and when to write evals)
🧮 Why evaluation isn’t just a metric — but a full development process
⚠️ Common mistakes when debugging LLM and agent systems
🛠️ How to think about the tradeoffs in adding more evals vs. fixing obvious issues
👥 Why enabling domain experts — not just engineers — can accelerate iteration
If you’ve ever built an AI system and found yourself unsure how to make it better, this conversation is for you.
LINKS
- A Field Guide to Rapidly Improving AI Products by Hamel Husain
- Vanishing Gradients YouTube Channel
- Upcoming Events on Luma
- Hugo's recent newsletter about upcoming events and more!
🎓 Learn more:
- Hugo's course: Building LLM Applications for Data Scientists and Software Engineers — next cohort starts July 8: https://maven.com/s/course/d56067f338
- Hamel & Shreya's course: Evals for LLMs — use code
GOHUGORGOHOME
for $800 off
📺 Watch the video version on YouTube: YouTube link
50 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.