Interview-based podcast on all things data science, entrepreneurship, statistics, machine learning, open source, and Python. Formerly PyData Deep Dive.
…
continue reading
Thomas Wiecki Podcasts
A podcast about all things data, brought to you by data scientist Hugo Bowne-Anderson. It's time for more critical conversations about the challenges in our industry in order to build better compasses for the solution space! To this end, this podcast will consist of long-format conversations between Hugo and other people who work broadly in the data science, machine learning, and AI spaces. We'll dive deep into all the moving parts of the data world, so if you're new to the space, you'll hav ...
…
continue reading

1
Episode 58: Building GenAI Systems That Make Business Decisions with Thomas Wiecki (PyMC Labs)
1:00:45
1:00:45
Play later
Play later
Lists
Like
Liked
1:00:45While most conversations about generative AI focus on chatbots, Thomas Wiecki (PyMC Labs, PyMC) has been building systems that help companies make actual business decisions. In this episode, he shares how Bayesian modeling and synthetic consumers can be combined with LLMs to simulate customer reactions, guide marketing spend, and support strategy. …
…
continue reading

1
Episode 59: Patterns and Anti-Patterns For Building with AI
47:37
47:37
Play later
Play later
Lists
Like
Liked
47:37John Berryman (Arcturus Labs; early GitHub Copilot engineer; co-author of Relevant Search and Prompt Engineering for LLMs) has spent years figuring out what makes AI applications actually work in production. In this episode, he shares the “seven deadly sins” of LLM development — and the practical fixes that keep projects from stalling. From context…
…
continue reading

1
Episode 57: AI Agents and LLM Judges at Scale: Processing Millions of Documents (Without Breaking the Bank)
41:27
41:27
Play later
Play later
Lists
Like
Liked
41:27While many people talk about “agents,” Shreya Shankar (UC Berkeley) has been building the systems that make them reliable. In this episode, she shares how AI agents and LLM judges can be used to process millions of documents accurately and cheaply. Drawing from work on projects ranging from databases of police misconduct reports to large-scale cust…
…
continue reading

1
Episode 56: DeepMind Just Dropped Gemma 270M... And Here’s Why It Matters
45:40
45:40
Play later
Play later
Lists
Like
Liked
45:40While much of the AI world chases ever-larger models, Ravin Kumar (Google DeepMind) and his team build across the size spectrum, from billions of parameters down to this week’s release: Gemma 270M, the smallest member yet of the Gemma 3 open-weight family. At just 270 million parameters, a quarter the size of Gemma 1B, it’s designed for speed, effi…
…
continue reading

1
Episode 55: From Frittatas to Production LLMs: Breakfast at SciPy
38:08
38:08
Play later
Play later
Lists
Like
Liked
38:08Traditional software expects 100% passing tests. In LLM-powered systems, that’s not just unrealistic — it’s a feature, not a bug. Eric Ma leads research data science in Moderna’s data science and AI group, and over breakfast at SciPy we explored why AI products break the old rules, what skills different personas bring (and miss), and how to keep sy…
…
continue reading

1
Episode 54: Scaling AI: From Colab to Clusters — A Practitioner’s Guide to Distributed Training and Inference
41:17
41:17
Play later
Play later
Lists
Like
Liked
41:17Colab is cozy. But production won’t fit on a single GPU. Zach Mueller leads Accelerate at Hugging Face and spends his days helping people go from solo scripts to scalable systems. In this episode, he joins me to demystify distributed training and inference — not just for research labs, but for any ML engineer trying to ship real software. We talk t…
…
continue reading

1
Episode 53: Human-Seeded Evals & Self-Tuning Agents: Samuel Colvin on Shipping Reliable LLMs
44:49
44:49
Play later
Play later
Lists
Like
Liked
44:49Demos are easy; durability is hard. Samuel Colvin has spent a decade building guardrails in Python (first with Pydantic, now with Logfire), and he’s convinced most LLM failures have nothing to do with the model itself. They appear where the data is fuzzy, the prompts drift, or no one bothered to measure real-world behavior. Samuel joins me to show …
…
continue reading

1
Episode 52: Why Most LLM Products Break at Retrieval (And How to Fix Them)
28:38
28:38
Play later
Play later
Lists
Like
Liked
28:38Most LLM-powered features do not break at the model. They break at the context. So how do you retrieve the right information to get useful results, even under vague or messy user queries? In this episode, we hear from Eric Ma, who leads data science research in the Data Science and AI group at Moderna. He shares what it takes to move beyond toy dem…
…
continue reading

1
Episode 51: Why We Built an MCP Server and What Broke First
47:41
47:41
Play later
Play later
Lists
Like
Liked
47:41What does it take to actually ship LLM-powered features, and what breaks when you connect them to real production data? In this episode, we hear from Philip Carter — then a Principal PM at Honeycomb and now a Product Management Director at Salesforce. In early 2023, he helped build one of the first LLM-powered SaaS features to ship to real users. M…
…
continue reading

1
Episode 50: A Field Guide to Rapidly Improving AI Products -- With Hamel Husain
27:42
27:42
Play later
Play later
Lists
Like
Liked
27:42If we want AI systems that actually work, we need to get much better at evaluating them, not just building more pipelines, agents, and frameworks. In this episode, Hugo talks with Hamel Hussain (ex-Airbnb, GitHub, DataRobot) about how teams can improve AI products by focusing on error analysis, data inspection, and systematic iteration. The convers…
…
continue reading

1
Episode 49: Why Data and AI Still Break at Scale (and What to Do About It)
1:21:45
1:21:45
Play later
Play later
Lists
Like
Liked
1:21:45If we want AI systems that actually work in production, we need better infrastructure—not just better models. In this episode, Hugo talks with Akshay Agrawal (Marimo, ex-Google Brain, Netflix, Stanford) about why data and AI pipelines still break down at scale, and how we can fix the fundamentals: reproducibility, composability, and reliable execut…
…
continue reading

1
Episode 48: HOW TO BENCHMARK AGI WITH GREG KAMRADT
1:04:25
1:04:25
Play later
Play later
Lists
Like
Liked
1:04:25If we want to make progress toward AGI, we need a clear definition of intelligence—and a way to measure it. In this episode, Hugo talks with Greg Kamradt, President of the ARC Prize Foundation, about ARC-AGI: a benchmark built on Francois Chollet’s definition of intelligence as “the efficiency at which you learn new things.” Unlike most evals that …
…
continue reading

1
Episode 47: The Great Pacific Garbage Patch of Code Slop with Joe Reis
1:19:12
1:19:12
Play later
Play later
Lists
Like
Liked
1:19:12What if the cost of writing code dropped to zero — but the cost of understanding it skyrocketed? In this episode, Hugo sits down with Joe Reis to unpack how AI tooling is reshaping the software development lifecycle — from experimentation and prototyping to deployment, maintainability, and everything in between. Joe is the co-author of Fundamentals…
…
continue reading

1
Episode 46: Software Composition Is the New Vibe Coding
1:08:57
1:08:57
Play later
Play later
Lists
Like
Liked
1:08:57What if building software felt more like composing than coding? In this episode, Hugo and Greg explore how LLMs are reshaping the way we think about software development—from deterministic programming to a more flexible, prompt-driven, and collaborative style of building. It’s not just hype or grift—it’s a real shift in how we express intent, reaso…
…
continue reading

1
Episode 45: Your AI application is broken. Here’s what to do about it.
1:17:30
1:17:30
Play later
Play later
Lists
Like
Liked
1:17:30Too many teams are building AI applications without truly understanding why their models fail. Instead of jumping straight to LLM evaluations, dashboards, or vibe checks, how do you actually fix a broken AI app? In this episode, Hugo speaks with Hamel Husain, longtime ML engineer, open-source contributor, and consultant, about why debugging generat…
…
continue reading

1
Episode 44: The Future of AI Coding Assistants: Who’s Really in Control?
1:34:11
1:34:11
Play later
Play later
Lists
Like
Liked
1:34:11AI coding assistants are reshaping how developers write, debug, and maintain code—but who’s really in control? In this episode, Hugo speaks with Tyler Dunn, CEO and co-founder of Continue, an open-source AI-powered code assistant that gives developers more customization and flexibility in their workflows. In this episode, we dive into: The trade-of…
…
continue reading

1
Episode 43: Tales from 400+ LLM Deployments: Building Reliable AI Agents in Production
1:01:03
1:01:03
Play later
Play later
Lists
Like
Liked
1:01:03Hugo speaks with Alex Strick van Linschoten, Machine Learning Engineer at ZenML and creator of a comprehensive LLMOps database documenting over 400 deployments. Alex's extensive research into real-world LLM implementations gives him unique insight into what actually works—and what doesn't—when deploying AI agents in production. In this episode, we …
…
continue reading

1
Episode 42: Learning, Teaching, and Building in the Age of AI
1:20:03
1:20:03
Play later
Play later
Lists
Like
Liked
1:20:03In this episode of Vanishing Gradients, the tables turn as Hugo sits down with Alex Andorra, host of Learning Bayesian Statistics. Hugo shares his journey from mathematics to AI, reflecting on how Bayesian inference shapes his approach to data science, teaching, and building AI-powered applications. They dive into the realities of deploying LLM app…
…
continue reading

1
Episode 41: Beyond Prompt Engineering: Can AI Learn to Set Its Own Goals?
43:51
43:51
Play later
Play later
Lists
Like
Liked
43:51Hugo Bowne-Anderson hosts a panel discussion from the MLOps World and Generative AI Summit in Austin, exploring the long-term growth of AI by distinguishing real problem-solving from trend-based solutions. If you're navigating the evolving landscape of generative AI, productionizing models, or questioning the hype, this episode dives into the tough…
…
continue reading

1
Episode 40: What Every LLM Developer Needs to Know About GPUs
1:43:34
1:43:34
Play later
Play later
Lists
Like
Liked
1:43:34Hugo speaks with Charles Frye, Developer Advocate at Modal and someone who really knows GPUs inside and out. If you’re a data scientist, machine learning engineer, AI researcher, or just someone trying to make sense of hardware for LLMs and AI workflows, this episode is for you. Charles and Hugo dive into the practical side of GPUs—from running inf…
…
continue reading

1
Episode 39: From Models to Products: Bridging Research and Practice in Generative AI at Google Labs
1:43:28
1:43:28
Play later
Play later
Lists
Like
Liked
1:43:28Hugo speaks with Ravin Kumar,*Senior Research Data Scientist at Google Labs. Ravin’s career has taken him from building rockets at SpaceX to driving data science and technology at Sweetgreen, and now to advancing generative AI research and applications at Google Labs and DeepMind. His multidisciplinary experience gives him a rare perspective on bui…
…
continue reading

1
Episode 38: The Art of Freelance AI Consulting and Products: Data, Dollars, and Deliverables
1:23:47
1:23:47
Play later
Play later
Lists
Like
Liked
1:23:47Hugo speaks with Jason Liu, an independent AI consultant with experience at Meta and Stitch Fix. At Stitch Fix, Jason developed impactful AI systems, like a $50 million product similarity search and the widely adopted Flight recommendation framework. Now, he helps startups and enterprises design and deploy production-level AI applications, with a f…
…
continue reading

1
Episode 37: Prompt Engineering, Security in Generative AI, and the Future of AI Research Part 2
50:36
50:36
Play later
Play later
Lists
Like
Liked
50:36Hugo speaks with three leading figures from the world of AI research: Sander Schulhoff, a recent University of Maryland graduate and lead contributor to the Learn Prompting initiative; Philip Resnik, professor at the University of Maryland, known for his pioneering work in computational linguistics; and Dennis Peskoff, a researcher from Princeton s…
…
continue reading

1
Episode 36: Prompt Engineering, Security in Generative AI, and the Future of AI Research Part 1
1:03:46
1:03:46
Play later
Play later
Lists
Like
Liked
1:03:46Hugo speaks with three leading figures from the world of AI research: Sander Schulhoff, a recent University of Maryland graduate and lead contributor to the Learn Prompting initiative; Philip Resnik, professor at the University of Maryland, known for his pioneering work in computational linguistics; and Dennis Peskoff, a researcher from Princeton s…
…
continue reading

1
Episode 35: Open Science at NASA -- Measuring Impact and the Future of AI
58:13
58:13
Play later
Play later
Lists
Like
Liked
58:13Hugo speaks with Dr. Chelle Gentemann, Open Science Program Scientist for NASA’s Office of the Chief Science Data Officer, about NASA’s ambitious efforts to integrate AI across the research lifecycle. In this episode, we’ll dive deeper into how AI is transforming NASA’s approach to science, making data more accessible and advancing open science pra…
…
continue reading

1
Episode 34: The AI Revolution Will Not Be Monopolized
1:42:51
1:42:51
Play later
Play later
Lists
Like
Liked
1:42:51Hugo speaks with Ines Montani and Matthew Honnibal, the creators of spaCy and founders of Explosion AI. Collectively, they've had a huge impact on the fields of industrial natural language processing (NLP), ML, and AI through their widely-used open-source library spaCy and their innovative annotation tool Prodigy. These tools have become essential …
…
continue reading

1
Episode 33: What We Learned Teaching LLMs to 1,000s of Data Scientists
1:25:10
1:25:10
Play later
Play later
Lists
Like
Liked
1:25:10Hugo speaks with Dan Becker and Hamel Husain, two veterans in the world of data science, machine learning, and AI education. Collectively, they’ve worked at Google, DataRobot, Airbnb, Github (where Hamel built out the precursor to copilot and more) and they both currently work as independent LLM and Generative AI consultants. Dan and Hamel recently…
…
continue reading

1
Episode 32: Building Reliable and Robust ML/AI Pipelines
1:15:10
1:15:10
Play later
Play later
Lists
Like
Liked
1:15:10Hugo speaks with Shreya Shankar, a researcher at UC Berkeley focusing on data management systems with a human-centered approach. Shreya's work is at the cutting edge of human-computer interaction (HCI) and AI, particularly in the realm of large language models (LLMs). Her impressive background includes being the first ML engineer at Viaduct, doing …
…
continue reading

1
Episode 31: Rethinking Data Science, Machine Learning, and AI
1:36:04
1:36:04
Play later
Play later
Lists
Like
Liked
1:36:04Hugo speaks with Vincent Warmerdam, a senior data professional and machine learning engineer at :probabl, the exclusive brand operator of scikit-learn. Vincent is known for challenging common assumptions and exploring innovative approaches in data science and machine learning. In this episode, they dive deep into rethinking established methods in d…
…
continue reading

1
Episode 30: Lessons from a Year of Building with LLMs (Part 2)
1:15:23
1:15:23
Play later
Play later
Lists
Like
Liked
1:15:23Hugo speaks about Lessons Learned from a Year of Building with LLMs with Eugene Yan from Amazon, Bryan Bischof from Hex, Charles Frye from Modal, Hamel Husain from Parlance Labs, and Shreya Shankar from UC Berkeley. These five guests, along with Jason Liu who couldn't join us, have spent the past year building real-world applications with Large Lan…
…
continue reading

1
Episode 29: Lessons from a Year of Building with LLMs (Part 1)
1:30:21
1:30:21
Play later
Play later
Lists
Like
Liked
1:30:21Hugo speaks about Lessons Learned from a Year of Building with LLMs with Eugene Yan from Amazon, Bryan Bischof from Hex, Charles Frye from Modal, Hamel Husain from Parlance Labs, and Shreya Shankar from UC Berkeley. These five guests, along with Jason Liu who couldn't join us, have spent the past year building real-world applications with Large Lan…
…
continue reading

1
Episode 28: Beyond Supervised Learning: The Rise of In-Context Learning with LLMs
1:05:38
1:05:38
Play later
Play later
Lists
Like
Liked
1:05:38Hugo speaks with Alan Nichol, co-founder and CTO of Rasa, where they build software to enable developers to create enterprise-grade conversational AI and chatbot systems across industries like telcos, healthcare, fintech, and government. What's super cool is that Alan and the Rasa team have been doing this type of thing for over a decade, giving th…
…
continue reading

1
Episode 27: How to Build Terrible AI Systems
1:32:24
1:32:24
Play later
Play later
Lists
Like
Liked
1:32:24Hugo speaks with Jason Liu, an independent consultant who uses his expertise in recommendation systems to help fast-growing startups build out their RAG applications. He was previously at Meta and Stitch Fix is also the creator of Instructor, Flight, and an ML and data science educator. They talk about how Jason approaches consulting companies acro…
…
continue reading

1
Episode 26: Developing and Training LLMs From Scratch
1:51:35
1:51:35
Play later
Play later
Lists
Like
Liked
1:51:35Hugo speaks with Sebastian Raschka, a machine learning & AI researcher, programmer, and author. As Staff Research Engineer at Lightning AI, he focuses on the intersection of AI research, software development, and large language models (LLMs). How do you build LLMs? How can you use them, both in prototype and production settings? What are the buildi…
…
continue reading

1
Episode 25: Fully Reproducible ML & AI Workflows
1:20:38
1:20:38
Play later
Play later
Lists
Like
Liked
1:20:38Hugo speaks with Omoju Miller, a machine learning guru and founder and CEO of Fimio, where she is building 21st century dev tooling. In the past, she was Technical Advisor to the CEO at GitHub, spent time co-leading non-profit investment in Computer Science Education for Google, and served as a volunteer advisor to the Obama administration’s White …
…
continue reading

1
Episode 24: LLM and GenAI Accessibility
1:30:03
1:30:03
Play later
Play later
Lists
Like
Liked
1:30:03Hugo speaks with Johno Whitaker, a Data Scientist/AI Researcher doing R&D with answer.ai. His current focus is on generative AI, flitting between different modalities. He also likes teaching and making courses, having worked with both Hugging Face and fast.ai in these capacities. Johno recently reminded Hugo how hard everything was 10 years ago: “W…
…
continue reading

1
Episode 23: Statistical and Algorithmic Thinking in the AI Age
1:20:37
1:20:37
Play later
Play later
Lists
Like
Liked
1:20:37Hugo speaks with Allen Downey, a curriculum designer at Brilliant, Professor Emeritus at Olin College, and the author of Think Python, Think Bayes, Think Stats, and other computer science and data science books. In 2019-20 he was a Visiting Professor at Harvard University. He previously taught at Wellesley College and Colby College and was a Visiti…
…
continue reading

1
Episode 22: LLMs, OpenAI, and the Existential Crisis for Machine Learning Engineering
1:20:07
1:20:07
Play later
Play later
Lists
Like
Liked
1:20:07Jeremy Howard (Fast.ai), Shreya Shankar (UC Berkeley), and Hamel Husain (Parlance Labs) join Hugo Bowne-Anderson to talk about how LLMs and OpenAI are changing the worlds of data science, machine learning, and machine learning engineering. Jeremy Howard is co-founder of fast.ai, an ex-Chief Scientist at Kaggle, and creator of the ULMFiT approach on…
…
continue reading

1
Episode 21: Deploying LLMs in Production: Lessons Learned
1:08:11
1:08:11
Play later
Play later
Lists
Like
Liked
1:08:11Hugo speaks with Hamel Husain, a machine learning engineer who loves building machine learning infrastructure and tools 👷. Hamel leads and contributes to many popular open-source machine learning projects. He also has extensive experience (20+ years) as a machine learning engineer across various industries, including large tech companies like Airbn…
…
continue reading

1
Episode 20: Data Science: Past, Present, and Future
1:26:39
1:26:39
Play later
Play later
Lists
Like
Liked
1:26:39Hugo speaks with Chris Wiggins (Columbia, NYTimes) and Matthew Jones (Princeton) about their recent book How Data Happened, and the Columbia course it expands upon, data: past, present, and future. Chris is an associate professor of applied mathematics at Columbia University and the New York Times’ chief data scientist, and Matthew is a professor o…
…
continue reading

1
Episode 19: Privacy and Security in Data Science and Machine Learning
1:23:19
1:23:19
Play later
Play later
Lists
Like
Liked
1:23:19Hugo speaks with Katharine Jarmul about privacy and security in data science and machine learning. Katharine is a Principal Data Scientist at Thoughtworks Germany focusing on privacy, ethics, and security for data science workflows. Previously, she has held numerous roles at large companies and startups in the US and Germany, implementing data proc…
…
continue reading

1
Episode 18: Research Data Science in Biotech
1:12:42
1:12:42
Play later
Play later
Lists
Like
Liked
1:12:42Hugo speaks with Eric Ma about Research Data Science in Biotech. Eric leads the Research team in the Data Science and Artificial Intelligence group at Moderna Therapeutics. Prior to that, he was part of a special ops data science team at the Novartis Institutes for Biomedical Research's Informatics department. In this episode, Hugo and Eric talk ab…
…
continue reading

1
Episode 17: End-to-End Data Science
1:16:04
1:16:04
Play later
Play later
Lists
Like
Liked
1:16:04Hugo speaks with Tanya Cashorali, a data scientist and consultant that helps businesses get the most out of data, about what end-to-end data science looks like across many industries, such as retail, defense, biotech, and sports, including scoping out projects, figuring out the correct questions to ask, how projects can change, delivering on the pr…
…
continue reading

1
Episode 16: Data Science and Decision Making Under Uncertainty
1:23:15
1:23:15
Play later
Play later
Lists
Like
Liked
1:23:15Hugo speaks with JD Long, agricultural economist, quant, and stochastic modeler, about decision making under uncertainty and how we can use our knowledge of risk, uncertainty, probabilistic thinking, causal inference, and more to help us use data science and machine learning to make better decisions in an uncertain world. This is part 2 of a two pa…
…
continue reading

1
Episode 15: Uncertainty, Risk, and Simulation in Data Science
53:30
53:30
Play later
Play later
Lists
Like
Liked
53:30Hugo speaks with JD Long, agricultural economist, quant, and stochastic modeler, about decision making under uncertainty and how we can use our knowledge of risk, uncertainty, probabilistic thinking, causal inference, and more to help us use data science and machine learning to make better decisions in an uncertain world. This is part 1 of a two pa…
…
continue reading

1
Episode 14: Decision Science, MLOps, and Machine Learning Everywhere
1:09:01
1:09:01
Play later
Play later
Lists
Like
Liked
1:09:01Hugo Bowne-Anderson, host of Vanishing Gradients, reads 3 audio essays about decision science, MLOps, and what happens when machine learning models are everywhere. Links Our upcoming Vanishing Gradients live recording of Data Science and Decision Making Under Uncertainty with Hugo and JD Long! Decision-Making in a Time of Crisis by Hugo Bowne-Ander…
…
continue reading

1
Episode 13: The Data Science Skills Gap, Economics, and Public Health
1:22:41
1:22:41
Play later
Play later
Lists
Like
Liked
1:22:41Hugo speak with Norma Padron about data science education and continuous learning for people working in healthcare, broadly construed, along with how we can think about the democratization of data science skills more generally. Norma is CEO of EmpiricaLab, where her team‘s mission is to bridge work and training and empower healthcare teams to focus…
…
continue reading

1
Episode 12: Data Science for Social Media: Twitter and Reddit
1:32:45
1:32:45
Play later
Play later
Lists
Like
Liked
1:32:45Hugo speakswith Katie Bauer about her time working in data science at both Twitter and Reddit. At the time of recording, Katie was a data science manager at Twitter and prior to that, a founding member of the data team at Reddit. She’s now Head of Data Science at Gloss Genius so congrats on the new job, Katie! In this conversation, we dive into wha…
…
continue reading

1
Episode 11: Data Science: The Great Stagnation
1:45:38
1:45:38
Play later
Play later
Lists
Like
Liked
1:45:38Hugo speaks with Mark Saroufim, an Applied AI Engineer at Meta who works on PyTorch where his team’s main focus is making it as easy as possible for people to deploy PyTorch in production outside Meta. Mark first came on our radar with an essay he wrote called Machine Learning: the Great Stagnation, which was concerned with the stagnation in machin…
…
continue reading

1
Episode 10: Investing in Machine Learning
1:26:33
1:26:33
Play later
Play later
Lists
Like
Liked
1:26:33Hugo speaks with Sarah Catanzaro, General Partner at Amplify Partners, about investing in data science and machine learning tooling and where we see progress happening in the space. Sarah invests in the tools that we both wish we had earlier in our careers: tools that enable data scientists and machine learners to collect, store, manage, analyze, a…
…
continue reading