Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Prateek Joshi. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Prateek Joshi or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Diffusion LLMs - The Fastest LLMs Ever Built | Stefano Ermon, cofounder of Inception Labs

39:09
 
Share
 

Manage episode 512580470 series 3370867
Content provided by Prateek Joshi. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Prateek Joshi or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Stefano Ermon is the cofounder of Inception Labs and an associate professor at Stanford. Inception is developing a new type of AI models called Diffusion LLMs.
Stefano's favorite book: If on a Winter's Night a Traveler (Author: Italo Calvino)
(00:01) Introduction
(00:38) What are autoregressive LLMs and how do they work
(02:28) How diffusion LLMs rethink generation
(04:02) The ceiling of autoregressive LLMs: cost, latency, reliability
(06:19) Why diffusion LLMs are commercially viable now
(09:12) Parallel refinement: how diffusion models generate text
(12:05) Understanding diffusion steps and efficiency
(13:49) Hardest engineering challenges at Inception
(15:23) From research to production: the power of data
(16:24) Where diffusion LLMs still lag behind
(18:18) Evaluations and benchmarks for diffusion LLMs
(20:20) Developer experience and OpenAI-compatible API
(21:47) Economics and GPU efficiency
(23:38) Hardware and runtime stack
(24:58) Competition and the evolving diffusion LLM landscape
(27:01) Where diffusion will win first — coding and agentic systems
(30:13) How diffusion changes infra, serving, and hardware design
(33:04) What’s next at Inception: reasoning and multimodality
(35:20) Rapid Fire Round
--------
Where to find Stefano Ermon:
LinkedIn: https://www.linkedin.com/in/ermon/
--------
Where to find Prateek Joshi:

Research column: https://www.infrastartups.com
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-infinite
X: https://x.com/prateekvjoshi

  continue reading

188 episodes

Artwork
iconShare
 
Manage episode 512580470 series 3370867
Content provided by Prateek Joshi. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Prateek Joshi or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Stefano Ermon is the cofounder of Inception Labs and an associate professor at Stanford. Inception is developing a new type of AI models called Diffusion LLMs.
Stefano's favorite book: If on a Winter's Night a Traveler (Author: Italo Calvino)
(00:01) Introduction
(00:38) What are autoregressive LLMs and how do they work
(02:28) How diffusion LLMs rethink generation
(04:02) The ceiling of autoregressive LLMs: cost, latency, reliability
(06:19) Why diffusion LLMs are commercially viable now
(09:12) Parallel refinement: how diffusion models generate text
(12:05) Understanding diffusion steps and efficiency
(13:49) Hardest engineering challenges at Inception
(15:23) From research to production: the power of data
(16:24) Where diffusion LLMs still lag behind
(18:18) Evaluations and benchmarks for diffusion LLMs
(20:20) Developer experience and OpenAI-compatible API
(21:47) Economics and GPU efficiency
(23:38) Hardware and runtime stack
(24:58) Competition and the evolving diffusion LLM landscape
(27:01) Where diffusion will win first — coding and agentic systems
(30:13) How diffusion changes infra, serving, and hardware design
(33:04) What’s next at Inception: reasoning and multimodality
(35:20) Rapid Fire Round
--------
Where to find Stefano Ermon:
LinkedIn: https://www.linkedin.com/in/ermon/
--------
Where to find Prateek Joshi:

Research column: https://www.infrastartups.com
Newsletter: https://prateekjoshi.substack.com
Website: https://prateekj.com
LinkedIn: https://www.linkedin.com/in/prateek-joshi-infinite
X: https://x.com/prateekvjoshi

  continue reading

188 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play