Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Lukas Biewald. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Lukas Biewald or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

The CEO Behind the Fastest-Growing AI Inference Company | Tuhin Srivastava

59:13
 
Share
 

Manage episode 520068993 series 2777250
Content provided by Lukas Biewald. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Lukas Biewald or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

In this episode of Gradient Dissent, Lukas Biewald talks with Tuhin Srivastava, CEO and founder of Baseten, one of the fastest-growing companies in the AI inference ecosystem. Tuhin shares the real story behind Baseten’s rise and how the market finally aligned with the infrastructure they’d spent years building.

They get into the core challenges of modern inference, including why dedicated deployments matter, how runtime and infrastructure bottlenecks stack up, and what makes serving large models fundamentally different from smaller ones.

Tuhin also explains how vLLM, TensorRT-LLM, and SGLang differ in practice, what it takes to tune workloads for new chips like the B200, and why reliability becomes harder as systems scale.

The conversation dives into company-building, from killing product lines to avoiding premature scaling while navigating a market that shifts every few weeks.

Connect with us here:

Tuhin Srivastva: https://www.linkedin.com/in/tuhin-srivastava/

Lukas Biewald: https://www.linkedin.com/in/lbiewald/

Weights & Biases: https://www.linkedin.com/company/wandb/

  continue reading

131 episodes

Artwork
iconShare
 
Manage episode 520068993 series 2777250
Content provided by Lukas Biewald. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Lukas Biewald or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

In this episode of Gradient Dissent, Lukas Biewald talks with Tuhin Srivastava, CEO and founder of Baseten, one of the fastest-growing companies in the AI inference ecosystem. Tuhin shares the real story behind Baseten’s rise and how the market finally aligned with the infrastructure they’d spent years building.

They get into the core challenges of modern inference, including why dedicated deployments matter, how runtime and infrastructure bottlenecks stack up, and what makes serving large models fundamentally different from smaller ones.

Tuhin also explains how vLLM, TensorRT-LLM, and SGLang differ in practice, what it takes to tune workloads for new chips like the B200, and why reliability becomes harder as systems scale.

The conversation dives into company-building, from killing product lines to avoiding premature scaling while navigating a market that shifts every few weeks.

Connect with us here:

Tuhin Srivastva: https://www.linkedin.com/in/tuhin-srivastava/

Lukas Biewald: https://www.linkedin.com/in/lbiewald/

Weights & Biases: https://www.linkedin.com/company/wandb/

  continue reading

131 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play