Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Google Developer Podcasts. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Google Developer Podcasts or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Conversation with Bibo Xu: How agent conversations are evolving with Google AI

57:42
 
Share
 

Manage episode 514072470 series 3458329
Content provided by Google Developer Podcasts. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Google Developer Podcasts or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Bibo Xu is a Product Manager at Google DeepMind and leads Gemini's multimodal modeling. This video dives into Google AI's journey from basic voice commands to advanced dialogue systems that comprehend not just what is said, but also tone, emotion, and visual context. Check out this conversation to gain a deeper understanding of the challenges and opportunities in integrating diverse AI capabilities when creating universal assistants.

Resources:

Chapters: 0:00 - Intro 1:43 - Introducing Bibo Xu 2:40 - Bibo's Journey: From business school to voice AI 3:59 - The genesis of Google Assistant and Google Home 6:50 - Milestones in speech recognition technology 13:30 - Shifting from command-based AI to natural dialogue 19:00 - The power of multimodal AI for human interaction 21:20 - Real-time multilingual translation with LLMs 25:20 - Project Astra: Building a universal assistant 28:40 - Developer challenges in multimodal AI integration 29:50 - Unpacking the "can't see" debugging story 35:10 - The importance of low latency and interruption 38:30 - Seamless dialogue and background noise filtering 40:00 - Redefining human-computer interaction 41:00 - Ethical considerations for humanlike AI 44:00 - Responding to user emotions and frustration 45:50 - Politeness and expectations in AI conversations 49:10 - AI as a catalyst for research and automation 52:00 - The future of AI assistants and tool use 52:40 - AI interacting with interfaces 54:50 - Transforming the future of work and communication 55:19 - AI for enhanced writing and idea generation 57:13 - Conclusion and future outlook for AI development

Subscribe to Google for Developers → https://goo.gle/developers

Speakers: Bibo Xu, Christina Warren, Ashley Oldacre Products Mentioned: Google AI, Gemini, Generative AI, Android, Google Home, Google Voice, Project Astra, Gemini Live, Google DeepMind

  continue reading

39 episodes

Artwork
iconShare
 
Manage episode 514072470 series 3458329
Content provided by Google Developer Podcasts. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Google Developer Podcasts or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Bibo Xu is a Product Manager at Google DeepMind and leads Gemini's multimodal modeling. This video dives into Google AI's journey from basic voice commands to advanced dialogue systems that comprehend not just what is said, but also tone, emotion, and visual context. Check out this conversation to gain a deeper understanding of the challenges and opportunities in integrating diverse AI capabilities when creating universal assistants.

Resources:

Chapters: 0:00 - Intro 1:43 - Introducing Bibo Xu 2:40 - Bibo's Journey: From business school to voice AI 3:59 - The genesis of Google Assistant and Google Home 6:50 - Milestones in speech recognition technology 13:30 - Shifting from command-based AI to natural dialogue 19:00 - The power of multimodal AI for human interaction 21:20 - Real-time multilingual translation with LLMs 25:20 - Project Astra: Building a universal assistant 28:40 - Developer challenges in multimodal AI integration 29:50 - Unpacking the "can't see" debugging story 35:10 - The importance of low latency and interruption 38:30 - Seamless dialogue and background noise filtering 40:00 - Redefining human-computer interaction 41:00 - Ethical considerations for humanlike AI 44:00 - Responding to user emotions and frustration 45:50 - Politeness and expectations in AI conversations 49:10 - AI as a catalyst for research and automation 52:00 - The future of AI assistants and tool use 52:40 - AI interacting with interfaces 54:50 - Transforming the future of work and communication 55:19 - AI for enhanced writing and idea generation 57:13 - Conclusion and future outlook for AI development

Subscribe to Google for Developers → https://goo.gle/developers

Speakers: Bibo Xu, Christina Warren, Ashley Oldacre Products Mentioned: Google AI, Gemini, Generative AI, Android, Google Home, Google Voice, Project Astra, Gemini Live, Google DeepMind

  continue reading

39 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play