Go offline with the Player FM app!
#202 - Qwen-32B, Anthropic's $3.5 billion, LLM Cognitive Behaviors
Manage episode 470417572 series 3010767
Our 202nd episode with a summary and discussion of last week's big AI news!
Recorded on 03/07/2025
Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at [email protected] and/or [email protected]
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.
Join our Discord here! https://discord.gg/nTyezGSKwP
In this episode:
- Alibaba released Qwen-32B, their latest reasoning model, on par with leading models like DeepMind’s R1.
 - Anthropic raised $3.5 billion in a funding round, valuing the company at $61.5 billion, solidifying its position as a key competitor to OpenAI.
 - DeepMind introduced BigBench Extra Hard, a more challenging benchmark to evaluate the reasoning capabilities of large language models.
 - Reinforcement Learning pioneers Andrew Bartow and Rich Sutton were awarded the prestigious Turing Award for their contributions to the field.
 
Timestamps + Links:
cle picks:
- (00:00:00) Intro / Banter
 - (00:01:41) Episode Preview
 - (00:02:50) GPT-4.5 Discussion
 
- (00:14:13) Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI’s o1-mini
 - (00:21:29) With Alexa Plus, Amazon finally reinvents its best product
 - (00:26:08) Another DeepSeek moment? General AI agent Manus shows ability to handle complex tasks
 - (00:29:14) Microsoft’s new Dragon Copilot is an AI assistant for healthcare
 - (00:32:24) Mistral’s new OCR API turns any PDF document into an AI-ready Markdown file
 
- (00:33:19) A.I. Start-Up Anthropic Closes Deal That Values It at $61.5 Billion
 - (00:35:49) Nvidia-Backed CoreWeave Files for IPO, Shows Growing Revenue
 - (00:38:05) Waymo and Uber's Austin robotaxi expansion begins today
 - (00:38:54) UK competition watchdog drops Microsoft-OpenAI probe
 - (00:41:17) Scale AI announces multimillion-dollar defense deal, a major step in U.S. military automation
 
- (00:44:43) DeepSeek Open Source Week: A Complete Summary
 - (00:45:25) DeepSeek AI Releases DualPipe: A Bidirectional Pipeline Parallelism Algorithm for Computation-Communication Overlap in V3/R1 Training
 - (00:53:00) Physical Intelligence open-sources Pi0 robotics foundation model
 - (00:54:23) BIG-Bench Extra Hard
 
- (00:56:10) Cognitive Behaviors that Enable Self-Improving Reasoners
 - (01:01:49) The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
 - (01:05:32) Pioneers of Reinforcement Learning Win the Turing Award
 - (01:06:56) OpenAI launches $50M grant program to help fund academic research
 
- (01:07:25) The Nuclear-Level Risk of Superintelligent AI
 - (01:13:34) METR’s GPT-4.5 pre-deployment evaluations
 - (01:17:16) Chinese buyers are getting Nvidia Blackwell chips despite US export controls
 
See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
264 episodes
Manage episode 470417572 series 3010767
Our 202nd episode with a summary and discussion of last week's big AI news!
Recorded on 03/07/2025
Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at [email protected] and/or [email protected]
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.
Join our Discord here! https://discord.gg/nTyezGSKwP
In this episode:
- Alibaba released Qwen-32B, their latest reasoning model, on par with leading models like DeepMind’s R1.
 - Anthropic raised $3.5 billion in a funding round, valuing the company at $61.5 billion, solidifying its position as a key competitor to OpenAI.
 - DeepMind introduced BigBench Extra Hard, a more challenging benchmark to evaluate the reasoning capabilities of large language models.
 - Reinforcement Learning pioneers Andrew Bartow and Rich Sutton were awarded the prestigious Turing Award for their contributions to the field.
 
Timestamps + Links:
cle picks:
- (00:00:00) Intro / Banter
 - (00:01:41) Episode Preview
 - (00:02:50) GPT-4.5 Discussion
 
- (00:14:13) Alibaba’s New QwQ 32B Model is as Good as DeepSeek-R1 ; Outperforms OpenAI’s o1-mini
 - (00:21:29) With Alexa Plus, Amazon finally reinvents its best product
 - (00:26:08) Another DeepSeek moment? General AI agent Manus shows ability to handle complex tasks
 - (00:29:14) Microsoft’s new Dragon Copilot is an AI assistant for healthcare
 - (00:32:24) Mistral’s new OCR API turns any PDF document into an AI-ready Markdown file
 
- (00:33:19) A.I. Start-Up Anthropic Closes Deal That Values It at $61.5 Billion
 - (00:35:49) Nvidia-Backed CoreWeave Files for IPO, Shows Growing Revenue
 - (00:38:05) Waymo and Uber's Austin robotaxi expansion begins today
 - (00:38:54) UK competition watchdog drops Microsoft-OpenAI probe
 - (00:41:17) Scale AI announces multimillion-dollar defense deal, a major step in U.S. military automation
 
- (00:44:43) DeepSeek Open Source Week: A Complete Summary
 - (00:45:25) DeepSeek AI Releases DualPipe: A Bidirectional Pipeline Parallelism Algorithm for Computation-Communication Overlap in V3/R1 Training
 - (00:53:00) Physical Intelligence open-sources Pi0 robotics foundation model
 - (00:54:23) BIG-Bench Extra Hard
 
- (00:56:10) Cognitive Behaviors that Enable Self-Improving Reasoners
 - (01:01:49) The MASK Benchmark: Disentangling Honesty From Accuracy in AI Systems
 - (01:05:32) Pioneers of Reinforcement Learning Win the Turing Award
 - (01:06:56) OpenAI launches $50M grant program to help fund academic research
 
- (01:07:25) The Nuclear-Level Risk of Superintelligent AI
 - (01:13:34) METR’s GPT-4.5 pre-deployment evaluations
 - (01:17:16) Chinese buyers are getting Nvidia Blackwell chips despite US export controls
 
See Privacy Policy at https://art19.com/privacy and California Privacy Notice at https://art19.com/privacy#do-not-sell-my-info.
264 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.