Go offline with the Player FM app!
#211 - Claude Voice, Flux Kontext, wrong RL research?
Manage episode 486620922 series 3010767
Our 211th episode with a summary and discussion of last week's big AI news!
Recorded on 05/31/2025
Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at [email protected] and/or [email protected]
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.
Join our Discord here! https://discord.gg/nTyezGSKwP
In this episode:
- Recent AI podcast covers significant AI news: startups, new tools, applications, investments in hardware, and research advancements.
- Discussions include the introduction of various new tools and applications such as Flux's new image generating models and Perplexity's new spreadsheet and dashboard functionalities.
- A notable segment focuses on OpenAI's partnership with the UAE and discussions on potential legislation aiming to prevent states from regulating AI for a decade.
- Concerns around model behaviors and safety are discussed, highlighting incidents like Claude Opus 4's blackmail attempt and Palisade Research's tests showing AI models bypassing shutdown commands.
Timestamps + Links:
- (00:00:10) Intro / Banter
- (00:01:39) News Preview
- (00:02:50) Response to Listener Comments
- Tools & Apps
- (00:07:10) Anthropic launches a voice mode for Claude
- (00:10:35) Black Forest Labs’ Kontext AI models can edit pics as well as generate them
- (00:15:30) Perplexity’s new tool can generate spreadsheets, dashboards, and more
- (00:18:43) xAI to pay Telegram $300M to integrate Grok into the chat app
- (00:22:42) Opera’s new AI browser promises to write code while you sleep
- (00:24:17) Google Photos debuts redesigned editor with new AI tools
- Applications & Business
- (00:25:13) Top Chinese memory maker expected to abandon DDR4 manufacturing at the behest of Beijing
- (00:30:04) Oracle to Buy $40 Billion Worth of Nvidia Chips for First Stargate Data Center
- (00:31:47) UAE makes ChatGPT Plus subscription free for all residents as part of deal with OpenAI
- (00:35:34) NVIDIA Corporation (NVDA) to Launch Cheaper Blackwell AI Chip for China, Says Report
- (00:38:39) The New York Times and Amazon ink AI licensing deal
- Projects & Open Source
- (00:41:11) DeepSeek’s distilled new R1 AI model can run on a single GPU
- (00:45:19) Google Unveils SignGemma, an AI Model That Can Translate Sign Language Into Spoken Text
- (00:47:08) Open-sourcing circuit tracing tools
- (00:49:42) Hugging Face unveils two new humanoid robots
- Research & Advancements
- (00:52:33) PANGU PRO MOE: MIXTURE OF GROUPED EXPERTS FOR EFFICIENT SPARSITY
- (00:58:55) DataRater: Meta-Learned Dataset Curation
- (01:05:05) Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims
- (01:10:17) Maximizing Confidence Alone Improves Reasoning
- (01:11:00) Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence
- (01:11:44) One RL to See Them All
- (01:15:05) Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
- Policy & Safety
- (01:17:58) Trump's 'Big Beautiful Bill' could ban states from regulating AI for a decade
- (01:24:31) Researchers claim ChatGPT o3 bypassed shutdown in controlled test
- (01:30:10) Anthropic’s new AI model turns to blackmail when engineers try to take it offline
- (01:31:09) Anthropic Faces Backlash As Claude 4 Opus Can Autonomously Alert Authorities
- (01:35:37) Claude helps users make bioweapons
- (01:35:49) The Claude 4 System Card is a Wild Read
251 episodes
Manage episode 486620922 series 3010767
Our 211th episode with a summary and discussion of last week's big AI news!
Recorded on 05/31/2025
Hosted by Andrey Kurenkov and Jeremie Harris.
Feel free to email us your questions and feedback at [email protected] and/or [email protected]
Read out our text newsletter and comment on the podcast at https://lastweekin.ai/.
Join our Discord here! https://discord.gg/nTyezGSKwP
In this episode:
- Recent AI podcast covers significant AI news: startups, new tools, applications, investments in hardware, and research advancements.
- Discussions include the introduction of various new tools and applications such as Flux's new image generating models and Perplexity's new spreadsheet and dashboard functionalities.
- A notable segment focuses on OpenAI's partnership with the UAE and discussions on potential legislation aiming to prevent states from regulating AI for a decade.
- Concerns around model behaviors and safety are discussed, highlighting incidents like Claude Opus 4's blackmail attempt and Palisade Research's tests showing AI models bypassing shutdown commands.
Timestamps + Links:
- (00:00:10) Intro / Banter
- (00:01:39) News Preview
- (00:02:50) Response to Listener Comments
- Tools & Apps
- (00:07:10) Anthropic launches a voice mode for Claude
- (00:10:35) Black Forest Labs’ Kontext AI models can edit pics as well as generate them
- (00:15:30) Perplexity’s new tool can generate spreadsheets, dashboards, and more
- (00:18:43) xAI to pay Telegram $300M to integrate Grok into the chat app
- (00:22:42) Opera’s new AI browser promises to write code while you sleep
- (00:24:17) Google Photos debuts redesigned editor with new AI tools
- Applications & Business
- (00:25:13) Top Chinese memory maker expected to abandon DDR4 manufacturing at the behest of Beijing
- (00:30:04) Oracle to Buy $40 Billion Worth of Nvidia Chips for First Stargate Data Center
- (00:31:47) UAE makes ChatGPT Plus subscription free for all residents as part of deal with OpenAI
- (00:35:34) NVIDIA Corporation (NVDA) to Launch Cheaper Blackwell AI Chip for China, Says Report
- (00:38:39) The New York Times and Amazon ink AI licensing deal
- Projects & Open Source
- (00:41:11) DeepSeek’s distilled new R1 AI model can run on a single GPU
- (00:45:19) Google Unveils SignGemma, an AI Model That Can Translate Sign Language Into Spoken Text
- (00:47:08) Open-sourcing circuit tracing tools
- (00:49:42) Hugging Face unveils two new humanoid robots
- Research & Advancements
- (00:52:33) PANGU PRO MOE: MIXTURE OF GROUPED EXPERTS FOR EFFICIENT SPARSITY
- (00:58:55) DataRater: Meta-Learned Dataset Curation
- (01:05:05) Incorrect Baseline Evaluations Call into Question Recent LLM-RL Claims
- (01:10:17) Maximizing Confidence Alone Improves Reasoning
- (01:11:00) Guided by Gut: Efficient Test-Time Scaling with Reinforced Intrinsic Confidence
- (01:11:44) One RL to See Them All
- (01:15:05) Efficient Reinforcement Finetuning via Adaptive Curriculum Learning
- Policy & Safety
- (01:17:58) Trump's 'Big Beautiful Bill' could ban states from regulating AI for a decade
- (01:24:31) Researchers claim ChatGPT o3 bypassed shutdown in controlled test
- (01:30:10) Anthropic’s new AI model turns to blackmail when engineers try to take it offline
- (01:31:09) Anthropic Faces Backlash As Claude 4 Opus Can Autonomously Alert Authorities
- (01:35:37) Claude helps users make bioweapons
- (01:35:49) The Claude 4 System Card is a Wild Read
251 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.