Go offline with the Player FM app!
Is GPT-5.1 Really an Upgrade? But Models Can Auto-Hack Govts, so … there’s that
Manage episode 519445412 series 3611272
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.
https://assemblyai.com/aiexplained
Chapters:
00:00 - Introduction
00:56 - GPT 5.1 Smarter?
01:47 - Some Regressions
03:22 - Sycophancy?
05:22 - Claude Auto-Hacking
06:16 - Jailbreaking through Granularity
08:22 - This Will be Re-used
09:30 - Hallucinating Hacker
09:57 - Surprisingly Neutral Tone
12:18 - SIMA 2
14:10 - Alpha Parallels
17:24 - AI Music
GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/
System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf
Benchmarks: https://openai.com/index/gpt-5-1-for-developers/
Simple Bench: https://lmcouncil.ai/benchmarks
Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618
https://www.anthropic.com/news/disrupting-AI-espionage
Sima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
https://x.com/amoufarek/status/1988986075331858693
Voyager: https://voyager.minedojo.org/
Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/
43 episodes
Manage episode 519445412 series 3611272
A lot just got released in the last 36 hours, and it will all affect hundreds of millions of people. 10 details you would miss if you just read the headlines, from GPT 5.1 regressions, to how Claude hacked Govt Agencies, to SIMA 2, and Musical Turing Tests.
https://assemblyai.com/aiexplained
Chapters:
00:00 - Introduction
00:56 - GPT 5.1 Smarter?
01:47 - Some Regressions
03:22 - Sycophancy?
05:22 - Claude Auto-Hacking
06:16 - Jailbreaking through Granularity
08:22 - This Will be Re-used
09:30 - Hallucinating Hacker
09:57 - Surprisingly Neutral Tone
12:18 - SIMA 2
14:10 - Alpha Parallels
17:24 - AI Music
GPT 5.1 Announcement: https://openai.com/index/gpt-5-1/
System Card: https://cdn.openai.com/pdf/4173ec8d-1229-47db-96de-06d87147e07e/5_1_system_card.pdf
Benchmarks: https://openai.com/index/gpt-5-1-for-developers/
Simple Bench: https://lmcouncil.ai/benchmarks
Auto-Hacking: https://x.com/AnthropicAI/status/1989033793190277618
https://www.anthropic.com/news/disrupting-AI-espionage
Sima 2 Announcement: https://deepmind.google/blog/sima-2-an-agent-that-plays-reasons-and-learns-with-you-in-virtual-3d-worlds/
https://x.com/amoufarek/status/1988986075331858693
Voyager: https://voyager.minedojo.org/
Reuters Music: https://www.reuters.com/legal/litigation/are-you-listening-bots-survey-shows-ai-music-is-virtually-undetectable-2025-11-12/
43 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.