Covering the biggest news of the century - the arrival of smarter-than-human AI. From the author of Simple Bench, which reveals the remaining gap between LLM and human reasoning. Hype-free, and the British accent is a freebie bonus.
…
continue reading
Philip Host Of AI Explained YT Podcasts
1
Bubble or No Bubble, AI Keeps Progressing (ft. Relentless Learning + Introspection)
12:53
12:53
Play later
Play later
Lists
Like
Liked
12:53Don’t let headlines about bubbles distract you from the real avenues of progress being explored in AI every week, including what had been thought to be a long-term blocker - continual learning (learning on the fly). https://app.grayswan.ai/ai-explained This, plus models introspecting (hesitate before you berate), Nano Banana 2 possibly spotted, Chi…
…
continue reading
1
Sora 2 - It will only get more realistic from here
15:43
15:43
Play later
Play later
Lists
Like
Liked
15:43Sora 2 - the start of the infinite slop-feed or a key step to a generalist agent? Better than VEO 3 or over-hyped? I bring out 6 details you may have missed, contrast the announcement to Periodic Labs and even squeeze in some Claude Sonnet 4.5 analysis. Maybe I should make my videos longer… https://80000hours.org/aiexplained AI Insiders ($9!): http…
…
continue reading
1
OpenAI Tests if GPT-5 Can Automate Your Job - 4 Unexpected Findings
14:06
14:06
Play later
Play later
Lists
Like
Liked
14:06An OpenAI report released in the last 24 hours is the best look we have as to whether 2025 AI can automate your job. I’ll go through 4 unexpected findings, from which model is best at what, to practical tips and massive caveats. Plus UFC robots, radiologist essay, don’t trust videos and the blockers to the singularity. Gray Swan: https://app.graysw…
…
continue reading
1
ChatGPT Will Guess your Age, Flirt if Asked, and Can Call the Cops
11:31
11:31
Play later
Play later
Lists
Like
Liked
11:31Sam Altman, CEO of OpenAI, announced a set of new ‘protections’ and ‘privileges’ for ChatGPT users, requiring a significant amount of trust from users. From predicting your age based on your chat to calling law enforcement if you are at risk of harm, to allowing non-minors to flirt. But amidst all of these announcements, there are interview snippet…
…
continue reading
1
An ‘AI Bubble’? What Altman Actually said, the Facts and Nano Banana
18:54
18:54
Play later
Play later
Lists
Like
Liked
18:54Wait, why did Sam Altman say AI was in a bubble? Or did he? Is it? 8 points for you to consider, before we all get distracted by Nano Banana. Chapters: 00:00 - Introduction 01:14 - Sam Altman Clarification 02:30 - Media Calls a Bubble (for the tenth time) 03:40 - MIT and McKinsey Analysed 08:21 - Incremental Progress Deceptive 12:07 - Reasoning Bre…
…
continue reading
GPT-5 will change how hundreds of millions of people use AI. Yes, you might have to forgive the chart crimes, the underwhelming livestream and Altman hype… But it’s a good model. I have read the 50 page system card in full, have the benchmark scores, coding tests, and things you might have missed. https://app.grayswan.ai/ai-explained Announcement: …
…
continue reading
1
Genie 3: The World Becomes Playable (DeepMind)
11:54
11:54
Play later
Play later
Lists
Like
Liked
11:54Soon, anything will be playable. A photo becomes an interactive world, a selfie becomes a new game. Genie 3 from Google, debuting just 2 hours ago, is what I mean, and I have the full analysis, plus the pushback I gave the authors (will it really lead to reliable AI agents? Is that even the point?). You make your own mind up, but it’s certainly fas…
…
continue reading
1
How Not to Read a Headline on AI (ft. new Olympiad Gold, GPT-5 …)
17:19
17:19
Play later
Play later
Lists
Like
Liked
17:19GPT-5 did what? OpenAI ahead of Google? There are 9 ways to misread the headlines of the last 48 hours, so this video is here to tell you what happened, sans sizzle. It’s been a fairly momentous last few days, so let’s dive in to the International Math Olympiad Gold, GPT-5 alpha release, whether mathematicians are out of jobs, and the white collar …
…
continue reading
Grok 4 is here, but did you know these 10 things about the new model? From benchmark caveats to soloing science, $300 a month secrets to Grok 5 promises, here's 10 new things to know in just under 12 minutes. AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00:22 - Benchmark Results 02:11 - Benchmark Caveats 02:…
…
continue reading
1
When Will AI Models Blackmail You, and Why?
26:19
26:19
Play later
Play later
Lists
Like
Liked
26:19In the last few days Anthropic have released an impressive honest account of how all models blackmail, no matter what goal they have, and despite prompt warnings, and other preventions. But do these models *want* this? Thanks to Storyblocks for sponsoring this video! Download unlimited stock media at one set price with Storyblocks: storyblocks.com/…
…
continue reading
1
Apple’s ‘AI Can’t Reason’ Claim Seen By 13M+, What You Need to Know
14:00
14:00
Play later
Play later
Lists
Like
Liked
14:00What to make of those headlines that AI can’t reason, seen by tens of millions? I cover the paper in layman’s terms, what it means and doesn’t mean, and what’s next. Thanks to Storyblocks for sponsoring this video! Download unlimited stock media at one set price with Storyblocks: https://storyblocks.com/AIExplained Plus o3-pro and whether it is my …
…
continue reading
1
AI Accelerates: New Gemini Model + AI Unemployment Stories Analysed
16:41
16:41
Play later
Play later
Lists
Like
Liked
16:41There’s a new best language model, so let’s go through the up and downs of Gemini 2.5 Pro 06-05. Record-breaking common-sense, but dumb mistakes remain. And it’s not even their best model, which remains behind the scenes - Gemini 2.5 Ultra. Plus Sundar Pichai’s AGI date and an analysis of whether the current AI unemployment headlines are justified,…
…
continue reading
1
Claude 4: Full 120 Page Breakdown … Is it the Best New Model?
19:04
19:04
Play later
Play later
Lists
Like
Liked
19:04Not only did I get early access and ran my own tests, as per the title I read both the 120 page Claude 4 Opus and Claude 4 Sonnet System Card, and 25 page report on ASL-3 being triggered, plus the 2 hour launch video, and surrounding coverage. Ft. coding tests, Simple, twitter controversies, deep alignment coverage, spiritual bliss and much more! h…
…
continue reading
1
Google Takes No Prisoners Amid Torrent of AI Announcements
17:07
17:07
Play later
Play later
Lists
Like
Liked
17:07Google just announced at least 12 things that are each worthy of a video, but here are the top I/O highlights. From Veo 3 to Deep Research now being useable, Deep Think breaking records to Gemini Diffusion, Gemini 2.5 Flash changing how AI is priced and GemmaVerse, SynthID Detector and Imagen 4. And even this intro is missing other announcements co…
…
continue reading
AlphaEvolve is not the first system to exhibit self-improvement, but it may be the most impressive yet. AI is literally improving the hardware, architectures, data and training methods of AI itself. A deep dive into the paper, drawing on two previous interviews and 5 other papers. Plus a snippet on OpenAI’s new Codex system. Gray Swan: http://app.g…
…
continue reading
1
o3 breaks (some) records, but AI becomes pay-to-win
14:33
14:33
Play later
Play later
Lists
Like
Liked
14:33A green card, o3 vs Gemini 2.5, 6 Benchmarks and a whole bunch of my thoughts on what on earth is happening in AI, from here to 2030. Plus, how AI is becoming pay-to-win, and why. Crazy times, 14 mins probably wasn’t enough. https://app.grayswan.ai/ai-explained AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: 00:00 - Introduction 00…
…
continue reading
1
o3 and o4-mini - they’re great, but easy to over-hype
14:24
14:24
Play later
Play later
Lists
Like
Liked
14:24Critical analysis of the two most powerful new models behind ChatGPT, o3 and o4-mini. Not just the system cards, benchmarks, and my own tests, but some you may not have seen before. Yes, they can whip up amazing front-end in a few seconds, but you always have to ask what is in their data. Either way, they prove the gains from RL are just beginning……
…
continue reading
1
‘Speaking Dolphin’ to AI Data Dominance, 4.1 + Kling 2: 7 Developments Critically Analysed
20:09
20:09
Play later
Play later
Lists
Like
Liked
20:09This pod won’t just be about the release of GPT 4.1 in the last 48 hours, o3 build-up, Kling 2.0, a sneak-peak at the next OpenAI model, or even the new Dolphin language tool. It will be about 7 such stories that contextualise where we are in AI and what is happening. https://www.emergentmind.com/ Chapters: 00:00 - Introduction 00:30 - Kling 2.0 01…
…
continue reading
1
AI CEO: ‘Stock Crash Could Stop AI Progress’, Llama 4 Anti-climax +‘Superintelligence in 2027’...
23:51
23:51
Play later
Play later
Lists
Like
Liked
23:51The latest on Llama 4, and whether it signals a slowdown in AI, or solid progress. Plus, a deep dive on that viral prediction of superintelligence by 2027, and Amodei’s cautionary words on what could stop AI progress in its tracks. o3 news, and more, as well. Weights & Biases: https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_be…
…
continue reading
1
Gemini 2.5 Pro - It’s a Smart Chatbot … (New Simple High Score)
21:21
21:21
Play later
Play later
Lists
Like
Liked
21:21Gemini gets a new record on Simple Bench, and several other benchmarks. I’ll go deep to explore its nuances, including how it deceptively reverse engineers answers, does better on certain coding benchmarks than others, may have a universal ‘conceptual language’ … https://weave-docs.wandb.ai/?utm_source=sponsorship&utm_medium=simple_bench&utm_campai…
…
continue reading
1
Did AI Just Get Commoditized? Gemini 2.5, New DeepSeek V3, & Microsoft vs OpenAI
13:47
13:47
Play later
Play later
Lists
Like
Liked
13:47Gemini 2.5 is out, on the same day as the new DeepSeek V3 (which should power Deepseek R2). Do both models prove AI is being commoditized? Let’s find out, on this blockbuster day of AI releases. Plus exclusives from the Information, Simple indications, Vista Bench, LM Arena and more… AI Insiders ($9!): https://www.patreon.com/AIExplained Chapters: …
…
continue reading
1
Manus AI - The Calm Before the Hypestorm … (vs Deep Research + Grok 3)
12:58
12:58
Play later
Play later
Lists
Like
Liked
12:58Is Manus AI the memecoin of the AI world, or legit? I’ll compare it to OpenAI’s Deep Research, Operator, Grok 3 DeepSearch and more to find out. I’ll also let you in on some of the secrets of what makes a good hype campaign, the estimated costs of Manus AI, and where it is strong. Other news (yes, Gemini image editing and research hacking, I mean y…
…
continue reading
GPT 4.5 is here, and do you remember when AI lab CEOs like Sam Altman and Dario Amodei were betting everything on scaling up base models like this one? Well let’s find out what would have happened if the future of AI rested on models like GPT 4.5. You’ll see all the benchmarks, highlights of the paper, emotional intelligence and humor tests, Simple…
…
continue reading
1
Claude 3.7 is More Significant than its Name Implies (ft DeepSeek R2 + GPT 4.5 coming soon)
27:39
27:39
Play later
Play later
Lists
Like
Liked
27:39Claude 3.7 is here, hot on the heels of Grok 3 and a host of other developments, but how good is it really? And what does it say about the next few months in AI? I’ve read the papers, played with the model for hours, and benched it on Simple. Things aren’t slowing down. Plus the latest in humanoid robots, led by Helix and freaked out by Protoclone.…
…
continue reading
1
AGI: (gets close), Humans: ‘Who Gets the Money?’
22:17
22:17
Play later
Play later
Lists
Like
Liked
22:17A 'frontier reasoning model' from just 1000 examples (s1). A $100B Musk bid for power. Gemini 2, Rand and warning from Amodei. Here’s 7-8 developments you may have missed but which I would argue help us understand how the next few years will play out. From labour vs capital to automating rival companies and countries, and from non-profit shenanigan…
…
continue reading