Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by David Gerard. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by David Gerard or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

20250418 - ‘Reasoning’ AI is LYING to you! — or maybe it’s just hallucinating again

5:58
 
Share
 

Manage episode 479663514 series 3662020
Content provided by David Gerard. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by David Gerard or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

The chatbot is definitely trying to kill you, maybe. Send us money.

Text version: https://pivot-to-ai.com/2025/04/18/reasoning-ai-is-lying-to-you-or-maybe-its-just-hallucinating-again/

Sources:

Anthropic: Reasoning models don't always say what they think https://www.anthropic.com/research/reasoning-models-dont-say-think

paper (PDF) https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf

Introducing Transluce https://transluce.org/introducing-transluce Investigating truthfulness in a pre-release o3 model https://transluce.org/investigating-o3-truthfulness

Transluce: "These behaviors are surprising." https://x.com/TransluceAI/status/1912552068717637980

(Ars Technica article, edited) Researchers concerned to find AI models misrepresenting their “reasoning” processes https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/

(Ars Technica article, original) Researchers concerned to find AI models hiding their true “reasoning” processes https://web.archive.org/web/20250410231357/https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/

Copyscape is nice for quickly comparing web pages https://copyscape.com

Previously:

Anthropic, Apollo astounded to find a chatbot will lie to you if you tell it to lie to you https://pivot-to-ai.com/2024/12/19/anthropic-and-apollo-astounded-to-find-that-a-chatbot-will-lie-to-you-if-you-tell-it-to-lie-to-you/

How Sam Altman got fired from OpenAI in 2023: not being an AI doom crank (and lying a lot) https://pivot-to-ai.com/2025/04/06/how-sam-altman-got-fired-from-openai-in-2023-not-being-an-ai-doom-crank-and-lying-a-lot/

video: https://www.youtube.com/watch?v=xlrBjeAtJUk&list=UU9rJrMVgcXTfa8xuMnbhAEA

T-shirt store now open! https://pivot-to-ai.redbubble.com

Enhance the channel: https://www.amazon.co.uk/hz/wishlist/ls/3Q8VZW46J6DM6

Please fund my vital AI safety research! The fate of humanity is at stake!

Patreon: https://www.patreon.com/davidgerard

Ko-Fi: https://ko-fi.com/A1529D5

  continue reading

40 episodes

Artwork
iconShare
 
Manage episode 479663514 series 3662020
Content provided by David Gerard. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by David Gerard or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

The chatbot is definitely trying to kill you, maybe. Send us money.

Text version: https://pivot-to-ai.com/2025/04/18/reasoning-ai-is-lying-to-you-or-maybe-its-just-hallucinating-again/

Sources:

Anthropic: Reasoning models don't always say what they think https://www.anthropic.com/research/reasoning-models-dont-say-think

paper (PDF) https://assets.anthropic.com/m/71876fabef0f0ed4/original/reasoning_models_paper.pdf

Introducing Transluce https://transluce.org/introducing-transluce Investigating truthfulness in a pre-release o3 model https://transluce.org/investigating-o3-truthfulness

Transluce: "These behaviors are surprising." https://x.com/TransluceAI/status/1912552068717637980

(Ars Technica article, edited) Researchers concerned to find AI models misrepresenting their “reasoning” processes https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/

(Ars Technica article, original) Researchers concerned to find AI models hiding their true “reasoning” processes https://web.archive.org/web/20250410231357/https://arstechnica.com/ai/2025/04/researchers-concerned-to-find-ai-models-hiding-their-true-reasoning-processes/

Copyscape is nice for quickly comparing web pages https://copyscape.com

Previously:

Anthropic, Apollo astounded to find a chatbot will lie to you if you tell it to lie to you https://pivot-to-ai.com/2024/12/19/anthropic-and-apollo-astounded-to-find-that-a-chatbot-will-lie-to-you-if-you-tell-it-to-lie-to-you/

How Sam Altman got fired from OpenAI in 2023: not being an AI doom crank (and lying a lot) https://pivot-to-ai.com/2025/04/06/how-sam-altman-got-fired-from-openai-in-2023-not-being-an-ai-doom-crank-and-lying-a-lot/

video: https://www.youtube.com/watch?v=xlrBjeAtJUk&list=UU9rJrMVgcXTfa8xuMnbhAEA

T-shirt store now open! https://pivot-to-ai.redbubble.com

Enhance the channel: https://www.amazon.co.uk/hz/wishlist/ls/3Q8VZW46J6DM6

Please fund my vital AI safety research! The fate of humanity is at stake!

Patreon: https://www.patreon.com/davidgerard

Ko-Fi: https://ko-fi.com/A1529D5

  continue reading

40 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Listen to this show while you explore
Play