The Illusion of Thinking
Manage episode 487616979 series 3669470
This Apple paper (https://ml-site.cdn-apple.com/papers/the-illusion-of-thinking.pdf) examines the reasoning capabilities of Large Reasoning Models (LRMs) compared to standard Large Language Models (LLMs) by testing them on controlled puzzle environments. The researchers found that LRM performance collapses entirely beyond a certain complexity, and surprisingly, their reasoning effort decreases as problems become too difficult. The study reveals three complexity regimes: standard LLMs perform better on low complexity, LRMs are advantageous at medium complexity, and both fail at high complexity. Analysis of intermediate "thinking" steps shows LRMs can exhibit "overthinking" on simple tasks and inconsistent reasoning across different puzzles. The findings suggest current LRMs may have fundamental limitations in generalizable reasoning and exact computation.
10 episodes