Go offline with the Player FM app!
AI Is Even More Biased Than We Are: Mahzarin Banaji on the Disturbing Truth Behind LLMs
Manage episode 523568234 series 3662679
This week I sat down with the woman who permanently rewired my understanding of human nature — and now she’s turning her attention to the nature of the machines we’ve gone crazy for.
Harvard psychologist Mahzarin Banaji coined the term “implicit bias” and has conducted research for decades into the blind spots we don’t admit even to ourselves. The work that blew my hair back shows how prejudice has and hasn’t changed since 2007. Take one of the tests here — I was deeply disappointed by my results.
More recently, she’s been running new experiments on today’s large language models.
What has she learned?
They’re far more biased than humans.
Sometimes twice or three times as biased.
They show shocking behavior — like a model declaring “I am a white male” or demonstrating literal self-love toward its own company. And as their most raw and objectionable responses are papered over, our ability to understand just how prejudiced they really are is being whitewashed, she says.
In this conversation, Banaji explains:
Why LLMs amplify bias instead of neutralizing it
How guardrails and “alignment” may hide what the model really thinks
Why kids, judges, doctors, and lonely users are uniquely exposed
How these systems form a narrowing “artificial hive mind”
And why we may not be mature enough to automate judgement at all
Banaji is working at the very cutting edge of the science, and delivers a clear and unsettling picture of what AI is amplifying in our minds.
00:00 — AI Will Warp Our Decisions
Banaji on why future decision-making may “suck” if we trust biased systems.
01:20 — The Woman Who Changed How We Think About Bias
Jake introduces Banaji’s life’s work charting the hidden prejudices wired into all of us.
03:00 — When Internet Language Revealed Human Bias
How early word-embedding research mirrored decades of psychological findings.
05:30 — AI Learns the One-Drop Rule
CLIP models absorb racial logic humans barely admit.
07:00 — The Moment GPT Said “I Am a White Male”
Banaji recounts the shocking early answer that launched her LLM research.
10:00 — The Rise of Guardrails… and the Disappearance of Honesty
Why the cleaned-up versions of models may tell us less about their true thinking.
12:00 — What “Alignment” Gets Fatally Wrong
The Silicon Valley fantasy of “universal human values” collides with actual psychology.
15:00 — When AI Corrects Itself in Stupid Ways
The Gemini fiasco, and why “fixing” bias often produces fresh distortions.
17:00 — Should We Even Build AGI?
Banaji on why specialized models may be safer than one general mind.
19:00 — Can We Automate Judgment When We Don’t Know Ourselves?
The paradox at the heart of AI development.
21:00 — Machines Can Be Manipulated Just Like Humans
Cialdini’s persuasion principles work frighteningly well on LLMs.
23:00 — Why AI Seems So Trustworthy (and Why That’s Dangerous)
The credibility illusion baked into every polished chatbot.
25:00 — The Discovery of Machine “Self-Love”
How models prefer themselves, their creators, and their own CEOs.
28:00 — The Hidden Line of Code That Made It All Make Sense
What changes when a model is told its own name.
31:00 — Artificial Hive Mind: What 70 LLMs Have in Common
The narrowing of creativity across models and why it matters.
34:00 — Why LLM Bias Is More Extreme Than Human Bias
Banaji explains effect sizes that blow past anything seen in psychology.
37:00 — A Global Problem: From U.S. Race Bias to India’s Caste Bias
How Western-built models export prejudice worldwide.
40:00 — The Loan Officer Problem: When “Truth to the Data” Is Immoral
A real-world example of why bias-blind AI is dangerous.
43:00 — Bayesian Hypocrisy: Humans Do It… and AI Does It More
Models replicate our irrational judgments — just with sharper edges.
48:00 — Are We Mature Enough to Hand Off Our Thinking?
Banaji on the risks of relying on a mind we didn’t design and barely understand.
50:00 — The Big Question: Can AI Ever Make Us More Rational?
44 episodes
Manage episode 523568234 series 3662679
This week I sat down with the woman who permanently rewired my understanding of human nature — and now she’s turning her attention to the nature of the machines we’ve gone crazy for.
Harvard psychologist Mahzarin Banaji coined the term “implicit bias” and has conducted research for decades into the blind spots we don’t admit even to ourselves. The work that blew my hair back shows how prejudice has and hasn’t changed since 2007. Take one of the tests here — I was deeply disappointed by my results.
More recently, she’s been running new experiments on today’s large language models.
What has she learned?
They’re far more biased than humans.
Sometimes twice or three times as biased.
They show shocking behavior — like a model declaring “I am a white male” or demonstrating literal self-love toward its own company. And as their most raw and objectionable responses are papered over, our ability to understand just how prejudiced they really are is being whitewashed, she says.
In this conversation, Banaji explains:
Why LLMs amplify bias instead of neutralizing it
How guardrails and “alignment” may hide what the model really thinks
Why kids, judges, doctors, and lonely users are uniquely exposed
How these systems form a narrowing “artificial hive mind”
And why we may not be mature enough to automate judgement at all
Banaji is working at the very cutting edge of the science, and delivers a clear and unsettling picture of what AI is amplifying in our minds.
00:00 — AI Will Warp Our Decisions
Banaji on why future decision-making may “suck” if we trust biased systems.
01:20 — The Woman Who Changed How We Think About Bias
Jake introduces Banaji’s life’s work charting the hidden prejudices wired into all of us.
03:00 — When Internet Language Revealed Human Bias
How early word-embedding research mirrored decades of psychological findings.
05:30 — AI Learns the One-Drop Rule
CLIP models absorb racial logic humans barely admit.
07:00 — The Moment GPT Said “I Am a White Male”
Banaji recounts the shocking early answer that launched her LLM research.
10:00 — The Rise of Guardrails… and the Disappearance of Honesty
Why the cleaned-up versions of models may tell us less about their true thinking.
12:00 — What “Alignment” Gets Fatally Wrong
The Silicon Valley fantasy of “universal human values” collides with actual psychology.
15:00 — When AI Corrects Itself in Stupid Ways
The Gemini fiasco, and why “fixing” bias often produces fresh distortions.
17:00 — Should We Even Build AGI?
Banaji on why specialized models may be safer than one general mind.
19:00 — Can We Automate Judgment When We Don’t Know Ourselves?
The paradox at the heart of AI development.
21:00 — Machines Can Be Manipulated Just Like Humans
Cialdini’s persuasion principles work frighteningly well on LLMs.
23:00 — Why AI Seems So Trustworthy (and Why That’s Dangerous)
The credibility illusion baked into every polished chatbot.
25:00 — The Discovery of Machine “Self-Love”
How models prefer themselves, their creators, and their own CEOs.
28:00 — The Hidden Line of Code That Made It All Make Sense
What changes when a model is told its own name.
31:00 — Artificial Hive Mind: What 70 LLMs Have in Common
The narrowing of creativity across models and why it matters.
34:00 — Why LLM Bias Is More Extreme Than Human Bias
Banaji explains effect sizes that blow past anything seen in psychology.
37:00 — A Global Problem: From U.S. Race Bias to India’s Caste Bias
How Western-built models export prejudice worldwide.
40:00 — The Loan Officer Problem: When “Truth to the Data” Is Immoral
A real-world example of why bias-blind AI is dangerous.
43:00 — Bayesian Hypocrisy: Humans Do It… and AI Does It More
Models replicate our irrational judgments — just with sharper edges.
48:00 — Are We Mature Enough to Hand Off Our Thinking?
Banaji on the risks of relying on a mind we didn’t design and barely understand.
50:00 — The Big Question: Can AI Ever Make Us More Rational?
44 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.