Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Jacob Ward. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Jacob Ward or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

AI Is Even More Biased Than We Are: Mahzarin Banaji on the Disturbing Truth Behind LLMs

1:06:30
 
Share
 

Manage episode 523568234 series 3662679
Content provided by Jacob Ward. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Jacob Ward or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

This week I sat down with the woman who permanently rewired my understanding of human nature — and now she’s turning her attention to the nature of the machines we’ve gone crazy for.

Harvard psychologist Mahzarin Banaji coined the term “implicit bias” and has conducted research for decades into the blind spots we don’t admit even to ourselves. The work that blew my hair back shows how prejudice has and hasn’t changed since 2007. Take one of the tests here — I was deeply disappointed by my results.

More recently, she’s been running new experiments on today’s large language models.

What has she learned?

They’re far more biased than humans.

Sometimes twice or three times as biased.

They show shocking behavior — like a model declaring “I am a white male” or demonstrating literal self-love toward its own company. And as their most raw and objectionable responses are papered over, our ability to understand just how prejudiced they really are is being whitewashed, she says.

In this conversation, Banaji explains:

  • Why LLMs amplify bias instead of neutralizing it

  • How guardrails and “alignment” may hide what the model really thinks

  • Why kids, judges, doctors, and lonely users are uniquely exposed

  • How these systems form a narrowing “artificial hive mind”

  • And why we may not be mature enough to automate judgement at all

Banaji is working at the very cutting edge of the science, and delivers a clear and unsettling picture of what AI is amplifying in our minds.

00:00 — AI Will Warp Our Decisions

Banaji on why future decision-making may “suck” if we trust biased systems.

01:20 — The Woman Who Changed How We Think About Bias

Jake introduces Banaji’s life’s work charting the hidden prejudices wired into all of us.

03:00 — When Internet Language Revealed Human Bias

How early word-embedding research mirrored decades of psychological findings.

05:30 — AI Learns the One-Drop Rule

CLIP models absorb racial logic humans barely admit.

07:00 — The Moment GPT Said “I Am a White Male”

Banaji recounts the shocking early answer that launched her LLM research.

10:00 — The Rise of Guardrails… and the Disappearance of Honesty

Why the cleaned-up versions of models may tell us less about their true thinking.

12:00 — What “Alignment” Gets Fatally Wrong

The Silicon Valley fantasy of “universal human values” collides with actual psychology.

15:00 — When AI Corrects Itself in Stupid Ways

The Gemini fiasco, and why “fixing” bias often produces fresh distortions.

17:00 — Should We Even Build AGI?

Banaji on why specialized models may be safer than one general mind.

19:00 — Can We Automate Judgment When We Don’t Know Ourselves?

The paradox at the heart of AI development.

21:00 — Machines Can Be Manipulated Just Like Humans

Cialdini’s persuasion principles work frighteningly well on LLMs.

23:00 — Why AI Seems So Trustworthy (and Why That’s Dangerous)

The credibility illusion baked into every polished chatbot.

25:00 — The Discovery of Machine “Self-Love”

How models prefer themselves, their creators, and their own CEOs.

28:00 — The Hidden Line of Code That Made It All Make Sense

What changes when a model is told its own name.

31:00 — Artificial Hive Mind: What 70 LLMs Have in Common

The narrowing of creativity across models and why it matters.

34:00 — Why LLM Bias Is More Extreme Than Human Bias

Banaji explains effect sizes that blow past anything seen in psychology.

37:00 — A Global Problem: From U.S. Race Bias to India’s Caste Bias

How Western-built models export prejudice worldwide.

40:00 — The Loan Officer Problem: When “Truth to the Data” Is Immoral

A real-world example of why bias-blind AI is dangerous.

43:00 — Bayesian Hypocrisy: Humans Do It… and AI Does It More

Models replicate our irrational judgments — just with sharper edges.

48:00 — Are We Mature Enough to Hand Off Our Thinking?

Banaji on the risks of relying on a mind we didn’t design and barely understand.

50:00 — The Big Question: Can AI Ever Make Us More Rational?

  continue reading

44 episodes

Artwork
iconShare
 
Manage episode 523568234 series 3662679
Content provided by Jacob Ward. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Jacob Ward or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

This week I sat down with the woman who permanently rewired my understanding of human nature — and now she’s turning her attention to the nature of the machines we’ve gone crazy for.

Harvard psychologist Mahzarin Banaji coined the term “implicit bias” and has conducted research for decades into the blind spots we don’t admit even to ourselves. The work that blew my hair back shows how prejudice has and hasn’t changed since 2007. Take one of the tests here — I was deeply disappointed by my results.

More recently, she’s been running new experiments on today’s large language models.

What has she learned?

They’re far more biased than humans.

Sometimes twice or three times as biased.

They show shocking behavior — like a model declaring “I am a white male” or demonstrating literal self-love toward its own company. And as their most raw and objectionable responses are papered over, our ability to understand just how prejudiced they really are is being whitewashed, she says.

In this conversation, Banaji explains:

  • Why LLMs amplify bias instead of neutralizing it

  • How guardrails and “alignment” may hide what the model really thinks

  • Why kids, judges, doctors, and lonely users are uniquely exposed

  • How these systems form a narrowing “artificial hive mind”

  • And why we may not be mature enough to automate judgement at all

Banaji is working at the very cutting edge of the science, and delivers a clear and unsettling picture of what AI is amplifying in our minds.

00:00 — AI Will Warp Our Decisions

Banaji on why future decision-making may “suck” if we trust biased systems.

01:20 — The Woman Who Changed How We Think About Bias

Jake introduces Banaji’s life’s work charting the hidden prejudices wired into all of us.

03:00 — When Internet Language Revealed Human Bias

How early word-embedding research mirrored decades of psychological findings.

05:30 — AI Learns the One-Drop Rule

CLIP models absorb racial logic humans barely admit.

07:00 — The Moment GPT Said “I Am a White Male”

Banaji recounts the shocking early answer that launched her LLM research.

10:00 — The Rise of Guardrails… and the Disappearance of Honesty

Why the cleaned-up versions of models may tell us less about their true thinking.

12:00 — What “Alignment” Gets Fatally Wrong

The Silicon Valley fantasy of “universal human values” collides with actual psychology.

15:00 — When AI Corrects Itself in Stupid Ways

The Gemini fiasco, and why “fixing” bias often produces fresh distortions.

17:00 — Should We Even Build AGI?

Banaji on why specialized models may be safer than one general mind.

19:00 — Can We Automate Judgment When We Don’t Know Ourselves?

The paradox at the heart of AI development.

21:00 — Machines Can Be Manipulated Just Like Humans

Cialdini’s persuasion principles work frighteningly well on LLMs.

23:00 — Why AI Seems So Trustworthy (and Why That’s Dangerous)

The credibility illusion baked into every polished chatbot.

25:00 — The Discovery of Machine “Self-Love”

How models prefer themselves, their creators, and their own CEOs.

28:00 — The Hidden Line of Code That Made It All Make Sense

What changes when a model is told its own name.

31:00 — Artificial Hive Mind: What 70 LLMs Have in Common

The narrowing of creativity across models and why it matters.

34:00 — Why LLM Bias Is More Extreme Than Human Bias

Banaji explains effect sizes that blow past anything seen in psychology.

37:00 — A Global Problem: From U.S. Race Bias to India’s Caste Bias

How Western-built models export prejudice worldwide.

40:00 — The Loan Officer Problem: When “Truth to the Data” Is Immoral

A real-world example of why bias-blind AI is dangerous.

43:00 — Bayesian Hypocrisy: Humans Do It… and AI Does It More

Models replicate our irrational judgments — just with sharper edges.

48:00 — Are We Mature Enough to Hand Off Our Thinking?

Banaji on the risks of relying on a mind we didn’t design and barely understand.

50:00 — The Big Question: Can AI Ever Make Us More Rational?

  continue reading

44 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play