Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Daniel Filan. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daniel Filan or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

12 - AI Existential Risk with Paul Christiano

2:49:36
 
Share
 

Manage episode 308463186 series 2844728
Content provided by Daniel Filan. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daniel Filan or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Why would advanced AI systems pose an existential risk, and what would it look like to develop safer systems? In this episode, I interview Paul Christiano about his views of how AI could be so dangerous, what bad AI scenarios could look like, and what he thinks about various techniques to reduce this risk.

Topics we discuss, and timestamps:

- 00:00:38 - How AI may pose an existential threat

- 00:13:36 - AI timelines

- 00:24:49 - Why we might build risky AI

- 00:33:58 - Takeoff speeds

- 00:51:33 - Why AI could have bad motivations

- 00:56:33 - Lessons from our current world

- 01:08:23 - "Superintelligence"

- 01:15:21 - Technical causes of AI x-risk

- 01:19:32 - Intent alignment

- 01:33:52 - Outer and inner alignment

- 01:43:45 - Thoughts on agent foundations

- 01:49:35 - Possible technical solutions to AI x-risk

- 01:49:35 - Imitation learning, inverse reinforcement learning, and ease of evaluation

- 02:00:34 - Paul's favorite outer alignment solutions

- 02:01:20 - Solutions researched by others

- 2:06:13 - Decoupling planning from knowledge

- 02:17:18 - Factored cognition

- 02:25:34 - Possible solutions to inner alignment

- 02:31:56 - About Paul

- 02:31:56 - Paul's research style

- 02:36:36 - Disagreements and uncertainties

- 02:46:08 - Some favorite organizations

- 02:48:21 - Following Paul's work

The transcript: axrp.net/episode/2021/12/02/episode-12-ai-xrisk-paul-christiano.html

Paul's blog posts on AI alignment: ai-alignment.com

Material that we mention:

- Cold Takes - The Most Important Century: cold-takes.com/most-important-century

- Open Philanthropy reports on:

- Modeling the human trajectory: openphilanthropy.org/blog/modeling-human-trajectory

- The computational power of the human brain: openphilanthropy.org/blog/new-report-brain-computation

- AI timelines (draft): alignmentforum.org/posts/KrJfoZzpSDpnrv9va/draft-report-on-ai-timelines

- Whether AI could drive explosive economic growth: openphilanthropy.org/blog/report-advanced-ai-drive-explosive-economic-growth

- Takeoff speeds: sideways-view.com/2018/02/24/takeoff-speeds

- Superintelligence: Paths, Dangers, Strategies: en.wikipedia.org/wiki/Superintelligence:_Paths,_Dangers,_Strategies

- Wei Dai on metaphilosophical competence:

- Two neglected problems in human-AI safety: alignmentforum.org/posts/HTgakSs6JpnogD6c2/two-neglected-problems-in-human-ai-safety

- The argument from philosophical difficulty: alignmentforum.org/posts/w6d7XBCegc96kz4n3/the-argument-from-philosophical-difficulty

- Some thoughts on metaphilosophy: alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy

- AI safety via debate: arxiv.org/abs/1805.00899

- Iterated distillation and amplification: ai-alignment.com/iterated-distillation-and-amplification-157debfd1616

- Scalable agent alignment via reward modeling: a research direction: arxiv.org/abs/1811.07871

- Learning the prior: alignmentforum.org/posts/SL9mKhgdmDKXmxwE4/learning-the-prior

- Imitative generalisation (AKA 'learning the prior'): alignmentforum.org/posts/JKj5Krff5oKMb8TjT/imitative-generalisation-aka-learning-the-prior-1

- When is unaligned AI morally valuable?: ai-alignment.com/sympathizing-with-ai-e11a4bf5ef6e

  continue reading

60 episodes

Artwork
iconShare
 
Manage episode 308463186 series 2844728
Content provided by Daniel Filan. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Daniel Filan or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Why would advanced AI systems pose an existential risk, and what would it look like to develop safer systems? In this episode, I interview Paul Christiano about his views of how AI could be so dangerous, what bad AI scenarios could look like, and what he thinks about various techniques to reduce this risk.

Topics we discuss, and timestamps:

- 00:00:38 - How AI may pose an existential threat

- 00:13:36 - AI timelines

- 00:24:49 - Why we might build risky AI

- 00:33:58 - Takeoff speeds

- 00:51:33 - Why AI could have bad motivations

- 00:56:33 - Lessons from our current world

- 01:08:23 - "Superintelligence"

- 01:15:21 - Technical causes of AI x-risk

- 01:19:32 - Intent alignment

- 01:33:52 - Outer and inner alignment

- 01:43:45 - Thoughts on agent foundations

- 01:49:35 - Possible technical solutions to AI x-risk

- 01:49:35 - Imitation learning, inverse reinforcement learning, and ease of evaluation

- 02:00:34 - Paul's favorite outer alignment solutions

- 02:01:20 - Solutions researched by others

- 2:06:13 - Decoupling planning from knowledge

- 02:17:18 - Factored cognition

- 02:25:34 - Possible solutions to inner alignment

- 02:31:56 - About Paul

- 02:31:56 - Paul's research style

- 02:36:36 - Disagreements and uncertainties

- 02:46:08 - Some favorite organizations

- 02:48:21 - Following Paul's work

The transcript: axrp.net/episode/2021/12/02/episode-12-ai-xrisk-paul-christiano.html

Paul's blog posts on AI alignment: ai-alignment.com

Material that we mention:

- Cold Takes - The Most Important Century: cold-takes.com/most-important-century

- Open Philanthropy reports on:

- Modeling the human trajectory: openphilanthropy.org/blog/modeling-human-trajectory

- The computational power of the human brain: openphilanthropy.org/blog/new-report-brain-computation

- AI timelines (draft): alignmentforum.org/posts/KrJfoZzpSDpnrv9va/draft-report-on-ai-timelines

- Whether AI could drive explosive economic growth: openphilanthropy.org/blog/report-advanced-ai-drive-explosive-economic-growth

- Takeoff speeds: sideways-view.com/2018/02/24/takeoff-speeds

- Superintelligence: Paths, Dangers, Strategies: en.wikipedia.org/wiki/Superintelligence:_Paths,_Dangers,_Strategies

- Wei Dai on metaphilosophical competence:

- Two neglected problems in human-AI safety: alignmentforum.org/posts/HTgakSs6JpnogD6c2/two-neglected-problems-in-human-ai-safety

- The argument from philosophical difficulty: alignmentforum.org/posts/w6d7XBCegc96kz4n3/the-argument-from-philosophical-difficulty

- Some thoughts on metaphilosophy: alignmentforum.org/posts/EByDsY9S3EDhhfFzC/some-thoughts-on-metaphilosophy

- AI safety via debate: arxiv.org/abs/1805.00899

- Iterated distillation and amplification: ai-alignment.com/iterated-distillation-and-amplification-157debfd1616

- Scalable agent alignment via reward modeling: a research direction: arxiv.org/abs/1811.07871

- Learning the prior: alignmentforum.org/posts/SL9mKhgdmDKXmxwE4/learning-the-prior

- Imitative generalisation (AKA 'learning the prior'): alignmentforum.org/posts/JKj5Krff5oKMb8TjT/imitative-generalisation-aka-learning-the-prior-1

- When is unaligned AI morally valuable?: ai-alignment.com/sympathizing-with-ai-e11a4bf5ef6e

  continue reading

60 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play