Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Turpentine, Erik Torenberg, and Nathan Labenz. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Turpentine, Erik Torenberg, and Nathan Labenz or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

Embryology of AI: How Training Data Shapes AI Development w/ Timaeus' Jesse Hoogland & Daniel Murfet

1:44:39
 
Share
 

Manage episode 489557814 series 3452589
Content provided by Turpentine, Erik Torenberg, and Nathan Labenz. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Turpentine, Erik Torenberg, and Nathan Labenz or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Jesse Hoogland and Daniel Murfet, founders of Timaeus, introduce their mathematically rigorous approach to AI safety through "developmental interpretability" based on Singular Learning Theory. They explain how neural network loss landscapes are actually complex, jagged surfaces full of "singularities" where models can change internally without affecting external behavior—potentially masking dangerous misalignment. Using their Local Learning Coefficient measure, they've demonstrated the ability to identify critical phase changes during training in models up to 7 billion parameters, offering a complementary approach to mechanistic interpretability. This work aims to move beyond trial-and-error neural network training toward a more principled engineering discipline that could catch safety issues during training rather than after deployment.


Sponsors:

Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive

The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at https://agntcy.org/?utmcampaign=fy25q4agntcyamerpaid-mediaagntcy-cognitiverevolutionpodcast&utmchannel=podcast&utmsource=podcast

NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 41,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive


PRODUCED BY:

https://aipodcast.ing


CHAPTERS:

(00:00) Teaser

(04:44) About the Episode

(09:28) Introduction and Background

(11:01) Timaeus Origins and Philosophy

(14:18) Mathematical Foundations and SLT

(17:11) Developmental Interpretability Approach (Part 1)

(20:53) Sponsors: Oracle Cloud Infrastructure | The AGNTCY

(22:53) Developmental Interpretability Approach (Part 2)

(24:08) Proto-Paradigm and SAEs

(29:21) Generalization Theory Deep Dive

(34:59) Central Dogma Framework (Part 1)

(36:57) Sponsor: NetSuite by Oracle

(38:21) Central Dogma Framework (Part 2)

(39:19) Loss Landscape Geometry

(45:25) Degeneracies and Singularities

(52:09) Structure and Generalization

(01:00:20) Essential Dynamics Research

(01:05:04) Grokking vs Typical Learning

(01:12:03) Double Descent Discussion

(01:14:39) Interpretability and Alignment Applications

(01:22:01) Reward Hacking and Overgeneralization

(01:30:03) Future Training Vision

(01:36:20) Scaling and Compute Requirements

(01:38:19) Future Research Directions

(01:41:27) Outro


  continue reading

255 episodes

Artwork
iconShare
 
Manage episode 489557814 series 3452589
Content provided by Turpentine, Erik Torenberg, and Nathan Labenz. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Turpentine, Erik Torenberg, and Nathan Labenz or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Jesse Hoogland and Daniel Murfet, founders of Timaeus, introduce their mathematically rigorous approach to AI safety through "developmental interpretability" based on Singular Learning Theory. They explain how neural network loss landscapes are actually complex, jagged surfaces full of "singularities" where models can change internally without affecting external behavior—potentially masking dangerous misalignment. Using their Local Learning Coefficient measure, they've demonstrated the ability to identify critical phase changes during training in models up to 7 billion parameters, offering a complementary approach to mechanistic interpretability. This work aims to move beyond trial-and-error neural network training toward a more principled engineering discipline that could catch safety issues during training rather than after deployment.


Sponsors:

Oracle Cloud Infrastructure: Oracle Cloud Infrastructure (OCI) is the next-generation cloud that delivers better performance, faster speeds, and significantly lower costs, including up to 50% less for compute, 70% for storage, and 80% for networking. Run any workload, from infrastructure to AI, in a high-availability environment and try OCI for free with zero commitment at https://oracle.com/cognitive

The AGNTCY: The AGNTCY is an open-source collective dedicated to building the Internet of Agents, enabling AI agents to communicate and collaborate seamlessly across frameworks. Join a community of engineers focused on high-quality multi-agent software and support the initiative at https://agntcy.org/?utmcampaign=fy25q4agntcyamerpaid-mediaagntcy-cognitiverevolutionpodcast&utmchannel=podcast&utmsource=podcast

NetSuite by Oracle: NetSuite by Oracle is the AI-powered business management suite trusted by over 41,000 businesses, offering a unified platform for accounting, financial management, inventory, and HR. Gain total visibility and control to make quick decisions and automate everyday tasks—download the free ebook, Navigating Global Trade: Three Insights for Leaders, at https://netsuite.com/cognitive


PRODUCED BY:

https://aipodcast.ing


CHAPTERS:

(00:00) Teaser

(04:44) About the Episode

(09:28) Introduction and Background

(11:01) Timaeus Origins and Philosophy

(14:18) Mathematical Foundations and SLT

(17:11) Developmental Interpretability Approach (Part 1)

(20:53) Sponsors: Oracle Cloud Infrastructure | The AGNTCY

(22:53) Developmental Interpretability Approach (Part 2)

(24:08) Proto-Paradigm and SAEs

(29:21) Generalization Theory Deep Dive

(34:59) Central Dogma Framework (Part 1)

(36:57) Sponsor: NetSuite by Oracle

(38:21) Central Dogma Framework (Part 2)

(39:19) Loss Landscape Geometry

(45:25) Degeneracies and Singularities

(52:09) Structure and Generalization

(01:00:20) Essential Dynamics Research

(01:05:04) Grokking vs Typical Learning

(01:12:03) Double Descent Discussion

(01:14:39) Interpretability and Alignment Applications

(01:22:01) Reward Hacking and Overgeneralization

(01:30:03) Future Training Vision

(01:36:20) Scaling and Compute Requirements

(01:38:19) Future Research Directions

(01:41:27) Outro


  continue reading

255 episodes

Alle episoder

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play