Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Center for AI Safety. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Center for AI Safety or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

AISN #45: Center for AI Safety 2024 Year in Review

11:31
 
Share
 

Manage episode 467280709 series 3647399
Content provided by Center for AI Safety. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Center for AI Safety or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

As 2024 draws to a close, we want to thank you for your continued support for AI safety and review what we’ve been able to accomplish. In this special-edition newsletter, we highlight some of our most important projects from the year.

The mission of the Center for AI Safety is to reduce societal-scale risks from AI. We focus on three pillars of work: research, field-building, and advocacy.

Research

CAIS conducts both technical and conceptual research on AI safety. Here are some highlights from our research in 2024:

Circuit Breakers. We published breakthrough research showing how circuit breakers can prevent AI models from behaving dangerously by interrupting crime-enabling outputs. In a jailbreaking competition with a prize pool of tens of thousands of dollars, it took twenty thousand attempts to jailbreak a model trained with circuit breakers. The paper was accepted to NeurIPS 2024.

The WMDP Benchmark. We developed the Weapons [...]

---

Outline:

(00:34) Research

(04:25) Advocacy

(06:44) Field-Building

(10:38) Looking Ahead

---

First published:
December 19th, 2024

Source:
https://newsletter.safe.ai/p/aisn-45-center-for-ai-safety-2024

---

Want more? Check out our ML Safety Newsletter for technical safety research.

Narrated by TYPE III AUDIO.

---

Images from the article:

undefined
undefined
undefined
undefined

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

  continue reading

71 episodes

Artwork
iconShare
 
Manage episode 467280709 series 3647399
Content provided by Center for AI Safety. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Center for AI Safety or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

As 2024 draws to a close, we want to thank you for your continued support for AI safety and review what we’ve been able to accomplish. In this special-edition newsletter, we highlight some of our most important projects from the year.

The mission of the Center for AI Safety is to reduce societal-scale risks from AI. We focus on three pillars of work: research, field-building, and advocacy.

Research

CAIS conducts both technical and conceptual research on AI safety. Here are some highlights from our research in 2024:

Circuit Breakers. We published breakthrough research showing how circuit breakers can prevent AI models from behaving dangerously by interrupting crime-enabling outputs. In a jailbreaking competition with a prize pool of tens of thousands of dollars, it took twenty thousand attempts to jailbreak a model trained with circuit breakers. The paper was accepted to NeurIPS 2024.

The WMDP Benchmark. We developed the Weapons [...]

---

Outline:

(00:34) Research

(04:25) Advocacy

(06:44) Field-Building

(10:38) Looking Ahead

---

First published:
December 19th, 2024

Source:
https://newsletter.safe.ai/p/aisn-45-center-for-ai-safety-2024

---

Want more? Check out our ML Safety Newsletter for technical safety research.

Narrated by TYPE III AUDIO.

---

Images from the article:

undefined
undefined
undefined
undefined

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

  continue reading

71 episodes

كل الحلقات

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play