Go offline with the Player FM app!
ChatGPT Jailbreaks: The Grandma Exploit
Manage episode 367756780 series 3427795
How do you extract prohibited information from ChatGPT? Grandma and DAN exploits trick language models into violating their own policies. Why these techniques work, what they reveal about LLM architecture, and how companies protect against prompt injection attacks. Solo episode on LLM security.
To stay in touch, sign up for our newsletter at https://www.superprompt.fm
30 episodes
Manage episode 367756780 series 3427795
How do you extract prohibited information from ChatGPT? Grandma and DAN exploits trick language models into violating their own policies. Why these techniques work, what they reveal about LLM architecture, and how companies protect against prompt injection attacks. Solo episode on LLM security.
To stay in touch, sign up for our newsletter at https://www.superprompt.fm
30 episodes
All episodes
×Welcome to Player FM!
Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.