Hacking AI And Retraining LLMs The API Hour podcast

6d ago 1:01:37

Content provided by Christine Bevilacqua. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Christine Bevilacqua or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Artificial Intelligence is transforming every industry, but with that transformation comes new security risks. In this episode of The API Hour, host Dan Barahona interviews Robert Herbig, Senior Engineer at SEP and instructor of the APIsec University course, Building Security into AI, to explore the emerging world of AI attacks, data poisoning, and model tampering.

From poisoned stop sign datasets to prompt injections that trick LLMs into revealing dangerous information, this episode is packed with eye-opening examples of how AI can be manipulated, and what builders and security teams can do to defend against it.

What You’ll Learn

Data poisoning in action: how mislabeled stop signs and manipulated datasets can cause catastrophic AI failures
Watering hole attacks & typosquatting: why malicious datasets and libraries pose a hidden risk
Prompt injection & jailbreaking: real-world cases where LLMs were manipulated into revealing restricted information
Black box vs. white box attacks: what attackers can infer just by observing model confidence scores
Retraining & RAG: how AI models ingest new information and why continuous updates create new vulnerabilities
The API connection: why exposing models via APIs ties AI security directly to API security best practices

Episode Timestamps

00:45 – Stop signs, stripes, and poisoned training data
07:00 – Data poisoning in Gmail spam detection
17:00 – SEO hacks and AI summaries: a new frontier for attackers
22:00 – Typo-squatting and malicious packages
25:00 – Pliny the Liberator and “memetic viruses” in training data
33:00 – Black box vs. white box attacks on computer vision models
43:00 – Prompt injection and roleplay exploits
52:00 – APIs and AI security: two sides of the same coin

3 episodes

What You’ll Learn

Data poisoning in action: how mislabeled stop signs and manipulated datasets can cause catastrophic AI failures
Watering hole attacks & typosquatting: why malicious datasets and libraries pose a hidden risk
Prompt injection & jailbreaking: real-world cases where LLMs were manipulated into revealing restricted information
Black box vs. white box attacks: what attackers can infer just by observing model confidence scores
Retraining & RAG: how AI models ingest new information and why continuous updates create new vulnerabilities
The API connection: why exposing models via APIs ties AI security directly to API security best practices

Episode Timestamps

00:45 – Stop signs, stripes, and poisoned training data
07:00 – Data poisoning in Gmail spam detection
17:00 – SEO hacks and AI summaries: a new frontier for attackers
22:00 – Typo-squatting and malicious packages
25:00 – Pliny the Liberator and “memetic viruses” in training data
33:00 – Black box vs. white box attacks on computer vision models
43:00 – Prompt injection and roleplay exploits
52:00 – APIs and AI security: two sides of the same coin

Podcasts Worth a Listen

The API Hour »
Hacking AI and Retraining LLMs