Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by Evan Kirstel. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Evan Kirstel or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.
Player FM - Podcast App
Go offline with the Player FM app!

How AI Chatbots Go Off The Rails And What To Do About It

16:32
 
Share
 

Manage episode 520380098 series 3499431
Content provided by Evan Kirstel. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Evan Kirstel or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Interested in being a guest? Email us at [email protected]

Your most powerful product might also be your biggest liability: an AI agent making decisions you can’t see and answers you can’t predict. We sat down with Andre Scott of Coralogix to unpack how to make black-box systems measurable, accountable, and—most importantly—improvable over time.
We trace the journey from monoliths to microservices to LLMs and explain why old-school “index everything, analyze later” monitoring breaks under today’s data explosion. Andre introduces an analytics-first approach that processes telemetry in-stream and then stores what matters in your own object storage. That shift delivers cost control and true data ownership, turning observability from an insurance policy into a growth engine. We dig into open tooling like an LLM trace kit built on OpenTelemetry that captures prompts, responses, and metadata, so you can evaluate correctness, flag prompt injection, and enforce guardrails at runtime.
Bias and hallucinations don’t announce themselves; they creep in through context loss, retrieval misses, and model updates. The fix is continuous evaluation with small, purpose-trained models that run outside your app to score tone, safety, factuality, and leakage risks. Think of agents like employees: give them performance reviews, train them with real data, and escalate when risk spikes. We also explore Olly, CoraLogix’ agentic SRE that reads your telemetry, answers business-grade questions, and recommends alerts and remediations—especially handy when cloud outages ripple through your stack.
Regulation is coming fast, and accountability rests with the teams who ship AI into production. If you deploy it, you own the risk. The practical playbook is clear: embrace analytics-first observability, capture LLM telemetry, make evaluators your crown jewels, and keep the data that teaches your models to improve. Subscribe, share this with your engineering and product teams, and leave a review with the one place you’d add guardrails first.

Support the show

More at https://linktr.ee/EvanKirstel

  continue reading

Chapters

1. Meet Andre And CoreLogics’ Mission (00:00:00)

2. From Monoliths To Microservices To AI (00:00:42)

3. Analytics-First Observability And Data Ownership (00:01:54)

4. Why Chatbots Fail And Prompt Injection (00:02:32)

5. The AI Center And LLM Trace Telemetry (00:04:20)

6. Bias, Hallucinations, And Evaluations (00:06:08)

7. Continuous Evaluation And Maturity Model (00:07:12)

8. Self-Healing Agents And Today’s Limits (00:08:31)

9. Ollie: An Agentic SRE In Your Pocket (00:09:19)

10. Accountability, Risks, And Attack Surface (00:10:35)

11. War Stories: DPD And Costly Mistakes (00:12:03)

12. Regulation Pressure And Readiness (00:13:20)

13. Monitoring Doesn’t Kill Autonomy (00:14:07)

14. Roadmap, re:Invent, And Next Steps (00:14:40)

15. Closing And Where To Follow (00:16:07)

559 episodes

Artwork
iconShare
 
Manage episode 520380098 series 3499431
Content provided by Evan Kirstel. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Evan Kirstel or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Interested in being a guest? Email us at [email protected]

Your most powerful product might also be your biggest liability: an AI agent making decisions you can’t see and answers you can’t predict. We sat down with Andre Scott of Coralogix to unpack how to make black-box systems measurable, accountable, and—most importantly—improvable over time.
We trace the journey from monoliths to microservices to LLMs and explain why old-school “index everything, analyze later” monitoring breaks under today’s data explosion. Andre introduces an analytics-first approach that processes telemetry in-stream and then stores what matters in your own object storage. That shift delivers cost control and true data ownership, turning observability from an insurance policy into a growth engine. We dig into open tooling like an LLM trace kit built on OpenTelemetry that captures prompts, responses, and metadata, so you can evaluate correctness, flag prompt injection, and enforce guardrails at runtime.
Bias and hallucinations don’t announce themselves; they creep in through context loss, retrieval misses, and model updates. The fix is continuous evaluation with small, purpose-trained models that run outside your app to score tone, safety, factuality, and leakage risks. Think of agents like employees: give them performance reviews, train them with real data, and escalate when risk spikes. We also explore Olly, CoraLogix’ agentic SRE that reads your telemetry, answers business-grade questions, and recommends alerts and remediations—especially handy when cloud outages ripple through your stack.
Regulation is coming fast, and accountability rests with the teams who ship AI into production. If you deploy it, you own the risk. The practical playbook is clear: embrace analytics-first observability, capture LLM telemetry, make evaluators your crown jewels, and keep the data that teaches your models to improve. Subscribe, share this with your engineering and product teams, and leave a review with the one place you’d add guardrails first.

Support the show

More at https://linktr.ee/EvanKirstel

  continue reading

Chapters

1. Meet Andre And CoreLogics’ Mission (00:00:00)

2. From Monoliths To Microservices To AI (00:00:42)

3. Analytics-First Observability And Data Ownership (00:01:54)

4. Why Chatbots Fail And Prompt Injection (00:02:32)

5. The AI Center And LLM Trace Telemetry (00:04:20)

6. Bias, Hallucinations, And Evaluations (00:06:08)

7. Continuous Evaluation And Maturity Model (00:07:12)

8. Self-Healing Agents And Today’s Limits (00:08:31)

9. Ollie: An Agentic SRE In Your Pocket (00:09:19)

10. Accountability, Risks, And Attack Surface (00:10:35)

11. War Stories: DPD And Costly Mistakes (00:12:03)

12. Regulation Pressure And Readiness (00:13:20)

13. Monitoring Doesn’t Kill Autonomy (00:14:07)

14. Roadmap, re:Invent, And Next Steps (00:14:40)

15. Closing And Where To Follow (00:16:07)

559 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play