Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Artwork

Content provided by The Mad Botter. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Mad Botter or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://player.fm/legal.
Player FM - Podcast App
Go offline with the Player FM app!

636: Red Hat's James Haung

20:53
 
Share
 

Manage episode 525035138 series 2440919
Content provided by The Mad Botter. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Mad Botter or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Links
James on LinkedIn
Mike on LinkedIn
Mike's Blog
Show on Discord

Alice Promo

  1. AI on Red Hat Enterprise Linux (RHEL)

Trust and Stability: RHEL provides the mission-critical foundation needed for workloads where security and reliability cannot be compromised.

Predictive vs. Generative: Acknowledging the hype of GenAI while maintaining support for traditional machine learning algorithms.

Determinism: The challenge of bringing consistency and security to emerging AI technologies in production environments.

  1. Rama-Llama & Containerization

Developer Simplicity: Rama-Llama helps developers run local LLMs easily without being "locked in" to specific engines; it supports Podman, Docker, and various inference engines like Llama.cpp and Whisper.cpp.

Production Path: The tool is designed to "fade away" after helping package the model and stack into a container that can be deployed directly to Kubernetes.

Behind the Firewall: Addressing the needs of industries (like aircraft maintenance) that require AI to stay strictly on-premises.

  1. Enterprise AI Infrastructure

Red Hat AI: A commercial product offering tools for model customization, including pre-training, fine-tuning, and RAG (Retrieval-Augmented Generation).

Inference Engines: James highlights the difference between Llama.cpp (for smaller/edge hardware) and vLLM, which has become the enterprise standard for multi-GPU data center inferencing.

  continue reading

584 episodes

Artwork

636: Red Hat's James Haung

Coder Radio

1,183 subscribers

published

iconShare
 
Manage episode 525035138 series 2440919
Content provided by The Mad Botter. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by The Mad Botter or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Links
James on LinkedIn
Mike on LinkedIn
Mike's Blog
Show on Discord

Alice Promo

  1. AI on Red Hat Enterprise Linux (RHEL)

Trust and Stability: RHEL provides the mission-critical foundation needed for workloads where security and reliability cannot be compromised.

Predictive vs. Generative: Acknowledging the hype of GenAI while maintaining support for traditional machine learning algorithms.

Determinism: The challenge of bringing consistency and security to emerging AI technologies in production environments.

  1. Rama-Llama & Containerization

Developer Simplicity: Rama-Llama helps developers run local LLMs easily without being "locked in" to specific engines; it supports Podman, Docker, and various inference engines like Llama.cpp and Whisper.cpp.

Production Path: The tool is designed to "fade away" after helping package the model and stack into a container that can be deployed directly to Kubernetes.

Behind the Firewall: Addressing the needs of industries (like aircraft maintenance) that require AI to stay strictly on-premises.

  1. Enterprise AI Infrastructure

Red Hat AI: A commercial product offering tools for model customization, including pre-training, fine-tuning, and RAG (Retrieval-Augmented Generation).

Inference Engines: James highlights the difference between Llama.cpp (for smaller/edge hardware) and vLLM, which has become the enterprise standard for multi-GPU data center inferencing.

  continue reading

584 episodes

All episodes

×
 
Loading …

Welcome to Player FM!

Player FM is scanning the web for high-quality podcasts for you to enjoy right now. It's the best podcast app and works on Android, iPhone, and the web. Signup to sync subscriptions across devices.

 

Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play