Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo

Llm Evaluation Podcasts

show episodes
 
All Things LLM is your go-to podcast for demystifying Large Language Models! We break down their core concepts—like tokens, embeddings, and the self-attention that powers GPT-4 and Llama. Learn how LLMs are built, trained, and fine-tuned (SFT, RLHF, PEFT) on massive datasets. Discover real-world use cases in healthcare, finance, chatbots, code, RAG, and more. We explore the LLM ecosystem, covering open-source vs. closed models, LLMaaS, LangChain, and LLMOps tools. Plus, we tackle challenges— ...
  continue reading
 
The Everyday AI podcast is a daily livestream, podcast and free newsletter where we help everyday people grow their careers with AI. The Everyday AI podcast is hosted by Jordan Wilson, a former journalist who's now the owner of a boutique digital strategy company with 20 years of martech experience. Our main focus is to help you keep up with AI trends to make your job easier. Get your work done faster. Increase your output. - Sign up for our free Prime Prompt Polish ChatGPT course: https://p ...
  continue reading
 
Software engineers, architects and team leads have found inspiration to drive change and innovation in their team by listening to the weekly InfoQ Podcast. They have received essential information that helped them validate their software development map. We have achieved that by interviewing some of the top CTOs, engineers and technology directors from companies like Uber, Netflix and more. Over 1,200,000 downloads in the last 3 years.
  continue reading
 
Artwork

1
AWS Podcast

Amazon Web Services

icon
Unsubscribe
icon
icon
Unsubscribe
icon
Weekly
 
The Official AWS Podcast is a podcast for developers and IT professionals looking for the latest news and trends in storage, security, infrastructure, serverless, and more. Join Simon Elisha and Hawn Nguyen-Loughren for regular updates, deep dives, launches, and interviews. Whether you’re training machine learning models, developing open source projects, or building cloud solutions, the Official AWS Podcast has something for you.
  continue reading
 
Machine learning and artificial intelligence are dramatically changing the way businesses operate and people live. The TWIML AI Podcast brings the top minds and ideas from the world of ML and AI to a broad and influential community of ML/AI researchers, data scientists, engineers and tech-savvy business and IT leaders. Hosted by Sam Charrington, a sought after industry analyst, speaker, commentator and thought leader. Technologies covered include machine learning, artificial intelligence, de ...
  continue reading
 
AXRP (pronounced axe-urp) is the AI X-risk Research Podcast where I, Daniel Filan, have conversations with researchers about their papers. We discuss the paper, and hopefully get a sense of why it's been written and how it might reduce the risk of AI causing an existential catastrophe: that is, permanently and drastically curtailing humanity's future potential. You can visit the website and read transcripts at axrp.net.
  continue reading
 
Loading …
show series
 
My guest on this episode of the podcast is Luca Fiaschi, a machine learning expert who previously held executive data science roles at MistPlay, StitchFix, and HelloFresh. Luca is now a Partner for the Generative AI vertical at PyMC Labs, a consultancy that specializes in the application of Bayesian methods to business problems and which maintains …
  continue reading
 
How can you measure ROI on GenAI for your team? 🤔 Internal evaluations and intentionality. We've helped thousands of orgs put LLMs to work and ACTUALLY save time. On today's show, we're dishing the 7 steps you need to follow. What’s the best LLM for your team? 7 Steps to evaluate and create ROI for AI -- An Everyday AI chat with Jordan Wilson Newsl…
  continue reading
 
Is it AI failure or AI success? 🤔 We see massive trillion dollar valuations for AI companies, yet constant ‘AI bubble’ bust stories. And we see stories playing out in the media that say AI is both an enterprise boon and a complete waste of time. Welp…. A new study from Wharton will hopefully put this to rest. Among other things, it shows that 74% o…
  continue reading
 
Three words to Google Gemini and you can kiss your PowerPoint woes goodbye. 👋 Google Gemini quietly rolled out a kinda secret feature that TBH was deserving of a keynote. So how do you create slides in Google Gemini? And what are the pros and the limitations? Tune in as we put AI to Work on Wednesdays. The New Secret Google Gemini Feature that Quie…
  continue reading
 
In this episode, Carina Hong, founder and CEO of Axiom, joins us to discuss her work building an "AI Mathematician." Carina explains why this is a pivotal moment for AI in mathematics, citing a convergence of three key areas: the advanced reasoning capabilities of modern LLMs, the rise of formal proof languages like Lean, and breakthroughs in code …
  continue reading
 
OpenAI: reportedly losing $12 billion a quarter. 🥵 Also OpenAI: reportedly going public at a $1 trillion market cap. 🤑 The math aint mathin, right? Or is it? Join Everyday AI for our Hot Take Tuesday breaking down OpenAI's new structure, its new deal with Microsoft, and whether they're likely to keep losing money or be Wall Street's next darling. N…
  continue reading
 
Big AI deals. 🤝 Titans chasing startups. 🏃 Vibe coding with never-before-seen ease and enterprise AI following you to the apps you use every day. 🪄 As always, we saw some huge AI moves this week. How big? A (potential) $1 trillion IPO, hundreds of millions of users and 30K jobs cut. If you missed all the AI movement, we'll get you caught up and hel…
  continue reading
 
Magdalena Picariello reframes how we think about AI, moving the conversation from algorithms and metrics to business impact and outcomes. She champions evaluation systems that don't just measure accuracy but also demonstrate real-world business value, and advocates for iterative development with continuous feedback to build optimal applications.Rea…
  continue reading
 
The sixth installment of the Mobile Dev Memo mailbag features app monetization expert Sylvain Gauchet. Sylvain formerly served as Babbel's US Director of Revenue Strategy and now works with a number of subscription apps on revenue growth as an advisor and fractional executive. Additionally, Sylvain runs the GrowthGems newsletter, for which he scour…
  continue reading
 
ChatGPT Agents and Atlas have taken all the spotlight. 🤖 But these 5 underrated ChatGPT features can instantly improve your results. Join us as we uncover them and give you a leg up on everyone else. 5 Underrated ChatGPT Features You Should Be Using But Aren’t -- An Everyday AI Chat with Jordan Wilson (Replay) Newsletter: Sign up for our free daily…
  continue reading
 
In an AI push, Amazon has already axed 14,000 jobs and that total is reportedly going to hit 30,000. 🪓 Is this because Amazon overhired during the pandemic? Or, is this a sign that AI is now capable enough that most enterprises will cut thousands of jobs. Tune in as we discuss. Amazon Cuts 30,000 jobs in AI push. What this means for the the U.S. ec…
  continue reading
 
One small but fatal flaw of most LLMs? 💩 All your insights and deliverables kinda sit and die in those deserted chats. It can be tricky or nearly impossible to have ai chatbots simply create file types consistently. That's changing with this ONE overlooked feature inside Anthropic's Claude. Tune in as we put AI to Work on Wednesdays and start savin…
  continue reading
 
In this episode, Hung Bui, Technology Vice President at Qualcomm, joins us to explore the latest high-efficiency techniques for running generative AI, particularly diffusion models, on-device. We dive deep into the technical challenges of deploying these models, which are powerful but computationally expensive due to their iterative sampling proces…
  continue reading
 
ChatGPT ads are coming. 📰 They’re gonna be both crazy intrusive yet also pretty useful. That’s a given. But the real hot take here: personalized ChatGPT ads are actually gonna change how the internet works and conversational commerce is going to be the new norm. Every single company — including yours — is going to have to quickly adapt. We lay out …
  continue reading
 
Apparently this week was the week of Agentic Browsers? 🤷‍♂️ But, OpenAI's Atlas might not even be a top 3 AI news story of the week. We had hundreds of AI job cuts at Meta, Microsoft unveiled dozens of new AI features and AI music giant Suno may have a surprise competitor. Get caught up and get ahead with Everyday AI's weekly AI News That Matters s…
  continue reading
 
When Boyan Slat found more plastic than fish on a dive in Greece, he asked a simple question: "Why can't we just clean this up?" He was 16.What began as a humble project funded with pocket money has grown into a global initiative, removing millions of pounds of plastic from the world's rivers and oceans in the last decade. But simple questions don'…
  continue reading
 
Jenish Shah, a back-end engineer focused on distributed systems at Netflix, provides more insights on how to handle failures in a distributed systems setup. He shares details on how he built a library that handles exceptions uniformly, regardless of the underlying communication protocol. Read a transcript of this interview: http://bit.ly/3JpmIBnSub…
  continue reading
 
Blink and you’ve missed a few dozen Microsoft AI updates. And obviously agentic browser updates in Edge. If you missed Microsoft’s Copilot Sessions Fall Update, then you might be stuck scratching your head trying to decipher AI updates like that one street sign that no one understands. Don’t worry. We did the homework for you. Join us as we break d…
  continue reading
 
Dr. Aida Nematzadeh is a Senior Staff Research Scientist at Google DeepMind where her research focused on multimodal AI models. She works on developing evaluation methods and analyze model’s learning abilities to detect failure modes and guide improvements. Before joining DeepMind, she was a postdoctoral researcher at UC Berkeley and completed her …
  continue reading
 
(Kinda) Hot take 🔥 AI agents kinda stink. (For now.) If you want to get more done with AI, ditch the “general” agents until they catch up. Want gains today? Agentic browsers are the real winners. (Like OpenAI's just released Atlas browser.) Agentic Browsers are powered by the world's smartest models and actually keep your context and finish multi-s…
  continue reading
 
Today, we're joined by Alexandre Pesant, AI lead at Lovable, who joins us to discuss the evolution and practice of vibe coding. Alex shares his take on how AI is enabling a shift in software development from typing characters to expressing intent, creating a new layer of abstraction similar to how high-level code compiles to machine code. We explor…
  continue reading
 
ChatGPT just released their agnetic browser, Atlas. 🌏 Will it kill Chrome? What does it do? How does it incorporate ChatGPT? We'll answer those questions and more on today's show. ChatGPT’s New Agentic browser: Hands on with OpenAI’s Atlas -- An Everyday AI Chat with Jordan Wilson Newsletter: Sign up for our free daily newsletter More on this Episo…
  continue reading
 
In this episode of the podcast, members of the InfoQ editorial staff and friends of InfoQ will discuss current trends in the cloud and DevOps domains as part of our annual trends report creation process. These reports provide InfoQ readers with a high-level overview of key topics to watch and also help the editorial team focus on innovative technol…
  continue reading
 
In this week's episode of the podcast, I speak with Daphne Tideman, a product growth expert who runs the Growth Waves newsletter. The topic of our conversation is "zero-to-one growth": the tactics developers can utilize to validate and optimize their product to ultimately enable scaled user acquisition. Among other things, we cover: The purpose of …
  continue reading
 
Uber is paying its drivers as little as $1 to train LLMs. 😯 Smart business move or eery sign of what's to come? On this Hot Take Tuesday episode, we uncover the trend of dirt cheap data labeling, why it's a good thing and a bad thing, and how this is actually a sign of what's next. Uber paying drivers $1 to train AI models? A sign of what’s next --…
  continue reading
 
Could the combo of ChatGPT and Wal-Mart take on Amazon? 🥊 What are Claude Skill and why do they matter? 🤔 Did Sora 2 already get dethroned by Veo 3.1? 📹 A lot happened in the AI world this week. We'll break it down so you don't have any questions. ChatGPT and Wal-Mart team up for AI shopping, Google drops Veo 3.1, Claude Skills get released and mor…
  continue reading
 
AI hype is well over.🥱 Using AI is no longer a competitive edg y'all. it’s as basic as having internet access. (You wouldn't put that on your marketing materials, would ya?) If your business is still treating AI as something special, or stuck trying to look innovative just by using it, this episode is for you. Find out what actually matters now and…
  continue reading
 
Jeetu Patel knows a few AI secrets. As the President of one of the largest companies in the world, he's helped pave the AI adoption roadmap. At Cisco, they provide full-stack, enterprise AI solutions spanning infrastructure, security, observability, and operations to the world's largest companies. So naturally, Jeetu could write a legit playbook on…
  continue reading
 
You haven't used ChatGPT's Apps yet? 🫠 Oh.... you like wasting time? Even for free users, ChatGPT rolled out its new Apps mode that promises to shift the future of work. Don't know how to work it? Don't know where to start? Join us as we share 3 practical ways to start saving time today. ChatGPT Apps: 3 Hands-on approaches to save time today -- An …
  continue reading
 
In this episode, we're joined by Kunle Olukotun, professor of electrical engineering and computer science at Stanford University and co-founder and chief technologist at Sambanova Systems, to discuss reconfigurable dataflow architectures for AI inference. Kunle explains the core idea of building computers that are dynamically configured to match th…
  continue reading
 
In this week's episode of the podcast, I speak with Kate Minogue, a fractional CPO and advisor for consumer and ad tech companies. Kate also runs the AI Leadership Lab, an AI leadership course. Previously, Kate worked in marketing measurement at Meta. This episode is the fifth installment of the MDM Mailbag series, in which I bring experts onto the…
  continue reading
 
Will this be AI's 'App Store Moment'? 🤔 OpenAI's Apps are live, and the consensus is split. Some are calling them a revolutionary step forward while others are saying it's another marketing flop. What's our hot take? Join us and find out. AI’s App Store Moment? Are ChatGPT’s Apps The Next Big Thing or Smoke and Mirrors? An Everyday AI Chat with Jor…
  continue reading
 
OpenAI debuted the future of ChatGPT with Agents and Apps. How will that impact work? 🤖 Google dropped Gemini for Enterprise. Does that make them the top AI option for the big players? 🏢 Everyone is talking about the AI bubble. Is it real and will it burst? 🫧 If you have questions over what's happening in the world of AI news, we've got answers. Jo…
  continue reading
 
In this episode of the AWS Podcast, host Jillian Forde discusses the migration journey of Booking.com to AWS with Ali and Sarah. They explore the challenges faced by Booking.com , the benefits of using CloudFront and Lambda at Edge, and the importance of observability and cost optimization. The conversation also delves into chaos engineering practi…
  continue reading
 
In this podcast, Michael Stiefel spoke with Nimisha Asthagiri about the importance of system thinking, multi-agent systems, the consequences of society applying a technology into an area for which it was not designed, and whether we can ever have a healthy relationship with artificial intelligence. System thinking emphasizes the importance of menta…
  continue reading
 
Breaking: Google just released Gemini Enterprise. 🚨 Will it be a ChatGPT or Microsoft Copilot killer? We got our hands on a version of the newest release and will break down everything you need to know, including the ONE feature that could ultimately set Gemini Enterprise apart. Google Gemini Enterprise: Coming for ChatGPT and Microsoft Copilot? --…
  continue reading
 
Have you been sleeping on NotebookLM? 😴 If so, you're leaving hours of productivity (and probably a lot of money) at the door. But real talk -- the team is shipping fast. The NotebookLM you met last year from the viral Audio Overviews is not the NotebookLM of today. It's slowly turned into a robust, multimedia powerhouse. And the last feature updat…
  continue reading
 
My guest on this week's episode of the podcast is Dan Pantelo, the CEO and founder of Marpipe, a platform that enables eCommerce companies to build dynamic product ads. In our conversation, we discuss: The necessity of exhaustive creative experimentation in eCommerce advertising Whether and how advertisers can create an effective feedback loop betw…
  continue reading
 
Today, we're joined by Jacob Buckman, co-founder and CEO of Manifest AI to discuss achieving long context in transformers. We discuss the bottlenecks of scaling context length and recent techniques to overcome them, including windowed attention, grouped query attention, and latent space attention. We explore the idea of weight-state balance and the…
  continue reading
 
Will this be the AI update that finally brings AI agents to millions? 🤔 Probably. OpenAI had a straight up feast of AI drops at its Dev Day conference, but one of the biggest was its drag-and-drop agent builder. Oh, and literally bringing entire website experiences into ChatGPT via apps. Don't miss this one. Ep 626: ChatGPT’s new Agent Builder, App…
  continue reading
 
OpenAI is dropping a visual agent builder 🤯 There's a HUGE report on AI job losses.... Sora 2 gets good and bad news... And that's just the beginning. Make sure to join us for this week's AI News That Matters! EP 625: Sora 2 release and update, OpenAI and AMD partner, ChatGPT Visual Agent Builder incoming and more Newsletter: Sign up for our free d…
  continue reading
 
In this podcast, InfoQ spoke with Elena Samuylova from Evidently AI, on best practices in evaluating Large Language Model (LLM) based applications. She also discussed the tools for evaluating, testing and monitoring applications powered by AI technologies.Read a transcript of this interview: https://bit.ly/4mHAKvN Subscribe to the Software Architec…
  continue reading
 
Loading …
Copyright 2025 | Privacy Policy | Terms of Service | | Copyright
Listen to this show while you explore
Play