Search a title or topic

Over 20 million podcasts, powered by 

Player FM logo
Nicolay Gerold public
[search 0]
Download the App!
show episodes
 
Artwork

1
How AI Is Built

Nicolay Gerold

icon
Unsubscribe
icon
Unsubscribe
Weekly
 
Real engineers. Real deployments. Zero hype. We interview the top engineers who actually put AI in production. Learn what the best engineers have figured out through years of experience. Hosted by Nicolay Gerold, CEO of Aisbach and CTO at Proxdeal and Multiply Content.
  continue reading
 
Loading …
show series
 
Today on How AI Is Built, Nicolay Gerold sits down with Jorge Arango, an expert in information architecture. Jorge emphasizes that aligning systems with users' mental models is more important than optimizing backend logic alone. He shares a clear framework with four practical steps: Key Points: Information architecture should bridge user mental mod…
  continue reading
 
Modern search is broken. There are too many pieces that are glued together. Vector databases for semantic search Text engines for keywords Rerankers to fix the results LLMs to understand queries Metadata filters for precision Each piece works well alone. Together, they often become a mess. When you glue these systems together, you create: Data Cons…
  continue reading
 
John Berryman moved from aerospace engineering to search, then to ML and LLMs. His path: Eventbrite search → GitHub code search → data science → GitHub Copilot. He was drawn to more math and ML throughout his career. RAG Explained "RAG is not a thing. RAG is two things." It breaks into: Search - finding relevant information Prompt engineering - pre…
  continue reading
 
Kuzu is an embedded graph database that implements Cypher as a library. It can be easily integrated into various environments—from scripts and Android apps to serverless platforms. Its design supports both ephemeral, in-memory graphs (ideal for temporary computations) and large-scale persistent graphs where traditional systems struggle with perform…
  continue reading
 
Metadata is the foundation of any enterprise knowledge graph. By organizing both technical and business metadata, organizations create a “brain” that supports advanced applications like AI-driven data assistants. The goal is to achieve economies of scale—making data reusable, traceable, and ultimately more valuable. Juan Sequeda is a leading expert…
  continue reading
 
Daniel Davis is an expert on knowledge graphs. He has a background in risk assessment and complex systems—from aerospace to cybersecurity. Now he is working on “Temporal RAG” in TrustGraph. Time is a critical—but often ignored—dimension in data. Whether it’s threat intelligence, legal contracts, or API documentation, every data point has a temporal…
  continue reading
 
Robert Caulk runs Emergent Methods, a research lab building news knowledge graphs. With a Ph.D. in computational mechanics, he spent 12 years creating open-source tools for machine learning and data analysis. His work on projects like Flowdapt (model serving) and FreqAI (adaptive modeling) has earned over 1,000 academic citations. His team built As…
  continue reading
 
When you store vectors, each number takes up 32 bits. With 1000 numbers per vector and millions of vectors, costs explode. A simple chatbot can cost thousands per month just to store and search through vectors. The Fix: Quantization Think of it like image compression. JPEGs look almost as good as raw photos but take up far less space. Quantization …
  continue reading
 
Alex Garcia is a developer focused on making vector search accessible and practical. As he puts it: "I'm a SQLite guy. I use SQLite for a lot of projects... I want an easier vector search thing that I don't have to install 10,000 dependencies to use.” Core Mantra: "Simple, Local, Scalable" Why SQLite Vec? "I didn't go along thinking, 'Oh, I want to…
  continue reading
 
Today, I (Nicolay Gerold) sit down with Trey Grainger, author of the book AI-Powered Search. We discuss the different techniques for search and recommendations and how to combine them. While RAG (Retrieval-Augmented Generation) has become a buzzword in AI, Trey argues that the current understanding of "RAG" is overly simplified – it's actually a bi…
  continue reading
 
Today we are back continuing our series on search. We are talking to Brandon Smith, about his work for Chroma. He led one of the largest studies in the field on different chunking techniques. So today we will look at how we can unfuck our RAG systems from badly chosen chunking hyperparameters. The biggest lie in RAG is that semantic search is simpl…
  continue reading
 
Most LLMs you use today already use synthetic data. It’s not a thing of the future. The large labs use a large model (e.g. gpt-4o) to generate training data for a smaller one (gpt-4o-mini). This lets you build fast, cheap models that do one thing well. This is “distillation”. But the vision for synthetic data is much bigger. Enable people to train …
  continue reading
 
Modern RAG systems build on flexibility. At their core, they match each query with the best tool for the job. They know which tool fits each task. When you ask about sales numbers, they reach for SQL. When you need to company policies, they use vector search or BM25. The key is switching tools smoothly. A question about sales figures might need SQL…
  continue reading
 
Many companies use Elastic or OpenSearch and use 10% of the capacity. They have to build ETL pipelines. Get data Normalized. Worry about race conditions. All in all. At the moment, when you want to do search on top of your transactional data, you are forced to build a distributed systems. Not anymore. ParadeDB is building an open-source PostgreSQL …
  continue reading
 
RAG isn't a magic fix for search problems. While it works well at first, most teams find it's not good enough for production out of the box. The key is to make it better step by step, using good testing and smart data creation. Today, we are talking to Saahil Ognawala from Jina AI to start to understand RAG. To build a good RAG system, you need thr…
  continue reading
 
Documentation quality is the silent killer of RAG systems. A single ambiguous sentence might corrupt an entire set of responses. But the hardest part isn't fixing errors - it's finding them. Today we are talking to Max Buckley on how to find and fix these errors. Max works at Google and has built a lot of interesting experiments with LLMs on using …
  continue reading
 
Ever wondered why vector search isn't always the best path for information retrieval? Join us as we dive deep into BM25 and its unmatched efficiency in our latest podcast episode with David Tippett from GitHub. Discover how BM25 transforms search efficiency, even at GitHub's immense scale. BM25, short for Best Match 25, use term frequency (TF) and …
  continue reading
 
Ever wondered why your vector search becomes painfully slow after scaling past a million vectors? You're not alone - even tech giants struggle with this. Charles Xie, founder of Zilliz (company behind Milvus), shares how they solved vector database scaling challenges at 100B+ vector scale: Key Insights: Multi-tier storage strategy: GPU memory (1% o…
  continue reading
 
Modern search systems face a complex balancing act between performance, relevancy, and cost, requiring careful architectural decisions at each layer. While vector search generates buzz, hybrid approaches combining traditional text search with vector capabilities yield better results. The architecture typically splits into three core components: ing…
  continue reading
 
Today we are talking to Michael Günther, a senior machine learning scientist at Jina about his work on JINA Clip. Some key points: Uni-modal embeddings convert a single type of input (text, images, audio) into vectors Multimodal embeddings learn a joint embedding space that can handle multiple types of input, enabling cross-modal search (e.g., sear…
  continue reading
 
Imagine a world where data bottlenecks, slow data loaders, or memory issues on the VM don't hold back machine learning. Machine learning and AI success depends on the speed you can iterate. LanceDB is here to to enable fast experiments on top of terabytes of unstructured data. It is the database for AI. Dive with us into how LanceDB was built, what…
  continue reading
 
Today’s guest is Mór Kapronczay. Mór is the Head of ML at superlinked. Superlinked is a compute framework for your information retrieval and feature engineering systems, where they turn anything into embeddings. When most people think about embeddings, they think about ada, openai. You just take your text and throw it in there. But that’s too crude…
  continue reading
 
Today we have Jessica Talisman with us, who is working as an Information Architect at Adobe. She is (in my opinion) the expert on taxonomies and ontologies. That’s what you will learn today in this episode of How AI Is Built. Taxonomies, ontologies, knowledge graphs. Everyone is talking about them no-one knows how to build them. But before we look …
  continue reading
 
ColPali makes us rethink how we approach document processing. ColPali revolutionizes visual document search by combining late interaction scoring with visual language models. This approach eliminates the need for extensive text extraction and preprocessing, handling messy real-world data more effectively than traditional methods. In this episode, J…
  continue reading
 
Today, we're talking to Aamir Shakir, the founder and baker at mixedbread.ai, where he's building some of the best embedding and re-ranking models out there. We go into the world of rerankers, looking at how they can classify, deduplicate documents, prioritize LLM outputs, and delve into models like ColBERT. We discuss: The role of rerankers in ret…
  continue reading
 
Text embeddings have limitations when it comes to handling long documents and out-of-domain data. Today, we are talking to Nils Reimers. He is one of the researchers who kickstarted the field of dense embeddings, developed sentence transformers, started HuggingFace’s Neural Search team and now leads the development of search foundational models at …
  continue reading
 
Hey! Welcome back. Today we look at how we can get our RAG system ready for scale. We discuss common problems and their solutions, when you introduce more users and more requests to your system. For this we are joined by Nirant Kasliwal, the author of fastembed. Nirant shares practical insights on metadata extraction, evaluation strategies, and eme…
  continue reading
 
In this episode of How AI is Built, Nicolay Gerold interviews Doug Turnbull, a search engineer at Reddit and author on “Relevant Search”. They discuss how methods and technologies, including large language models (LLMs) and semantic search, contribute to relevant search results. Key Highlights: Defining relevance is challenging and depends heavily …
  continue reading
 
In this episode, we talk data-driven search optimizations with Charlie Hull. Charlie is a search expert from Open Source Connections. He has built Flax, one of the leading open source search companies in the UK, has written “Searching the Enterprise”, and is one of the main voices on data-driven search. We discuss strategies to improve search syste…
  continue reading
 
Welcome back to How AI Is Built. We have got a very special episode to kick off season two. Daniel Tunkelang is a search consultant currently working with Algolia. He is a leader in the field of information retrieval, recommender systems, and AI-powered search. He worked for Canva, Algolia, Cisco, Gartner, Handshake, to pick a few. His core focus i…
  continue reading
 
Today we are launching the season 2 of How AI Is Built. The last few weeks, we spoke to a lot of regular listeners and past guests and collected feedback. Analyzed our episode data. And we will be applying the learnings to season 2. This season will be all about search. We are trying to make it better, more actionable, and more in-depth. The goal i…
  continue reading
 
In this episode of "How AI is Built," host Nicolay Gerold interviews Jonathan Yarkoni, founder of Reach Latent. Jonathan shares his expertise in extracting value from unstructured data using AI, discussing challenging projects, the impact of ChatGPT, and the future of generative AI. From weather prediction to legal tech, Jonathan provides valuable …
  continue reading
 
This episode of "How AI Is Built" is all about data processing for AI. Abhishek Choudhary and Nicolay discuss Spark and alternatives to process data so it is AI-ready. Spark is a distributed system that allows for fast data processing by utilizing memory. It uses a dataframe representation "RDD" to simplify data processing. When should you use Spar…
  continue reading
 
In this episode, Nicolay talks with Rahul Parundekar, founder of AI Hero, about the current state and future of AI agents. Drawing from over a decade of experience working on agent technology at companies like Toyota, Rahul emphasizes the importance of focusing on realistic, bounded use cases rather than chasing full autonomy. They dive into the ke…
  continue reading
 
In this conversation, Nicolay and Richmond Alake discuss various topics related to building AI agents and using MongoDB in the AI space. They cover the use of agents and multi-agents, the challenges of controlling agent behavior, and the importance of prompt compression. When you are building agents. Build them iteratively. Start with simple LLM ca…
  continue reading
 
In this episode, Kirk Marple, CEO and founder of Graphlit, shares his expertise on building efficient data integrations. Kirk breaks down his approach using relatable concepts: The "Two-Sided Funnel": This model streamlines data flow by converting various data sources into a standard format before distributing it. Universal Data Streams: Kirk expla…
  continue reading
 
In our latest episode, we sit down with Derek Tu, Founder and CEO of Carbon, a cutting-edge ETL tool designed specifically for large language models (LLMs). Carbon is streamlining AI development by providing a platform for integrating unstructured data from various sources, enabling businesses to build innovative AI applications more efficiently wh…
  continue reading
 
In this episode, Nicolay sits down with Hugo Lu, founder and CEO of Orchestra, a modern data orchestration platform. As data pipelines and analytics workflows become increasingly complex, spanning multiple teams, tools and cloud services, the need for unified orchestration and visibility has never been greater. Orchestra is a serverless data orches…
  continue reading
 
Ever wondered how AI systems handle images and videos, or how they make lightning-fast recommendations? Tune in as Nicolay chats with Zain Hassan, an expert in vector databases from Weaviate. They break down complex topics like quantization, multi-vector search, and the potential of multimodal search, making them accessible for all listeners. Zain …
  continue reading
 
In this episode of "How AI is Built", data architect Anjan Banerjee provides an in-depth look at the world of data architecture and building complex AI and data systems. Anjan breaks down the basics using simple analogies, explaining how data architecture involves sorting, cleaning, and painting a picture with data, much like organizing Lego bricks…
  continue reading
 
Jorrit Sandbrink, a data engineer specializing on open table formats, discusses the advantages of decoupling storage and compute, the importance of choosing the right table format, and strategies for optimizing your data pipelines. This episode is full of practical advice for anyone looking to build a high-performance data analytics platform. Lake …
  continue reading
 
Kirk Marple, CEO and founder of Graphlit, discusses the evolution of his company from a data cataloging tool to an platform designed for ETL (Extract, Transform, Load) and knowledge retrieval for Large Language Models (LLMs). Graphlit empowers users to build custom applications on top of its API that go beyond naive RAG. Key Points: Knowledge Graph…
  continue reading
 
From Problem to Requirements to Architecture. In this episode, Nicolay Gerold and Jon Erich Kemi Warghed discuss the landscape of data engineering, sharing insights on selecting the right tools, implementing effective data governance, and leveraging powerful concepts like software-defined assets. They discuss the challenges of keeping up with the e…
  continue reading
 
In this episode, Nicolay Gerold interviews John Wessel, the founder of Agreeable Data, about data orchestration. They discuss the evolution of data orchestration tools, the popularity of Apache Airflow, the crowded market of orchestration tools, and the key problem that orchestrators solve. They also explore the components of a data orchestrator, t…
  continue reading
 
In this episode of "How AI is Built", we learn how to build and evaluate real-world language model applications with Shahul and Jithin, creators of Ragas. Ragas is a powerful open-source library that helps developers test, evaluate, and fine-tune Retrieval Augmented Generation (RAG) applications, streamlining their path to production readiness. Mai…
  continue reading
 
In this episode of Changelog, Weston Pace dives into the latest updates to LanceDB, an open-source vector database and file format. Lance's new V2 file format redefines the traditional notion of columnar storage, allowing for more efficient handling of large multimodal datasets like images and embeddings. Weston discusses the goals driving LanceDB'…
  continue reading
 
Had a fantastic conversation with Christopher Williams, Solutions Architect at Supabase, about setting up Postgres the right way for AI. We dug deep into Supabase, exploring: Core components and how they power real-time AI solutions Optimizing Postgres for AI workloads The magic of PG Vector and other key extensions Supabase’s future and exciting n…
  continue reading
 
If you've ever wanted a simpler way to integrate AI directly into your database, SuperDuperDB might be the answer. SuperDuperDB lets you easily apply AI processes to your data while keeping everything up-to-date with real-time calculations. It works with various databases and aims to make AI development less of a headache. In this podcast, we explo…
  continue reading
 
Supabase just acquired OrioleDB, a storage engine for PostgreSQL. Oriole gets creative with MVCC! It uses an UNDO log rather than keeping multiple versions of an entire data row (tuple). This means when you update data, Oriole tracks the changes needed to "undo" the update if necessary. Think of this like the "undo" function in a text editor. Inste…
  continue reading
 
Today’s guest is Antonio Bustamante, a serial entrepreneur who previously built Kite and Silo and is now working to fix bad data. He is building bem, the data tool to transform any data into the schema your AI and software needs. bem.ai is a data tool that focuses on transforming any data into the schema needed for AI and software. It acts as a sys…
  continue reading
 
Loading …
Listen to this show while you explore
Play