Similarity Searching With Vectors (Chapter 8) The Memriq AI Inference Brief

The Memriq AI Inference Brief – Engineering Edition « »

4d ago 20:34

Content provided by Keith Bourne. All podcast content including episodes, graphics, and podcast descriptions are uploaded and provided directly by Keith Bourne or their podcast platform partner. If you believe someone is using your copyrighted work without your permission, you can follow the process outlined here https://podcastplayer.com/legal.

Unlock the power of similarity search with vectors in this episode of Memriq Inference Digest – Engineering Edition. We explore how dense and sparse vector techniques combine to enable scalable, accurate semantic retrieval for AI systems, inspired by Chapter 8 of Keith Bourne’s book. Join us and special guest Keith Bourne as we unpack the engineering trade-offs, indexing algorithms, hybrid search strategies, and real-world applications that make vector search foundational in modern AI workflows.

In this episode:

- The fundamentals of representing data as high-dimensional embeddings and retrieving nearest neighbors

- How hybrid search fuses dense semantic embeddings with sparse keyword vectors to boost relevance

- Deep dive into Approximate Nearest Neighbor algorithms like HNSW for billion-scale indexing

- Practical considerations between open-source models and managed vector stores

- Engineering tips on tuning ANN parameters, persistence, and combining retrieval results with Reciprocal Rank Fusion

- Real-world use cases in enterprise search, recommendation engines, and retrieval-augmented generation systems

Key tools and technologies mentioned:

- sentence_transformers (e.g., all-mpnet-base-v2)

- BM25Retriever

- LangChain and Chroma

- FAISS, HNSW, ANNOY

- Reciprocal Rank Fusion (RRF)

- Pinecone, Weaviate, Google Vertex AI Vector Search

Timestamps:

0:00 - Introduction and episode overview

2:00 - The power of hybrid search: dense + sparse vectors

5:30 - ANN algorithms and indexing techniques (HNSW, LSH)

9:00 - Trade-offs: open-source embeddings vs commercial APIs

11:30 - Reciprocal Rank Fusion and ranking strategies

14:00 - Engineering challenges: persistence, tuning, and latency

16:30 - Real-world applications and production system considerations

19:00 - Final thoughts and resources

Resources:

- "Unlocking Data with Generative AI and RAG" by Keith Bourne - Search for 'Keith Bourne' on Amazon and grab the 2nd edition

- Visit Memriq.ai for advanced AI engineering guides and resources

Thanks for tuning into Memriq Inference Digest – Engineering Edition. Stay sharp, and see you next time!

21 episodes