Dense Retrieval
Information retrieval using learned vector embeddings to find semantically similar documents. Called 'dense' because document representations are dense numerical vectors with no zero values.
Why It Matters
Dense retrieval understands meaning beyond keywords, finding relevant content even when query and document use completely different words.
Example
Finding documents about 'reducing employee turnover' when the user searches 'how to keep staff from leaving' — same meaning, zero keyword overlap.
Think of it like...
Like asking a knowledgeable person for help versus doing a keyword search — they understand what you mean, not just what you said.
Related Terms
Embedding
A numerical representation of data (text, images, etc.) as a vector of numbers in a high-dimensional space. Similar items are placed closer together in this space, enabling machines to understand semantic relationships.
Semantic Search
Search that understands the meaning and intent behind a query rather than just matching keywords. It uses embeddings to find results that are conceptually related even if they use different words.
Vector Database
A specialized database designed to store, index, and search high-dimensional vector embeddings efficiently. It enables fast similarity searches across millions or billions of vectors.
Hybrid Search
A search approach that combines keyword-based (lexical) search with semantic (vector) search to get the benefits of both — exact matching for specific terms and meaning-based matching for conceptual queries.
Sparse Retrieval
Information retrieval using traditional keyword matching and term frequency methods (like BM25). Called 'sparse' because document representations have mostly zero values.