Artificial Intelligence

BM25

Best Matching 25 — a widely used ranking function for keyword-based information retrieval. BM25 scores documents based on query term frequency, document length, and corpus statistics.

Why It Matters

BM25 remains surprisingly competitive even in the era of neural search. It is fast, interpretable, and requires no training — making it an essential baseline.

Example

Elasticsearch using BM25 to rank documents for the query 'machine learning optimization,' boosting documents that use these terms frequently and penalizing very long documents.

Think of it like...

Like a librarian who recommends books based on how often they mention your topic, with adjustments for book length — a short book mentioning your topic 10 times is more relevant than a 1000-page book mentioning it 10 times.

Related Terms