Hybrid Search

Search that fuses lexical (BM25) and semantic (vector) retrieval, combining their scores so exact matches and meaning-based matches both surface.

Hybrid search runs two retrievers in parallel — a keyword index (BM25 / TF-IDF) and a vector index (embedding similarity) — and fuses their results. The fusion is usually done with Reciprocal Rank Fusion (RRF), weighted score combination, or by feeding both signals into a learned re-ranker.

It exists because pure-keyword search misses synonyms and natural-language queries, and pure-vector search misses exact matches (SKU codes, brand names, dimensions). Real ecommerce queries are a mix of both. A shopper searching “air max 90 size 10” needs both lexical precision and intent understanding.

Implementation note: the two signals are on incompatible scales (BM25 returns raw scores in the 0–30 range; cosine similarity is bounded at 1.0). Don’t add them directly — normalize first, or use rank-based fusion which sidesteps the scale problem.

Related terms