Generative search reframes the search box as a conversational answer engine. Instead of returning a list, an LLM reads the top-N retrieved products and produces a direct answer: a recommendation, a comparison, a “best for X” short-list, with the underlying products linked as citations.
For ecommerce, the killer use case is exploratory and natural-language queries: “what should I get my niece who likes climbing for her 10th birthday under $50?” Traditional search struggles; generative search shines. The retriever (BM25 + vectors) still does the heavy lifting — the LLM only sees what the retriever surfaces — so retrieval quality is still the bottleneck.
Cost and latency are the trade-offs: an LLM call adds 1–4 seconds and a few cents per search vs. milliseconds and ~free for classical search. Stores typically reserve generative answers for long-form or low-result queries where it’s clearly worth it.