Skip to main content
Case StudiesE-Commerce

This case study describes a real engagement. Client identity, proprietary details, and specific metrics are anonymized or approximated under NDA.

E-Commerce

AI-Powered Product Search and Discovery

The Problem

Keyword-based site search returning irrelevant results. 40% of search sessions ending without a click. No semantic understanding of product relationships — a search for "office chair with lumbar support" would return results based on keyword overlap rather than product characteristics.

The Solution

Vector search with embedding-based retrieval, query understanding layer, and personalized re-ranking using session and purchase history signals. Replaced the existing keyword search index with a hybrid retrieval system that handles semantic queries, misspellings, and attribute-based filtering.

2.4x
Search Click-Through
180ms
P95 Search Latency
31%
Conversion Lift
Overview

This engagement replaced a legacy keyword-based search system with a vector retrieval architecture covering 180,000+ active product SKUs. The system combines dense vector retrieval using OpenAI embeddings with sparse BM25 retrieval in a hybrid ranking layer, which outperforms either approach alone on the query types that dominate the platform's search logs. Product embeddings are pre-computed and stored in Pinecone; query embeddings are generated at request time. Re-ranking applies a lightweight personalization model that adjusts result ordering based on category affinity derived from the user's session and purchase history. Total P95 search latency at production load is 180ms, including embedding generation, retrieval, and re-ranking.

Challenge

The Challenge

The primary challenge was building an embedding strategy that captures product characteristics accurately across a catalog with significant attribute inconsistency. Product descriptions and specifications were submitted by multiple merchants with different terminology, levels of detail, and formatting conventions — the same product attribute might appear as "lumbar support", "ergonomic back", or "lower back cushion" depending on the merchant. Embedding quality is directly dependent on input text quality, requiring a normalization pass before embedding generation. The personalization re-ranking layer needed to be fast enough to stay within the overall latency budget, which ruled out approaches that required expensive model inference at re-rank time.

Approach

How We Built It

01

Search log analysis and query taxonomy (Weeks 1–2): We analyzed 30 days of search logs (4.2M queries) to understand the distribution of query types: navigational (brand + product name), attribute-based (descriptive queries with specific requirements), category browsing (broad terms), and long-tail queries. The distribution of zero-click sessions was significantly higher for attribute-based and long-tail queries, confirming that keyword matching was failing on precisely the queries where semantic understanding would add the most value. This analysis informed the embedding strategy and the test set selection for evaluation.

02

Product embedding pipeline (Weeks 3–4): We built a product representation pipeline that combines title, description, category hierarchy, and structured attributes (material, dimensions, color, compatibility) into a normalized text document for each SKU. Normalization handles common attribute terminology variants, strips promotional language that does not reflect product characteristics, and standardizes units and formats. OpenAI's text-embedding-3-large model generates embeddings for each product document, which are stored in Pinecone. The embedding pipeline runs nightly to pick up new products and changed listings, with incremental updates for changed SKUs only.

03

Hybrid retrieval and query understanding (Weeks 5–6): The query processing layer handles spelling correction, query expansion for known synonym sets, and attribute extraction for queries that contain specific constraints (e.g., extracting "lumbar support" as a required attribute filter). Retrieval combines dense vector search from Pinecone with sparse BM25 retrieval from a Redis inverted index, with a learned fusion weight that was calibrated against the annotated relevance evaluation set. The hybrid approach outperformed pure vector retrieval on navigational queries and pure BM25 on semantic queries.

04

Personalization re-ranking and production deployment (Weeks 6 — post-launch): The re-ranking layer applies a lightweight gradient boosting model that adjusts the base retrieval score using category affinity features (computed from session and 90-day purchase history), price range affinity, and brand preference signals. The model runs in under 3ms per query — small enough to fit within the latency budget without requiring a dedicated inference service. The full search stack is served from a Go API layer that handles authentication, caching (Redis for popular queries), and request routing. P95 latency at production load measured 180ms in pre-launch load testing.

Results

What We Delivered

Search click-through rate increased 2.4x from pre-deployment baseline (measured as the proportion of search sessions resulting in at least one product click). Zero-click search sessions dropped from 40% to 17%. The improvement was largest in the attribute-based and long-tail query categories, consistent with the pre-project hypothesis that semantic understanding would most benefit the query types where keyword matching was weakest.

P95 search latency at production load measured 180ms end-to-end, against a 200ms target. Pinecone retrieval accounts for approximately 80ms of the budget; embedding generation, Redis cache check, and re-ranking account for the remainder. Cache hit rate for popular queries is approximately 68%, bringing P95 latency for cached queries under 40ms.

Conversion rate for sessions containing at least one search query increased 31% in the 30 days following full deployment. The improvement was attributed primarily to the reduction in dead-end search experiences that previously caused buyers to navigate away from the platform rather than refine their query.

Tech Stack
PythonPineconeOpenAI EmbeddingsNext.jsGoRedis
Timeline
6 weeks
Team Size
2 engineers

Ready to build something like this?

Tell us what you are building. We will scope it, price it honestly, and give you a clear plan.

Start a Conversation

Free 30-minute scoping call. No obligation.