AI Product Recommendations

Metricuno

May 17, 2026

4 min read

Quick answer

AI product recommendations use collaborative filtering, content-based matching, or LLM-driven attribute understanding to surface the right next product. Here's how the engines work and the lift to expect.

Definition

AI Optimization

AI Product Recommendations

Algorithmic suggestions that match shoppers to products using behavioural, content, or LLM-derived signals.

AI product recommendations are the ranked product suggestions that appear in PDP carousels, cart drawers, search results, and post-purchase emails — generated by an engine that scores every candidate SKU against a shopper's context in real time. The three classic approaches are collaborative filtering (people who viewed this also viewed), content-based matching (similar attributes: fabric, brand, price band), and increasingly LLM-driven engines that read product copy and reviews to understand intent.

Done well, the 'you might also like' rail stops being decorative and starts driving 10-30% of total revenue. Done badly, it shows the item the shopper just added to cart.

Also known as

personalised recommendations

product recommendation engine

recsys

Recommendation surfaces have multiplied. A Shopify apparel store today might run rails on the home page, the PDP, the cart drawer, the empty-search state, the order-confirmation page, and the abandonment email — each with different intent and different ranking logic. Treating them as one block is the most common mistake.

Modern engines blend signals rather than picking one model. A typical hybrid score weights recent browse behaviour, attribute similarity, margin, and stock level. As a slice of AI Optimization, recommendations are where most stores see the fastest payback because the surface area is huge and the baselines (random or 'bestseller') are weak.

Formula

score(u, i) = w1 · CF(u, i) + w2 · Content(u, i) + w3 · Margin(i) - w4 · Recency_penalty(u, i)

Variables

CF(u, i)

Collaborative filtering score

Probability user u co-engages with item i based on similar shoppers' behaviour.

Content(u, i)

Content similarity score

Cosine similarity between user's recently-viewed attribute vector and item i's attributes.

Margin(i)

Margin uplift

Normalised gross margin of item i — used as a tiebreaker, not a primary driver.

Recency_penalty(u, i)

Already-seen penalty

Down-weights items the user has just viewed, added to cart, or purchased.

w1..w4

Tunable weights

Learned per surface — PDP weights content high, cart weights CF and margin high.

Worked example

Beauty store ranking candidates for a PDP 'complete the routine' rail

CF score (cleanser → serum): 0.62

Content score (both 'sensitive skin'): 0.78

Margin score (normalised): 0.55

Recency penalty (not yet viewed): 0

Weights w1..w4: 0.5, 0.3, 0.1, 0.4

→ Score ≈ 0.5·0.62 + 0.3·0.78 + 0.1·0.55 - 0.4·0.0 = 0.60

A score of 0.60 on a 0-1 scale puts this serum comfortably in the top slots of the rail. Swap in a recently-purchased cleanser and the recency penalty drops it out entirely.

The weights matter more than the model choice. A PDP rail that over-weights margin shows expensive irrelevant products and tanks click-through. A cart rail that ignores margin leaves money on the table because the shopper is already buying. Tune per surface, not per store.

Benchmark

Typical revenue lift from AI product recommendations by surface (vs. no rail or static bestsellers)

Surface	CTR on rail	Attributed revenue lift	Time to positive ROI
Home page rail	3-6%	1-3%	2-4 weeks
PDP 'you might also like'	8-14%	4-8%	1-3 weeks
PDP 'complete the look/routine'	10-18%	6-12%	1-3 weeks
Cart drawer cross-sell	12-22%	5-10%	1-2 weeks
Post-purchase upsell	6-11%	3-7%	2-4 weeks
Abandonment email	4-9%	2-5%	3-6 weeks

The cart drawer and PDP 'complete the look' rails consistently outperform because they catch buying intent at peak. Home-page rails are the lowest-leverage surface — strong CTR, weak attributed revenue — because the shopper hasn't formed intent yet. If you're running one rail, run it on the PDP.

Frequently asked

AI product recommendations: FAQ

Collaborative filtering uses behavioural co-occurrence — shoppers who viewed A also viewed B. Content-based uses product attributes — A and B share fabric, brand, or price band. Collaborative needs traffic to work; content-based works on day one but misses non-obvious affinities. Most production engines blend both.

Pure collaborative filtering needs roughly 50k+ sessions a month to escape sparsity issues. Content-based and LLM-driven engines work from session one because they read product attributes, not behaviour. If you're below 30k sessions, start content-based and layer collaborative signals as traffic grows.

A well-built engine returns ranked SKUs in 30-80ms server-side, and the rail itself loads lazily below the fold. The performance hit comes from heavy client-side widgets that re-fetch on every scroll. Audit Largest Contentful Paint before and after — if it moves more than 100ms, the integration is the problem, not recommendations.

LLMs read product titles, descriptions, and review text to understand attributes the catalogue doesn't expose — 'good for narrow feet', 'vegan', 'gift for a teen'. They're strongest on long-tail and cold-start SKUs where collaborative signals are thin. They don't replace CF; they add a semantic layer on top.

Use both on different surfaces. 'Frequently bought together' belongs on the cart and PDP near the buy button — it's a complement signal. 'You might also like' belongs lower on the PDP or empty search — it's a discovery signal. Mixing them on one rail confuses the ranking.

Track three metrics: rail CTR, attach rate (% of orders containing a recommended SKU), and incremental revenue from a holdout A/B test. CTR alone is vanity — a rail can drive clicks while cannibalising organic discovery. The holdout test is the only true measure of incrementality.

Yes, and you should. Split traffic 50/50 between two engines (or one engine vs. static bestsellers) for at least two full weekly cycles. Measure revenue per session, not CTR. Most stores find their first big win in the test, not in switching engines later.

Stores moving from no rail (or static bestsellers) to a personalised engine typically see 5-15% lift in revenue per session within four weeks, concentrated on PDP and cart surfaces. Lift plateaus around month three as the model has learned the catalogue.

Apply hard filters before ranking — exclude OOS, exclude SKUs below a margin floor, exclude items the shopper just viewed or bought. Then let the model rank what's left. Trying to teach the model to avoid these via training data is slower and less reliable than a filter step.

Shopify's native 'related products' is content-based and acceptable as a baseline. You typically outgrow it once you have 500+ SKUs or want behaviour-driven ranking. The signal: if your rail CTR is below 4% and attach rate is flat, the built-in engine is the bottleneck.

Get an AI expert review of your site

Paste your URL — Metricuno's AI runs the same heuristic checks a senior CRO consultant would, scoring your page and prioritising the fixes that'll move conversion fastest.

Run a free expert review

AI Product Recommendations

AI Product Recommendations

Typical revenue lift from AI product recommendations by surface (vs. no rail or static bestsellers)

AI product recommendations: FAQ

What's the difference between collaborative filtering and content-based recommendations?

How much traffic do I need before AI recommendations work?

Do AI recommendations slow down my PDP?

How do LLM-driven recommendations differ from traditional engines?

Should I show 'you might also like' or 'frequently bought together'?

How do I measure if my recommendations are working?

Can I A/B test different recommendation algorithms?

What's a realistic revenue lift from adding AI recommendations?

How do I avoid recommending out-of-stock or low-margin products?

Do I need a dedicated recommendation tool, or is my platform's built-in engine enough?

Get an AI expert review of your site