Impact Estimation

Metricuno

May 18, 2026

4 min read

Quick answer

Impact estimation forecasts the revenue lift a winning A/B test would deliver — the input that decides which experiments are worth running. Here's the formula, typical ranges, and how to use it.

Definition

Experimentation

Impact Estimation

Forecasting the expected revenue lift from a winning A/B test before you run it, used to decide which experiments are worth the queue slot.

Impact estimation is the practice of projecting how much extra revenue a test would generate if its winning variant rolled out to 100% of traffic. The classic form multiplies four levers: the expected effect size (relative conversion lift), the audience size exposed to the change, the average order value of that audience, and the time window over which the lift accrues.

It is the quantitative input to almost every prioritization framework — ICE, PIE, PXL — and it is what separates a test queue ranked by gut from one ranked by expected euros. A well-built estimate is conservative on effect size, honest about audience reach, and explicit about the runtime assumption baked in.

Also known as

Expected lift

Revenue forecast

Test impact modeling

Most CRO programs lose money not on losing tests but on winning tests that were never worth running. A 4% lift on a page that gets 800 sessions a month is a rounding error; the same lift on checkout pays for the entire tool stack. Impact estimation forces that distinction before a developer touches the variant.

It is also the bridge to finance. When you tell a Head of E-commerce that a checkout test is queued, the next question is always "how much?" — and a defensible estimate (with assumptions written down) is what keeps the experimentation roadmap funded. It feeds directly into experiment prioritization as the I or the Impact term in whichever scoring model your team uses.

Formula

Estimated Lift (€) = Effect Size × Baseline Conversion Rate × Audience Size × AOV × Runtime

Variables

Effect Size

Relative lift

Expected % uplift in conversion rate from the winning variant (e.g. 0.05 for a 5% lift)

Baseline Conversion Rate

Current CVR

Conversion rate of the page or flow being tested, as a decimal

Audience Size

Eligible visitors

Sessions per period that will encounter the change once rolled out

AOV

Average order value

Mean revenue per converting session for the audience segment

Runtime

Forecast window

Number of periods you are projecting over (e.g. 12 for annualized)

Worked example

An apparel Shopify store estimates the annual lift from a product-page test on dresses.

Effect Size: 0.06 (6% relative lift)

Baseline Conversion Rate: 0.022 (2.2%)

Audience Size: 120,000 sessions/month

AOV: €78

Runtime: 12 months

→ €14,826 annualized estimated lift

A 6% relative lift on a 2.2% baseline becomes 2.33%, adding ~158 orders per month at €78 AOV. Over 12 months that's just under €15k — enough to justify a one-week test, not enough to justify a six-week build.

Pick effect sizes from a defensible distribution, not optimism. The honest range for most page-level changes on an established store is 2-8% relative lift; structural changes to checkout or PDP can reach 10-15% but those are the exception, not the planning assumption. Use the table below as a sanity check before you commit to a forecast.

Benchmark

Typical relative conversion lift by test type (winning variants only)

Test type	Median lift	Top-quartile lift	Hit rate
Headline / copy on PDP	2-4%	6-8%	~25%
Pricing display & urgency	3-6%	9-12%	~30%
Product-page layout / imagery	4-7%	10-14%	~22%
Cart & checkout UX	5-9%	12-18%	~35%
Navigation & category pages	2-5%	7-10%	~20%
Add-to-cart button & PDP CTA	3-5%	8-11%	~28%

Two adjustments separate amateur estimates from credible ones. First, multiply by the historical hit rate of similar tests on your store — if only 30% of checkout tests win, the expected value is 0.3 × the winning-case lift. Second, discount audience size by the fraction of traffic that actually sees the change (mobile-only tests, geo-targeted tests, returning-visitor tests all need this cut applied).

Frequently asked

Frequently asked questions

Prioritization is the ranking decision; impact estimation is one of the inputs that feeds it. Frameworks like ICE and PIE multiply impact by confidence and ease (or similar) to produce a score — without a credible impact number, the score is just opinion.

Use a conservative anchor from public CRO benchmarks for that test type — usually 3-5% relative lift for a well-designed change on an established store. Then halve it for your planning estimate. You can revise up once you have your own historical win rates.

Both. Show stakeholders the winning-case lift (the upside) and the expected value adjusted for hit rate (winning lift × probability of winning). The first sells the test; the second is what you actually plan revenue against.

AOV converts the extra orders into euros. If the test changes basket size too (e.g. a cross-sell), model that as a separate AOV uplift term — don't compound it into the conversion-rate effect size, or you'll double-count.

Use 12 months for annualized impact when discussing ROI with finance, and use the realistic shelf-life of the change for prioritization (some seasonal tests only matter for 8 weeks). Always state the window explicitly in the estimate.

Lower audience but higher baseline conversion rate and higher AOV usually make checkout tests the highest-impact slot in the queue. Run the formula end-to-end rather than discounting by traffic alone — the math often surprises teams.

For acquisition-stage tests yes — a conversion lift on a first-time buyer is worth their full LTV, not one order. For repeat-purchase or upsell tests, AOV is the right unit. Don't mix the two in the same estimate.

Aim for an estimate you'd be willing to defend after the test runs. If your post-test reality consistently lands below your forecasts, you're being optimistic; if it lands above, your queue is probably under-prioritizing high-effort tests.

Yes — the formula doesn't care about the change type, only the levers. Copy and imagery tests typically have lower effect sizes (2-4%) and lower hit rates, which is exactly why estimating them up-front prevents the queue from filling up with low-impact creative swaps.

A disciplined team lands within ±30% of the forecast on most tests, with checkout and pricing tests being the most predictable and copy tests the noisiest. The point is not perfect accuracy — it's having a consistent yardstick to rank a queue of 40 ideas down to the 5 that actually run.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.

Launch your first experiment

Impact Estimation

Impact Estimation

Typical relative conversion lift by test type (winning variants only)

Frequently asked questions

What's the difference between impact estimation and experiment prioritization?

How do I pick an effect size without prior test data?

Should I estimate using winning lift or expected value?

How does AOV factor in if my test only affects conversion rate, not basket size?

What runtime should I use in the forecast?

How do I handle tests on small audiences like checkout?

Should I include LTV instead of AOV?

How conservative should I be with the effect size?

Does impact estimation apply to qualitative tests like new copy or imagery?

How accurate are these estimates in practice?

Test ideas before you ship them