Opportunity Scoring

Metricuno
May 18, 2026
4 min read
Quick answer

Opportunity Scoring ranks customer needs by importance and satisfaction to find under-served jobs worth testing — a research input, not a test-ranking framework.

Definition
CRO Research

Opportunity Scoring

A research method that ranks customer needs by importance and satisfaction to surface under-served jobs worth testing.

Opportunity Scoring is a customer-research technique popularised by Tony Ulwick (Outcome-Driven Innovation) and echoed in Sean Ellis's growth playbook. You survey customers on two dimensions for each job-to-be-done or product attribute: how important it is, and how satisfied they are with the current experience. A simple formula combines both into an opportunity score that highlights gaps — high importance, low satisfaction — where investment is likely to move the needle.

It is a research input, not a backlog ranking system. The score tells you WHAT to work on (which problem area is under-served); frameworks like ICE, PIE or RICE then rank HOW to test it.

Also known as
Outcome-Driven Opportunity Score
Importance-Satisfaction Gap Analysis
ODI Opportunity Score

The method assumes that needs customers rate as highly important but poorly satisfied are the ones competitors have failed to solve. Closing that gap is where new revenue tends to live — and where a CRO test is most likely to produce a real lift rather than a flat result.

On an e-commerce site this looks practical. You might survey buyers on jobs like "choose the right size", "trust the brand before paying", or "track the order after checkout". An apparel store that scores low on size confidence and high on its importance has a clear opportunity area — fit guides, model height callouts, return-policy reassurance — long before anyone writes a single test card.

Formula

Opportunity Score = Importance + max(Importance - Satisfaction, 0)

Variables

Importance

Importance rating

Customer rating (typically 1-10) of how important the outcome or job is.

Satisfaction

Satisfaction rating

Customer rating (1-10) of how well the current solution delivers on that outcome.

Worked example

A Shopify apparel store surveys 400 recent buyers about the job "choose the right size on the first try". Average Importance = 9.2, average Satisfaction = 5.4.

Importance: 9.2

Satisfaction: 5.4

9.2 + max(9.2 - 5.4, 0) = 13.0

A score of 13.0 is a strong under-served opportunity (anything above 12 is generally considered actionable, above 15 is urgent). Size-confidence work — fit finders, customer-photo galleries, free-return banners — should jump to the top of the test backlog.

Two practical notes on the math. Importance is added in twice when satisfaction lags, so even moderately important needs can score high if customers really hate the status quo. And the max() clamp prevents over-served needs (where satisfaction exceeds importance) from going negative — they simply score equal to importance, signalling "nothing to fix here".

Benchmark

Typical Opportunity Score bands and what to do with them

Score rangeReadRecommended action
≥ 15Severely under-servedTop of backlog — design 2-3 tests targeting this job within the quarter.
12 - 15Under-servedStrong test candidates — write hypotheses, prioritise with ICE/RICE.
10 - 12Moderate gapWorth a lightweight test or qualitative follow-up before committing dev time.
7 - 10Appropriately servedDefend, don't disrupt — risk of negative tests is higher here.
< 7Over-servedCandidate for simplification or feature removal, not new investment.

Opportunity Scoring pairs naturally with Experiment Prioritization. Run the survey once a quarter to refresh your map of under-served jobs, then feed the top 3-5 opportunity areas into your ICE or RICE scoring as the "impact" anchor. Tests sourced this way tend to win more often because they target real, measured customer pain rather than internal hunches.

Frequently asked

Opportunity Scoring FAQ

ICE and RICE rank test ideas you've already written. Opportunity Scoring tells you which customer problems deserve test ideas in the first place. It sits one step earlier in the workflow — research, not backlog grooming.

150-400 responses per segment is the usual sweet spot. Below 100 the averages get noisy; above 500 you stop learning much. If you're segmenting by behaviour (e.g. first-time vs returning buyers), aim for at least 150 per segment.

Survey recent buyers for outcome ratings (they have ground truth on whether the job got done) and browsers or cart-abandoners for friction signals. Mixing both segments often surfaces the largest gaps because abandoners rate satisfaction much lower on the same jobs.

A 1-10 scale is the classic Ulwick format and what the formula was calibrated for. A 1-5 scale works but compresses the score range and makes bands harder to read. Avoid mixing scales across questions in the same survey.

You can approximate it from support tickets, review mining, and session replays — code each piece of feedback by job and tally frequency (proxy for importance) and sentiment (proxy for satisfaction). It's directional, not statistical, but useful when survey infrastructure isn't in place.

Quarterly is typical. Site changes, seasonal shifts, and competitive moves can change which jobs are under-served. A stale opportunity map quietly biases your backlog toward problems you already solved last quarter.

Scores at or above 12 are generally worth testing; 15+ is urgent. Below 10 the expected lift from a test is small enough that you're better off defending the experience or simplifying it rather than redesigning.

Yes — frame the jobs around the visit instead of the purchase: "quickly understand what the brand sells", "see if my size is in stock", "feel confident this isn't a scam". The same importance × satisfaction logic applies.

Use the top-scoring opportunities as the "impact" or "reach" input when scoring test ideas with ICE or RICE. Tests anchored to a high-opportunity job have a measurably higher win rate because the underlying need is already validated.

Both — but jobs-to-be-done give cleaner results. Feature-level scoring tends to reflect what customers know exists, while job-level scoring captures unmet needs they can't articulate as a feature request. Start with jobs, then drill into features within the top opportunity areas.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.