How to use Device Analysis

Metricuno
May 18, 2026
6 min read
Quick answer

Splitting test results by desktop, mobile, and tablet is often the most decision-changing cut you can make. Here's how to do it without fooling yourself.

Definition
Experimentation

Device Analysis

Segmenting A/B test results by device type — desktop, mobile, tablet — to surface effects that the pooled number hides.

Device analysis is the practice of slicing experiment results by the device the visitor used: desktop, mobile, and tablet at minimum, often broken further by OS or viewport. It exists because these populations behave like different products. A mobile shopper on iOS browsing during a commute and a desktop shopper researching at home have different intent, different ergonomics, and very different baseline conversion rates.

A single pooled lift number averages across all of them, which routinely produces a result that looks flat overall but hides a +12% win on mobile and a -8% regression on desktop. Device analysis is the cheapest, highest-leverage segmentation you can apply to an experiment — and on most DTC sites, it's the split that changes the ship/kill decision more often than any other.

Also known as
Device segmentation
Per-device test analysis

Most stores already know mobile and desktop convert differently. The trap is treating that as a baseline fact and then ignoring it when reading test results. If your overall sample is 70% mobile and the variant helps desktop, the desktop signal gets drowned out by the larger mobile cohort.

Device analysis fixes that by reading each cohort as its own experiment. It's a routine step inside broader experiment analysis — usually the first segmentation cut, before audience or traffic-source splits — because device is one of the few attributes that's reliable, ubiquitous, and clearly tied to user experience.

Why device is the most meaningful split

Three things diverge between devices: intent, ergonomics, and baseline conversion. Mobile traffic skews toward discovery and lower-commitment sessions; desktop skews toward considered purchases and higher AOV. A change that helps one mode of shopping can quietly hurt the other.

Ergonomics matter more than people credit. A sticky add-to-cart bar that feels redundant on desktop is often the single biggest mobile conversion lever. A multi-column layout that improves desktop scannability collapses into a slow, scroll-heavy mess on a 390px viewport.

Baseline conversion rates also differ by 1.5-3x between desktop and mobile on most apparel and beauty stores. That gap means percentage lifts on mobile move more absolute revenue per visitor, even when the percentage looks similar — and that mobile noise is higher, so you need to read significance per device, not pooled.

Rule of thumb

If your traffic mix is more than 60/40 on any device, the pooled result is mostly telling you about the majority cohort. Always look at the minority device before shipping — that's where the regressions hide.

How to run a clean device analysis

Pre-register the device split. Decide before you launch the test that you will read results on desktop and mobile separately, with significance evaluated per cohort. This avoids the post-hoc fishing problem where you only look at devices when the overall number disappoints.

Power each cohort independently. If desktop is 30% of your traffic, a test sized for the pooled audience will be under-powered on desktop. Either run longer, accept that desktop is directional-only, or limit the test to a single device when the hypothesis is clearly device-specific.

Chart

Typical conversion-rate gap by device (apparel & beauty stores)

0%1%2%3%4%DesktopTabletMobileConversion rateDevice

Treat tablet carefully. Tablet traffic is usually 2-5% of sessions, which means it almost never reaches significance on its own. Most teams either fold tablet into desktop (if their layout breaks at 768px+) or report it as a watch-only cohort and don't gate decisions on it.

What to expect from the data

Around a third of tests show a meaningful device interaction — meaning the lift on one device differs from the other by more than measurement noise. That's high enough that you should expect it, not be surprised by it. Checkout, PDP, and navigation tests are the most likely to split; pricing and copy tests are the least likely.

Below are typical patterns we see across DTC categories. Use this as a sanity check, not a substitute for your own data — your traffic mix and price band will shift the numbers.

Benchmark

Common device-interaction patterns by test type

Test typeMobile lift tendencyDesktop lift tendencyDevice split likely?
Sticky add-to-cart+4% to +12%0% to +1%Yes
Image gallery / zoom+1% to +5%+3% to +8%Moderate
Checkout field reduction+2% to +6%+1% to +3%Yes
Free-shipping threshold copy+1% to +4%+1% to +4%Rarely
Sticky nav / mega-menu0% to +2%+2% to +5%Yes
Trust badges on PDP0% to +2%0% to +2%Rarely

The table reflects a simple truth: layout and ergonomic changes split by device, semantic changes (copy, pricing logic, trust) usually don't. If your hypothesis is about screen real estate or thumb reach, expect a device interaction. If it's about persuasion, expect a uniform effect.

Pitfalls that flip your decision

The biggest pitfall is multiple-comparisons inflation. If you check desktop, mobile, tablet, iOS, Android, and small-mobile-vs-large-mobile, you're running six tests, not one. At a 95% confidence threshold, you'd expect roughly one false positive per six checks even when nothing is happening.

Limit yourself to the splits you pre-registered, and treat any extra cut as exploratory — a hypothesis for the next test, not a ship decision. Also watch device-misclassification: large tablets reporting as desktop, or in-app browsers reporting as mobile-web, can muddy the cohorts by 2-5% of traffic.

Don't ship the average

If a test is flat overall but +8% on mobile and -6% on desktop, the answer is rarely 'ship to everyone.' Either ship per-device (variant on mobile, control on desktop) or treat it as a learning and design a desktop-specific follow-up.

Frequently asked

Frequently asked questions

Always, as the first segmentation cut. Device is reliable, ubiquitous, and tied to user experience. Pre-register the split before launch so the analysis isn't post-hoc fishing.

Not usually — one test with a pre-registered device split is fine for most cases. Run separate tests when the hypothesis is clearly device-specific (e.g. a sticky mobile bar) or when one device is so under-powered that pooling wastes runtime.

Most stores either fold tablet into desktop (if the layout breaks at tablet width) or report it as watch-only and don't gate decisions on it. Tablet rarely reaches significance on its own because it's 2-5% of sessions.

Look at the desktop number. If desktop is meaningfully negative, ship per-device or kill the variant. If desktop is flat, you can usually ship to everyone — the mobile win carries the result and desktop is neutral.

Stick to 2-3 pre-registered cohorts (typically desktop, mobile, and optionally tablet). Going deeper into iOS vs Android or browser splits inflates the false-positive rate and rarely changes the decision.

Mobile traffic skews toward discovery and lower-commitment sessions, ergonomics are harder (small screens, thumb reach, slower networks), and intent is mixed with research and social-browsing. A 1.5-3x desktop-to-mobile gap is typical for apparel and beauty stores.

Your test platform already does this — the pooled result is a traffic-weighted average by construction. The problem is the weighting hides per-device divergence, which is exactly what device analysis is for.

No. It's the first cut, not the only one. New vs returning, traffic source, and logged-in status are the other splits worth pre-registering. Device just tends to be the most decision-changing one for DTC stores.

Long enough that the smallest device cohort you care about reaches your required sample size. If desktop is 30% of traffic, plan for roughly 3x the runtime of a pooled-only analysis to power desktop independently.

Yes, and it's often the right call. Most experimentation tools and CMS platforms support device-targeted deployment. Per-device shipping is how you capture wins without inheriting the regression on the other cohort.

Test ideas before you ship them

Run unlimited A/B tests, attach hypotheses to outcomes, and build a searchable archive of what works — and what doesn't.