Holdout Group Size Calculator

Determine the optimal holdout group size for statistically significant campaign measurement.

53,209
Holdout Size (53.2%)
46,791
Treatment Size
80%
Statistical Power

Why Holdout Group Sizing Matters

Holdout testing is the gold standard for measuring the causal impact of marketing campaigns. Unlike attribution-based measurement, holdout tests create a true controlled experiment by randomly withholding a portion of your audience from receiving a campaign. The difference in outcomes between the treatment and holdout groups represents the incremental effect of your marketing.

However, the size of your holdout group directly determines whether your results will be statistically meaningful. Too small a holdout group and your test will lack the statistical power to detect real effects, leading to inconclusive results. Too large a holdout group and you are unnecessarily withholding campaigns from customers who could benefit from them, leaving revenue on the table.

The optimal holdout size depends on four factors: your total audience size, the baseline conversion rate, the minimum lift you want to detect, and your desired confidence level. This calculator uses the standard two-proportion z-test sample size formula to determine the minimum holdout group size needed to achieve 80% statistical power at your chosen confidence level.

Understanding the Math

The sample size formula for a two-proportion z-test is: n = (Z_alpha/2 + Z_beta)^2 x (p1(1-p1) + p2(1-p2)) / (p1-p2)^2, where p1 is the baseline conversion rate, p2 is the expected conversion rate with lift, Z_alpha/2 is the critical value for your confidence level, and Z_beta is the critical value for your desired power (typically 0.84 for 80% power).

For example, with a 3% baseline conversion rate and a 10% minimum detectable lift, p2 = 3.3%. The small absolute difference (0.3 percentage points) means you need a relatively large sample to distinguish signal from noise. At 95% confidence, this typically requires several thousand users in the holdout group.

Scalversion automatically sizes holdout groups for every campaign based on these statistical principles, ensuring you always have enough power to detect meaningful lift without over-allocating to holdout.

Frequently Asked Questions

What is a holdout group?

A holdout group is a randomly selected subset of your audience that does not receive a marketing campaign. By comparing the behavior of the holdout group to the treatment group (who received the campaign), you can measure the true causal impact of your marketing.

How large should my holdout group be?

Holdout group size depends on your audience size, baseline conversion rate, the minimum lift you want to detect, and your desired confidence level. Larger holdout groups give more statistical power but mean fewer customers receive your campaign. Typically, 5-15% of your audience is a good starting point.

What is statistical power?

Statistical power is the probability that your test will detect a real effect when one exists. A power of 80% means there is an 80% chance of detecting the true lift if it is at least as large as your minimum detectable effect. Higher power requires larger sample sizes.

What confidence level should I use?

A 95% confidence level is standard in marketing experimentation. This means there is only a 5% chance of a false positive (concluding the campaign had an effect when it did not). For high-stakes decisions, use 99%. For quick directional reads, 90% may suffice.

Automatic holdout sizing on every campaign

Scalversion handles holdout group assignment and statistical measurement automatically.

Start Free Pilot