What Is the Bonferroni Correction?
The Bonferroni correction is a statistical adjustment that controls the familywise error rate when you're conducting multiple simultaneous hypothesis tests. The core idea is simple: divide your desired alpha level by the number of comparisons you're making. If you're running 5 tests at α = 0.05, each individual test must meet a stricter threshold of 0.05/5 = 0.01 to be declared significant. Named after Italian mathematician Carlo Emilio Bonferroni, this correction ensures that the overall probability of making at least one Type I error (false positive) across all your tests stays at or below your original alpha level. It's the most widely used and straightforward multiple-comparison correction in market research, survey analysis, and experimental design.
Why the Bonferroni Correction Matters
Every time you run an additional statistical test, you increase the chance of a false positive. With 20 independent tests at α = 0.05, you'd expect one "significant" result by chance alone, even if nothing real is happening. The Bonferroni correction prevents you from mistaking statistical noise for genuine findings, which is critical when you're making business decisions based on survey data or experiment results.
How the Bonferroni Correction Works
The Formula
Adjusted α = α / m
Where α is the desired familywise error rate (typically 0.05) and m is the number of comparisons or tests being conducted.
Equivalently, you can multiply each p-value by m and compare to the original α:
Adjusted p-value = p × m
If the adjusted p-value exceeds 1.0, cap it at 1.0.
Worked Example
You're testing whether customer satisfaction differs across four product categories. After a significant one-way ANOVA, you run all pairwise comparisons. With four groups, there are m = 4(4-1)/2 = 6 pairwise comparisons.
Adjusted α = 0.05 / 6 = 0.0083
| Comparison | Raw p-value | Adjusted p-value | Significant? |
|---|---|---|---|
| A vs. B | 0.003 | 0.018 | Yes |
| A vs. C | 0.001 | 0.006 | Yes |
| A vs. D | 0.042 | 0.252 | No |
| B vs. C | 0.210 | 1.000 | No |
| B vs. D | 0.008 | 0.048 | Yes |
| C vs. D | 0.031 | 0.186 | No |
Notice that A vs. D was significant at the uncorrected α = 0.05 (p = 0.042) but fails the Bonferroni threshold (adjusted p = 0.252). Without the correction, you would have reported a difference that's likely a false positive.
Why It Works
The Bonferroni correction is based on the Bonferroni inequality from probability theory. For m independent tests, the probability of at least one Type I error is:
P(at least one false positive) ≤ m × α_individual
By setting α_individual = α/m, you guarantee that:
P(at least one false positive) ≤ m × (α/m) = α
The inequality means the actual familywise error rate is at or below α, often below, which is why the Bonferroni correction is described as conservative.
When It Gets Too Conservative
The correction becomes increasingly conservative as the number of comparisons grows. With 50 comparisons, each test must meet α = 0.001, making it very difficult to detect real effects. This is the main criticism: the Bonferroni correction reduces Type I errors at the cost of inflating Type II errors (missing genuine effects).
Alternatives for large numbers of comparisons include:
- Holm-Bonferroni (step-down): Ranks p-values from smallest to largest and applies progressively less strict thresholds. Always more powerful than standard Bonferroni while still controlling the familywise error rate.
- Benjamini-Hochberg (FDR): Controls the false discovery rate rather than the familywise error rate. Appropriate when you're screening many variables and can tolerate some false positives among your findings.
- Tukey's HSD: Purpose-built for all pairwise comparisons after ANOVA and generally more powerful than Bonferroni for that specific situation.
Bonferroni vs. No Correction
| Scenario | No correction | Bonferroni |
|---|---|---|
| 3 comparisons | ~14% false-positive risk | ≤5% false-positive risk |
| 10 comparisons | ~40% false-positive risk | ≤5% false-positive risk |
| 20 comparisons | ~64% false-positive risk | ≤5% false-positive risk |
When to Use the Bonferroni Correction
- Post-hoc pairwise comparisons after a significant ANOVA, especially when the number of comparisons is moderate (under 10-15)
- Multiple survey items tested for group differences, e.g., comparing segments on 8 different satisfaction attributes
- Subgroup analyses where you're testing the same hypothesis across several demographic cuts
- Multiple endpoints in A/B tests where you're tracking several metrics simultaneously
- Any analysis where you're running the same type of test multiple times on the same dataset
Common Mistakes to Avoid
- Applying Bonferroni to every test in an entire study: it should be applied within a family of related tests, not across all analyses in a report
- Using Bonferroni when the number of comparisons is very large: switch to Holm-Bonferroni or FDR methods for 15+ comparisons to maintain reasonable power
- Forgetting to report whether a correction was applied: reviewers and stakeholders need to know the correction method to evaluate your findings
How Quali-Fi Supports Multiple Comparison Corrections
Quali-Fi automatically applies Bonferroni corrections when significance testing is performed across multiple groups or items in cross-tabulation reports, with the option to switch to alternative methods. The Research plan ($1,061/month) provides configurable correction settings so you can match the approach to your study design.
Configure significance testing in Quali-Fi
Frequently Asked Questions
Is the Bonferroni correction too conservative?
It can be, especially with many comparisons. The Holm-Bonferroni method is uniformly more powerful and controls the same familywise error rate, there's no statistical reason to prefer the standard Bonferroni over Holm-Bonferroni. However, the standard version remains popular because it's easy to explain and widely understood.
When should I not use the Bonferroni correction?
Skip it when you're running a single pre-planned comparison (no correction needed) or when you have dozens of comparisons in an exploratory analysis (use FDR instead). Also skip it when the tests address completely independent research questions, the correction applies to families of related tests.
Does the Bonferroni correction apply to confidence intervals?
Yes. To construct Bonferroni-adjusted confidence intervals, use the α/(2m) critical value instead of α/2. For m = 5 comparisons at α = 0.05, each confidence interval uses α/(2×5) = 0.005, yielding 99% individual confidence intervals that jointly provide 95% familywise coverage.