What Is Power Analysis?
Power analysis is a statistical method used to determine the minimum sample size needed to detect an effect of a given size with a specified level of confidence. Statistical power is the probability that a test will correctly reject a false null hypothesis, in plain terms, the chance you'll find a real effect when one actually exists. Power is expressed as a value between 0 and 1, with the convention of targeting 0.80 (80%). A power of 0.80 means that if a real effect exists, your study has an 80% chance of detecting it and a 20% chance of missing it. Running a power analysis before data collection prevents you from fielding an underpowered study that wastes time and budget.
Why Power Analysis Matters
Underpowered studies are one of the most common problems in applied research. If your sample is too small, you won't detect real differences between groups, real preferences in concept tests, or real shifts in satisfaction scores, and you'll conclude that nothing happened when something actually did. Power analysis catches this problem before you spend money on data collection. It also prevents over-sampling, which wastes budget on precision you don't need.
How Power Analysis Works
The Four Components
Every power analysis involves four interconnected values. If you know three, you can solve for the fourth:
- Sample size (n): how many observations you'll collect
- Effect size: the minimum meaningful difference you want to detect
- Significance level (α): the threshold for rejecting the null hypothesis (typically 0.05)
- Power (1 - β): the probability of detecting the effect (typically 0.80)
In practice, you usually set α = 0.05 and power = 0.80, specify the effect size you care about, and solve for sample size.
Effect Size
Effect size quantifies how large the difference or relationship needs to be for it to matter. Common effect size measures include:
Cohen's d for comparing two means: d = (M₁ - M₂) / SD_pooled
Small: d = 0.2
Medium: d = 0.5
Large: d = 0.8
Cohen's w for chi-square tests
f for ANOVA
r for correlations
The tricky part is choosing the right effect size. If you set it too large, you'll need fewer respondents but might miss moderate effects that still matter to your business. If you set it too small, you'll need a huge sample. The best approach is to base it on pilot data, prior research, or the smallest difference that would change your decision.
Worked Example
You want to compare satisfaction scores between two product versions. Based on past data, the standard deviation of satisfaction scores is 15 points. You consider a 5-point difference meaningful enough to act on.
Effect size: d = 5 / 15 = 0.33 (small-to-medium)
Parameters: α = 0.05, power = 0.80, two-tailed test
Using the power analysis formula for a two-sample t-test, you'd need approximately 145 respondents per group (290 total) to reliably detect that 5-point difference.
If you only collected 50 per group, your power would drop to roughly 0.35, meaning you'd miss the effect 65% of the time even if it's real.
A Priori vs. Post Hoc Power Analysis
A priori power analysis (before data collection) is the standard approach. You calculate the sample size you need and then collect that many observations.
Post hoc power analysis (after data collection) is widely discouraged. Calculating power after you've seen the results is mathematically redundant, post hoc power is just a transformation of the p-value you already have. If you got a non-significant result, post hoc power will always be low. It doesn't tell you anything new. The time to think about power is before you field the study.
Factors That Affect Power
Power increases when you:
- Increase sample size: more data means more precision
- Target a larger effect size: bigger effects are easier to detect
- Use a less conservative alpha: α = 0.10 gives more power than α = 0.05 (but more false positives)
- Use a one-tailed test: if you're sure about the direction of the effect
- Reduce measurement noise: better instruments and more reliable scales lower variability
When to Use Power Analysis
- Before fielding any quantitative study to ensure you're collecting enough data to answer your research questions
- During proposal or budgeting phases to justify sample size requirements to stakeholders
- When comparing study designs to see which approach delivers adequate power at lower cost
- Before running A/B tests to determine how long the test needs to run based on expected conversion rate differences
Common Mistakes to Avoid
- Running post hoc power analysis to explain non-significant results: this is circular reasoning; use confidence intervals instead to show what effects your study could and couldn't detect
- Using "medium" effect size as a default without justification: Cohen's benchmarks are conventions, not universal truths; the effect size should reflect what matters for your specific decision
- Ignoring attrition and data quality: if 20% of survey responses will be removed for quality reasons, inflate your target sample size by 20% to maintain adequate power after cleaning
How Quali-Fi Supports Power Analysis
Quali-Fi's Research plan includes a built-in sample size calculator that runs power analysis for common research designs, including A/B tests, concept tests, and subgroup comparisons. You set your desired confidence level and minimum detectable difference, and the platform recommends the number of completes you need before fielding.
Plan your sample size with Quali-Fi
Frequently Asked Questions
What's the difference between power and confidence level?
Confidence level (1 - α) controls the false positive rate, how often you'd incorrectly reject the null hypothesis. Power (1 - β) controls the false negative rate, how often you'd miss a real effect. Both are set before data collection, but they address different types of errors. A study can be 95% confident but only 50% powered, meaning it's unlikely to flag a false alarm but very likely to miss a real signal.
Is 80% power always the right target?
It's the most common convention, but not a universal rule. For high-stakes decisions (like launching a new product line), you might target 90% power to reduce the chance of missing a real difference. For exploratory research where you'll follow up with more studies, 70% might be acceptable. The cost of a false negative should drive your power target.
Can I do a power analysis for qualitative research?
Not in the traditional sense, since power analysis is designed for hypothesis testing with quantitative data. For qualitative research, sample size planning uses different frameworks, like thematic saturation, where you recruit until new interviews stop revealing new themes. The typical range is 12-30 interviews, depending on the research question's complexity.