What Is Fisher's Exact Test?
Fisher's exact test is a statistical test for determining whether two categorical variables are associated in a 2×2 contingency table, particularly when sample sizes are small. Unlike the chi-square test of independence, which relies on a large-sample approximation, Fisher's exact test calculates the exact probability of observing the data (or something more extreme) under the null hypothesis of no association. This makes it reliable even when expected cell frequencies fall below the thresholds where chi-square becomes inaccurate. Developed by R.A. Fisher in the 1930s, it uses the hypergeometric distribution to compute probabilities directly. In market research, you'll reach for it when comparing binary outcomes across two small groups, such as whether 12 participants in a concept test preferred option A over B at different rates than 10 participants in another condition.
Why Fisher's Exact Test Matters
Chi-square tests break down with small samples. The standard rule of thumb is that chi-square requires all expected cell frequencies to be at least 5 (some texts say at least 10). When your cells contain small counts, common in pilot studies, qualitative follow-ups, or niche segment analyses, chi-square p-values become unreliable. Fisher's exact test gives you a valid p-value regardless of sample size, so you're never stuck without an answer.
How Fisher's Exact Test Works
The Setup
You have a 2×2 contingency table:
| Outcome A | Outcome B | Row Total | |
|---|---|---|---|
| Group 1 | a | b | a + b |
| Group 2 | c | d | c + d |
| Column Total | a + c | b + d | N |
The Formula
The probability of the observed table, given fixed row and column totals, is:
P = [(a+b)! × (c+d)! × (a+c)! × (b+d)!] / [N! × a! × b! × c! × d!]
The p-value is the sum of probabilities for all tables that are as extreme as or more extreme than the observed table (in the direction specified by your hypothesis).
Worked Example
You're testing whether a new product demo increases purchase intent. You show 15 shoppers the demo and 12 shoppers the standard display, then record whether each one expressed intent to purchase:
| Intent: Yes | Intent: No | Total | |
|---|---|---|---|
| Demo | 10 | 5 | 15 |
| Standard | 4 | 8 | 12 |
| Total | 14 | 13 | 27 |
Expected cell frequencies: the expected count for "Demo + Yes" = (15 × 14)/27 = 7.78. All expected values should be checked, if any fall below 5, chi-square is inappropriate.
Expected values: 7.78, 7.22, 6.22, 5.78. The smallest is 5.78, which is borderline. Fisher's exact test is the safer choice.
Fisher's exact test p-value = 0.072 (two-tailed)
At α = 0.05, this isn't significant, though it's close. The chi-square test on the same data gives p = 0.058, slightly different because it's an approximation. With small samples, these discrepancies matter, and Fisher's gives the correct answer.
If the hypothesis was one-tailed (the demo group has higher intent), the one-tailed p = 0.043, which is significant. This illustrates why pre-specifying directionality matters.
When to Use Fisher's vs. Chi-Square
| Criterion | Fisher's Exact | Chi-Square |
|---|---|---|
| Sample size | Any (best for small) | Large (expected cells ≥ 5) |
| Computation | Exact probability | Approximation |
| Table size | Traditionally 2×2 | Any size |
| Speed | Slow for large N | Fast |
| Accuracy | Always exact | Approximate, poor with small N |
Modern software handles Fisher's exact test efficiently even for larger samples, so some statisticians recommend using it routinely for all 2×2 tables. The chi-square test is still preferred for larger tables (3×3 and above) where exact computation becomes intensive, though Freeman-Halton extensions of Fisher's test handle larger tables too.
Relationship to Odds Ratio
Fisher's exact test and the odds ratio are natural companions. The odds ratio for a 2×2 table is:
OR = (a × d) / (b × c)
In the worked example: OR = (10 × 8) / (5 × 4) = 80/20 = 4.0. Shoppers who saw the demo had 4 times the odds of expressing purchase intent, though this didn't reach significance with the small sample.
Fisher's exact test can be thought of as testing whether the odds ratio equals 1.0 (no association).
When to Use Fisher's Exact Test
- Small sample studies where any expected cell frequency is below 5 in a 2×2 table
- Pilot testing with limited participants where you need to check for categorical differences before scaling up
- Niche segment comparisons where the subgroup of interest has few respondents
- Medical or safety research where accuracy matters more than convenience
- Any 2×2 table when you want the exact p-value rather than an approximation
Common Mistakes to Avoid
- Defaulting to chi-square without checking expected frequencies: if any expected cell is below 5, chi-square results may be misleading
- Confusing one-tailed and two-tailed p-values: Fisher's test can be run in either direction; make sure the version matches your hypothesis (most software defaults to two-tailed)
- Applying the test to paired data: Fisher's exact test is for independent groups; for paired binary data, use McNemar's test
How Quali-Fi Supports Small-Sample Testing
Quali-Fi automatically switches from chi-square to Fisher's exact test when expected cell frequencies fall below the reliability threshold, ensuring accurate significance testing regardless of sample size. The Research plan ($1,061/month) reports both the exact p-value and the odds ratio with confidence intervals for all 2×2 comparisons.
Get accurate small-sample testing with Quali-Fi
Frequently Asked Questions
Is Fisher's exact test always better than chi-square?
For 2×2 tables, yes, it's always at least as accurate and is exact rather than approximate. The only practical drawback used to be computational time with large samples, but modern software eliminates this concern. For larger tables (3×3+), chi-square or the Freeman-Halton extension of Fisher's test are the options.
Can Fisher's exact test detect small effects?
With small samples, no test has high power. Fisher's exact test is accurate (it won't give false positives), but it may lack the power to detect real effects if your sample is very small. This is a sample-size limitation, not a test limitation. Power analysis before data collection is essential.
What if my table is larger than 2×2?
The Freeman-Halton extension generalizes Fisher's exact test to r×c tables. Most modern statistical software (R, SPSS, Stata) can compute this, though it's computationally intensive for large tables. For practical purposes, when tables are larger than about 2×4 with reasonable expected frequencies, chi-square works well.