What Is a Chi-Square Test?
A chi-square test is a statistical method used to determine whether there's a significant association between two categorical variables. It compares the frequencies you actually observed in your data to the frequencies you'd expect if the variables were completely independent of each other. If the difference between observed and expected values is large enough, you conclude the variables are related. The most common version, the chi-square test of independence, is the go-to method for analyzing survey cross-tabs where both variables are categorical, like comparing purchase behavior across age groups or product preference across regions.
Why Chi-Square Tests Matter in Research
Survey data is full of categorical variables: yes/no responses, demographic groups, product choices, satisfaction tiers. When you need to know whether one category is related to another, does gender affect brand preference? does job title affect which feature customers prioritize?, the chi-square test gives you a formal answer. Without it, you're left eyeballing frequency tables and guessing whether the differences are real or just sampling noise.
How Chi-Square Tests Work
The Formula
chi-square = SUM((O - E)^2 / E)
Where:
- O is the observed frequency (what you actually counted)
- E is the expected frequency (what you'd predict if the variables were independent)
- The sum is taken over all cells in the contingency table
Calculating Expected Frequencies
For each cell in a contingency table:
E = (row total * column total) / grand total
Worked Example: 2x2 Contingency Table
A SaaS company wants to know if trial-to-paid conversion differs between customers who received an onboarding call and those who didn't.
Observed data:
| Converted | Did Not Convert | Row Total | |
|---|---|---|---|
| Onboarding Call | 85 | 115 | 200 |
| No Call | 60 | 140 | 200 |
| Column Total | 145 | 255 | 400 |
Step 1. Calculate expected frequencies:
Expected (Call + Converted) = (200 * 145) / 400 = 72.5 Expected (Call + Not Converted) = (200 * 255) / 400 = 127.5 Expected (No Call + Converted) = (200 * 145) / 400 = 72.5 Expected (No Call + Not Converted) = (200 * 255) / 400 = 127.5
Expected frequencies table:
| Converted | Did Not Convert | |
|---|---|---|
| Onboarding Call | 72.5 | 127.5 |
| No Call | 72.5 | 127.5 |
Step 2. Calculate chi-square for each cell:
- Cell 1: (85 - 72.5)^2 / 72.5 = (12.5)^2 / 72.5 = 156.25 / 72.5 = 2.155
- Cell 2: (115 - 127.5)^2 / 127.5 = (-12.5)^2 / 127.5 = 156.25 / 127.5 = 1.225
- Cell 3: (60 - 72.5)^2 / 72.5 = (-12.5)^2 / 72.5 = 156.25 / 72.5 = 2.155
- Cell 4: (140 - 127.5)^2 / 127.5 = (12.5)^2 / 127.5 = 156.25 / 127.5 = 1.225
Step 3. Sum all cells: chi-square = 2.155 + 1.225 + 2.155 + 1.225 = 6.76
Step 4. Determine degrees of freedom: df = (rows - 1) * (columns - 1) = (2 - 1) * (2 - 1) = 1
Step 5. Find the p-value: With chi-square = 6.76 and df = 1, the p-value is approximately 0.0093.
Result: Since 0.0093 < 0.05, there's a statistically significant association between receiving an onboarding call and converting to a paid plan. Customers who got the call converted at 42.5% versus 30% for those who didn't.
Degrees of Freedom
Degrees of freedom (df) for a chi-square test of independence equals (number of rows - 1) times (number of columns - 1). A 2x2 table has 1 degree of freedom. A 3x4 table has 6. The degrees of freedom determine which chi-square distribution to use when looking up the p-value. Higher degrees of freedom shift the critical value upward, requiring a larger chi-square statistic to achieve significance.
Types of Chi-Square Tests
Chi-square test of independence (most common): Tests whether two categorical variables are related. Used with contingency tables from survey cross-tabulations.
Chi-square goodness-of-fit test: Tests whether a single categorical variable follows a specific expected distribution. For example, testing whether responses are evenly split across four product categories (25% each) or follow some other expected pattern.
Formula is the same for both types, the difference is in how you define the expected frequencies.
Effect Size: Cramer's V
A significant chi-square tells you an association exists but not how strong it is. Cramer's V measures effect size:
V = sqrt(chi-square / (n * (k - 1)))
Where n is total sample size and k is the smaller of (rows, columns). V ranges from 0 (no association) to 1 (perfect association). Values around 0.10 are small, 0.30 are medium, and 0.50+ are large.
For the worked example: V = sqrt(6.76 / (400 * 1)) = sqrt(0.0169) = 0.13, a small effect.
When to Use a Chi-Square Test
- Survey cross-tabulations: testing whether demographic variables (age, gender, region) are associated with categorical outcomes (product choice, satisfaction tier, yes/no questions)
- A/B test analysis with categorical outcomes: conversion (yes/no), plan selection (basic/pro/enterprise), support ticket status
- Market segmentation: determining whether customer segments differ in their behavioral patterns or preferences
- Goodness-of-fit testing: checking whether response distributions match expected patterns or prior benchmarks
- Pre-post comparisons: comparing categorical outcome distributions before and after an intervention
Common Mistakes to Avoid
- Using chi-square when expected cell counts are below 5: the chi-square approximation breaks down with small expected frequencies. Use Fisher's exact test instead for 2x2 tables, or combine categories to increase cell counts.
- Applying chi-square to continuous data: the test requires categorical variables. If you have continuous data, either categorize it first (with meaningful cutoffs) or use a different test like a t-test or ANOVA.
- Confusing significance with strength: a significant chi-square tells you the variables are associated, not how strongly. Always report Cramer's V or another effect size measure alongside the p-value.
- Ignoring the direction of association: a significant chi-square in a table larger than 2x2 doesn't tell you which specific cells are driving the association. Examine standardized residuals to identify where observed counts deviate most from expected.
- Using chi-square on paired or repeated data: the standard chi-square assumes independent observations. For paired categorical data (same subjects measured twice), use McNemar's test.
How Quali-Fi Supports Chi-Square Testing
Quali-Fi runs chi-square tests automatically whenever you cross-tabulate categorical survey variables in the analytics dashboard. Significant associations are highlighted with confidence markers so you can spot meaningful patterns without manual calculations. The platform also displays cell-level residuals to show which specific combinations are driving the overall significance, essential for interpreting tables larger than 2x2.
Frequently Asked Questions
What's the minimum sample size for a chi-square test?
There's no hard minimum for total sample size, but the rule of thumb is that every expected cell count should be at least 5. For a 2x2 table, this typically requires a total sample of at least 20-40, depending on how the data splits. Larger tables need proportionally larger samples.
Can I use chi-square with more than two variables?
The standard chi-square test works with two variables at a time. For three or more categorical variables simultaneously, use log-linear analysis or stratified chi-square tests (Cochran-Mantel-Haenszel). Alternatively, test pairs of variables separately, adjusting for multiple comparisons.
What's the difference between chi-square and Fisher's exact test?
Both test the association between categorical variables. Chi-square uses an approximation that works well with large samples. Fisher's exact test computes the exact probability and is preferred when sample sizes are small or expected cell counts fall below 5. With large samples, the results are virtually identical.
How do I report chi-square results?
Standard format: chi-square(df) = value, p = value, V = value. For the worked example: chi-square(1) = 6.76, p = 0.009, V = 0.13. Include the contingency table with both observed counts and percentages so readers can see the pattern behind the statistic.
Can I run a chi-square test on Likert scale data?
Technically yes, you can treat each response option (1-5) as a category. However, this loses ordinal information. If you want to test whether two groups differ on a Likert item, Mann-Whitney U (which accounts for ordering) is often a better choice. If you collapse the scale into categories (e.g., "agree" vs. "disagree"), chi-square is appropriate.
Related Topics
- P-Value
- Statistical Significance
- Correlation Coefficient
- ANOVA
- Hypothesis Testing
- Null Hypothesis
- Statistical Significance Calculator
Want automatic chi-square testing in your survey cross-tabs? Start your free 14-day Quali-Fi trial, no credit card required.