Chi-Square Test Explained

Q: What's the minimum sample size for a chi-square test?

There's no hard minimum for total sample size, but the rule of thumb is that every expected cell count should be at least 5. For a 2x2 table, this typically requires a total sample of at least 20-40, depending on how the data splits. Larger tables need proportionally larger samples.

Q: Can I use chi-square with more than two variables?

The standard chi-square test works with two variables at a time. For three or more categorical variables simultaneously, use log-linear analysis or stratified chi-square tests (Cochran-Mantel-Haenszel). Alternatively, test pairs of variables separately, adjusting for multiple comparisons.

Q: What's the difference between chi-square and Fisher's exact test?

Both test the association between categorical variables. Chi-square uses an approximation that works well with large samples. Fisher's exact test computes the exact probability and is preferred when sample sizes are small or expected cell counts fall below 5. With large samples, the results are virtually identical.

Q: How do I report chi-square results?

Standard format: chi-square(df) = value, p = value, V = value. For the worked example: chi-square(1) = 6.76, p = 0.009, V = 0.13. Include the contingency table with both observed counts and percentages so readers can see the pattern behind the statistic.

Q: Can I run a chi-square test on Likert scale data?

Technically yes, you can treat each response option (1-5) as a category. However, this loses ordinal information. If you want to test whether two groups differ on a Likert item, Mann-Whitney U (which accounts for ordering) is often a better choice. If you collapse the scale into categories (e.g., "agree" vs. "disagree"), chi-square is appropriate.

Learn what a chi-square test is, how to calculate it with a contingency table, and when to use it for categorical data analysis in survey research.

What Is a Chi-Square Test?

A chi-square test is a statistical method used to determine whether there's a significant association between two categorical variables. It compares the frequencies you actually observed in your data to the frequencies you'd expect if the variables were completely independent of each other. If the difference between observed and expected values is large enough, you conclude the variables are related. The most common version, the chi-square test of independence, is the go-to method for analyzing survey cross-tabs where both variables are categorical, like comparing purchase behavior across age groups or product preference across regions.

Why Chi-Square Tests Matter in Research

Survey data is full of categorical variables: yes/no responses, demographic groups, product choices, satisfaction tiers. When you need to know whether one category is related to another, does gender affect brand preference? does job title affect which feature customers prioritize?, the chi-square test gives you a formal answer. Without it, you're left eyeballing frequency tables and guessing whether the differences are real or just sampling noise.

How Chi-Square Tests Work

The Formula

chi-square = SUM((O - E)^2 / E)

Where:

O is the observed frequency (what you actually counted)
E is the expected frequency (what you'd predict if the variables were independent)
The sum is taken over all cells in the contingency table

Calculating Expected Frequencies

For each cell in a contingency table:

E = (row total * column total) / grand total

Worked Example: 2x2 Contingency Table

A SaaS company wants to know if trial-to-paid conversion differs between customers who received an onboarding call and those who didn't.

Observed data:

	Converted	Did Not Convert	Row Total
Onboarding Call	85	115	200
No Call	60	140	200
Column Total	145	255	400

Step 1. Calculate expected frequencies:

Expected (Call + Converted) = (200 * 145) / 400 = 72.5 Expected (Call + Not Converted) = (200 * 255) / 400 = 127.5 Expected (No Call + Converted) = (200 * 145) / 400 = 72.5 Expected (No Call + Not Converted) = (200 * 255) / 400 = 127.5

Expected frequencies table:

	Converted	Did Not Convert
Onboarding Call	72.5	127.5
No Call	72.5	127.5

Step 2. Calculate chi-square for each cell:

Cell 1: (85 - 72.5)^2 / 72.5 = (12.5)^2 / 72.5 = 156.25 / 72.5 = 2.155
Cell 2: (115 - 127.5)^2 / 127.5 = (-12.5)^2 / 127.5 = 156.25 / 127.5 = 1.225
Cell 3: (60 - 72.5)^2 / 72.5 = (-12.5)^2 / 72.5 = 156.25 / 72.5 = 2.155
Cell 4: (140 - 127.5)^2 / 127.5 = (12.5)^2 / 127.5 = 156.25 / 127.5 = 1.225

Step 3. Sum all cells: chi-square = 2.155 + 1.225 + 2.155 + 1.225 = 6.76

Step 4. Determine degrees of freedom: df = (rows - 1) * (columns - 1) = (2 - 1) * (2 - 1) = 1

Step 5. Find the p-value: With chi-square = 6.76 and df = 1, the p-value is approximately 0.0093.

Result: Since 0.0093 < 0.05, there's a statistically significant association between receiving an onboarding call and converting to a paid plan. Customers who got the call converted at 42.5% versus 30% for those who didn't.

Degrees of Freedom

Degrees of freedom (df) for a chi-square test of independence equals (number of rows - 1) times (number of columns - 1). A 2x2 table has 1 degree of freedom. A 3x4 table has 6. The degrees of freedom determine which chi-square distribution to use when looking up the p-value. Higher degrees of freedom shift the critical value upward, requiring a larger chi-square statistic to achieve significance.

Types of Chi-Square Tests

Chi-square test of independence (most common): Tests whether two categorical variables are related. Used with contingency tables from survey cross-tabulations.

Chi-square goodness-of-fit test: Tests whether a single categorical variable follows a specific expected distribution. For example, testing whether responses are evenly split across four product categories (25% each) or follow some other expected pattern.

Formula is the same for both types, the difference is in how you define the expected frequencies.

Effect Size: Cramer's V

A significant chi-square tells you an association exists but not how strong it is. Cramer's V measures effect size:

V = sqrt(chi-square / (n * (k - 1)))

Where n is total sample size and k is the smaller of (rows, columns). V ranges from 0 (no association) to 1 (perfect association). Values around 0.10 are small, 0.30 are medium, and 0.50+ are large.

For the worked example: V = sqrt(6.76 / (400 * 1)) = sqrt(0.0169) = 0.13, a small effect.

When to Use a Chi-Square Test

Survey cross-tabulations: testing whether demographic variables (age, gender, region) are associated with categorical outcomes (product choice, satisfaction tier, yes/no questions)
A/B test analysis with categorical outcomes: conversion (yes/no), plan selection (basic/pro/enterprise), support ticket status
Market segmentation: determining whether customer segments differ in their behavioral patterns or preferences
Goodness-of-fit testing: checking whether response distributions match expected patterns or prior benchmarks
Pre-post comparisons: comparing categorical outcome distributions before and after an intervention

Common Mistakes to Avoid

Using chi-square when expected cell counts are below 5: the chi-square approximation breaks down with small expected frequencies. Use Fisher's exact test instead for 2x2 tables, or combine categories to increase cell counts.
Applying chi-square to continuous data: the test requires categorical variables. If you have continuous data, either categorize it first (with meaningful cutoffs) or use a different test like a t-test or ANOVA.
Confusing significance with strength: a significant chi-square tells you the variables are associated, not how strongly. Always report Cramer's V or another effect size measure alongside the p-value.
Ignoring the direction of association: a significant chi-square in a table larger than 2x2 doesn't tell you which specific cells are driving the association. Examine standardized residuals to identify where observed counts deviate most from expected.
Using chi-square on paired or repeated data: the standard chi-square assumes independent observations. For paired categorical data (same subjects measured twice), use McNemar's test.

How Quali-Fi Supports Chi-Square Testing

Quali-Fi runs chi-square tests automatically whenever you cross-tabulate categorical survey variables in the analytics dashboard. Significant associations are highlighted with confidence markers so you can spot meaningful patterns without manual calculations. The platform also displays cell-level residuals to show which specific combinations are driving the overall significance, essential for interpreting tables larger than 2x2.

Frequently Asked Questions

What's the minimum sample size for a chi-square test?

There's no hard minimum for total sample size, but the rule of thumb is that every expected cell count should be at least 5. For a 2x2 table, this typically requires a total sample of at least 20-40, depending on how the data splits. Larger tables need proportionally larger samples.

Can I use chi-square with more than two variables?

The standard chi-square test works with two variables at a time. For three or more categorical variables simultaneously, use log-linear analysis or stratified chi-square tests (Cochran-Mantel-Haenszel). Alternatively, test pairs of variables separately, adjusting for multiple comparisons.

What's the difference between chi-square and Fisher's exact test?

Both test the association between categorical variables. Chi-square uses an approximation that works well with large samples. Fisher's exact test computes the exact probability and is preferred when sample sizes are small or expected cell counts fall below 5. With large samples, the results are virtually identical.

How do I report chi-square results?

Standard format: chi-square(df) = value, p = value, V = value. For the worked example: chi-square(1) = 6.76, p = 0.009, V = 0.13. Include the contingency table with both observed counts and percentages so readers can see the pattern behind the statistic.

Can I run a chi-square test on Likert scale data?

Technically yes, you can treat each response option (1-5) as a category. However, this loses ordinal information. If you want to test whether two groups differ on a Likert item, Mann-Whitney U (which accounts for ordering) is often a better choice. If you collapse the scale into categories (e.g., "agree" vs. "disagree"), chi-square is appropriate.

Want automatic chi-square testing in your survey cross-tabs? Start your free 14-day Quali-Fi trial, no credit card required.

What Is a Chi-Square Test?

Why Chi-Square Tests Matter in Research

How Chi-Square Tests Work

The Formula

Calculating Expected Frequencies

Worked Example: 2x2 Contingency Table

Degrees of Freedom

Types of Chi-Square Tests

Effect Size: Cramer's V

When to Use a Chi-Square Test

Common Mistakes to Avoid

How Quali-Fi Supports Chi-Square Testing

Frequently Asked Questions

What's the minimum sample size for a chi-square test?

Can I use chi-square with more than two variables?

What's the difference between chi-square and Fisher's exact test?

How do I report chi-square results?

Can I run a chi-square test on Likert scale data?

Frequently Asked Questions

Related Guides

P-Value in Research Explained

Statistical Significance Explained

Correlation Coefficient Explained

ANOVA in Research Explained

Hypothesis Testing: What It Is and How to Use It in Research

Ready to apply this in your research?

Chi-Square Test Explained

What Is a Chi-Square Test?

Why Chi-Square Tests Matter in Research

How Chi-Square Tests Work

The Formula

Calculating Expected Frequencies

Worked Example: 2x2 Contingency Table

Degrees of Freedom

Types of Chi-Square Tests

Effect Size: Cramer's V

When to Use a Chi-Square Test

Common Mistakes to Avoid

How Quali-Fi Supports Chi-Square Testing

Frequently Asked Questions

What's the minimum sample size for a chi-square test?

Can I use chi-square with more than two variables?

What's the difference between chi-square and Fisher's exact test?

How do I report chi-square results?

Can I run a chi-square test on Likert scale data?

Related Topics

Frequently Asked Questions

Related Guides

P-Value in Research Explained

Statistical Significance Explained

Correlation Coefficient Explained

ANOVA in Research Explained

Hypothesis Testing: What It Is and How to Use It in Research

Ready to apply this in your research?