Statistical Concepts

Kruskal-Wallis Test: Nonparametric One-Way Comparison

6 min read

Learn what the Kruskal-Wallis test is, how it compares to one-way ANOVA, and when to use it for ordinal or non-normal data with three or more groups.

What Is the Kruskal-Wallis Test?

The Kruskal-Wallis test is a nonparametric statistical test that compares three or more independent groups to determine whether their distributions differ. It's the nonparametric equivalent of one-way ANOVA, used when the dependent variable is ordinal or when the assumption of normality required by ANOVA isn't met. Like other rank-based tests, it works by ranking all observations from all groups together and then testing whether the average ranks differ significantly across groups. The test produces an H statistic that follows a chi-square distribution. In market research, the Kruskal-Wallis test is common when comparing customer segments, demographic groups, or experimental conditions on Likert-scale ratings or other ordinal outcomes where parametric assumptions are questionable.

Why the Kruskal-Wallis Test Matters

Comparing three or more groups is one of the most frequent analyses in research, segment comparisons, multi-cell experiments, regional breakdowns. When your data is ordinal, heavily skewed, or drawn from small groups, one-way ANOVA can produce misleading results. The Kruskal-Wallis test provides a valid alternative that doesn't require normality or equal variances, giving you trustworthy conclusions even with messy real-world data.

How the Kruskal-Wallis Test Works

The Procedure

  1. Combine all observations from all groups
  2. Rank them from lowest to highest (tied values get the average rank)
  3. Calculate the average rank for each group
  4. Compute the H statistic, which measures how much the group rank means deviate from the overall average rank

The Formula

H = [12 / (N(N + 1))] × Σ(R²_j / n_j) - 3(N + 1)

Where N is the total number of observations, k is the number of groups, n_j is the sample size for group j, and R_j is the sum of ranks in group j.

H is compared to the chi-square distribution with k - 1 degrees of freedom.

Worked Example

You compare satisfaction ratings (1-7 scale) across three customer service channels: phone (n = 6), chat (n = 6), and email (n = 6). N = 18.

Phone Chat Email
5 6 3
4 7 4
6 5 2
5 6 3
3 7 4
4 6 3

Ranked data (1-18):

All scores sorted: 2, 3, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6, 6, 7, 7, wait, that's 17. Let me recount: Phone has {5,4,6,5,3,4} = 6 scores; Chat has {6,7,5,6,7,6} = 6; Email has {3,4,2,3,4,3} = 6. Total = 18.

Assigning average ranks:

Score Count Ranks Occupied Average Rank
2 1 1 1
3 4 2-5 3.5
4 3 6-8 7
5 3 9-11 10
6 4 12-15 13.5
7 2 16-17... 17

Actually with 18 scores: 2(×1), 3(×4), 4(×3), 5(×3), 6(×4), 7(×2) = 17. One more 7? Let me recount chat: 6,7,5,6,7,6. That's two 7s. Total 7s = 2, total 6s = 4, total 5s = 3, total 4s = 3, total 3s = 4, total 2s = 1 → sums to 17. The 18th score... Phone 3 gives us five 3s? No, phone has one 3, email has three 3s = four 3s total. We have 1+4+3+3+4+2 = 17. Recheck: email has {3,4,2,3,4,3}, three 3s, two 4s, one 2. Phone has {5,4,6,5,3,4}, two 5s, two 4s, one 6, one 3. Chat has {6,7,5,6,7,6}, three 6s, two 7s, one 5.

Totals: 2(×1), 3(×4), 4(×4), 5(×3), 6(×4), 7(×2) = 18. Right, four 4s, not three.

Score Count Ranks Average Rank
2 1 1 1.0
3 4 2-5 3.5
4 4 6-9 7.5
5 3 10-12 11.0
6 4 13-16 14.5
7 2 17-18 17.5

R_phone = 11 + 7.5 + 14.5 + 11 + 3.5 + 7.5 = 55

R_chat = 14.5 + 17.5 + 11 + 14.5 + 17.5 + 14.5 = 89.5

R_email = 3.5 + 7.5 + 1 + 3.5 + 7.5 + 3.5 = 26.5

Check: 55 + 89.5 + 26.5 = 171 = 18(19)/2 = 171. Correct.

H = [12 / (18 × 19)] × [(55²/6) + (89.5²/6) + (26.5²/6)] - 3(19)

H = [12/342] × [504.2 + 1335.0 + 117.0] - 57

H = 0.0351 × 1956.2 - 57 = 68.6 - 57 = 11.6

With df = 2, the critical chi-square at α = 0.05 is 5.99. Since H = 11.6 > 5.99, the three channels produce significantly different satisfaction ratings (p < 0.01).

Follow-Up Tests

Use pairwise Mann-Whitney U tests with Bonferroni correction (or Dunn's test) to determine which groups differ. With 3 groups, that's 3 comparisons at adjusted α = 0.05/3 = 0.017.

When to Use the Kruskal-Wallis Test

  • Comparing three or more independent groups on ordinal or non-normal continuous data
  • Segment analysis with Likert-scale outcomes when groups are unequal in size or data is skewed
  • Small group sizes where normality assumptions for ANOVA are questionable
  • Survey data where response options are limited and distributions are lumpy
  • Replacing one-way ANOVA when diagnostic checks reveal non-normal residuals or unequal variances

Common Mistakes to Avoid

  • Using it for paired/repeated measures: for related groups, use the Friedman test instead
  • Stopping at the omnibus test: a significant H statistic means at least one group differs, but you need post-hoc comparisons to identify which ones
  • Assuming it tests medians: like the Mann-Whitney U, it tests distributional differences, which equals a median test only when group distributions have the same shape

How Quali-Fi Supports Nonparametric Group Comparisons

Quali-Fi's Research plan ($1,061/month) offers the Kruskal-Wallis test as a standard option for multi-group comparisons, with automated post-hoc testing and Bonferroni correction. The platform selects the appropriate test based on your data's characteristics and presents results in clear comparison tables.

Compare multiple groups with Quali-Fi

Frequently Asked Questions

How is the Kruskal-Wallis test different from one-way ANOVA?

One-way ANOVA compares group means and assumes normally distributed data with equal variances. The Kruskal-Wallis test compares rank distributions and makes no normality assumption. When ANOVA assumptions are met, ANOVA is more powerful. When they're not, Kruskal-Wallis is more reliable.

Can I use the Kruskal-Wallis test with only two groups?

Technically yes, and it gives the same result as the Mann-Whitney U test. But with only two groups, most researchers use the Mann-Whitney directly since no follow-up comparisons are needed.

What effect size should I report?

Epsilon-squared (ε²) = H / (N² - 1) / (N + 1) is one option. More commonly, eta-squared based on ranks (η²_H) = (H - k + 1) / (N - k) is reported. Values around 0.01, 0.06, and 0.14 correspond to small, medium, and large effects.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.