Friedman Test: Nonparametric Repeated Measures Analysis

Q: Can the Friedman test handle missing data?

Standard implementations require complete data, each participant must have scores for all conditions. If a participant is missing one condition, they're typically excluded entirely. Some software offers adjustments for missing data, but the safest approach is to address missingness before running the test.

Q: What's the effect size for the Friedman test?

Kendall's W (coefficient of concordance) is the standard effect size. It ranges from 0 (no agreement in rankings) to 1 (perfect agreement). W = χ²_F / (n(k-1)). Values around 0.1 are small, 0.3 medium, and 0.5+ large.

Learn what the Friedman test is, how it compares to repeated-measures ANOVA, and when to use it for ranked or ordinal repeated-measures data.

What Is the Friedman Test?

The Friedman test is a nonparametric statistical test used to detect differences across three or more related groups, typically repeated measures from the same participants. It's the nonparametric alternative to repeated-measures ANOVA, designed for situations where the dependent variable is ordinal or where the assumptions of normality required by parametric tests aren't met. Instead of comparing raw means, the Friedman test ranks the scores within each participant across conditions, then tests whether the average ranks differ significantly across conditions. It was developed by economist Milton Friedman (yes, that Milton Friedman) in 1937. In market research, you'd use it when the same respondents rate or rank multiple items, such as evaluating three product concepts in sequence, and the data is ordinal or heavily skewed.

Why the Friedman Test Matters

Repeated-measures designs are efficient because each participant serves as their own control, reducing variability. But repeated-measures ANOVA requires normally distributed data and interval-level measurement. When you're working with Likert-scale ratings, ranks, or satisfaction tiers, the Friedman test gives you a valid way to test for differences without forcing parametric assumptions onto non-parametric data.

How the Friedman Test Works

The Procedure

For each participant, rank the scores across the k conditions (1 = lowest, k = highest). Tied ranks receive the average of the ranks they would have occupied.
Sum the ranks for each condition across all participants.
Calculate the Friedman test statistic.

The Formula

χ²_F = [12 / (nk(k + 1))] × ΣR²_j - 3n(k + 1)

Where n is the number of participants, k is the number of conditions, and R_j is the sum of ranks for condition j.

The test statistic follows a chi-square distribution with k - 1 degrees of freedom.

Worked Example

Eight customers taste-test three flavors of a beverage and rate each on a 1-7 scale:

Customer	Flavor A	Flavor B	Flavor C
1	5	3	6
2	4	4	7
3	6	2	5
4	3	5	6
5	5	3	7
6	4	4	5
7	6	1	7
8	5	3	6

Step 1. Rank within each customer:

Customer	Rank A	Rank B	Rank C
1	2	1	3
2	1.5	1.5	3
3	3	1	2
4	1	2	3
5	2	1	3
6	1.5	1.5	3
7	2	1	3
8	2	1	3

Step 2. Sum ranks per condition:

R_A = 15, R_B = 10, R_C = 23

Step 3. Calculate:

χ²_F = [12 / (8 × 3 × 4)] × (15² + 10² + 23²) - 3(8)(4)

χ²_F = [12 / 96] × (225 + 100 + 529) - 96

χ²_F = 0.125 × 854 - 96 = 106.75 - 96 = 10.75

With df = 2, the critical chi-square at α = 0.05 is 5.99. Since 10.75 > 5.99, we reject the null hypothesis. The three flavors produce significantly different ratings.

Follow-Up Tests

A significant Friedman test tells you that at least one condition differs, but not which ones. Use pairwise Wilcoxon signed-rank tests with Bonferroni correction to identify specific differences. With three conditions, you'd make 3 comparisons and use α = 0.05/3 = 0.017 for each.

Friedman Test vs. Repeated-Measures ANOVA

Feature	Friedman Test	Repeated-Measures ANOVA
Data level	Ordinal or non-normal continuous	Interval/ratio, approximately normal
Uses	Ranks	Raw scores
Sphericity assumption	Not required	Required (or corrected)
Power with normal data	Lower	Higher
Sample size needs	Smaller okay	Larger preferred
Effect size	Kendall's W	Partial eta-squared

If your data is continuous and reasonably normal, repeated-measures ANOVA is more powerful. If the data is ordinal, heavily skewed, or comes from small samples where normality is questionable, the Friedman test is the safer choice.

When to Use the Friedman Test

Taste tests or concept evaluations where the same respondents rate multiple options on ordinal scales
Before-during-after designs with three or more time points and non-normal data
Ranking tasks where participants rank items rather than rating them on a continuous scale
Small sample sizes where you can't confidently assume normality
Likert-scale data when you're treating the scale as ordinal rather than interval

Common Mistakes to Avoid

Using the Friedman test for independent groups: it's for related (repeated) measures only; use Kruskal-Wallis for independent groups
Skipping post-hoc comparisons after a significant result, the omnibus test doesn't tell you where the differences are
Ignoring ties: when many scores are tied, the basic formula needs a correction factor for ties; most statistical software handles this automatically

How Quali-Fi Supports Nonparametric Analysis

Quali-Fi's Research plan ($1,061/month) includes nonparametric testing options for repeated-measures designs, automatically selecting the appropriate test based on your data structure and measurement level. The platform handles tied ranks and provides follow-up pairwise comparisons with Bonferroni correction.

Analyze repeated-measures data with Quali-Fi

Frequently Asked Questions

How many participants do I need for the Friedman test?

With three conditions, a minimum of about 10-12 participants is often cited, though more is always better. For small samples (n < 10), you should use exact p-values rather than the chi-square approximation, which most statistical software can provide.

Can the Friedman test handle missing data?

Standard implementations require complete data, each participant must have scores for all conditions. If a participant is missing one condition, they're typically excluded entirely. Some software offers adjustments for missing data, but the safest approach is to address missingness before running the test.

What's the effect size for the Friedman test?

Kendall's W (coefficient of concordance) is the standard effect size. It ranges from 0 (no agreement in rankings) to 1 (perfect agreement). W = χ²_F / (n(k-1)). Values around 0.1 are small, 0.3 medium, and 0.5+ large.

What Is the Friedman Test?

Why the Friedman Test Matters

How the Friedman Test Works

The Procedure

The Formula

Worked Example

Follow-Up Tests

Friedman Test vs. Repeated-Measures ANOVA

When to Use the Friedman Test

Common Mistakes to Avoid

How Quali-Fi Supports Nonparametric Analysis

Frequently Asked Questions

How many participants do I need for the Friedman test?

Can the Friedman test handle missing data?

What's the effect size for the Friedman test?

Frequently Asked Questions

Related Guides

Wilcoxon Signed-Rank Test: Paired Nonparametric Comparison

Kruskal-Wallis Test: Nonparametric One-Way Comparison

Bonferroni Correction: Formula, Examples, and When to Use It

Post-Hoc Tests: Tukey, Bonferroni, and Scheffé Compared

Mann-Whitney U Test: Independent Nonparametric Comparison

Ready to apply this in your research?

Friedman Test: Nonparametric Repeated Measures Analysis

What Is the Friedman Test?

Why the Friedman Test Matters

How the Friedman Test Works

The Procedure

The Formula

Worked Example

Follow-Up Tests

Friedman Test vs. Repeated-Measures ANOVA

When to Use the Friedman Test

Common Mistakes to Avoid

How Quali-Fi Supports Nonparametric Analysis

Frequently Asked Questions

How many participants do I need for the Friedman test?

Can the Friedman test handle missing data?

What's the effect size for the Friedman test?

Related Topics

Frequently Asked Questions

Related Guides

Wilcoxon Signed-Rank Test: Paired Nonparametric Comparison

Kruskal-Wallis Test: Nonparametric One-Way Comparison

Bonferroni Correction: Formula, Examples, and When to Use It

Post-Hoc Tests: Tukey, Bonferroni, and Scheffé Compared

Mann-Whitney U Test: Independent Nonparametric Comparison

Ready to apply this in your research?