Statistical Concepts

Interquartile Range (IQR): What It Is, How to Calculate It, and Outlier Detection

6 min read

Learn what the interquartile range is, how to calculate it, and how the 1.5 IQR rule identifies outliers in survey and research data.

What Is the Interquartile Range?

The interquartile range (IQR) is the difference between the third quartile (Q3) and the first quartile (Q1) of a dataset. It measures the spread of the middle 50% of observations, ignoring the top and bottom 25%. Because it excludes extreme values, the IQR is a strong measure of variability that isn't distorted by outliers, unlike the range or standard deviation. If Q1 is 30 and Q3 is 55, the IQR is 25, meaning the central half of your data spans 25 units. The IQR is the width of the box in a box plot and the foundation of one of the most common outlier detection rules in applied statistics.

Why the Interquartile Range Matters

Standard deviation assumes your data is roughly symmetric and normally distributed. When data is skewed or contains outliers, the standard deviation gets inflated and gives a misleading picture of typical variability. The IQR doesn't have this problem, it describes the spread of the middle portion of the data regardless of what's happening at the extremes. In market research, where survey data often contains straightliners, speeders, or genuinely extreme respondents, the IQR provides a more honest measure of how spread out the core responses really are.

How the Interquartile Range Works

The Formula

IQR = Q3 - Q1

That's it. Calculate Q1 (25th percentile) and Q3 (75th percentile), then subtract.

Worked Example

Customer satisfaction scores (1-100) from 16 respondents, sorted:

22, 35, 40, 45, 48, 52, 55, 58, 60, 63, 65, 68, 72, 75, 80, 95

Q1: Median of the lower half (22, 35, 40, 45, 48, 52, 55, 58) = (45 + 48) / 2 = 46.5

Q3: Median of the upper half (60, 63, 65, 68, 72, 75, 80, 95) = (68 + 72) / 2 = 70

IQR = 70 - 46.5 = 23.5

The middle 50% of satisfaction scores spans 23.5 points. Compare this to the full range (95 - 22 = 73), the IQR tells you that the bulk of the data is far more concentrated than the extremes suggest.

The 1.5 IQR Rule for Outlier Detection

The most widely used application of the IQR is identifying outliers. The rule defines boundaries beyond which values are considered unusually extreme:

Lower boundary = Q1 - 1.5 × IQR Upper boundary = Q3 + 1.5 × IQR

Any value below the lower boundary or above the upper boundary is flagged as a potential outlier.

Using our example:

Lower boundary = 46.5 - 1.5 × 23.5 = 46.5 - 35.25 = 11.25 Upper boundary = 70 + 1.5 × 23.5 = 70 + 35.25 = 105.25

Since all values fall between 11.25 and 105.25, there are no outliers by this rule. But if one respondent had scored 5 (below 11.25), it would be flagged.

Some analysts use a stricter threshold of 3 × IQR to identify extreme outliers:

Lower extreme boundary = 46.5 - 3 × 23.5 = -24 (below the scale minimum, so irrelevant here) Upper extreme boundary = 70 + 3 × 23.5 = 140.5

Why 1.5?

The 1.5 multiplier was proposed by John Tukey, who developed the box plot. For normally distributed data, the 1.5 × IQR boundaries capture approximately 99.3% of observations, meaning roughly 0.7% of values would be flagged as outliers even in a perfectly normal distribution. It's a practical convention that balances sensitivity (catching genuine outliers) with specificity (not flagging too many normal observations).

IQR vs. Standard Deviation

IQR Standard Deviation
What it measures Spread of the middle 50% Average distance from the mean
Sensitivity to outliers Not affected Heavily affected
Assumptions None Assumes symmetry/normality for interpretation
Best for Skewed data, data with outliers Symmetric, normally distributed data
Used in Box plots, outlier detection Confidence intervals, hypothesis tests

For a symmetric, normal distribution, the IQR is about 1.35 standard deviations (specifically, IQR ≈ 1.35σ). The more skewed the data, the more the two measures diverge.

IQR in Practice: Survey Data Quality

In survey research, the IQR-based outlier rule is commonly applied to:

  • Completion time: Respondents who finish much faster (below Q1 - 1.5 × IQR) may be speeding through without reading
  • Straight-line detection: After computing response variance per respondent, those with variance below Q1 - 1.5 × IQR may be straightlining
  • Open-end length: Unusually short open-ended responses (below the lower boundary) may indicate low effort

These applications treat the IQR rule as a screening tool, not an automatic removal criterion. Flagged cases should be reviewed manually before being excluded.

When to Use the Interquartile Range

  • Describing spread in skewed or non-normal distributions where standard deviation is misleading
  • Detecting outliers using the 1.5 × IQR rule as a data quality check
  • Comparing variability across groups using box plots
  • Survey data cleaning to identify speeders, straightliners, and low-effort respondents
  • Reporting results to stakeholders who need to understand data spread without technical statistical knowledge

Common Mistakes to Avoid

  • Automatically removing all IQR-flagged outliers: the 1.5 × IQR rule identifies unusual values, not necessarily invalid ones; a genuinely high-satisfaction customer scoring in the 99th percentile isn't an error
  • Using IQR with categorical or nominal data: the IQR requires at least ordinal data and is most meaningful with continuous data
  • Reporting only the IQR without context: an IQR of 15 on a 100-point scale means something very different from an IQR of 15 on a 20-point scale; always relate it to the measurement range

How Quali-Fi Supports IQR-Based Analysis

Quali-Fi's data quality module uses IQR-based rules to automatically flag potential speeders and outlier respondents during data collection. The platform marks flagged cases for review rather than removing them automatically, letting you make informed decisions about data cleaning before analysis.

Clean your data with Quali-Fi

Frequently Asked Questions

Can the IQR be zero?

Yes, if Q1 and Q3 are the same value, meaning at least 50% of the data has the same value. This can happen with discrete data that has a limited number of response options (like a 3-point scale where most people choose the middle option). An IQR of zero makes the outlier rule useless, since the boundaries collapse to a single point.

Is the IQR the same as the semi-interquartile range?

No. The semi-interquartile range (SIQR) is half the IQR: SIQR = IQR / 2. Some older textbooks use the SIQR as a measure of spread, but the full IQR is far more common in modern practice.

Why not just use the full range instead of the IQR?

The range (maximum minus minimum) is determined entirely by the two most extreme values. A single outlier changes it dramatically. The IQR is immune to this because it only looks at the middle 50%. For any dataset with potential outliers, which is nearly all real-world data, the IQR gives a more stable and representative measure of spread.

Frequently Asked Questions

Related Guides

Put it into practice

Ready to apply this in your research?

Quali-Fi makes it easy to run surveys, conjoint studies, and more, all in one platform.