The interquartile range and the standard deviation are two ways to measure the spread of values in a dataset.
This tutorial provides a brief explanation of each metric along with the similarities and differences between the two.
Interquartile Range
The interquartile range (IQR) of a dataset is the difference between the first quartile (the 25th percentile) and the third quartile (the 75th percentile). It measures the spread of the middle 50% of values.
IQR = Q3 – Q1
For example, suppose we have the following dataset:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32
According to the Interquartile Range Calculator, the interquartile range (IQR) for this dataset is calculated as:
- Q1: 12
- Q3: 26.5
- IQR = Q3 – Q1 = 14.5
This tells us that the middle 50% of values in the dataset have a spread of 14.5.
Standard Deviation
The standard deviation of a dataset is a way to measure the typical deviation of individual values from the mean value. It is calculated as:
s = √(Σ(xi – x)2 / (n-1))
For example, suppose we have the following dataset:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32
We can use a calculator to find that the sample standard deviation of this dataset is 9.25. This gives us an idea of how far the typical value lies from the mean.
Similarities & Differences
The interquartile range and standard deviation share the following similarity:
- Both metrics measure the spread of values in a dataset.
However, the interquartile range and standard deviation have the following key difference:
- The interquartile range (IQR) is not affected by extreme outliers. For example, an extremely small or extremely large value in a dataset will not affect the calculation of the IQR because the IQR only uses the values at the 25th percentile and 75th percentile of the dataset.
- The standard deviation is affected by extreme outliers. For example, an extremely large value in a dataset will cause the standard deviation to be much larger since the standard deviation uses every single value in a dataset in its formula.
When to Use Each
You should use the interquartile range to measure the spread of values in a dataset when there are extreme outliers present.
Conversely, you should use the standard deviation to measure the spread of values when there are no extreme outliers present.
To illustrate why, consider the following dataset:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32
Earlier in the article we calculated the following metrics for this dataset:
- IQR: 14.5
- Standard Deviation: 9.25
However, consider if the dataset had one extreme outlier:
Dataset: 1, 4, 8, 11, 13, 17, 19, 19, 20, 23, 24, 24, 25, 28, 29, 31, 32, 378
We could use a calculator to find the following metrics for this dataset:
- IQR: 15
- Standard Deviation: 85.02
Notice that the interquartile range barely changes when an outlier is present, while the standard deviation increase from 9.25 all the way to 85.02.
Additional Resources
Measures of Central Tendency: Definition & Examples
Measures of Dispersion: Definition & Examples
How to Find Outliers Using the Interquartile Range