Chapter 4 Measures of Dispersion
4.1 Range
Providing the range for a set of values is so easy, most people don’t even realize it is an actual statistical measure of dispersion. If you have ever said something to the effect of “I have friends whose ages vary between seventeen and twenty-seven” or “my scores on these exams vary from 25/100 to 95/100”, etc., you have effectively been providing the range of your friends’ ages or the range of your exam scores.
To give you the more technical definition, the range of a variable is the difference between its highest and lowest values. That is, to get the range, we simply subtract the lowest value from the highest value:
In the two quick examples above, the range of your friends’ ages would be (27-17=) 10 years, and the range of your exam scores would be (95-25=) 70 points.
I’ll use an older, familiar example for the longer work-through, below.
Example 4.1 The Range for Textbook Prices Paid in One Semester
Recall Example 3.5 from Section 3.4 (https://pressbooks.bccampus.ca/simplestats/chapter/3-4-mean/) where we calculated the mean price of textbooks we imagined you paid in a particular semester. The books’ prices were $120, $230, $300, $65, $30. The cheapest book (i.e., the lowest value, ) was $30 and the most expensive book (i.e., the highest value, ) was $300. Thus
That is, now we have found that the range of textbook prices for that semester was $270, with prices you paid ranging between $30 and $300.
One thing to note here is that in order to have a difference, i.e., in order to be able to do a mathematical operation like subtraction, we need to have numerical values.
In truth, as you are about to see, all measures of dispersion are obtained through mathematical operations and, as such, require numerical values. Since interval/ratio variables are the only variables which contain actual numerical values, all dispersion measures (including the range) are only applicable to interval/ratio variables.[1]
A final point about the range is that it is a rather unsophisticated measure of dispersion, as you have already noticed. (Hence the very short section about it.) By taking into account solely the highest and the lowest values, the range effectively ignores all other values, be they more clustered or more spread out.
After all, if you recall from Section 3.6 (https://pressbooks.bccampus.ca/simplestats/chapter/3-6-outliers/), outliers do exist. In the presence of outliers, the range can end up being quite large, even if the majority of the observations are closely clustered. Therefore, we’d better find a dispersion measure which takes into account more than just the two extremes of a variable’s distribution.
The interquartile range is one such measure which provides a bit more information about the variability of the distribution. Alas, the cost of this information is, of course, an increased complexity in obtaining that measure. (An ominous foreshadowing for what’s to come!)
- Some people find it useful to provide something like a range for ordinal variables: after all, they do have a "lowest" category and a "highest" category. While technically not a statistical measure of dispersion (as no difference can be computed), it can still be useful to add a description about the categories ranging between the lowest and highest points, e.g., "respondents' agreement with the statement varies between "strongly disagree" and "strongly agree". Considering that the categories of nominal variables have no inherent order, nothing of the sort can be applied to them. All in all, providing a qualitative description of dispersion for ordinal variables (like the agreement one I just mentioned) is optional and, strictly speaking, not a statistical measure. ↵