Chapter 2 What Data Looks Like and Summarizing Data

2.3.3 Summing Up: Adding Cumulative Percentages

The thing that remains to add to our frequency table is there only for convenience’s sake. It can be useful to know, for example, what percentage of the 21 people in our original group do not have graduate degrees, or what percentage of people have not gone to university, etc. Of course, in our specific educational attainment example it would be easy to to the quick-and-dirty calculation of adding 11.1 percent (those with Master’s degrees) to 5.6 percent (those with PhD), thus finding that 16.7 percent of our respondents have graduate degrees; or adding 5.6 percent (those without a degree) to 33.3 percent (those with Secondary/High School) and finding that 38.9 percent of our respondents have not gone to university. Doing such calculations all the time, depending on the question, might get tedious, however, at best, and, at worst, it’s also incorrect (hence the “quick-and-dirty” appellation).

 

Let’s then improve on our frequency table-in-progress a final time, shall we? The version below is the final version, ta-da!

 

Example 2.2 (E) Frequency Table for Educational Attainment

Table 2.4 Educational Attainment by Frequency, Percent, Valid Percent and Cumulative Percent

  Degree

  Frequency

Percent

Valid Percent

Cumulative Percent

Valid    No degree 1 4.7 5.6 5.6
   Secondary/High School 6 28.6 33.3 38.9
   Associate’s 3 14.3 16.7 55.6
   Bachelor’s 5 23.8 27.8 83.3
   Master’s 2 9.5 11.1 94.4
   PhD 1 4.7 5.6 100.0
   Total Valid 18 85.6 100.0  
Missing    Didn’t answer 3 14.3
   Total Missing 3 14.3
    TOTAL 21 100.0    

 

The final column I have added in our Table 2.4 is called Cumulative Percent. What it does is keep a sort of a “running total”, adding the second category’s frequency to the first and reporting the first two categories as a fraction of the total; adding the third category’s frequency to the total of the first two and reporting the first three categories as a fraction of the total, etc. — in effect adding each subsequent category to the total of all preceding ones, one by one, until all categories are added together.

 

Note, however, that you should not add the percentages in the Valid Percent column to obtain cumulative percentages. Despite the quick-and-dirty trick I did before, I actually calculated the cumulative percentages based on the added categories’ frequencies, and so should you, if you have to create a frequency table from scratch.

 

Like this:  there is one person without a degree and 6 people with secondary/high school degrees, or 7 people combined. Therefore, the cumulative percent of these two categories is obtained thus:

 

    \[\frac{f_1+f_2}{N}(100)=\frac{1+6}{18}(100)=\frac{7}{18}=0.389(100)=38.9\%\]

 

and not by adding 5.6 percent (the person with no degree) to 33.3 percent (the ones with secondary/high school degrees) — even if in this case, both produce the same result, 38.9 percent.

 

The reason why we need to add the original frequencies and not the valid percentages themselves is rounding. The percentages reported in the frequency table are rounded to 1 digit after the decimal point; adding rounded numbers inevitably adds imprecision to the result, which, depending on the situation, might end up being crucial. In our case, it makes no difference but do note that the percentages reported in the Percent column actually only add up to 99.9 percent, not 100 percent; similarly, the percentages reported in the Valid Percent column actually add up to 100.1 percent rather than 100 percent. These differences, as negligible as they seem when working with a variable with few categories like the one here, can add up and become more significant in variables with numerous categories (like interval/ratio variables, for example).

 

You can see examples of real-data frequency tables in the next-subsection.

License

Simple Stats Tools Copyright © by Mariana Gatzeva. All Rights Reserved.

Share This Book