Chapter 2 What Data Looks Like and Summarizing Data
2.3 Frequency Tables
As usual, let’s start ground-up with an example and work our way up to the concept under study. Consider the following raw (unorganized) data.
Example 2.2 (A) Hypothetical Raw Data on Educational Attainment
Imagine that a group of 21 people were asked about the highest educational degree they have attained. These are their responses:
Secondary/High School | Bachelor’s | Secondary/High School | No Degree | Bachelor’s | Didn’t answer |
Master’s | Associate’s | Master’s | Secondary/High School | Bachelor’s | |
Secondary/High School |
Secondary/High School |
Didn’t answer |
Didn’t answer |
Bachelor’s |
|
Secondary/High School | PhD | Bachelor’s | Associate’s | Associate’s |
What can we glean from this presentation of the information? Can we easily see which is the most frequently obtained educational degree in the group? How many people do we have of each degree? What fraction/proportion of the total are each?
Of course, we could always count — but what if I had asked you to imagine a group of 36 people? Of 72? Or 200? Or 2,000? Or more? Are you still going to painstakingly count the different responses?
You may be surprised, but the answer is “yes, if we had to”. In the past, researchers used to do a that, a lot. Nowadays of course we have computers to do it for us. SPSS can easily summarize this data but to understand the process better, we’ll start from scratch.
The most obvious way we can organize the raw data above into something less chaotic is the following:
Example 2.2 (B) Hypothetical Data on Educational Attainment, Organized
Table 2.1 Educational Attainment by Frequency
Degree |
Count (a.k.a. frequency) |
No degree | 1 |
Secondary/High School | 6 |
Associate’s | 3 |
Bachelor’s | 5 |
Master’s | 2 |
PhD | 1 |
Didn’t answer | 3 |
TOTAL | 21 |
In the most basic sense, this is a frequency table. It lists the different categories of a variable along with their observed count, a.k.a. frequency. That is, we essentially count how many times any given category appears, i.e., we count how frequent a response is among the respondents, and then indicate the number for each category/response. Frequency is usually denoted by f in statistical notation.
Real frequency tables, however, usually contain more information than a simple count. The following few sub-sections provide the details, while we work our way through creating a full frequency table.