Chapter 9 Testing Associations I: Difference of Means, F-test, and χ2 Test
9.1 Between a Discrete and a Continuous Variable: The t-test
For this part, you need to recall (from Section 7.2.1, https://pressbooks.bccampus.ca/simplestats/chapter/7-2-1-between-a-discrete-and-a-continuous-variable/) how we described bivariate associations between two variables, one of which is treated as discrete and one as continuous. In this case we essentially compared the groups (categories of the discrete variable) by their mean (or median) value on the continuous variable. We examine the potential association between such variables visually through boxplots and numerically through a difference of means.
Now the question in front of us is: even if we do see a difference in the means of the different groups in sample data, how certain can we be that this association is real and reflective of the population? As we learned in Chapter 8, to answer this question, we need to test the difference for statistical significance.
We start with a few theoretical notes, which we will then apply to the example I used in Chapter 7 about the potential gender difference in average income. In this way we will be able to test whether the difference observed in the NHS 2011 data ($16,401 in favour of men to be precise) is statistically significant or not. In the latter half of this section we will see what happens when there are more than two groups’ means to compare.
Testing the difference of two means. Recall from Section 8.3 (https://pressbooks.bccampus.ca/simplestats/chapter/8-3-hypothesis-testing/) that we tested whether the employees who took a training course indeed had a higher average productivity by simply calculating the z-value (or, using the estimated standard error, the t-value with a given df) for the mean and then finding its associated p-value. We could then compare the p-value to the preselected α-level and make a conclusion regarding the null hypothesis.
You will be happy to know that testing a difference of means follows the same principle: obtain the z (or rather, the t-value), get the associated p-value, compare to the α. What is not the same is that now we are testing expressly a difference of two means — so we need the t-value for the difference. It turns out, we can calculate one as easily as ever, as long as we had the standard error of the difference[1].
The standard error of a difference of two means is a combination of their separate standard errors:
= standard error of the difference of two means
where the subscripts refer to the first and second group being compared.
The z-value for a difference of two means follows the ordinary z-value formula, but with the difference taking the place of the single mean:
However, under the null hypothesis we hypothesize there is no difference in the population means, as such , and thus . Accounting for that in the formula, along with substituting the standard error with its own formula from above, we get:
Finally, since we generally don’t know the population parameters but work with sample data, we estimate the standard error σ with the sample standard error s, thus moving to the t-value through which we test the difference for statistical significance:
= t-test for the difference of means[2]
Note than unlike the single value case where the df=N-1, when working with a difference of means of two groups the df=N-2.
Before you eyes glaze over (completely), rest assured that SPSS calculates this for you; I only provide it here to show you that the logic of hypothesis testing is the same, only the formulas change to accommodate the testing of a difference of means rather than a single mean.
From this point on, it’s easy: you only need to check the p-value of the t-value you have obtained (given the specific df)[3], and compare it to the significance level, and voila — you have yourself a significance test!
Let’s see how this all works out in an example. A few sections back I promised you to test the gender differences in average income, didn’t I?
Example 9.1 Testing Gender Differences in Average Income, NHS 2011
As in Example 7.2 in Section 7.2.1, I use a random sample of about 3 percent of the entire NHS 2011 data, this time resulting in N=21,902[4].
We are still interested in whether women and men on average earn differently per year, i.e., whether gender affects income:
- H0: The average annual income of women and men is the same,
- Ha: The average annual income of women and men is different,
There are 11,323 women (Nf=11,323) and 10,579 men (Nm=10,579) in the sample. The men earn an average of $48,113 () and women earn an average of $31,519 (). The respective standard deviations are $68214 for men () and $34,760 for women ().
The difference of means is therefore:
The question is whether this $16,549 is due to sampling variation (i.e., statistically not different than a population difference of means of $0), or unusual enough so that a population mean of $0 to be unlikely (i.e., so the difference is statistically significant).
To test this, we need to calculate the standard error of the difference. Once we have the standard error of the difference, we can calculate the t-value.
The standard error of the difference is:
=
The t-value is then:
Given the large N, even just looking at the t-value should make it clear that the difference is statistically significant — after all, in a two-tailed test, the t-value is significant at 1.96 and on (for α=0.05) and at 2.58 and on (for α=0.01).
Still, this is not the way to report a test — this is: With a t=22.447, df=21,900, and p=0.000[5], and p<0.001[6], we have enough evidence to reject the null hypothesis. Indeed, we can conclude with 99.99% certainty that there is a statistically significant difference between the average annual income of men and women (i.e., that the difference exists in the population).
We can check this with a confidence interval too, again substituting the difference in place of a single value[7]:
95% CI: = =
That is, we can say that the difference of average annual incomes between men and women will be between $15,145 and $18,043 with 95% certainty; or that 19 out of 20 such studies will find a difference of $16,594 $1,448. (We also see the correspondence with hypothesis testing: since the interval does not contain 0, 0 is not a plausible value for the difference.)
Inference is not doing too badly, no?
Again, SPSS will provide all the calculations but I advise you to still test your understanding of the procedure with the following exercise.
Do It!! 9.1 Gender Differences in Age of Actors in Main Roles
Studies find that due to the gendered social construction of aging (i.e., women are considered “older” and “mature” at younger ages than men), male actors are frequently paired with much younger female actors (Buchanan 2013; Follows 2015). For example, the Oscars average age of male and female Academy Award nominees is telling: in the Best Actor category, the average age of men is 43.4 years while the average age of women is 37.2 years (Beckwith & Hester, 2018 [http://thedataface.com/2018/03/culture/oscar-nominees-age]).
Let’s say that you want to investigate this phenomenon yourself. You randomly select 100 male and 100 female academy award nominees, and calculate their age at nomination for an Academy Award. You find that men’s average age is 45 years and women’s is 36 years, with standard deviations of 15 years for men and 20 years for women. Test the hypothesis that the average age for women is different from that of men for the population of all Best Actor/Actress Oscar nominees. Create a 95% CI for the difference to see its correspondence with the hypothesis test.
Now that you understand the principle of testing the difference of two means, let’s see what we can do about non-binary discrete variables, in the next section. The SPSS guidelines for doing a t-test are below.
SPSS Tip 9.1 The t-test
- From the Main Menu, select Analyze, and from the pull-down menu, click on Compare Means and Independent Samples T Test;
- Select your continuous variable from the list of variables on the left and, using the top arrow, move it to the Test Variable(s) empty space on the right;
- Select your discrete variable from the list of variables on the left and, using the bottom arrow, move it to the Grouping Variable empty space on the right;
- Click on Define Groups, and in the new window, keep Use specified values selected; in the empty spaces for Group 1 and Group 2, enter the numeric values[8] corresponding to the two categories of your discrete variable; click Continue.
- In the Independent Samples T Test window click Options…; you can request specific confidence interval in the new window (the default is 95%); click Continue;
- Click OK once back to the Independent Samples T Test window.
- SPSS will produce two tables in the Output window: a Group Statistics one (where you can see sample size, the mean, standard deviation, and standard error for each group (category in the discrete variable), and an Independent Samples Test one (where you can find the t-value, df, p-value, mean difference, standard error of the difference, and the requested confidence interval)[9].
- I hope you have not forgotten that , where the standard error . ↵
- The more observant of you would notice that the squared standard deviations of the two groups, i.e., the s12 and s22 here are of course the groups' variances (which we need if we are to have them under the square root). In this version of the formula, the groups are taken to have unequal variances, which is a more conservative assumption than assuming the variances of the two groups are equal. If we have a good reason to assume equal variances, then s12 and s22 will just be the same (combined, or pooled) variance s2, and the formula will look like this: ↵
- You can do that through an online p-value calculator for the t-distribution like this one here: https://www.socscistatistics.com/pvalues/tdistribution.aspx. ↵
- Since I use a new random sub-sample of the data, you can consider this an indirect illustration of sampling variation. For comparison of sample statistics as well as variable description, refer back to Example 7.2 ↵
- You can check this with a p-value calculator; SPSS reports it too. ↵
- That is, the probability to observe a difference of $16,594 in the sample if there were no difference in the population is smaller than 0.1%. ↵
- I hope you remember that 95% CI: . ↵
- That would be the "code" -- for example, gender may be coded as "1 female, 2 male", or "0 male, 1 female", etc., depending on the dataset. You have to know this beforehand; if unsure, go back to Variable View and check. ↵
- The table provides two versions of the test: with and without equal variances assumed. Which one you should use depends on the size of the two groups' variances. If the variance of one groups is twice (or more) as big as the other group's variance (like in Example 9.1 above, where the men's variance was much larger than the women's one), use the test results in the bottom row, "equal variances not assumed". If the two groups' variances are relatively similar, you can use the top row, "equal variances assumed". You don't have to decide on your own, as SPSS provides a convenient indication for which one is better to use, under Levene's Test/F for comparing variances. If the F-test is significant (i.e., p≤0.05), the variances are too different and using the bottom row is better; if the F-test is non-significant (i.e., p>0.05) you can assume the variances are equal and use the top row of results. ↵