8 Hypotheses Testing
8.2 Hypotheses
Now that we have come to terms with the fact that we will not be making causal statements at this point, let’s turn our attention to establishing statistical associations. As I mentioned in the previous section, this is done through testing. In order to test variables’ associations we need to know how hypotheses are scientifically tested.
To have a hypothesis about something means to have an idea about how to explain it. This idea, or proposed explanation, might be based on any combination of logic, previous related observations, experience, etc. In science, hypotheses are formulated as relatively concise, testable statements. If a statement cannot be tested, it doesn’t qualify as a scientific hypothesis.
Most students unfamiliar with the scientific method of testing hypothesis are surprised to learn that the testing is done in a roundabout, method-of-exclusion kind of way: we don’t set out to confirm our hypothesis but rather to reject the opposite of what we claim. To baffle you further, if we reject the opposite, we have found evidence in support of our hypothesis but we have not proven that it’s true. Similarly, if we do not reject the opposite, it doesn’t mean that we’ve proven it as true or that we have proven our hypothesis wrong. (Nothing is ever proven in science as that would require 100 percent certainty and we already established that is impossible.) Thus, interpreting a hypothesis test requires careful, qualified language as to not overstate findings.
Confused? Not to worry. I am getting ahead of myself here to give you a quick sketch of where we are headed in this section, but of course I will go over and explain the parts of the paragraph above in greater detail below. Also a heads-up: after the brief respite, things are about to get technical again (in the next section). But first things first.
To test a hypothesis of interest, we make two contradictory statements: one about what we hypothesize and another stating the exact opposite [1]. The “opposite” hypothesis is called a null hypothesis (frequently designated as H0) and is usually stated first; the original hypothesis of interest is called an alternative hypothesis (usually designated as Ha) and is stated second[2].
When we apply all this to testing variables’ associations, we end up with null hypotheses such as “the two variables are not associated”, “there is no association between the two variables”, or “Variable 1 does not affect Variable 2”, or “the two variables are independent of each other”, etc. The alternative hypotheses then would be something like “the two variables are associated”, “there is an association between the two variables”, or “Variable 1 does affect Variable 2”, etc. (However, recall that when interpreting and reporting results it is always better to state the findings not only in terms of variables but also in terms of people.) See some examples in the box below.
Example 8.1 Stating Hypotheses
Hair colour and eye colour:
- H0: Hair colour and eye colour are not associated; e.g., dark-haired individuals are equally likely to have blue eyes as blond individuals are.
- Ha: Hair colour and eye colour are associated; e.g., dark-haired individuals’ and blond individuals’ likelihood of having blue eyes is different.
Smoking and lung disease:
- H0: Smoking and lung disease are not associated; e.g., smokers and non-smokers have the same odds of developing lung disease.
- Ha: Smoking and lung disease are associated; e.g., smokers and non-smokers have different odds of developing lung disease.
Gender and income:
- H0: Income is independent of gender; e.g., men and women have the same average income.
- Ha: Income is dependent on gender; e.g., women and men have different income on average.
Parental education and offspring education:
- H0: Parental education is unrelated to the education of their offspring; e.g., the level of parental education has no effect on children’s level of education.
- Ha: Parental education and their offspring’s education are related; e.g., the level of parental education is associated with the children’s level of education.
There are three things that you can learn from the examples presented above. First, the hypotheses are formulated as short statements that can be evaluated in a simple yes-or-no kind of way: “Average income is independent of gender”: YES, or “Average income is independent of gender”: NO. Thus you really need only one statement per hypothesis; if your proposed explanation is complicated and involves more than two variables, this means you are dealing with multiple hypotheses, each of which needs to be tested separately.
Second, while there are many ways you can state essentially the same hypothesis, try to keep the null hypothesis as the same statement the alternative hypothesis has but in opposition, such as “…are not related/associated” and “…are related/associated”, or “…are independent” and “…are not independent”, etc.
Third, you may have noticed the slightly awkward way in which some of the alternative hypotheses are listed above. Couldn’t I have stated “women have lower income than men on average”? Or, “blond individuals are more likely to have blue eyes than dark-haired individuals”? I could but then these would have been different alternative hypotheses. The reason I did not imply who is more or less likely to have blue eyes, or who has a higher income on average but kept the statements as a generic “different likelihood” and “different income” is because it affects the kind of test that needs to be used. Briefly, there is a general test for association/difference (aka two-tailed test), and a more specific version (aka one-tailed test) which implies “direction”; the former is more “open-minded” as it doesn’t rely on or imply prior knowledge and is therefore more conservative. The latter indicates not only a difference (i.e., association) but of what specific type so its usage needs to be justified. More on that in the next section but for now keep in mind that as beginner researchers, it’s recommended you use the general, two-tailed, version of the test.
Before we move to some actual hypothesis testing, see if you can formulate some hypotheses on your own.
Do It! 8.1 Stating Hypotheses
Formally state the null and alternative hypotheses about each of the following pairs of variables: class attendance and test scores, time spent on social media prior to a test and test scores, race/ethnicity and years of schooling, gender and belief in climate change, political affiliation and attitudes toward gun control. In fact, just go ahead and practice formulating hypotheses about anything you like.
- Why? Beyond what I already explained about proofs, also because scientists need to be impartial about what a test will reveal. As a scientist, you want to test a hypothesis with an open mind and to be equally prepared to accept the result either way it goes -- so you cannot set out from the start to find your hypothesis supported. ↵
- Do not get alarmed if you see different notation in published research. When researchers test many hypotheses in the same study, they may designate them as H1, H2, H3, etc. Even more importantly, experienced researchers don't explicitly state the null hypotheses in their studies -- they are self-understood as the opposite of whatever each alternative hypothesis states. Further, some researchers never explicitly designate a hypothesis as it is taken as evident that this is what they do. Beginner researchers like you, however, should practice stating -- and clearly designating -- both null and alternative hypotheses. ↵