69 Inferential Statistics
As discussed earlier, inferential statistics are not only concerned about the characteristics of the sample. We are also making deductions about population based on what is known about the sample. In this section, we recap estimation procedures and discuss common statistical tests that allow us to make inferences about the population.
Recap on estimation procedures
You will remember from your social statistics class that population values can be estimated from sample values with either a point estimate or with interval estimates (e.g., confidence intervals where the population value is estimated within a certain range). Point estimates assume that the population statistic is the same as the sample statistic (either a mean or a proportion) (Healey, 2009, p. 174). However, with interval estimates, we calculate a range of values within which the population falls. As the goal of this chapter is not to teach statistics, but to provide guidance on how to report your findings in your paper, we advise you to revise your statistics notes if you want to refresh your statistical knowledge. You can also visit this video by Dane McGuckian for more information about constructing point estimates at Point Estimate for a Mean and Confidence Interval – YouTube and confidence intervals at The steps for constructing a confidence interval to estimate the mean – YouTube.
Box below 10.5.1.2 provides hypothetical SPSS output from SPSS from a sample of UBC students.
Table 10.4 - Hypothetical SPSS Output from a Sample of UBC Students | |||||
---|---|---|---|---|---|
Numbers of Hours Slept Each Week | N | Minimum | Maximum | Mean | Standard Deviation |
Valid N (Listwise) | 152152 | 28.50 | 84.25 | 50.75 | 6.125 |
Suppose we want to estimate the population value based on the sample. You might remember the formula for constructing the sample interval
CI = X +/- Z (s/√n)
Where: CI is confidence interval; X=sample mean; Z= confidence level value, s=sample standard deviation and n=sample size.
To construct the confidence interval at the 95% level (z= 1.96), we substitute the values in the SPSS output into the formula.
CI = 50.75 +/- 1.96 (6.125/ √152)
CI= 50.75 +/- 1.96 (12.33)
CI=50.75+/-24.17
In your papers, you would write: “we estimate that UBC students, on average, slept between 26.58 hours and 74.92 hours each week”.
Hypothesis testing and regression
One of the reasons why you probably decided to do quantitative data analysis is to test hypotheses. Hypothesis testing involves analyzing your data to determine if the results are meaningful (e.g., Are two means similar? Does variable A impact variable B?). If you are still undecided on what statistical analysis you will use in your thesis, now is a great time to refresh yourself on different statistical techniques. Below we summarize common research objectives and the kind of statistical technique that might be appropriate.
Table 10.5 - Common Research Objectives and their Statistical Techniques | ||
---|---|---|
Research Objective (To ….) |
Statistical Procedure | Sample Research Question |
Test if the mean of a population is statistically different from a known or hypothesized value | One Sample T-test | Is the mean grade in SOCI 200 different from 70? |
Test if he null hypothesis that the means of two groups are equal | Two sample T-test | Do males and females score the same in SOCI 200? |
Compare the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different | Independent samples T-test | What is the difference in SOCI 200 scores from two different sections (e.g. section 103 and 104)? |
Compare means across three or more groups with one independent variable | One way ANOVA | What is the difference in average scores in SOCI 200 faculty (attributes Arts, Science, Engineering? |
Compare means across groups with two or more independent variables | Two-way ANOVA | What is the difference in SOCI 200 grades according to gender and age? |
Examine the differences between categorical variables in the same population | chi-square | What is the effect of gender on marital status? |
Determine which independent variable (s) impacts an outcome (dependent variable) for continuous variables | Linear Regression | What effect do the number of hours studied have on SOCI 200 grades? |
Determine which independent variable impacts an outcome (dependent variable) when the output is discrete (i.e., the presence or absence of the outcome) | Logistic Regression | Does gender affect whether students pass or fail SOCI 200? |
To help you decide on which technique to use, we provide a bit more detail on each of these below:
- The One and Two Sample T-test: The One Sample t Test is used to test the statistical difference between a mean and a known or hypothesized value of the mean in the population. Please note that this procedure cannot be used to compare sample means between multiple groups. Remember that if you are comparing the means of multiple groups to each other, you should consider an Independent Samples t Test (to compare the means of two groups) or a One-Way ANOVA (to compare the means of two or more groups). However, you can use a two-sample T-test to test if the means of two groups are the same.
- Paired Samples T-Test: The Paired Samples t Test compares the means of two measurements taken from the same individual, object, or related units. In social science research, each subject is measured twice, resulting in pairs of observations. “Paired” measurements can include measurements taken at two different times, for example, a pre-test and post-test score with an intervention administered between the two time points such as measuring the impact of anti-racist education on attitudes toward minority groups. In this case, a research could distribute a survey to determine attitudes towards minority group, then offer anti-racist education, followed by a repeat of the survey. The essence of the paired samples t-test is to determine whether the mean difference between paired observations is significantly different from zero. Kansas State Universities Libraries (2022) provide additional cases where the Paired Samples t Test is commonly used, including:
- Statistical difference between two time points
- Statistical difference between two conditions
- Statistical difference between two measurements
- Statistical difference between a matched pair
Note that the Paired Samples t Test can only compare the means for two (and only two) related (paired) units on a continuous outcome that is normally distributed (Kansas state universities library, 2022).
- Independent Samples T-Test: The Independent Samples t Test compares the means of two independent groups in order to determine whether there is statistical evidence that the associated population means are significantly different. It can only compare the means for two (and only two) groups; ANOVA should be used to make comparisons among more than two groups.
Reporting T-test results
Reporting your findings in your thesis is quite simple. You will need to report on the T value, df and sig. Your statement should take one of the following forms:
- Identify the technique used (e.g., independent sample, paired t-test etc. and the variables of interest).
- Note whether the means were significantly different (statistically, based on p value).
- State the level of the difference (which group is higher or lower, or whether the mean is different from a known value).
- Provide descriptive statistics to indicate the difference. The text in your findings can follow the template below:
A ______(type of t-test e.g., independent sample) t-test was conducted to determine if the mean for ______(name of variable) was significantly different. There was a significant or non-significant effect for _____(name of variable), t(df) = ____, p = ___, with attribute A being higher/lower (M=, SD=) than attribute B (M =, SD =).
Here is an example:
A two sample t-test was conducted to determine if the mean grades in SOCI 200 by gender were significantly different. There was a significant effect for gender, t(152) = 5.43, p =.001, with females receiving higher scores (M= 72.1, SD 2.2) than those identifying with other genders (M=66.3, SD=1.16).
Additional Resources
For further tutorials on how to run and interpret confidence intervals in SPSS, see UBC Research Commons: https://researchcommons.library.ubc.ca/introduction-to-spss-for-statistical-analysis/
Also check out this youtube tutorial for STATA: Stata® tutorial: Confidence interval calculator for normal data – YouTube
Common research objectives and their appropriate statistical technique resources
See UBC Research Commons for tutorials on how to generate and interpret the statistical procedures discussed in Box 9.8 in SPSS https://researchcommons.library.ubc.ca/introduction-to-spss-for-statistical-analysis/
References
Healey, J. F. (2009). Statistics: A Tool for Social Research (Eight Edition). Wadsworth Cengage Learning.
Kent State University Libraries. (2017, May 15). SPSS tutorials. https://libguides.library.kent.edu/SPSS