Chapter 6 Sampling, the Basis of Inference

6.4 Parameters, Statistics, and Estimators

The logic underlying statistical inference is that we want to know something about a population of interest but, since we cannot know it directly, what we do is study a subgroup of that population. Based on what we learn/know about the subgroup, we can then estimate (i.e., infer) things about the population. In the previous section, we already established that not any subgroup of the population will do — what we need is a randomly selected sample, created through one of the random sampling methods I listed (simple, systematic, stratified, and cluster). What we do is collect data from/about elements of a sample (e.g., respondents) with the explicit goal of finding something and drawing conclusions about a population. (Again, we can do that due to the fact that random sampling allows us to use probability theory through the normal curve.)

 

Saying we want to find “something” about the population of interest is hardly formal (much less precise) terminology but I wanted to get the message across before I introduced you to the proper statistics jargon. Let’s do that now.

 

Populations have parameters and samples have statistics. We describe populations with their parameters while we describe samples with their statistics. When we study something, we are interested in the parameters of the population, however, in most cases it is difficult to collect the information to calculate them. What we do instead is we take a random sample of the population and calculate the sample’s statistics. We then use the sample statistics to estimate (i.e., infer) the population parameters. Thus, sample statistics are also called estimators of population parameters.

 

For example, if we want to know the average age of Canadians, we could either do a census and ask everyone or simply take a nationally representative sample. Considering how expensive and time-consuming it would be to ask all 36.7 mln. Canadians (and Statistics Canada conducts the official census only every five years), we can poll a random selection of people across Canada, calculate their average age, and use that as an estimate of the average age of all Canadians[1].

 

In this example, the average age calculated based on the people in the sample is the statistic which we use to estimate the average age of all Canadians, the population parameter. All measures of central tendency and dispersion describing variables based on sample data are statistics. On the other hand, if we have data from all the population when calculating measures of central tendency and dispersion, we would have parameters.

 

Consider if you will, examples I have used in past chapters: whenever the example was based on actual data from a dataset, and SPSS was used, this was sample data producing statistics[2]. Even if we haven’t used statistics in this way yet, they can be used to estimate things about Canadians as a whole. On the other hand, any time I have used examples using hypothetical (imaginary) data about “your friends”, “your classmates”, “hours you have worked per week”, etc. can be considered as having population data, as we imagine we have all the information about those things, and there’s nothing to estimate.  

 

A final note concerns formal notation. To differentiate between statistics and parameters, we designate sample statistics by Latin letters but we denote population parameters by Greek letters.

 

You have already seen a ready-made example for this rule: recall our discussion on variance and standard deviation. In Section 4.4 (https://pressbooks.bccampus.ca/simplestats/chapter/4-4-standard-deviation/) I introduced formulas for σ and σ2 and I mentioned (without much explanation) that another “version” of these exist as s and s2. In truth, when we calculated the variance and the standard deviation with the hypothetical data in the examples, we needed the population standard deviation and variance (i.e., σ and σ2, respectively); but when we use SPSS with a dataset (i.e., sample data), we need the sample standard deviation and variance (i.e., s and s2, respectively). Here they are again:

 

    \[\frac{\sum\limits_{i=1}^{N}{(x_i-\overline{x})^2}}{N} = \sigma^2 =\textrm{population variance}\]

 

    \[\sqrt{\frac{\sum\limits_{i=1}^{N}{(x_i-\overline{x})^2}}{N}} = \sqrt{\sigma^2}=\sigma=\textrm{population standard deviation}\]

 

    \[\frac{\sum\limits_{i=1}^{N}{(x_i-\overline{x})^2}}{N-1} = s^2 =\textrm{sample variance}\]

 

    \[\sqrt{\frac{\sum\limits_{i=1}^{N}{(x_i-\overline{x})^2}}{N-1}} = \sqrt{s^2}=s=\textrm{sample standard deviation}\] 

 

I’ll take this opportunity to finally explain why we need the difference in the formulas (i.e., to divide by N-1 in the sample formulas but by N in the population formulas). Considering that the sample statistics estimate the population parameters but are arguably different from the exact parameters — i.e., some uncertainty exists, as inference is not a perfect “guess” — to assume what we obtain from a sample is exactly the same as the population would be a biased estimation. Thus, the N-1 is meant to correct that bias[3] (which it does for the variance, and does to an extent for the standard deviation). What we have then is that s and s2 are unbiased estimators of σ and σ2, respectively.

 

Thus it should be clear why we use the s and s2  formulas when working with datasets and SPSS — as the actual data has been collected from respondents randomly selected from a population of interest and comprising a sample of specific size. On the other hand, when we have data about everyone/everything we’re interested in (like in the small-scale examples with made-up data), we have a de facto population on our hands — hence the σ and σ2 formulas are appropriate. In the former case, the findings can be extrapolated to the population (acknowledging that we are dealing with inferred estimates); in the latter case, there is nothing further to extrapolate as we are calculating the parameters directly.

 

Another important parameter to note (as we will be using it a lot from now) on is the population mean designated by the small-case Greek letter for m (from mean) — μ, which I introduced in Section 5.1.2 (https://pressbooks.bccampus.ca/simplestats/chapter/5-1-2-the-z-value/) without giving you a reason why. Unlike the correspondence between s and σ, however, we don’t usually denote the sample mean with an m; as you know, we use \overline{x} instead (so that we know which variable’s mean we have in mind).

 

Finally, when a parameter is being estimated by an estimator, it is designated by a “hat” on top: for example, if we have a sample statistic called a estimating a population parameter α[4], the estimated α will be \hat{\alpha}, pronounced “alpha-hat”. By analogy, if a statistic b estimates a parameter β[5], the estimated β will be \hat{\beta}, pronounced “beta-hat”.

 

Thus, the logic of inference tells us that while a = \hat{\alpha} and b = \hat{\beta} (i.e., the statistics are estimators for the parameters), a = \hat{\alpha}\neq\alpha and b = \hat{\beta}\neq\beta. That is, the statistics (a.k.a. estimators) are not the same as the parameters. More on this, next.


  1. When people who have no statistics background learn of this, they usually protest that the information is not accurate because it's not based on everyone. What you will learn in this chapter is that you don't need everyone, and a sample is perfectly enough because random samples of sufficient size are mathematically proven to produce the best (closest, truest, most unbiased) estimates of the population parameters. To the extent that there is a difference between a statistic and the parameter it estimates, this difference is accounted for by reporting levels of certainty/confidence. More on that later.
  2. All datasets used in this book are nationally representative data collected by Statistics Canada.
  3. This is called Bessel's correction, by the name of Friedrich Bessel who introduced it.
  4. This is the small-case Greek letter a: α, pronounced "AL-pha".
  5. This is the small-case Greek letter b: β, pronounced "BAY-ta".

License

Simple Stats Tools Copyright © by Mariana Gatzeva. All Rights Reserved.

Share This Book