{"id":99,"date":"2018-10-31T17:41:36","date_gmt":"2018-10-31T21:41:36","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/simplestats\/?post_type=chapter&#038;p=99"},"modified":"2019-10-18T18:47:32","modified_gmt":"2019-10-18T22:47:32","slug":"6-6-the-central-limit-theorem","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/simplestats\/chapter\/6-6-the-central-limit-theorem\/","title":{"raw":"6.6 The Central Limit Theorem","rendered":"6.6 The Central Limit Theorem"},"content":{"raw":"[latexpage]\r\n\r\nDespite it's scary-sounding name, the <em>Central Limit Theorem<\/em>\u00a0(CLT) simply <em>describes<\/em> the sampling distribution -- and simultaneously explains why, and how, we can use sample statistics (like the mean of a variable, $\\overline{x}$, obtained through sample data) to estimate population parameters (like the true population mean of that variable,<em> \u03bc<\/em>).\r\n\r\n&nbsp;\r\n\r\nRecall what we use to describe a variable's frequency distribution: 1) a graph to visually display the distribution's shape; 2) measures of central tendency; and 3) measures of dispersion. In the previous section I also asked you to imagine the (entirely theoretical, i.e., <em>probability<\/em>) distribution of the mean\u00a0<span style=\"text-indent: 18.6667px;font-size: 14pt\">(again, in theory, o<\/span><span style=\"text-indent: 1em;font-size: 14pt\">ver infinitely repeated samples). What t<\/span>he CLT does then is provide information about all three of these elements (shape, central tendency, dispersion) but about the distribution of mean. <strong>In short, the CLT describes the sampling distribution of the mean.\u00a0<\/strong>\r\n\r\n&nbsp;\r\n\r\nThe sample size plays an important role: the CLT applies to \"large\u00a0<em>N\"<\/em>, and is stated for \"as the sample size grows\", bringing us back to the point that the larger the <em>N<\/em>, the better for inference it is.\r\n\r\n&nbsp;\r\n\r\nSpecifically, the CLT states that with random sampling, as <em>N<\/em> increases (i.e., for large <em>N<\/em>), the shape, central tendency, and the dispersion (of the sampling distribution) of the mean, $\\overline{x}$, will be the following:\r\n<ol>\r\n \t<li>The distribution of\u00a0$\\overline{x}$ will approach normal distribution in shape. (That is, the sampling distribution is a bell-shaped curve.)<\/li>\r\n \t<li>The mean of the sampling distribution[footnote]You can think of it as \"the mean of the means\", or the mean of the hypothetical variable <em>mean<\/em>. [\/footnote] (denoted as $\\mu_\\overline{x}$)\u00a0 will become the population mean, $\\mu$. (That is,\u00a0$\\mu_\\overline{x}$ $=\\mu$.)<\/li>\r\n \t<li>The standard deviation of the sampling distribution (denoted as $\\sigma_\\overline{x}$) is called <em>the standard error<\/em>, and is related to the population standard deviation,<em>\u00a0\u03c3<\/em>, by the formula $\\sigma_\\overline{x}$ $=\\frac{\\sigma}{\\sqrt{N}}$.<\/li>\r\n<\/ol>\r\n&nbsp;\r\n\r\nThis may seem like a lot to take in (what with all the jargon, notation, and all) but it really <em>is<\/em> simply a description of a distribution. The next paragraph clarifies each of the CLT's points in turn.\r\n\r\n&nbsp;\r\n\r\nAs brief as it is, the CLT is conveniently packed with all sorts of useful information: The sampling distribution is normal in shape -- so we can apply all we know about the normal distribution to it (for example, that it's bisected by its mean). Hence, the sampling distribution is <em>centered<\/em> on the population mean. Finally, according to the formula for the sampling distribution's standard deviation (a.k.a the standard error), as the sample size <em>N<\/em> grows, the standard error becomes smaller[footnote]After all, <em>N<\/em> is in the denominator.[\/footnote] -- so the distribution will be less variable\/spread out, and thus the estimates will be closer to the parameters[footnote]On the flip side, the larger the original variables's dispersion, the larger the standard error and the smaller the original variable's dispersion, the smaller the standard error\u00a0<span style=\"font-size: 14pt;text-indent: 18.6667px\">(as\u00a0<\/span><em style=\"font-size: 14pt;text-indent: 18.6667px\">\u03c3<\/em><span style=\"font-size: 14pt;text-indent: 18.6667px\">\u00a0is in the numerator)<\/span><span style=\"text-indent: 1em;font-size: 14pt\">.[\/footnote].<\/span>\r\n\r\n&nbsp;\r\n\r\nTo summarize, the sampling distribution provides us with a bridge between sample statistics (i.e., estimators) and population parameters (i.e., the estimated). <strong>The CLT provides a description of the sampling distribution: by giving us information about an estimator <\/strong>(in hypothetical repeated sampling)<strong>, it decreases the uncertainty of the estimation since now we can calculate how close the statistic is to the parameter.<\/strong>\r\n\r\n&nbsp;\r\n\r\nI say <em>estimator<\/em> and <em>statistic<\/em>, not <em>mean<\/em>, because <strong>CLT (or a version thereof) applies to all statistical estimators, as they all have a normal distribution with increasing sample size. <\/strong>The latter is noteworthy because<strong> it's true regardless of the shape of the original variable's distribution <\/strong>(in the population)<strong>: a variable might not be normally distributed but its mean (and other statistics) always is.<\/strong>[footnote]Many variables tend to be approximately normally distributed in the population. The point I'm emphasizing here is that even when they are not, the statistics of these variables based on random sample data <em>are<\/em> normally distributed. This relates to our discussion of how large\u00a0<em>N<\/em> should be: if the original variable's distribution in the population is close to normal to start with, a smaller <em>N<\/em> will be fine. On the other hand, if a variable is not normally distributed in the population (or is too widely dispersed\/has a lot of outliers, as reflected in <em>\u03c3<\/em>), a relatively large <em>N<\/em> will be needed to ensure the normality of the sampling distribution.[\/footnote]\r\n\r\n&nbsp;\r\n\r\nIf you are wondering about the connection between random sampling and the normal distribution, the following video might help:\r\n\r\n&nbsp;\r\n\r\nhttps:\/\/youtu.be\/Kq7e6cj2nDw\r\n\r\n&nbsp;\r\n\r\nThe video above uses a <em>Galton board<\/em> to demonstrate the connection between randomness and normal curves by showing that balls falling randomly end up distributed approximately into a bell-shaped curve -- with the majority in the centre, fewer to the sides, and fewer yet in the \"tails\". You can think of a sample mean as one of these balls (all other balls are the means of other samples of the same size). Thus, what we see is that the majority of means would fall in the centre, fewer to the sides, and fewer still in the tail ends. However, since we do not have many means at all but only one, produced by one sample, we are dealing with a probability distribution. In turn, this tells us that\u00a0<span style=\"font-size: 14pt\">the highest probability<\/span><span style=\"font-size: 14pt\">\u00a0is\u00a0<\/span><span style=\"text-indent: 1em;font-size: 14pt\">the mean to fall in the centre region, with smaller probability to be to the sides but still close to the centre, and a further decreasing probability the farther it gets from the centre, just like with any probability normal curve[footnote]Of course, in the video you see an <em>approximation<\/em> of a normal curve; after all, this is a finite, not infinite, number of balls. That is why the perfectly normal distribution is only a theortical concept.[\/footnote].<\/span>\r\n\r\n<span style=\"text-indent: 1em;font-size: 14pt\">\u00a0<\/span>\r\n\r\nIf you still find all this hopelessly abstract (as I'm sure most do), you can see exactly how we use the CLT for inference in the example below. (Unfortunately, your relief to be back to examples will be premature at this point: we have more necessary theory to cover ahead. On the bright side, we are more than half-way in the chapter so cheer up, the end is near.)\r\n\r\n&nbsp;\r\n\r\nAs a heads-up, here's the rationale of what we'll do: In order to explain inference about populations based on samples, we'll reverse-engineer it. That is, we'll start with \"knowledge\" about the population and, based on the CLT, we'll \"infer\" the sample statistic. At the end we'll see that following the same logic (but in reverse) we can easily do the opposite -- to estimate the population parameter through a sample statistic -- which is exactly what we want to do in the first place.\r\n\r\n&nbsp;\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><em>Example 6.3\u00a0Price of Statistics Textbooks<\/em><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n&nbsp;\r\n\r\nLet's say that university students on average spend \\$250 for a statistics textbook, with a standard deviation of \\$100 -- i.e., we assume to know the population parameters:\r\n\r\n&nbsp;\r\n\r\n<em>\u03bc<\/em> = 250 and\u00a0<em>\u03c3<\/em> = 100\r\n\r\n&nbsp;\r\n\r\nWe draw a random sample of<em> N<\/em>=1,600 students. We want to know the probability for that sample to have a specific mean price paid for statistics textbooks.\r\n\r\n&nbsp;\r\n\r\nTo get that probability, we first need the standard error, $\\sigma_\\overline{x}$:\r\n\r\n&nbsp;\r\n\r\n$\\sigma_\\overline{x}$ $=\\frac{\\sigma}{\\sqrt{N}}=\\frac{100}{\\sqrt{1600}}=\\frac{100}{40}=2.5$\r\n\r\n&nbsp;\r\n\r\nNext, we can draw the sampling distribution: bell-shaped, centered on\u00a0<em>\u03bc<\/em>, and with a (standard deviation called) standard error of \\$2.5. Applying what we know about the normal distribution in terms of the probability under the curve, we get the following Fig. 6.1.\r\n\r\n&nbsp;\r\n\r\n<em>Figure 6.1\u00a0The Sampling Distribution of the Mean Price of Statistics Textbooks<\/em>\r\n\r\n<img src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities.png\" alt=\"\" width=\"898\" height=\"454\" class=\"alignleft wp-image-1944 size-full\" \/>\r\n\r\n&nbsp;\r\n\r\nThat is, we see that 68% of the sample mean prices of statistics textbooks (in hypothetical repeated sampling) would fall between \\$247.5 and \\$252.5[footnote]That is, 250-2.5=247.5 and 250+2.5=252.5.[\/footnote] (i.e., within 1 standard error away from the mean, denoted with green in Fig. 6.1) and 95% of the sample means will fall approximately between \\$245 and \\$255[footnote]That is, 250-2(2.5)=250-5=245 and 250+2(2.5)=250+5=255.[\/footnote] (i.e., within about 2 standard errors away from the mean, denoted with blue in the graph).\r\n\r\n&nbsp;\r\n\r\nSince this is just a heuristic way to <em>imagine<\/em> the sampling distribution, we can restate our finding more correctly: a single, one-off sample mean will fall between \\$247.5 and \\$252.5 68 percent of the time, and between approximately \\$245 and \\$255 95 percent of the time.\r\n\r\n&nbsp;\r\n\r\nOr, even <em>more<\/em> precisely, we have a 68 percent probability that the average paid price for statistics books obtained from a random sample of 1,600 students will be between\u00a0\\$247.5 and \\$252.5, and a 95 percent probability that it will be approximately between \\$245 and \\$255. This means that we have a 95 percent chance that the sample mean, $\\overline{x}$, will fall within \\$10 (i.e., \u00b1\\$5) of the population mean,\u00a0<em>\u03bc<\/em>.\r\n\r\n&nbsp;\r\n\r\nQuite good as far as predictions go, eh?\r\n\r\n<\/div>\r\n<\/div>\r\n&nbsp;\r\n\r\nOf course, we rarely would have the population mean to go by, and we would <em>never<\/em> need to estimate a statistics -- usually, it's the other way around. But the sampling distribution <em>is<\/em> the same, as we still go by the CLT: With large <em>N<\/em>, it is still a normal curve. With large <em>N<\/em>, the sample mean, $\\overline{x}$, is still approaching the true population mean,\u00a0<em>\u03bc.<\/em>\u00a0And, with large <em>N<\/em>, the formula for the standard error is still the same, $\\sigma_\\overline{x}$ $=\\frac{\\sigma}{\\sqrt{N}}$. For statistical inference, we<span style=\"text-indent: 1em;font-size: 14pt\">\u00a0need only follow the logic presented in Example 6.3 above (albeit in reverse).<\/span>\r\n\r\n&nbsp;\r\n\r\nHowever, there is one thing we normally do <em>not<\/em> have in order to proceed: the population standard deviation,\u00a0<em>\u03c3<\/em>. We typically use the sample standard deviation, <em>s<\/em>, as a substitute, even if this does increase the uncertainty of the estimates[footnote]<span style=\"text-indent: 18.6667px;font-size: 14pt\">We have a way to account for that, however, as we will see in Section 6.6 on the <em>t-distribution<\/em>\u00a0below and the concept of <em>degrees of freedom<\/em><\/span><span style=\"text-indent: 1em;font-size: 14pt\">.[\/footnote].<\/span>\r\n\r\n&nbsp;\r\n\r\nThen, finally, here is <strong>how inference works<\/strong>, in one paragraph: <strong>we use sample statistics to estimate population parameters <\/strong>-- i.e., the statistics we calculate based on random sample data act as statistical estimators for what we truly want to know, the unknown population parameters.<strong> We do that by the postulates of the Central Limit Theorem <\/strong>which describe the sampling distribution, the bridge between the statistics and the parameters. By the CLT, we have <strong>the sampling distribution as normal. <\/strong>Again, by the CLT,<strong> we can center the sampling distribution on the sample mean, and calculate the sampling distribution's standard error using the sample standard deviation. By applying the properties of the normal probability distribution to the sampling distribution<\/strong><span style=\"text-indent: 18.6667px;font-size: 14pt\"><strong>, we then produce population estimates.<\/strong> Ta-da!<\/span>\r\n\r\n&nbsp;\r\n\r\nI will end this section with an example to illustrate the full process from the beginning to the end.\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><em>Example 6.4<\/em>\u00a0<em>Average Annual Income<\/em><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n&nbsp;\r\n\r\nImagine you are interested in the average annual income in a medium-size city. You randomly select <em>N<\/em>=1,600 people living in that city and ask them about their annual income. You then calculate the mean of the resulting variable as \\$50,000, and the standard deviation as \\$12,000. I.e.,\r\n\r\n&nbsp;\r\n\r\n$\\overline{x}=50,000$ and <em>s<\/em> = 12,000\r\n\r\n&nbsp;\r\n\r\n<em>As a first guess<\/em>, you <em>could<\/em> say that the average annual income in the city is \\$50,000. However, since we know this is an estimate, and random error exists, you can do better: you can also provide information about how certain you are about your estimate along with some margins for error.\r\n\r\n&nbsp;\r\n\r\nTo do that, you need to draw the sampling distribution of the mean. Following the CLT, you draw the sampling distribution as a normal curve centered on \\$50,000. At this point, you also need information about the sampling distribution's dispersion, i.e., its standard error. You substitute the <em>s<\/em>\u00a0you do know for the\u00a0<em>\u03c3<\/em> you don't[footnote]Recall that a \"hat\" over a symbol indicates it being estimated.[\/footnote]:\r\n\r\n&nbsp;\r\n\r\n$\\hat\\sigma_\\overline{x}$ $=s_\\overline{x}$ $=\\frac{s}{\\sqrt{N}}= \\frac{12000}{\\sqrt{1600}}=\\frac{12000}{40}=300$\r\n\r\n&nbsp;\r\n\r\nFig. 6.2 shows the resulting sampling distribution.\r\n\r\n&nbsp;\r\n\r\n<em>Figure 6.2\u00a0Average Annual Income<\/em>\r\n\r\n<img src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities.png\" alt=\"\" width=\"898\" height=\"454\" class=\"alignleft wp-image-1945 size-full\" \/>\r\n\r\n&nbsp;\r\n\r\nBased on the figure above (and following the same logic as in the previous Example 6.3), we find that the average annual income of the city's population will be between \\$49,400 and \\$50,600 with 95 percent probability[footnote]We get these bounds (i.e., within about 2 standard errors away from the mean) through 50,000-2(300)=50,000-600=47,400 and 50,000+2(300)=50,000+600=50,600.[\/footnote]. That is, we can be 95 percent confident that the city's average annual income will be within \\$1,200 of the sample average of \\$50,000, or, that the city's average annual income is \\$50,000\u00a0\u00b1\\$600, with 95 percent certainty. (Don't worry, all this talk of <em>confidence<\/em> and <em>certainty<\/em> will be explained in the next section.)\r\n\r\n<\/div>\r\n<\/div>\r\n&nbsp;\r\n\r\nYou should be able to appreciate that this \"average annual income of \\$50,000\u00a0\u00b1\\$600\" is a much more qualified and precise statement than simply assuming the population average is the same as the sample average (which it is likely not). <strong>Now you <em>know<\/em> how much potential variability the population mean has, with a specific <\/strong>(and quite high!) <strong>level of certainty.<\/strong>\r\n\r\n&nbsp;\r\n\r\nThis is no way trivial, and the best \"guess\" you can offer as an estimate of the population mean. No other research method using sample data is able to produce a closer level of generalizability of the sample findings to the level of population, much less with the mathematical, probability-theory-backed evidence offered by random sampling. This is what statistical inference does, and now you even know how and why it works! In the next section, you can try it for yourself.\r\n\r\n&nbsp;\r\n\r\nWe are almost but not quite done with this abstract monster of a chapter. There is a light at the end of the tunnel -- what is left is tying some loose ends, formally introducing a concept we're already using (psst, that's the <em>confidence<\/em> I mentioned above), and providing some final details on inference in the next section -- and then we are good to go: we can start on some real research and working with variables again in Chapter 7!","rendered":"<p>Despite it&#8217;s scary-sounding name, the <em>Central Limit Theorem<\/em>\u00a0(CLT) simply <em>describes<\/em> the sampling distribution &#8212; and simultaneously explains why, and how, we can use sample statistics (like the mean of a variable, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-0d00c2da2b2541a97ae0ac3c10e1504e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"11\" style=\"vertical-align: 0px;\" \/>, obtained through sample data) to estimate population parameters (like the true population mean of that variable,<em> \u03bc<\/em>).<\/p>\n<p>&nbsp;<\/p>\n<p>Recall what we use to describe a variable&#8217;s frequency distribution: 1) a graph to visually display the distribution&#8217;s shape; 2) measures of central tendency; and 3) measures of dispersion. In the previous section I also asked you to imagine the (entirely theoretical, i.e., <em>probability<\/em>) distribution of the mean\u00a0<span style=\"text-indent: 18.6667px;font-size: 14pt\">(again, in theory, o<\/span><span style=\"text-indent: 1em;font-size: 14pt\">ver infinitely repeated samples). What t<\/span>he CLT does then is provide information about all three of these elements (shape, central tendency, dispersion) but about the distribution of mean. <strong>In short, the CLT describes the sampling distribution of the mean.\u00a0<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>The sample size plays an important role: the CLT applies to &#8220;large\u00a0<em>N&#8221;<\/em>, and is stated for &#8220;as the sample size grows&#8221;, bringing us back to the point that the larger the <em>N<\/em>, the better for inference it is.<\/p>\n<p>&nbsp;<\/p>\n<p>Specifically, the CLT states that with random sampling, as <em>N<\/em> increases (i.e., for large <em>N<\/em>), the shape, central tendency, and the dispersion (of the sampling distribution) of the mean, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-0d00c2da2b2541a97ae0ac3c10e1504e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"11\" style=\"vertical-align: 0px;\" \/>, will be the following:<\/p>\n<ol>\n<li>The distribution of\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-0d00c2da2b2541a97ae0ac3c10e1504e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"11\" style=\"vertical-align: 0px;\" \/> will approach normal distribution in shape. (That is, the sampling distribution is a bell-shaped curve.)<\/li>\n<li>The mean of the sampling distribution<a class=\"footnote\" title=\"You can think of it as &quot;the mean of the means&quot;, or the mean of the hypothetical variable mean.\" id=\"return-footnote-99-1\" href=\"#footnote-99-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> (denoted as <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-b0f6659031b03d0225ccaadcac32d125_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#117;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"20\" style=\"vertical-align: -4px;\" \/>)\u00a0 will become the population mean, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-461fe1a58a75801541487ddf10d32abd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#117;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"11\" style=\"vertical-align: -4px;\" \/>. (That is,\u00a0<img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-b0f6659031b03d0225ccaadcac32d125_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#109;&#117;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"20\" style=\"vertical-align: -4px;\" \/> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-1bc592cc22578fa3843a56b786c46152_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#61;&#92;&#109;&#117;\" title=\"Rendered by QuickLaTeX.com\" height=\"12\" width=\"30\" style=\"vertical-align: -4px;\" \/>.)<\/li>\n<li>The standard deviation of the sampling distribution (denoted as <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-de5382c00a55332dd89774492d104d0c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"19\" style=\"vertical-align: -3px;\" \/>) is called <em>the standard error<\/em>, and is related to the population standard deviation,<em>\u00a0\u03c3<\/em>, by the formula <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-de5382c00a55332dd89774492d104d0c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"19\" style=\"vertical-align: -3px;\" \/> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-ff1cbb6cd1d0399c05c18824c0141efd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#115;&#105;&#103;&#109;&#97;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#78;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"45\" style=\"vertical-align: -11px;\" \/>.<\/li>\n<\/ol>\n<p>&nbsp;<\/p>\n<p>This may seem like a lot to take in (what with all the jargon, notation, and all) but it really <em>is<\/em> simply a description of a distribution. The next paragraph clarifies each of the CLT&#8217;s points in turn.<\/p>\n<p>&nbsp;<\/p>\n<p>As brief as it is, the CLT is conveniently packed with all sorts of useful information: The sampling distribution is normal in shape &#8212; so we can apply all we know about the normal distribution to it (for example, that it&#8217;s bisected by its mean). Hence, the sampling distribution is <em>centered<\/em> on the population mean. Finally, according to the formula for the sampling distribution&#8217;s standard deviation (a.k.a the standard error), as the sample size <em>N<\/em> grows, the standard error becomes smaller<a class=\"footnote\" title=\"After all, N is in the denominator.\" id=\"return-footnote-99-2\" href=\"#footnote-99-2\" aria-label=\"Footnote 2\"><sup class=\"footnote\">[2]<\/sup><\/a> &#8212; so the distribution will be less variable\/spread out, and thus the estimates will be closer to the parameters<a class=\"footnote\" title=\"On the flip side, the larger the original variables's dispersion, the larger the standard error and the smaller the original variable's dispersion, the smaller the standard error\u00a0(as\u00a0\u03c3\u00a0is in the numerator).\" id=\"return-footnote-99-3\" href=\"#footnote-99-3\" aria-label=\"Footnote 3\"><sup class=\"footnote\">[3]<\/sup><\/a>.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>To summarize, the sampling distribution provides us with a bridge between sample statistics (i.e., estimators) and population parameters (i.e., the estimated). <strong>The CLT provides a description of the sampling distribution: by giving us information about an estimator <\/strong>(in hypothetical repeated sampling)<strong>, it decreases the uncertainty of the estimation since now we can calculate how close the statistic is to the parameter.<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>I say <em>estimator<\/em> and <em>statistic<\/em>, not <em>mean<\/em>, because <strong>CLT (or a version thereof) applies to all statistical estimators, as they all have a normal distribution with increasing sample size. <\/strong>The latter is noteworthy because<strong> it&#8217;s true regardless of the shape of the original variable&#8217;s distribution <\/strong>(in the population)<strong>: a variable might not be normally distributed but its mean (and other statistics) always is.<\/strong><a class=\"footnote\" title=\"Many variables tend to be approximately normally distributed in the population. The point I'm emphasizing here is that even when they are not, the statistics of these variables based on random sample data are normally distributed. This relates to our discussion of how large\u00a0N should be: if the original variable's distribution in the population is close to normal to start with, a smaller N will be fine. On the other hand, if a variable is not normally distributed in the population (or is too widely dispersed\/has a lot of outliers, as reflected in \u03c3), a relatively large N will be needed to ensure the normality of the sampling distribution.\" id=\"return-footnote-99-4\" href=\"#footnote-99-4\" aria-label=\"Footnote 4\"><sup class=\"footnote\">[4]<\/sup><\/a><\/p>\n<p>&nbsp;<\/p>\n<p>If you are wondering about the connection between random sampling and the normal distribution, the following video might help:<\/p>\n<p>&nbsp;<\/p>\n<p><iframe loading=\"lazy\" id=\"oembed-1\" title=\"The Galton Board\" width=\"500\" height=\"375\" src=\"https:\/\/www.youtube.com\/embed\/Kq7e6cj2nDw?feature=oembed&#38;rel=0\" frameborder=\"0\" allowfullscreen=\"allowfullscreen\"><\/iframe><\/p>\n<p>&nbsp;<\/p>\n<p>The video above uses a <em>Galton board<\/em> to demonstrate the connection between randomness and normal curves by showing that balls falling randomly end up distributed approximately into a bell-shaped curve &#8212; with the majority in the centre, fewer to the sides, and fewer yet in the &#8220;tails&#8221;. You can think of a sample mean as one of these balls (all other balls are the means of other samples of the same size). Thus, what we see is that the majority of means would fall in the centre, fewer to the sides, and fewer still in the tail ends. However, since we do not have many means at all but only one, produced by one sample, we are dealing with a probability distribution. In turn, this tells us that\u00a0<span style=\"font-size: 14pt\">the highest probability<\/span><span style=\"font-size: 14pt\">\u00a0is\u00a0<\/span><span style=\"text-indent: 1em;font-size: 14pt\">the mean to fall in the centre region, with smaller probability to be to the sides but still close to the centre, and a further decreasing probability the farther it gets from the centre, just like with any probability normal curve<a class=\"footnote\" title=\"Of course, in the video you see an approximation of a normal curve; after all, this is a finite, not infinite, number of balls. That is why the perfectly normal distribution is only a theortical concept.\" id=\"return-footnote-99-5\" href=\"#footnote-99-5\" aria-label=\"Footnote 5\"><sup class=\"footnote\">[5]<\/sup><\/a>.<\/span><\/p>\n<p><span style=\"text-indent: 1em;font-size: 14pt\">\u00a0<\/span><\/p>\n<p>If you still find all this hopelessly abstract (as I&#8217;m sure most do), you can see exactly how we use the CLT for inference in the example below. (Unfortunately, your relief to be back to examples will be premature at this point: we have more necessary theory to cover ahead. On the bright side, we are more than half-way in the chapter so cheer up, the end is near.)<\/p>\n<p>&nbsp;<\/p>\n<p>As a heads-up, here&#8217;s the rationale of what we&#8217;ll do: In order to explain inference about populations based on samples, we&#8217;ll reverse-engineer it. That is, we&#8217;ll start with &#8220;knowledge&#8221; about the population and, based on the CLT, we&#8217;ll &#8220;infer&#8221; the sample statistic. At the end we&#8217;ll see that following the same logic (but in reverse) we can easily do the opposite &#8212; to estimate the population parameter through a sample statistic &#8212; which is exactly what we want to do in the first place.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><em>Example 6.3\u00a0Price of Statistics Textbooks<\/em><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>&nbsp;<\/p>\n<p>Let&#8217;s say that university students on average spend &#36;250 for a statistics textbook, with a standard deviation of &#36;100 &#8212; i.e., we assume to know the population parameters:<\/p>\n<p>&nbsp;<\/p>\n<p><em>\u03bc<\/em> = 250 and\u00a0<em>\u03c3<\/em> = 100<\/p>\n<p>&nbsp;<\/p>\n<p>We draw a random sample of<em> N<\/em>=1,600 students. We want to know the probability for that sample to have a specific mean price paid for statistics textbooks.<\/p>\n<p>&nbsp;<\/p>\n<p>To get that probability, we first need the standard error, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-de5382c00a55332dd89774492d104d0c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"19\" style=\"vertical-align: -3px;\" \/>:<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-de5382c00a55332dd89774492d104d0c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"19\" style=\"vertical-align: -3px;\" \/> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-1311d383ed24521a07af4725c9a247b2_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#115;&#105;&#103;&#109;&#97;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#78;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#48;&#48;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#49;&#54;&#48;&#48;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#48;&#48;&#125;&#123;&#52;&#48;&#125;&#61;&#50;&#46;&#53;\" title=\"Rendered by QuickLaTeX.com\" height=\"27\" width=\"207\" style=\"vertical-align: -11px;\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Next, we can draw the sampling distribution: bell-shaped, centered on\u00a0<em>\u03bc<\/em>, and with a (standard deviation called) standard error of &#36;2.5. Applying what we know about the normal distribution in terms of the probability under the curve, we get the following Fig. 6.1.<\/p>\n<p>&nbsp;<\/p>\n<p><em>Figure 6.1\u00a0The Sampling Distribution of the Mean Price of Statistics Textbooks<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities.png\" alt=\"\" width=\"898\" height=\"454\" class=\"alignleft wp-image-1944 size-full\" srcset=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities.png 898w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities-300x152.png 300w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities-768x388.png 768w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities-65x33.png 65w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities-225x114.png 225w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/textbook-mean-price-250-with-probabilities-350x177.png 350w\" sizes=\"auto, (max-width: 898px) 100vw, 898px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>That is, we see that 68% of the sample mean prices of statistics textbooks (in hypothetical repeated sampling) would fall between &#36;247.5 and &#36;252.5<a class=\"footnote\" title=\"That is, 250-2.5=247.5 and 250+2.5=252.5.\" id=\"return-footnote-99-6\" href=\"#footnote-99-6\" aria-label=\"Footnote 6\"><sup class=\"footnote\">[6]<\/sup><\/a> (i.e., within 1 standard error away from the mean, denoted with green in Fig. 6.1) and 95% of the sample means will fall approximately between &#36;245 and &#36;255<a class=\"footnote\" title=\"That is, 250-2(2.5)=250-5=245 and 250+2(2.5)=250+5=255.\" id=\"return-footnote-99-7\" href=\"#footnote-99-7\" aria-label=\"Footnote 7\"><sup class=\"footnote\">[7]<\/sup><\/a> (i.e., within about 2 standard errors away from the mean, denoted with blue in the graph).<\/p>\n<p>&nbsp;<\/p>\n<p>Since this is just a heuristic way to <em>imagine<\/em> the sampling distribution, we can restate our finding more correctly: a single, one-off sample mean will fall between &#36;247.5 and &#36;252.5 68 percent of the time, and between approximately &#36;245 and &#36;255 95 percent of the time.<\/p>\n<p>&nbsp;<\/p>\n<p>Or, even <em>more<\/em> precisely, we have a 68 percent probability that the average paid price for statistics books obtained from a random sample of 1,600 students will be between\u00a0&#36;247.5 and &#36;252.5, and a 95 percent probability that it will be approximately between &#36;245 and &#36;255. This means that we have a 95 percent chance that the sample mean, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-0d00c2da2b2541a97ae0ac3c10e1504e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"11\" style=\"vertical-align: 0px;\" \/>, will fall within &#36;10 (i.e., \u00b1&#36;5) of the population mean,\u00a0<em>\u03bc<\/em>.<\/p>\n<p>&nbsp;<\/p>\n<p>Quite good as far as predictions go, eh?<\/p>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Of course, we rarely would have the population mean to go by, and we would <em>never<\/em> need to estimate a statistics &#8212; usually, it&#8217;s the other way around. But the sampling distribution <em>is<\/em> the same, as we still go by the CLT: With large <em>N<\/em>, it is still a normal curve. With large <em>N<\/em>, the sample mean, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-0d00c2da2b2541a97ae0ac3c10e1504e_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"11\" style=\"vertical-align: 0px;\" \/>, is still approaching the true population mean,\u00a0<em>\u03bc.<\/em>\u00a0And, with large <em>N<\/em>, the formula for the standard error is still the same, <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-de5382c00a55332dd89774492d104d0c_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"19\" style=\"vertical-align: -3px;\" \/> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-ff1cbb6cd1d0399c05c18824c0141efd_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#92;&#115;&#105;&#103;&#109;&#97;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#78;&#125;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"24\" width=\"45\" style=\"vertical-align: -11px;\" \/>. For statistical inference, we<span style=\"text-indent: 1em;font-size: 14pt\">\u00a0need only follow the logic presented in Example 6.3 above (albeit in reverse).<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>However, there is one thing we normally do <em>not<\/em> have in order to proceed: the population standard deviation,\u00a0<em>\u03c3<\/em>. We typically use the sample standard deviation, <em>s<\/em>, as a substitute, even if this does increase the uncertainty of the estimates<a class=\"footnote\" title=\"We have a way to account for that, however, as we will see in Section 6.6 on the t-distribution\u00a0below and the concept of degrees of freedom.\" id=\"return-footnote-99-8\" href=\"#footnote-99-8\" aria-label=\"Footnote 8\"><sup class=\"footnote\">[8]<\/sup><\/a>.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>Then, finally, here is <strong>how inference works<\/strong>, in one paragraph: <strong>we use sample statistics to estimate population parameters <\/strong>&#8212; i.e., the statistics we calculate based on random sample data act as statistical estimators for what we truly want to know, the unknown population parameters.<strong> We do that by the postulates of the Central Limit Theorem <\/strong>which describe the sampling distribution, the bridge between the statistics and the parameters. By the CLT, we have <strong>the sampling distribution as normal. <\/strong>Again, by the CLT,<strong> we can center the sampling distribution on the sample mean, and calculate the sampling distribution&#8217;s standard error using the sample standard deviation. By applying the properties of the normal probability distribution to the sampling distribution<\/strong><span style=\"text-indent: 18.6667px;font-size: 14pt\"><strong>, we then produce population estimates.<\/strong> Ta-da!<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>I will end this section with an example to illustrate the full process from the beginning to the end.<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><em>Example 6.4<\/em>\u00a0<em>Average Annual Income<\/em><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>&nbsp;<\/p>\n<p>Imagine you are interested in the average annual income in a medium-size city. You randomly select <em>N<\/em>=1,600 people living in that city and ask them about their annual income. You then calculate the mean of the resulting variable as &#36;50,000, and the standard deviation as &#36;12,000. I.e.,<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-e621967facebca1611b0b6819b8a8d9f_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;&#61;&#53;&#48;&#44;&#48;&#48;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"17\" width=\"86\" style=\"vertical-align: -4px;\" \/> and <em>s<\/em> = 12,000<\/p>\n<p>&nbsp;<\/p>\n<p><em>As a first guess<\/em>, you <em>could<\/em> say that the average annual income in the city is &#36;50,000. However, since we know this is an estimate, and random error exists, you can do better: you can also provide information about how certain you are about your estimate along with some margins for error.<\/p>\n<p>&nbsp;<\/p>\n<p>To do that, you need to draw the sampling distribution of the mean. Following the CLT, you draw the sampling distribution as a normal curve centered on &#36;50,000. At this point, you also need information about the sampling distribution&#8217;s dispersion, i.e., its standard error. You substitute the <em>s<\/em>\u00a0you do know for the\u00a0<em>\u03c3<\/em> you don&#8217;t<a class=\"footnote\" title=\"Recall that a &quot;hat&quot; over a symbol indicates it being estimated.\" id=\"return-footnote-99-9\" href=\"#footnote-99-9\" aria-label=\"Footnote 9\"><sup class=\"footnote\">[9]<\/sup><\/a>:<\/p>\n<p>&nbsp;<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-785ae6e249be46641e4a402bd65210a5_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#92;&#104;&#97;&#116;&#92;&#115;&#105;&#103;&#109;&#97;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"16\" width=\"19\" style=\"vertical-align: -3px;\" \/> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-1f908a1fb1b7a5465f07a0ace257e8ab_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#61;&#115;&#95;&#92;&#111;&#118;&#101;&#114;&#108;&#105;&#110;&#101;&#123;&#120;&#125;\" title=\"Rendered by QuickLaTeX.com\" height=\"11\" width=\"36\" style=\"vertical-align: -3px;\" \/> <img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/ql-cache\/quicklatex.com-1f590c8557710f1e43b84094680eb2c5_l3.png\" class=\"ql-img-inline-formula quicklatex-auto-format\" alt=\"&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#115;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#78;&#125;&#125;&#61;&#32;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#50;&#48;&#48;&#48;&#125;&#123;&#92;&#115;&#113;&#114;&#116;&#123;&#49;&#54;&#48;&#48;&#125;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#49;&#50;&#48;&#48;&#48;&#125;&#123;&#52;&#48;&#125;&#61;&#51;&#48;&#48;\" title=\"Rendered by QuickLaTeX.com\" height=\"27\" width=\"225\" style=\"vertical-align: -11px;\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Fig. 6.2 shows the resulting sampling distribution.<\/p>\n<p>&nbsp;<\/p>\n<p><em>Figure 6.2\u00a0Average Annual Income<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities.png\" alt=\"\" width=\"898\" height=\"454\" class=\"alignleft wp-image-1945 size-full\" srcset=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities.png 898w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities-300x152.png 300w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities-768x388.png 768w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities-65x33.png 65w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities-225x114.png 225w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/09\/income-mean-price-50-with-probabilities-350x177.png 350w\" sizes=\"auto, (max-width: 898px) 100vw, 898px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Based on the figure above (and following the same logic as in the previous Example 6.3), we find that the average annual income of the city&#8217;s population will be between &#36;49,400 and &#36;50,600 with 95 percent probability<a class=\"footnote\" title=\"We get these bounds (i.e., within about 2 standard errors away from the mean) through 50,000-2(300)=50,000-600=47,400 and 50,000+2(300)=50,000+600=50,600.\" id=\"return-footnote-99-10\" href=\"#footnote-99-10\" aria-label=\"Footnote 10\"><sup class=\"footnote\">[10]<\/sup><\/a>. That is, we can be 95 percent confident that the city&#8217;s average annual income will be within &#36;1,200 of the sample average of &#36;50,000, or, that the city&#8217;s average annual income is &#36;50,000\u00a0\u00b1&#36;600, with 95 percent certainty. (Don&#8217;t worry, all this talk of <em>confidence<\/em> and <em>certainty<\/em> will be explained in the next section.)<\/p>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>You should be able to appreciate that this &#8220;average annual income of &#36;50,000\u00a0\u00b1&#36;600&#8221; is a much more qualified and precise statement than simply assuming the population average is the same as the sample average (which it is likely not). <strong>Now you <em>know<\/em> how much potential variability the population mean has, with a specific <\/strong>(and quite high!) <strong>level of certainty.<\/strong><\/p>\n<p>&nbsp;<\/p>\n<p>This is no way trivial, and the best &#8220;guess&#8221; you can offer as an estimate of the population mean. No other research method using sample data is able to produce a closer level of generalizability of the sample findings to the level of population, much less with the mathematical, probability-theory-backed evidence offered by random sampling. This is what statistical inference does, and now you even know how and why it works! In the next section, you can try it for yourself.<\/p>\n<p>&nbsp;<\/p>\n<p>We are almost but not quite done with this abstract monster of a chapter. There is a light at the end of the tunnel &#8212; what is left is tying some loose ends, formally introducing a concept we&#8217;re already using (psst, that&#8217;s the <em>confidence<\/em> I mentioned above), and providing some final details on inference in the next section &#8212; and then we are good to go: we can start on some real research and working with variables again in Chapter 7!<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-99-1\">You can think of it as \"the mean of the means\", or the mean of the hypothetical variable <em>mean<\/em>.  <a href=\"#return-footnote-99-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><li id=\"footnote-99-2\">After all, <em>N<\/em> is in the denominator. <a href=\"#return-footnote-99-2\" class=\"return-footnote\" aria-label=\"Return to footnote 2\">&crarr;<\/a><\/li><li id=\"footnote-99-3\">On the flip side, the larger the original variables's dispersion, the larger the standard error and the smaller the original variable's dispersion, the smaller the standard error\u00a0<span style=\"font-size: 14pt;text-indent: 18.6667px\">(as\u00a0<\/span><em style=\"font-size: 14pt;text-indent: 18.6667px\">\u03c3<\/em><span style=\"font-size: 14pt;text-indent: 18.6667px\">\u00a0is in the numerator)<\/span><span style=\"text-indent: 1em;font-size: 14pt\">. <a href=\"#return-footnote-99-3\" class=\"return-footnote\" aria-label=\"Return to footnote 3\">&crarr;<\/a><\/li><li id=\"footnote-99-4\">Many variables tend to be approximately normally distributed in the population. The point I'm emphasizing here is that even when they are not, the statistics of these variables based on random sample data <em>are<\/em> normally distributed. This relates to our discussion of how large\u00a0<em>N<\/em> should be: if the original variable's distribution in the population is close to normal to start with, a smaller <em>N<\/em> will be fine. On the other hand, if a variable is not normally distributed in the population (or is too widely dispersed\/has a lot of outliers, as reflected in <em>\u03c3<\/em>), a relatively large <em>N<\/em> will be needed to ensure the normality of the sampling distribution. <a href=\"#return-footnote-99-4\" class=\"return-footnote\" aria-label=\"Return to footnote 4\">&crarr;<\/a><\/li><li id=\"footnote-99-5\">Of course, in the video you see an <em>approximation<\/em> of a normal curve; after all, this is a finite, not infinite, number of balls. That is why the perfectly normal distribution is only a theortical concept. <a href=\"#return-footnote-99-5\" class=\"return-footnote\" aria-label=\"Return to footnote 5\">&crarr;<\/a><\/li><li id=\"footnote-99-6\">That is, 250-2.5=247.5 and 250+2.5=252.5. <a href=\"#return-footnote-99-6\" class=\"return-footnote\" aria-label=\"Return to footnote 6\">&crarr;<\/a><\/li><li id=\"footnote-99-7\">That is, 250-2(2.5)=250-5=245 and 250+2(2.5)=250+5=255. <a href=\"#return-footnote-99-7\" class=\"return-footnote\" aria-label=\"Return to footnote 7\">&crarr;<\/a><\/li><li id=\"footnote-99-8\"><span style=\"text-indent: 18.6667px;font-size: 14pt\">We have a way to account for that, however, as we will see in Section 6.6 on the <em>t-distribution<\/em>\u00a0below and the concept of <em>degrees of freedom<\/em><\/span><span style=\"text-indent: 1em;font-size: 14pt\">. <a href=\"#return-footnote-99-8\" class=\"return-footnote\" aria-label=\"Return to footnote 8\">&crarr;<\/a><\/li><li id=\"footnote-99-9\">Recall that a \"hat\" over a symbol indicates it being estimated. <a href=\"#return-footnote-99-9\" class=\"return-footnote\" aria-label=\"Return to footnote 9\">&crarr;<\/a><\/li><li id=\"footnote-99-10\">We get these bounds (i.e., within about 2 standard errors away from the mean) through 50,000-2(300)=50,000-600=47,400 and 50,000+2(300)=50,000+600=50,600. <a href=\"#return-footnote-99-10\" class=\"return-footnote\" aria-label=\"Return to footnote 10\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":533,"menu_order":6,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-99","chapter","type-chapter","status-publish","hentry"],"part":32,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/99","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/users\/533"}],"version-history":[{"count":25,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/99\/revisions"}],"predecessor-version":[{"id":2052,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/99\/revisions\/2052"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/parts\/32"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/99\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/media?parent=99"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapter-type?post=99"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/contributor?post=99"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/license?post=99"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}