{"id":767,"date":"2019-03-07T16:28:54","date_gmt":"2019-03-07T21:28:54","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/simplestats\/?post_type=chapter&#038;p=767"},"modified":"2019-08-19T19:32:11","modified_gmt":"2019-08-19T23:32:11","slug":"5-1-the-normal-distribution","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/simplestats\/chapter\/5-1-the-normal-distribution\/","title":{"raw":"5.1 The Normal Distribution","rendered":"5.1 The Normal Distribution"},"content":{"raw":"&nbsp;\r\n\r\nYou might have already heard of bell curves (or bell-shaped curves), or even normal curves. If you have, you also probably know they look similar to the one in Fig. 5.1.\r\n\r\n&nbsp;\r\n\r\n<em>Figure 5.1 Body Mass Index of Respondents (CCHS 2015\/2016)<\/em>\r\n\r\n<img src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs.png\" alt=\"\" width=\"462\" height=\"370\" class=\"alignnone wp-image-1679 size-full\" \/>\r\n\r\n&nbsp;\r\n\r\nFig. 5.1 shows a histogram with the distribution of the variable <em>body mass index<\/em> (or <em>BMI<\/em>) of respondents to the <em>CCHS 2015\/2016<\/em>. Judging by the height of the bars that comprise it, the histogram illustrates the fact that most cases tend to cluster at the centre (i.e., most people's <em>BMI<\/em> is average), while a decreasing number of cases end up in the \"tails\" of the distribution (i.e., the further their <em>BMI<\/em> is from the average, the fewer cases there are).\r\n\r\n&nbsp;\r\n\r\nYou can easily notice that the distribution (as reflected in the green bars) is not perfectly symmetric but a bit positively skewed: the right \"tail\" is longer than the left. Still, its shape approximates a bell well-enough (note for comparison the black curve in Fig. 5.1 which is a true bell shape). <strong>We call this type of distribution <em>approximately normal<\/em><\/strong>.\r\n\r\n&nbsp;\r\n\r\nA great many interval\/ratio variables in the world tend to have an approximately normal distribution when plotted (true for both the social and natural sciences). That is, the majority of observations are centered in the middle of the distribution (i.e., they tend to be <em>average<\/em>); we find fewer observations just below and just above the average, and fewer still which are\u00a0 much below or much above the average.\r\n\r\n&nbsp;\r\n\r\nThink about height, for example. Most people are of average height (that's why it's called <em>average<\/em> height after all), some people are above and some below average, fewer people are much taller or shorter, and rather rarely are some people extremely short or extremely tall. Variables like age, or weight (which you can see in Fig. 5.2 below[footnote]The reason you observe the \"double\" distribution -- one shorter (darker) while the other taller (lighter) -- is due to the self-reporting of weight. Most people tend to report their weight in whole numbers, and here some have done so, stating their weight as 65 kg or 85 kg, etc.; these are the tall bars. Others, however, may have reported it with grams and\/or in pounds (which when converted to kilograms would produce a non-whole number weight), thus resulting in weights such as 65.35 kg or 85.75 kg, etc., leading to the short bars and to the histogram appearing like two histograms plotted on top of each other. Had the responses been rounded to the nearest whole kilogram, the histogram would have taken a regular, \"single\" normal-curve shape.[\/footnote]) but also, say, test marks, or points scored per hockey game, or text messages sent per day, etc. are similar. There will be an average, and a continuous decrease in frequency the further one gets from that average.\r\n\r\n&nbsp;\r\n\r\n<em>Fig. 5.2 Weight of Respondents (CCHS 2015\/2016)<\/em>\r\n\r\n<img src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs.png\" alt=\"\" width=\"462\" height=\"370\" class=\"alignnone wp-image-1680 size-full\" \/>\r\n\r\n&nbsp;\r\n\r\n<em>As fascinating as all this is<\/em>, you might be thinking now, <em>why do we care about it?<\/em> <em>It's just one type of a distribution among many.<\/em>\r\n\r\n&nbsp;\r\n\r\nTrue, but as I already mentioned, the normal distribution is special, and not just because many variables' histograms tend to plot an approximately normal curve. To understand why, we need to start exploring the normal distribution as a <em>theoretical<\/em> concept (or, to borrow from Max Weber, as an <em>ideal type<\/em>).\r\n\r\n&nbsp;","rendered":"<p>&nbsp;<\/p>\n<p>You might have already heard of bell curves (or bell-shaped curves), or even normal curves. If you have, you also probably know they look similar to the one in Fig. 5.1.<\/p>\n<p>&nbsp;<\/p>\n<p><em>Figure 5.1 Body Mass Index of Respondents (CCHS 2015\/2016)<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs.png\" alt=\"\" width=\"462\" height=\"370\" class=\"alignnone wp-image-1679 size-full\" srcset=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs.png 462w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs-300x240.png 300w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs-65x52.png 65w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs-225x180.png 225w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-bmi-cchs-350x280.png 350w\" sizes=\"auto, (max-width: 462px) 100vw, 462px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Fig. 5.1 shows a histogram with the distribution of the variable <em>body mass index<\/em> (or <em>BMI<\/em>) of respondents to the <em>CCHS 2015\/2016<\/em>. Judging by the height of the bars that comprise it, the histogram illustrates the fact that most cases tend to cluster at the centre (i.e., most people&#8217;s <em>BMI<\/em> is average), while a decreasing number of cases end up in the &#8220;tails&#8221; of the distribution (i.e., the further their <em>BMI<\/em> is from the average, the fewer cases there are).<\/p>\n<p>&nbsp;<\/p>\n<p>You can easily notice that the distribution (as reflected in the green bars) is not perfectly symmetric but a bit positively skewed: the right &#8220;tail&#8221; is longer than the left. Still, its shape approximates a bell well-enough (note for comparison the black curve in Fig. 5.1 which is a true bell shape). <strong>We call this type of distribution <em>approximately normal<\/em><\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<p>A great many interval\/ratio variables in the world tend to have an approximately normal distribution when plotted (true for both the social and natural sciences). That is, the majority of observations are centered in the middle of the distribution (i.e., they tend to be <em>average<\/em>); we find fewer observations just below and just above the average, and fewer still which are\u00a0 much below or much above the average.<\/p>\n<p>&nbsp;<\/p>\n<p>Think about height, for example. Most people are of average height (that&#8217;s why it&#8217;s called <em>average<\/em> height after all), some people are above and some below average, fewer people are much taller or shorter, and rather rarely are some people extremely short or extremely tall. Variables like age, or weight (which you can see in Fig. 5.2 below<a class=\"footnote\" title=\"The reason you observe the &quot;double&quot; distribution -- one shorter (darker) while the other taller (lighter) -- is due to the self-reporting of weight. Most people tend to report their weight in whole numbers, and here some have done so, stating their weight as 65 kg or 85 kg, etc.; these are the tall bars. Others, however, may have reported it with grams and\/or in pounds (which when converted to kilograms would produce a non-whole number weight), thus resulting in weights such as 65.35 kg or 85.75 kg, etc., leading to the short bars and to the histogram appearing like two histograms plotted on top of each other. Had the responses been rounded to the nearest whole kilogram, the histogram would have taken a regular, &quot;single&quot; normal-curve shape.\" id=\"return-footnote-767-1\" href=\"#footnote-767-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a>) but also, say, test marks, or points scored per hockey game, or text messages sent per day, etc. are similar. There will be an average, and a continuous decrease in frequency the further one gets from that average.<\/p>\n<p>&nbsp;<\/p>\n<p><em>Fig. 5.2 Weight of Respondents (CCHS 2015\/2016)<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs.png\" alt=\"\" width=\"462\" height=\"370\" class=\"alignnone wp-image-1680 size-full\" srcset=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs.png 462w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs-300x240.png 300w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs-65x52.png 65w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs-225x180.png 225w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/08\/normal-curve-weight-cchs-350x280.png 350w\" sizes=\"auto, (max-width: 462px) 100vw, 462px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p><em>As fascinating as all this is<\/em>, you might be thinking now, <em>why do we care about it?<\/em> <em>It&#8217;s just one type of a distribution among many.<\/em><\/p>\n<p>&nbsp;<\/p>\n<p>True, but as I already mentioned, the normal distribution is special, and not just because many variables&#8217; histograms tend to plot an approximately normal curve. To understand why, we need to start exploring the normal distribution as a <em>theoretical<\/em> concept (or, to borrow from Max Weber, as an <em>ideal type<\/em>).<\/p>\n<p>&nbsp;<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-767-1\">The reason you observe the \"double\" distribution -- one shorter (darker) while the other taller (lighter) -- is due to the self-reporting of weight. Most people tend to report their weight in whole numbers, and here some have done so, stating their weight as 65 kg or 85 kg, etc.; these are the tall bars. Others, however, may have reported it with grams and\/or in pounds (which when converted to kilograms would produce a non-whole number weight), thus resulting in weights such as 65.35 kg or 85.75 kg, etc., leading to the short bars and to the histogram appearing like two histograms plotted on top of each other. Had the responses been rounded to the nearest whole kilogram, the histogram would have taken a regular, \"single\" normal-curve shape. <a href=\"#return-footnote-767-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":533,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-767","chapter","type-chapter","status-publish","hentry"],"part":28,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/767","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/users\/533"}],"version-history":[{"count":11,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/767\/revisions"}],"predecessor-version":[{"id":1699,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/767\/revisions\/1699"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/parts\/28"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/767\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/media?parent=767"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapter-type?post=767"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/contributor?post=767"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/license?post=767"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}