{"id":2106,"date":"2019-11-02T19:43:39","date_gmt":"2019-11-02T23:43:39","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/simplestats\/?post_type=chapter&#038;p=2106"},"modified":"2019-11-02T19:47:50","modified_gmt":"2019-11-02T23:47:50","slug":"9-2-the-f-test","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/simplestats\/chapter\/9-2-the-f-test\/","title":{"raw":"9.2 Between a Discrete and a Continuous Variable: The F-test","rendered":"9.2 Between a Discrete and a Continuous Variable: The F-test"},"content":{"raw":"&nbsp;\r\n\r\nWhen the discrete variable of interest has more than two categories, we can no longer use the simple <em>t<\/em>-test presented in the previous section. While we can still use a boxplot chart for visualizing the association between the two variables -- where instead of two boxplots, we will have as many boxplots as there are groups (categories of the discrete variable) -- we no longer have only one difference to test.\r\n\r\n&nbsp;\r\n\r\nTesting multiple means for statistical significance is done through a version of a test called an <em>F<\/em>-test. This <em>F<\/em>-test tests whether the means of several groups[footnote]Note that \"several groups\" includes the two-groups case as well: you <em>could<\/em> test the significance of a difference between the means of two groups with an <em>F<\/em>-test too (it will just provide less information).[\/footnote] are all equal (versus at least one of them not being the same as the rest) through an analysis of variance (aka ANOVA).\r\n\r\n&nbsp;\r\n\r\nAt this point you might feel like a treatment of the topic of the kind I offered about the <em>t<\/em>-test above would be a tad too much, and you will be correct: providing the full-on technical details and the formula of the <em>F<\/em>-test is beyond the scope of this book.\r\n\r\n&nbsp;\r\n\r\nBriefly, the ANOVA <em>F<\/em>-test calculates a ratio of variances (between groups to within groups, in terms of sums of squares): the larger the ratio, the more evidence there is against the null hypothesis, and vice versa. The <em>F<\/em>-test statistic follows an <em>F<\/em>-distribution (not discussed here), which provides the <em>F<\/em>-value with its <em>p<\/em>-value, which is then compared to the\u00a0<em>\u03b1<\/em>-level and interpreted in the usual way. Example 9.2 illustrates.\r\n\r\n&nbsp;\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><em>Example 9.2 Education Differences in Average Income, NHS 2011<\/em><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\n&nbsp;\r\n\r\nPresumably, college is worth it. You delay your full entry into the labour force and instead invest in your education, with the hope that you will then be able to have a better -- and <em>better<\/em>-<em>paying<\/em> job.\r\n\r\n&nbsp;\r\n\r\nLet's examine this questions then -- do higher educational degrees translate into higher average income? -- using about 3 percent random sample of the <em>NHS 2011<\/em> data. The variable <em>income<\/em> is the same one I used in previous occasions (i.e.,\u00a0<em>total income<\/em> in <em>NHS 2011<\/em>). The groups to compare are the categories of a variable called (highest) <em>degree<\/em>. The variable <em>degree<\/em> is a recoded version of the<em> NHS 2011<\/em>'s <em>highest certificate, diploma or degree<\/em>. I recoded the original variable's thirteen categories in <em>degree<\/em>'s six: 1) no high school, 2) high school, 3) certificate or diploma below Bachelor's, 4) Bachelor's, 5) Master's[footnote]This category includes\u00a0<span style=\"font-size: 1rem\">certificates above Bachelor's, and medical, dentistry, and veterinary degrees.<\/span><span style=\"text-indent: 1em;font-size: 1rem\">[\/footnote], and 6) PhD.<\/span>\r\n\r\n&nbsp;\r\n\r\nA brief descriptive investigation of the data reveals that the average income reported by the six education groups <em>looks<\/em> different: \\$19,433 for respondents without a high school degree, \\$30,455 for respondents with a high school degree, \\$41,971 for respondents with more than a high school but less than a Bachelor's degree, \\$60,360 for respondents with a Bachelor's degree, \\$71,593 for respondents with a Master's degree, and \\$93,924 for respondents with a PhD. This potential positive association (more education, more income) is also reflected in the boxplots in Figure 9.1. While there are outliers with extremely high average income in all groups (the most extreme were even truncated at the top), the median and the outlier-less maximum income increase from left to right with the increase of highest degree.\r\n\r\n&nbsp;\r\n\r\n<em>Figure 9.1 Average Income by Highest Degree, NHS 2011<\/em>\r\n\r\n<img src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs.png\" alt=\"\" width=\"462\" height=\"410\" class=\"alignnone wp-image-1213 size-full\" \/>\r\n\r\n&nbsp;\r\n\r\nAre these differences statistically significant? In other words, are the differences observed in the sample a result of regular sampling variation, or reflective of differences in the population?\r\n<ul>\r\n \t<li>H<sub>0<\/sub>: The average income of all six education groups is the same.<\/li>\r\n \t<li>Ha: The average income of some of the education groups is different from others.<\/li>\r\n<\/ul>\r\nSPSS reports a larger between-groups than within-groups variance;\u00a0<strong><em>F<\/em>=413.535 with <em>p<\/em>&lt;0.001. With the probability of observing such differences between the groups in the sample -- had there been no difference in the population (i.e., under the null hypothesis) -- less than 1 in a thousand, we reject the null hypothesis and conclude that the differences in average income of groups with different highest degrees are statistically significant.\u00a0<\/strong>\r\n\r\n&nbsp;\r\n\r\n<\/div>\r\n<\/div>\r\n&nbsp;\r\n\r\nBefore we turn to testing associations between two discrete variables, the SPSS Tip 9.1 below lists the steps of the <em>t<\/em>-test and ANOVA <em>F<\/em>-test procedures.\r\n\r\n&nbsp;\r\n<div class=\"textbox textbox--key-takeaways\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><em>SPSS Tip 9.2 The F-test<\/em><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>From the <em>Main Menu<\/em>, select <em>Analyze<\/em>, and from the pull-down menu, click on <em>Compare Means<\/em> and then <em>One-Way ANOVA<\/em>;<\/li>\r\n \t<li>Select your continuous variable from the list of variables on the left and, using the top arrow, move it to the <em>Dependent List<\/em> empty space on the right;<\/li>\r\n \t<li>Select your discrete variable from the list of variables on the left and, using the bottom arrow, move it to the <em>Factor<\/em> empty space on the right; click OK.<\/li>\r\n \t<li>The <em>Output<\/em> window will present a <em>Oneway ANOVA<\/em> table, listing a breakdown of variances (by sums of squares), and most importantly, the resulting <em>F<\/em>-statistics and <em>p<\/em>-value.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\n&nbsp;","rendered":"<p>&nbsp;<\/p>\n<p>When the discrete variable of interest has more than two categories, we can no longer use the simple <em>t<\/em>-test presented in the previous section. While we can still use a boxplot chart for visualizing the association between the two variables &#8212; where instead of two boxplots, we will have as many boxplots as there are groups (categories of the discrete variable) &#8212; we no longer have only one difference to test.<\/p>\n<p>&nbsp;<\/p>\n<p>Testing multiple means for statistical significance is done through a version of a test called an <em>F<\/em>-test. This <em>F<\/em>-test tests whether the means of several groups<a class=\"footnote\" title=\"Note that &quot;several groups&quot; includes the two-groups case as well: you could test the significance of a difference between the means of two groups with an F-test too (it will just provide less information).\" id=\"return-footnote-2106-1\" href=\"#footnote-2106-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> are all equal (versus at least one of them not being the same as the rest) through an analysis of variance (aka ANOVA).<\/p>\n<p>&nbsp;<\/p>\n<p>At this point you might feel like a treatment of the topic of the kind I offered about the <em>t<\/em>-test above would be a tad too much, and you will be correct: providing the full-on technical details and the formula of the <em>F<\/em>-test is beyond the scope of this book.<\/p>\n<p>&nbsp;<\/p>\n<p>Briefly, the ANOVA <em>F<\/em>-test calculates a ratio of variances (between groups to within groups, in terms of sums of squares): the larger the ratio, the more evidence there is against the null hypothesis, and vice versa. The <em>F<\/em>-test statistic follows an <em>F<\/em>-distribution (not discussed here), which provides the <em>F<\/em>-value with its <em>p<\/em>-value, which is then compared to the\u00a0<em>\u03b1<\/em>-level and interpreted in the usual way. Example 9.2 illustrates.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><em>Example 9.2 Education Differences in Average Income, NHS 2011<\/em><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>&nbsp;<\/p>\n<p>Presumably, college is worth it. You delay your full entry into the labour force and instead invest in your education, with the hope that you will then be able to have a better &#8212; and <em>better<\/em>&#8211;<em>paying<\/em> job.<\/p>\n<p>&nbsp;<\/p>\n<p>Let&#8217;s examine this questions then &#8212; do higher educational degrees translate into higher average income? &#8212; using about 3 percent random sample of the <em>NHS 2011<\/em> data. The variable <em>income<\/em> is the same one I used in previous occasions (i.e.,\u00a0<em>total income<\/em> in <em>NHS 2011<\/em>). The groups to compare are the categories of a variable called (highest) <em>degree<\/em>. The variable <em>degree<\/em> is a recoded version of the<em> NHS 2011<\/em>&#8216;s <em>highest certificate, diploma or degree<\/em>. I recoded the original variable&#8217;s thirteen categories in <em>degree<\/em>&#8216;s six: 1) no high school, 2) high school, 3) certificate or diploma below Bachelor&#8217;s, 4) Bachelor&#8217;s, 5) Master&#8217;s<a class=\"footnote\" title=\"This category includes\u00a0certificates above Bachelor's, and medical, dentistry, and veterinary degrees.\" id=\"return-footnote-2106-2\" href=\"#footnote-2106-2\" aria-label=\"Footnote 2\"><sup class=\"footnote\">[2]<\/sup><\/a>, and 6) PhD.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>A brief descriptive investigation of the data reveals that the average income reported by the six education groups <em>looks<\/em> different: \\$19,433 for respondents without a high school degree, \\$30,455 for respondents with a high school degree, \\$41,971 for respondents with more than a high school but less than a Bachelor&#8217;s degree, \\$60,360 for respondents with a Bachelor&#8217;s degree, \\$71,593 for respondents with a Master&#8217;s degree, and \\$93,924 for respondents with a PhD. This potential positive association (more education, more income) is also reflected in the boxplots in Figure 9.1. While there are outliers with extremely high average income in all groups (the most extreme were even truncated at the top), the median and the outlier-less maximum income increase from left to right with the increase of highest degree.<\/p>\n<p>&nbsp;<\/p>\n<p><em>Figure 9.1 Average Income by Highest Degree, NHS 2011<\/em><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs.png\" alt=\"\" width=\"462\" height=\"410\" class=\"alignnone wp-image-1213 size-full\" srcset=\"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs.png 462w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs-300x266.png 300w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs-65x58.png 65w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs-225x200.png 225w, https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-content\/uploads\/sites\/564\/2019\/04\/boxplot-degree-income-nhs-350x311.png 350w\" sizes=\"auto, (max-width: 462px) 100vw, 462px\" \/><\/p>\n<p>&nbsp;<\/p>\n<p>Are these differences statistically significant? In other words, are the differences observed in the sample a result of regular sampling variation, or reflective of differences in the population?<\/p>\n<ul>\n<li>H<sub>0<\/sub>: The average income of all six education groups is the same.<\/li>\n<li>Ha: The average income of some of the education groups is different from others.<\/li>\n<\/ul>\n<p>SPSS reports a larger between-groups than within-groups variance;\u00a0<strong><em>F<\/em>=413.535 with <em>p<\/em>&lt;0.001. With the probability of observing such differences between the groups in the sample &#8212; had there been no difference in the population (i.e., under the null hypothesis) &#8212; less than 1 in a thousand, we reject the null hypothesis and conclude that the differences in average income of groups with different highest degrees are statistically significant.\u00a0<\/strong><\/p>\n<p>&nbsp;<\/p>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p>Before we turn to testing associations between two discrete variables, the SPSS Tip 9.1 below lists the steps of the <em>t<\/em>-test and ANOVA <em>F<\/em>-test procedures.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"textbox textbox--key-takeaways\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><em>SPSS Tip 9.2 The F-test<\/em><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>From the <em>Main Menu<\/em>, select <em>Analyze<\/em>, and from the pull-down menu, click on <em>Compare Means<\/em> and then <em>One-Way ANOVA<\/em>;<\/li>\n<li>Select your continuous variable from the list of variables on the left and, using the top arrow, move it to the <em>Dependent List<\/em> empty space on the right;<\/li>\n<li>Select your discrete variable from the list of variables on the left and, using the bottom arrow, move it to the <em>Factor<\/em> empty space on the right; click OK.<\/li>\n<li>The <em>Output<\/em> window will present a <em>Oneway ANOVA<\/em> table, listing a breakdown of variances (by sums of squares), and most importantly, the resulting <em>F<\/em>-statistics and <em>p<\/em>-value.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-2106-1\">Note that \"several groups\" includes the two-groups case as well: you <em>could<\/em> test the significance of a difference between the means of two groups with an <em>F<\/em>-test too (it will just provide less information). <a href=\"#return-footnote-2106-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><li id=\"footnote-2106-2\">This category includes\u00a0<span style=\"font-size: 1rem\">certificates above Bachelor's, and medical, dentistry, and veterinary degrees.<\/span><span style=\"text-indent: 1em;font-size: 1rem\"> <a href=\"#return-footnote-2106-2\" class=\"return-footnote\" aria-label=\"Return to footnote 2\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":533,"menu_order":2,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-2106","chapter","type-chapter","status-publish","hentry"],"part":120,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/2106","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/users\/533"}],"version-history":[{"count":2,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/2106\/revisions"}],"predecessor-version":[{"id":2109,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/2106\/revisions\/2109"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/parts\/120"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/2106\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/media?parent=2106"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapter-type?post=2106"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/contributor?post=2106"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/license?post=2106"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}