{"id":322,"date":"2024-03-01T14:48:35","date_gmt":"2024-03-01T19:48:35","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/?post_type=chapter&#038;p=322"},"modified":"2024-07-26T22:19:43","modified_gmt":"2024-07-27T02:19:43","slug":"chi-squared-test-of-independence","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/chapter\/chi-squared-test-of-independence\/","title":{"raw":"Steps for Chi-Squared Test of Independence","rendered":"Steps for Chi-Squared Test of Independence"},"content":{"raw":"<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Learning Objectives<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nDefine the steps and formula required to perform a Chi-Squared Test for Indepedence\r\n\r\n<\/div>\r\n<\/div>\r\nLet us now present the steps and formulas we will need to perform a Chi-Squared Test. See the section below 'Another Explanation for the [latex]\\chi^2[\/latex] test' to better understand the reason why we are using the formulas we are using and what they mean.\r\n<h2>Null and Alternate Hypotheses<\/h2>\r\nWe are, again, performing a hypothesis test so we need to define our null and alternate hypotheses:\r\n<p style=\"padding-left: 40px\">H<sub>0<\/sub>: The two categorical variables are independent ([latex]\\chi^2 = 0[\/latex])<\/p>\r\n<p style=\"padding-left: 40px\">H<sub>A<\/sub>: The two categorical variables are dependent ([latex]\\chi^2 \\neq 0[\/latex])<\/p>\r\n\r\n<h2>Expected Value Formula<\/h2>\r\nWe calculate an expected value for each category for both populations\/groups. This is the frequency we would expect if the two categorical variables are independent.\r\n\r\n\\[\\text{Expected Value}= \\frac{\\text{Row Total}\\times \\text{Column Total}}{\\text{Grand Total}} \\]\r\n\r\nWe read down the table to determine the column total and read across to determine the row total (the number of people\/events in that category total). We then divide by the total number of people\/events overall ('Grand Total').\r\n<h2>\u03c7<sup>2<\/sup><sub>test<\/sub> Formula<\/h2>\r\nWe now take the difference between each expected value and the actual value for that category:\r\n\r\n\\[ \\chi^2_{test} = \\sum \\frac{(obs - exp)^2}{exp} \\]\r\n\r\n[latex]\\chi^2[\/latex] is, essentially, a weighted average of the squared differences between the actual and expected frequencies. If it is much larger than zero, then the actual values are very different than the values we would expect if the two categorical variables were independent.\r\n<h2>Degrees of Freedom and p-value Formula<\/h2>\r\nOnce we have determined the test statistic ([latex]\\chi^2_{test}[\/latex]), next, we should determine the associated p-value. Before we do that, we need to calculate the degrees of the freedom for the problem:\r\n\r\n\\[ \\text{Degrees of Freedom} = df = (\\text{#} rows - 1)\\times (\\text{#} columns - 1) \\]\r\n\r\nWe now plug the test statistic and degrees of freedom into the <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/chisq-dist-rt-function-dc4832e8-ed2b-49ae-8d7c-b28d5804c0f2#:~:text=RT%20function,-Excel%20for%20Microsoft&amp;text=Returns%20the%20right%2Dtailed%20probability,associated%20with%20a%20%CF%872%20test.\">CHISQ.DIST.RT(\u03c7<sup>2<\/sup><sub>test<\/sub>, df)<\/a> Excel function:\r\n\r\n\\[\\text{p-value} = \\text{CHISQ.DIST.RT}(\\chi^2_{test}, df) \\]\r\n\r\nIf the p-value returned is much less than the level of significance, we can easily say that the deviations between the observed and the expected counts are too large to be attributed to chance (there is a dependence between the categorical variables).\r\n<h2>Decision<\/h2>\r\nJust like all the other hypothesis tests we have performed, if the p-value returned is less than the level of significance, then we reject H<sub>0<\/sub>. If not, we fail to reject H<sub>0<\/sub>. Ie:\r\n<ul>\r\n \t<li>if p-value &lt; L.O.S (Level of Significance): Reject H<sub>0<\/sub><\/li>\r\n \t<li>if p-value &gt; L.O.S (Level of Significance): Do not reject H<sub>0<\/sub><\/li>\r\n<\/ul>\r\n<h2>Conclusion<\/h2>\r\nAgain, like all of the other hypothesis tests in previous sections, if we reject H0, there is sufficient evidence to conclude (whatever it is we are trying to conclude). In this case, there will be sufficient evidence to conclude that the two categorical variables are dependent if we reject H<sub>0<\/sub>. Ie:\r\n<ul>\r\n \t<li>Reject H<sub>0<\/sub>: There is sufficient evidence to conclude that the two categorical variables are dependent.<\/li>\r\n \t<li>Do not reject H<sub>0<\/sub>:\u00a0 There is not sufficient evidence to conclude that the two categorical variables are dependent.<\/li>\r\n<\/ul>\r\n<h1>Another Explanation for the \u03c7<sup>2<\/sup> Test<\/h1>\r\nAnother explanation for the [latex]\\chi^2[\/latex] test is:\r\n<ul>\r\n \t<li>We calculate the frequencies for each category that we would expect to get if there were no difference between the proportions for each group.<\/li>\r\n \t<li>Ie: The expected values are calculated by assuming that the categories are independent of which population they belong to.<\/li>\r\n \t<li>We then calculate a weighted squared difference between the expected frequencies we calculated and the actual frequencies<\/li>\r\n \t<li>This difference is called the [latex]\\chi^2_{test}[\/latex].<\/li>\r\n \t<li>If the value of [latex]\\chi^2_{test}[\/latex] is large, this means that the actual values are much different from the values we should get if the categories were independent of the population they belong to.<\/li>\r\n \t<li>If that's the case, we conclude that the categories cannot be independent of the populations they belong to.<\/li>\r\n \t<li>Another conclusion we can draw is that the proportions are different between populations.<\/li>\r\n<\/ul>","rendered":"<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Learning Objectives<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Define the steps and formula required to perform a Chi-Squared Test for Indepedence<\/p>\n<\/div>\n<\/div>\n<p>Let us now present the steps and formulas we will need to perform a Chi-Squared Test. See the section below &#8216;Another Explanation for the [latex]\\chi^2[\/latex] test&#8217; to better understand the reason why we are using the formulas we are using and what they mean.<\/p>\n<h2>Null and Alternate Hypotheses<\/h2>\n<p>We are, again, performing a hypothesis test so we need to define our null and alternate hypotheses:<\/p>\n<p style=\"padding-left: 40px\">H<sub>0<\/sub>: The two categorical variables are independent ([latex]\\chi^2 = 0[\/latex])<\/p>\n<p style=\"padding-left: 40px\">H<sub>A<\/sub>: The two categorical variables are dependent ([latex]\\chi^2 \\neq 0[\/latex])<\/p>\n<h2>Expected Value Formula<\/h2>\n<p>We calculate an expected value for each category for both populations\/groups. This is the frequency we would expect if the two categorical variables are independent.<\/p>\n<p>\\[\\text{Expected Value}= \\frac{\\text{Row Total}\\times \\text{Column Total}}{\\text{Grand Total}} \\]<\/p>\n<p>We read down the table to determine the column total and read across to determine the row total (the number of people\/events in that category total). We then divide by the total number of people\/events overall (&#8216;Grand Total&#8217;).<\/p>\n<h2>\u03c7<sup>2<\/sup><sub>test<\/sub> Formula<\/h2>\n<p>We now take the difference between each expected value and the actual value for that category:<\/p>\n<p>\\[ \\chi^2_{test} = \\sum \\frac{(obs &#8211; exp)^2}{exp} \\]<\/p>\n<p>[latex]\\chi^2[\/latex] is, essentially, a weighted average of the squared differences between the actual and expected frequencies. If it is much larger than zero, then the actual values are very different than the values we would expect if the two categorical variables were independent.<\/p>\n<h2>Degrees of Freedom and p-value Formula<\/h2>\n<p>Once we have determined the test statistic ([latex]\\chi^2_{test}[\/latex]), next, we should determine the associated p-value. Before we do that, we need to calculate the degrees of the freedom for the problem:<\/p>\n<p>\\[ \\text{Degrees of Freedom} = df = (\\text{#} rows &#8211; 1)\\times (\\text{#} columns &#8211; 1) \\]<\/p>\n<p>We now plug the test statistic and degrees of freedom into the <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/chisq-dist-rt-function-dc4832e8-ed2b-49ae-8d7c-b28d5804c0f2#:~:text=RT%20function,-Excel%20for%20Microsoft&amp;text=Returns%20the%20right%2Dtailed%20probability,associated%20with%20a%20%CF%872%20test.\">CHISQ.DIST.RT(\u03c7<sup>2<\/sup><sub>test<\/sub>, df)<\/a> Excel function:<\/p>\n<p>\\[\\text{p-value} = \\text{CHISQ.DIST.RT}(\\chi^2_{test}, df) \\]<\/p>\n<p>If the p-value returned is much less than the level of significance, we can easily say that the deviations between the observed and the expected counts are too large to be attributed to chance (there is a dependence between the categorical variables).<\/p>\n<h2>Decision<\/h2>\n<p>Just like all the other hypothesis tests we have performed, if the p-value returned is less than the level of significance, then we reject H<sub>0<\/sub>. If not, we fail to reject H<sub>0<\/sub>. Ie:<\/p>\n<ul>\n<li>if p-value &lt; L.O.S (Level of Significance): Reject H<sub>0<\/sub><\/li>\n<li>if p-value &gt; L.O.S (Level of Significance): Do not reject H<sub>0<\/sub><\/li>\n<\/ul>\n<h2>Conclusion<\/h2>\n<p>Again, like all of the other hypothesis tests in previous sections, if we reject H0, there is sufficient evidence to conclude (whatever it is we are trying to conclude). In this case, there will be sufficient evidence to conclude that the two categorical variables are dependent if we reject H<sub>0<\/sub>. Ie:<\/p>\n<ul>\n<li>Reject H<sub>0<\/sub>: There is sufficient evidence to conclude that the two categorical variables are dependent.<\/li>\n<li>Do not reject H<sub>0<\/sub>:\u00a0 There is not sufficient evidence to conclude that the two categorical variables are dependent.<\/li>\n<\/ul>\n<h1>Another Explanation for the \u03c7<sup>2<\/sup> Test<\/h1>\n<p>Another explanation for the [latex]\\chi^2[\/latex] test is:<\/p>\n<ul>\n<li>We calculate the frequencies for each category that we would expect to get if there were no difference between the proportions for each group.<\/li>\n<li>Ie: The expected values are calculated by assuming that the categories are independent of which population they belong to.<\/li>\n<li>We then calculate a weighted squared difference between the expected frequencies we calculated and the actual frequencies<\/li>\n<li>This difference is called the [latex]\\chi^2_{test}[\/latex].<\/li>\n<li>If the value of [latex]\\chi^2_{test}[\/latex] is large, this means that the actual values are much different from the values we should get if the categories were independent of the population they belong to.<\/li>\n<li>If that&#8217;s the case, we conclude that the categories cannot be independent of the populations they belong to.<\/li>\n<li>Another conclusion we can draw is that the proportions are different between populations.<\/li>\n<\/ul>\n","protected":false},"author":883,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-322","chapter","type-chapter","status-publish","hentry"],"part":2679,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/chapters\/322","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/users\/883"}],"version-history":[{"count":25,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/chapters\/322\/revisions"}],"predecessor-version":[{"id":2918,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/chapters\/322\/revisions\/2918"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/parts\/2679"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/chapters\/322\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/media?parent=322"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/chapter-type?post=322"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/contributor?post=322"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/license?post=322"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}