{"id":2690,"date":"2024-07-24T11:05:04","date_gmt":"2024-07-24T15:05:04","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/?post_type=part&#038;p=2690"},"modified":"2024-07-25T23:05:52","modified_gmt":"2024-07-26T03:05:52","slug":"hypothesis-test-to-compare-two-population-means","status":"publish","type":"part","link":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/part\/hypothesis-test-to-compare-two-population-means\/","title":{"raw":"Hypothesis Tests to Compare Two Population Means","rendered":"Hypothesis Tests to Compare Two Population Means"},"content":{"raw":"In this chapter, we will learn how to test if there are differences between population means. We are really asking, are there differences between the two groups? For example:\r\n<ul>\r\n \t<li>Is there a difference in sales between two franchise locations?<\/li>\r\n \t<li>Does a new medication help reduce patients' cholesterol levels?<\/li>\r\n \t<li>Did a change in production increase your production levels?<\/li>\r\n<\/ul>\r\n<h2>Three Difference Tests<\/h2>\r\nWe perform different types of difference tests, depending:\r\n<table class=\"lines aligncenter\" style=\"border-collapse: collapse;width: 80%;height: 36px\" border=\"0\">\r\n<tbody>\r\n<tr style=\"height: 18px\">\r\n<th class=\"border\" style=\"height: 18px;vertical-align: middle;width: 33.2974%\">Matched-Pair Differences t-Test<\/th>\r\n<th style=\"vertical-align: middle;width: 33.2974%\" colspan=\"2\">Two Independent Samples Test<\/th>\r\n<\/tr>\r\n<tr style=\"height: 18px\">\r\n<td style=\"width: 33.2974%;height: 18px;vertical-align: middle\">Paired t-Test<\/td>\r\n<td style=\"width: 33.2974%;height: 18px;vertical-align: middle\">Pooled Variance t-Test<\/td>\r\n<td style=\"width: 33.3333%;height: 18px;vertical-align: middle\">Unpooled Variance t-Test<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nWe will step through when to use each test and what the differences are between them below.\r\n<h2>How to Perform the Tests?<\/h2>\r\n<ul>\r\n \t<li>We will step through examples of each type of test in the next sections (after this section).<\/li>\r\n \t<li>We will use Excel's <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/use-the-analysis-toolpak-to-perform-complex-data-analysis-6c67ccf0-f4a9-487c-8dec-bdb5a2cefab6\">Data Analysis Toolpak<\/a> to do all of the required calculations.<\/li>\r\n \t<li>The formulas presented in the sections below are just for your reference.<\/li>\r\n<\/ul>\r\n<h1>Matched-Pairs<\/h1>\r\nWe used the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Paired_difference_test#:~:text=A%20paired%20difference%20test%2C%20better,whether%20their%20population%20means%20differ.\">matched-pair differences t-test<\/a> when we have two related populations:\r\n<ul>\r\n \t<li>Before and after studies on the same person or object.<\/li>\r\n \t<li>Twins, siblings or spouses are matched and one of each is placed in each group.<\/li>\r\n \t<li>People who have been paired together from each group based on certain attributes. This is also a called a type of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Blocking_(statistics)\">blocking<\/a>.<\/li>\r\n<\/ul>\r\n<h2>Why is Matched-Pair Best?<\/h2>\r\nThis type of test is preferable:\r\n<ul>\r\n \t<li>If there is a 'match' between groups.<\/li>\r\n \t<li>This reduces the variability between the two groups<\/li>\r\n \t<li>Which, in effect, helps to better isolate the variability due to an effect you are looking to study.<\/li>\r\n<\/ul>\r\n<h2>Example of a Matched Pair Test<\/h2>\r\n<ul>\r\n \t<li>You want to test if a new medication has a helpful effect<\/li>\r\n \t<li>You pair people by similar age, health status, genetic makeup, etc..<\/li>\r\n \t<li>One group receives the new medication<\/li>\r\n \t<li>The other group either receives a placebo or another type of medication (depending what you trying study and the health concern)<\/li>\r\n \t<li>This reduces the variability between groups due to age, health, etc<\/li>\r\n \t<li>And, in effect, then isolates the effect of the new medication.<\/li>\r\n<\/ul>\r\n<h1>Pooled Variance t-Test<\/h1>\r\nWe can't always have matched pairs in our studies. Sometimes, there aren't possible similar individuals we can pair together. In this case, we consider the two samples independent. Note: We should collect random, non-bias samples that are independent of one-another.\r\n\r\nIn the case where the two independent samples have similar variability, we call this a Pooled Variance t-Test.\u00a0The variance is 'similar' enough if the following is true:\r\n<ul>\r\n \t<li>The standard deviation of one of the samples is less than double the other sample<\/li>\r\n \t<li>The variance of one of the samples is less than four times the variance of the other sample<\/li>\r\n<\/ul>\r\n<h2>Pooling the Variance<\/h2>\r\nWe call this test 'pooled' because we assume that the two samples came from the same population with the same variance. We then 'pool' the two samples' variances together to get a best-guess for the population standard deviation:\r\n\r\n\\[ S_{p}^2 = \\frac{(n_1-1)S_1^2+(n_2-1)S_2^2}{n_1+n_2-2} \\]\r\n\r\nWe did something similar when calculating a 'pooled' population proportion estimate <a href=\"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/chapter\/steps-for-testing-for-differences-in-proportions\/\">in the previous chapter<\/a>. In this case, Excel's Data Analysis Toolpak will return this value for us, so this formula is just for your reference. You will not need to use it.\r\n<h2>T-test Statistic<\/h2>\r\nWe can calculate the test statistic, [latex] t_{test}[\/latex]\u00a0with the following formula:\r\n\r\n\\[ t_{test} = \\frac{\\bar{x}_1-\\bar{x}_2}{\\sqrt{S_{p}^2 \\left( \\frac{1}{n_1} + \\frac{1}{n_2} \\right)}} \\]\r\n\r\nAgain, we will have Excel calculate this test statistic. It is also just for your reference.\r\n<h2>P-Value<\/h2>\r\nIn order to calculate the p-value, we first need the degrees of freedom ([latex]df[\/latex]):\r\n\r\n\\[ df = n_1 + n_2 -2 \\]\r\n\r\nWe use these degrees of freedom and test statistic in Excel's <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/t-dist-function-4329459f-ae91-48c2-bba8-1ead1c6c21b2\">T.DIST<\/a>, <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/t-dist-rt-function-20a30020-86f9-4b35-af1f-7ef6ae683eda\">T.DIST.RT<\/a> or <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/t-dist-2t-function-198e9340-e360-4230-bd21-f52f22ff5c28\">T.DIST.2T<\/a> functions. See the <a href=\"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/chapter\/hypothesis-testing-for-one-mean-examples\/\">hypothesis testing on one mean section<\/a> for which function to use when.\r\n<h1>Unpooled Variance t-Test<\/h1>\r\nIn this final case, the unpooled variance test, we use this test if:\r\n<ul>\r\n \t<li>We want to determine if there is a difference two samples<\/li>\r\n \t<li>The samples are independent of one another.<\/li>\r\n \t<li>The standard deviation of one sample is more than double the other sample's standard deviation.<\/li>\r\n \t<li>The variance of one sample is more than four times the size of the standard deviation sample.<\/li>\r\n<\/ul>\r\nIn this case, we cannot assume the two samples are from the same population and can therefore not 'pool' their variances.\r\n<h2>Test Statistic<\/h2>\r\nThe test statistic still follows a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Student%27s_t-distribution\">t-distribution<\/a>. When performing the hypothesis test, we assume that the difference is zero or that they both come from the same population. Ie: H<sub>0<\/sub>: \u03bc<sub>1<\/sub>\u2212\u03bc<sub>2<\/sub>\u00a0= 0 or H<sub>0<\/sub>: \u03bc<sub>1<\/sub>=\u03bc<sub>2<\/sub>. This simplifies the t<sub>test<\/sub> formula:\r\n\r\n\\[ t_{test} = \\frac{ (\\bar{x}_1- \\bar{x}_2) - (\\mu_1\u2212\\mu_2) }{\\sqrt{\\frac{s_1^2}{n_1}+\\frac{s_2^2}{n_2}}} =\\frac{(\\bar{x}_1-\\bar{x}_2)-(0)}{\\sqrt{\\frac{s_1^2}{n_1}+\\frac{s_2^2}{n_2}}}=\\frac{\\bar{x}_1-\\bar{x}_2}{\\sqrt{\\frac{s_1^2}{n_1}+\\frac{s_2^2}{n_2}}} \\]\r\n\r\n\\[\\]\r\n<h2>Degrees of Freedom<\/h2>\r\nThe degrees of freedom formula, which is often fairly simple, becomes quite involved in this case. We first calculate [latex]C[\/latex]:\r\n\r\n\\[C = \\frac{\\frac{s_1^2}{n_1}}{\\frac{s_1^2}{n_1}+\\frac{s_1^2}{n_1}}\\]\r\n\r\nWe use this value in the degrees of freedom calculation:\r\n\r\n\\[ df = \\frac{(n_1-1)(n_2-1)}{(n_2-1)C^2+(1-C)^2(n_1-1)} \\]\r\n\r\nAgain, Excel's Data Analysis Toolpak will calculate the degrees of freedom, test statistic and p-value for us. See the next sections for examples of this.\r\n<h1>Required Assumptions<\/h1>\r\nIn all of the above cases, we also need the following to be true:\r\n<ul>\r\n \t<li>The samples are randomly collected and non-bias.<\/li>\r\n \t<li>The sample sizes are large enough<\/li>\r\n \t<li>Or, the populations are known to be normally distributed.<\/li>\r\n<\/ul>","rendered":"<p>In this chapter, we will learn how to test if there are differences between population means. We are really asking, are there differences between the two groups? For example:<\/p>\n<ul>\n<li>Is there a difference in sales between two franchise locations?<\/li>\n<li>Does a new medication help reduce patients&#8217; cholesterol levels?<\/li>\n<li>Did a change in production increase your production levels?<\/li>\n<\/ul>\n<h2>Three Difference Tests<\/h2>\n<p>We perform different types of difference tests, depending:<\/p>\n<table class=\"lines aligncenter\" style=\"border-collapse: collapse;width: 80%;height: 36px\">\n<tbody>\n<tr style=\"height: 18px\">\n<th class=\"border\" style=\"height: 18px;vertical-align: middle;width: 33.2974%\">Matched-Pair Differences t-Test<\/th>\n<th style=\"vertical-align: middle;width: 33.2974%\" colspan=\"2\">Two Independent Samples Test<\/th>\n<\/tr>\n<tr style=\"height: 18px\">\n<td style=\"width: 33.2974%;height: 18px;vertical-align: middle\">Paired t-Test<\/td>\n<td style=\"width: 33.2974%;height: 18px;vertical-align: middle\">Pooled Variance t-Test<\/td>\n<td style=\"width: 33.3333%;height: 18px;vertical-align: middle\">Unpooled Variance t-Test<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>We will step through when to use each test and what the differences are between them below.<\/p>\n<h2>How to Perform the Tests?<\/h2>\n<ul>\n<li>We will step through examples of each type of test in the next sections (after this section).<\/li>\n<li>We will use Excel&#8217;s <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/use-the-analysis-toolpak-to-perform-complex-data-analysis-6c67ccf0-f4a9-487c-8dec-bdb5a2cefab6\">Data Analysis Toolpak<\/a> to do all of the required calculations.<\/li>\n<li>The formulas presented in the sections below are just for your reference.<\/li>\n<\/ul>\n<h1>Matched-Pairs<\/h1>\n<p>We used the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Paired_difference_test#:~:text=A%20paired%20difference%20test%2C%20better,whether%20their%20population%20means%20differ.\">matched-pair differences t-test<\/a> when we have two related populations:<\/p>\n<ul>\n<li>Before and after studies on the same person or object.<\/li>\n<li>Twins, siblings or spouses are matched and one of each is placed in each group.<\/li>\n<li>People who have been paired together from each group based on certain attributes. This is also a called a type of <a href=\"https:\/\/en.wikipedia.org\/wiki\/Blocking_(statistics)\">blocking<\/a>.<\/li>\n<\/ul>\n<h2>Why is Matched-Pair Best?<\/h2>\n<p>This type of test is preferable:<\/p>\n<ul>\n<li>If there is a &#8216;match&#8217; between groups.<\/li>\n<li>This reduces the variability between the two groups<\/li>\n<li>Which, in effect, helps to better isolate the variability due to an effect you are looking to study.<\/li>\n<\/ul>\n<h2>Example of a Matched Pair Test<\/h2>\n<ul>\n<li>You want to test if a new medication has a helpful effect<\/li>\n<li>You pair people by similar age, health status, genetic makeup, etc..<\/li>\n<li>One group receives the new medication<\/li>\n<li>The other group either receives a placebo or another type of medication (depending what you trying study and the health concern)<\/li>\n<li>This reduces the variability between groups due to age, health, etc<\/li>\n<li>And, in effect, then isolates the effect of the new medication.<\/li>\n<\/ul>\n<h1>Pooled Variance t-Test<\/h1>\n<p>We can&#8217;t always have matched pairs in our studies. Sometimes, there aren&#8217;t possible similar individuals we can pair together. In this case, we consider the two samples independent. Note: We should collect random, non-bias samples that are independent of one-another.<\/p>\n<p>In the case where the two independent samples have similar variability, we call this a Pooled Variance t-Test.\u00a0The variance is &#8216;similar&#8217; enough if the following is true:<\/p>\n<ul>\n<li>The standard deviation of one of the samples is less than double the other sample<\/li>\n<li>The variance of one of the samples is less than four times the variance of the other sample<\/li>\n<\/ul>\n<h2>Pooling the Variance<\/h2>\n<p>We call this test &#8216;pooled&#8217; because we assume that the two samples came from the same population with the same variance. We then &#8216;pool&#8217; the two samples&#8217; variances together to get a best-guess for the population standard deviation:<\/p>\n<p>\\[ S_{p}^2 = \\frac{(n_1-1)S_1^2+(n_2-1)S_2^2}{n_1+n_2-2} \\]<\/p>\n<p>We did something similar when calculating a &#8216;pooled&#8217; population proportion estimate <a href=\"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/chapter\/steps-for-testing-for-differences-in-proportions\/\">in the previous chapter<\/a>. In this case, Excel&#8217;s Data Analysis Toolpak will return this value for us, so this formula is just for your reference. You will not need to use it.<\/p>\n<h2>T-test Statistic<\/h2>\n<p>We can calculate the test statistic, [latex]t_{test}[\/latex]\u00a0with the following formula:<\/p>\n<p>\\[ t_{test} = \\frac{\\bar{x}_1-\\bar{x}_2}{\\sqrt{S_{p}^2 \\left( \\frac{1}{n_1} + \\frac{1}{n_2} \\right)}} \\]<\/p>\n<p>Again, we will have Excel calculate this test statistic. It is also just for your reference.<\/p>\n<h2>P-Value<\/h2>\n<p>In order to calculate the p-value, we first need the degrees of freedom ([latex]df[\/latex]):<\/p>\n<p>\\[ df = n_1 + n_2 -2 \\]<\/p>\n<p>We use these degrees of freedom and test statistic in Excel&#8217;s <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/t-dist-function-4329459f-ae91-48c2-bba8-1ead1c6c21b2\">T.DIST<\/a>, <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/t-dist-rt-function-20a30020-86f9-4b35-af1f-7ef6ae683eda\">T.DIST.RT<\/a> or <a href=\"https:\/\/support.microsoft.com\/en-us\/office\/t-dist-2t-function-198e9340-e360-4230-bd21-f52f22ff5c28\">T.DIST.2T<\/a> functions. See the <a href=\"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/chapter\/hypothesis-testing-for-one-mean-examples\/\">hypothesis testing on one mean section<\/a> for which function to use when.<\/p>\n<h1>Unpooled Variance t-Test<\/h1>\n<p>In this final case, the unpooled variance test, we use this test if:<\/p>\n<ul>\n<li>We want to determine if there is a difference two samples<\/li>\n<li>The samples are independent of one another.<\/li>\n<li>The standard deviation of one sample is more than double the other sample&#8217;s standard deviation.<\/li>\n<li>The variance of one sample is more than four times the size of the standard deviation sample.<\/li>\n<\/ul>\n<p>In this case, we cannot assume the two samples are from the same population and can therefore not &#8216;pool&#8217; their variances.<\/p>\n<h2>Test Statistic<\/h2>\n<p>The test statistic still follows a <a href=\"https:\/\/en.wikipedia.org\/wiki\/Student%27s_t-distribution\">t-distribution<\/a>. When performing the hypothesis test, we assume that the difference is zero or that they both come from the same population. Ie: H<sub>0<\/sub>: \u03bc<sub>1<\/sub>\u2212\u03bc<sub>2<\/sub>\u00a0= 0 or H<sub>0<\/sub>: \u03bc<sub>1<\/sub>=\u03bc<sub>2<\/sub>. This simplifies the t<sub>test<\/sub> formula:<\/p>\n<p>\\[ t_{test} = \\frac{ (\\bar{x}_1- \\bar{x}_2) &#8211; (\\mu_1\u2212\\mu_2) }{\\sqrt{\\frac{s_1^2}{n_1}+\\frac{s_2^2}{n_2}}} =\\frac{(\\bar{x}_1-\\bar{x}_2)-(0)}{\\sqrt{\\frac{s_1^2}{n_1}+\\frac{s_2^2}{n_2}}}=\\frac{\\bar{x}_1-\\bar{x}_2}{\\sqrt{\\frac{s_1^2}{n_1}+\\frac{s_2^2}{n_2}}} \\]<\/p>\n<p>\\[\\]<\/p>\n<h2>Degrees of Freedom<\/h2>\n<p>The degrees of freedom formula, which is often fairly simple, becomes quite involved in this case. We first calculate [latex]C[\/latex]:<\/p>\n<p>\\[C = \\frac{\\frac{s_1^2}{n_1}}{\\frac{s_1^2}{n_1}+\\frac{s_1^2}{n_1}}\\]<\/p>\n<p>We use this value in the degrees of freedom calculation:<\/p>\n<p>\\[ df = \\frac{(n_1-1)(n_2-1)}{(n_2-1)C^2+(1-C)^2(n_1-1)} \\]<\/p>\n<p>Again, Excel&#8217;s Data Analysis Toolpak will calculate the degrees of freedom, test statistic and p-value for us. See the next sections for examples of this.<\/p>\n<h1>Required Assumptions<\/h1>\n<p>In all of the above cases, we also need the following to be true:<\/p>\n<ul>\n<li>The samples are randomly collected and non-bias.<\/li>\n<li>The sample sizes are large enough<\/li>\n<li>Or, the populations are known to be normally distributed.<\/li>\n<\/ul>\n","protected":false},"parent":0,"menu_order":14,"template":"","meta":{"pb_part_invisible":false,"pb_part_invisible_string":""},"contributor":[],"license":[],"class_list":["post-2690","part","type-part","status-publish","hentry"],"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/parts\/2690","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/parts"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/types\/part"}],"version-history":[{"count":25,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/parts\/2690\/revisions"}],"predecessor-version":[{"id":2839,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/pressbooks\/v2\/parts\/2690\/revisions\/2839"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/media?parent=2690"}],"wp:term":[{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/contributor?post=2690"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/1130sandbox\/wp-json\/wp\/v2\/license?post=2690"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}