{"id":440,"date":"2021-08-10T13:00:46","date_gmt":"2021-08-10T17:00:46","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/statspsych\/?post_type=chapter&#038;p=440"},"modified":"2022-06-04T03:29:34","modified_gmt":"2022-06-04T07:29:34","slug":"chapter-8","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/statspsych\/chapter\/chapter-8\/","title":{"raw":"8. Analysis of Variance, Planned Contrasts and Posthoc Tests","rendered":"8. Analysis of Variance, Planned Contrasts and Posthoc Tests"},"content":{"raw":"<h1>8a. Analysis of Variance<\/h1>\r\nIn this chapter we graduate from teenage statistics to adult statistics! <strong>[pb_glossary id=\"441\"]Analysis of Variance[\/pb_glossary]<\/strong> is a technique that is very widely used in the analysis of data in psychology and many other disciplines. It is a system of analysis that is very flexible, and it is based on a statistical concept called the <strong>[pb_glossary id=\"442\"]general linear model[\/pb_glossary]<\/strong>. Once you learn how to use it, you can adapt it to nearly any situation.\r\n\r\nOur tasks for this lesson include grasping the concept of [pb_glossary id=\"445\"]<strong>partitioning variance<\/strong>[\/pb_glossary] into different buckets, like treatment effects vs. error, or between-groups vs. within-groups variance. Next, we will have a look at why <strong>Analysis of Variance<\/strong> is needed to analyze data from experimental designs with more than 2 groups. In particular we will examine the dangers of inflating the risk of Type I error, or alpha. And finally, we will demystify the <strong>analysis of variance<\/strong> system by conducting a one-way <strong>ANOVA<\/strong>. Just to give a little preview, in the following lessons, we will learn how to follow up on <strong>ANOVA<\/strong> with <strong>[pb_glossary id=\"452\"]planned contrasts[\/pb_glossary]<\/strong> and <strong>[pb_glossary id=\"453\"]post hoc tests[\/pb_glossary]<\/strong>, and then we will progress to a two-way <strong>ANOVA<\/strong> with factorial analysis.\r\n\r\nThe most important concept to grasp in order to intuitively understand what analysis of variance does, is the <strong>partitioning of variance<\/strong>. Variance should be a familiar concept by now. Variance is a statistic that summarizes the extent to which the individual scores in a dataset are spread out from the mean. It is calculated by the following steps:\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Steps to calculate variance (sample-based estimate for a population)<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ol>\r\n \t<li>Take the distance (\u201cdeviation\u201d) of each score from the mean.<\/li>\r\n \t<li>Next, Square each distance to get rid of the sign (because some deviations will be negative).<\/li>\r\n \t<li>Add up all the resulting \u201csquared deviations\u201d. This number is known as \u201csum of squares\u201d (SS).<\/li>\r\n \t<li>Divide the SS by the number of scores minus 1.<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\nThis gives us an estimated variance based on a sample, that is appropriate to use in statistical analysis, in which we want to use the differences between sample means to make inferences about the differences between population means.\r\n\r\nThe way the <strong>partitioning of variance<\/strong> works is this. Differences among scores exist for all sorts of reasons. One of those reasons is the one we are actually interested in. Systematic difference cause by treatments or associated with known characteristics of interest are the differences we are hoping to see in the data. The difference in amount of sleep that can be attributed to the effects of a new drug. The difference in mood specifically caused by chocolate. These are difference between groups, or between samples, that can be explained by the variable of interest.\r\n\r\n[caption id=\"attachment_469\" align=\"aligncenter\" width=\"287\"]<img class=\"wp-image-469\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5.png\" alt=\"\" width=\"287\" height=\"307\" \/> <em>Sources of variability in data from experimental (or quasi-experimental) research designs<\/em>[\/caption]\r\n\r\nHowever, there are also difference between scores that are not explained by the variable of interest. These are random, or unsystematic differences. The individual differences among scores within an experimental or control condition count in this category. Error in experimental design or in our measurements also go in this bin. When we want to make an objective decision about data, we need to separate out the systematic, explained differences, which we can label \u201cgood variance\u201d, from the random, unexplained differences, which we shall label \u201cbad variance\u201d. Note that good and bad in this context just means it counts toward (\"good\") or against (\"bad\") statistical significance.\r\n\r\n[h5p id=\"68\"]\r\n\r\nBefore we get to numeric examples of partitioning variance, maybe a visual example will help. At left we have a whole sample of dots of various colours.\r\n\r\n<img class=\"wp-image-448 alignleft\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1.png\" alt=\"\" width=\"387\" height=\"501\" \/>\r\n\r\nWhat if we wanted to sort the data by hue, to achieve greater consistency in colour within each group. We can apply the <strong>[pb_glossary id=\"450\"]factor[\/pb_glossary]<\/strong> of hue to the dots, using three levels: green, blue and red. There are still variations of hue within each grouping, but some of the systematic variability has been separated out by grouping into these three levels. Thus we have accounted for (or explained) some proportion of the variance. The more variance we can explain, the more confident we can be in the effect of our <strong>factors<\/strong>. (In an experimental design, <strong>factors<\/strong> are independent variables.)\r\n\r\nIn the next chapter, we will see that applying an additional <strong>factor<\/strong> can further sort the colours, to account for even more variability among colours. The more variance we can explain, through multiple <strong>factors<\/strong> and\/or multiple <strong>levels<\/strong>, the better! This is what we will be able to do with two-way <strong>ANOVA<\/strong> and factorial designs.\r\n\r\nNote: a one-way <strong>ANOVA<\/strong> includes one <strong>factor<\/strong>, whereas a two-way <strong>ANOVA<\/strong> includes two <strong>factors<\/strong>.\r\n\r\nThink of data analysis as a game in which the goal is to explain as much of the variability in the scores as possible through known <strong>factors<\/strong>. It\u2019s like imposing order over chaos in order to see patterns more clearly.\r\n\r\n<strong>Analysis of Variance<\/strong> becomes necessary when we have experimental designs that are more complex than the ones we have used to date. Up until now, we have covered statistical tests that can handle one-sample and two-sample experimental designs. But what if we are comparing three or more samples? <img class=\"alignleft wp-image-456 \" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3-300x149.jpg\" alt=\"\" width=\"264\" height=\"131\" \/>For example, what if we have a drug trial in which we are comparing the mean pain levels of patients after receiving placebo, a low dose of the drug, or a high dose of the drug?\r\n\r\n<img class=\" wp-image-455 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-1024x376.jpg\" alt=\"\" width=\"302\" height=\"111\" \/>Or what if our memory test using various types of stimuli measures memory for lists of words in black, red, blue or green? <strong>ANOVA<\/strong> can handle comparisons among 3, 4, or really any number of groups at once.\r\n\r\nLet\u2019s make sure we have a handle on the jargon that is used in <strong>ANOVA<\/strong>. First of all, the shortened term <strong>ANOVA<\/strong> came from making an acronym of sorts from the phrase <strong>Analysis of Variance<\/strong>. Secondly, the term <strong>factor<\/strong> is used to designate a nominal variable, or in the case of an experimental design, the independent variable, that designates the groups being compared. If we have a drug trial in which we are comparing the mean pain scores of patients after receiving placebo, a low dose of the drug, or a high dose of the drug, the <strong>factor<\/strong> would be \u201cdrug dose.\u201d Finally, the term <strong>[pb_glossary id=\"458\"]levels[\/pb_glossary]<\/strong> refers to the individual conditions or values that make up a factor. In our drug trial example, we have three <strong>levels<\/strong> of drug dose: placebo, low dose, and high dose.\r\n\r\nSo how is this <strong>ANOVA<\/strong> thing different from the t-tests we already learned? Well, in fact, you can think of it as an extension of the t-test to more than 2 groups. If you run an <strong>ANOVA<\/strong> on just 2 groups, the results are equivalent to the t-test. The only difference is that you get an F-value instead of a t-value. Fun fact \u2013 the statistician who invented the t-test published it under a pseudonym \u201cStudent\u201d. Perhaps he was scared of the angst of the many students who would have to learn to use it. The F-test, however, is named for Fisher. He apparently had no such fear. Maybe he was that confident that students would love learning <strong>ANOVA<\/strong>. Hopefully he was right\u2026 ! Anyway, trust me \u2013 if you were to calculate the t-value and the F-value for the exact same two sample, the F-value would be the t-value squared.\r\n\r\nThere is a nice part of using the F distribution and a not so nice part of it. The F distribution requires two degrees of freedom. Annoying, yes. But the nice part outweighs that annoyance, in my opinion. The F distribution starts from 0 and heads to the right. It has only positive values.\r\n\r\n<img class=\"aligncenter wp-image-461\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-1024x664.jpg\" alt=\"\" width=\"425\" height=\"275\" \/>\r\n\r\nWhat this means is no more distribution sketches, and no more one-tailed or two-tailed nonsense. So the logistics of the hypothesis test actually get a whole lot simpler.\r\n\r\n[h5p id=\"69\"]\r\n\r\nYou might be wondering, okay so the <strong>ANOVA<\/strong> thing has some advantages, but we do already know the t-test, so could we not just use multiple t-tests to compare each group within the <strong>factor<\/strong> against each other? The problem is, each comparison includes a risk of a Type I error. The risk of Type I error accumulates with multiple statistical tests on the same data, and that is called the <strong>[pb_glossary id=\"462\"]experimentwise alpha level[\/pb_glossary]<\/strong>. <strong>ANOVA<\/strong> does one overall, or omnibus, test of treatment effects, to keep our risk of Type I error down. Inflating alpha is dangerous, and any statistical method we can use to keep it under control is a good thing.\r\n\r\n[h5p id=\"67\"]\r\n\r\n[h5p id=\"70\"]\r\n\r\nThe calculation method I will show you differs from more efficient methods you can find on the internet or in many other textbooks. Sorry for that, but the nice thing about the method I will show you here is that it has beautiful symmetry to it and highlights the concept of partitioning of variance. In other words these are conceptual rather than calculational formulas. This is a deliberate choice to help you understand how <strong>ANOVA<\/strong> works, because if we think about it, you will never need to calculate statistics by hand in the \"real world.\" You will always be able to use a computer instead. However, all the computers in the world cannot help you choose an appropriate statistic for a particular situation, or to understand\/articulate how the selected statistical test works. This is what an introduction to statistics is really about.\r\n\r\n&nbsp;\r\n\r\nThere will also be an inherent math double-check opportunity in this method, which I think you will appreciate. In reality, most people use a computer to calculate <strong>ANOVA<\/strong>. However, I do like to ask you to try calculating things by hand, so you can see how it works. My hope is that you gain a better conceptual understanding of the mechanisms behind these statistical tests by applying them, and seeing how it all fits together like a puzzle, with tangible examples. Given that, I think this calculation system is better than others you can use.\r\n\r\nSo, how does <strong>ANOVA<\/strong> work? Essentially it works by calculating different kinds of Sums of squares, which we will continue to abbreviate as SS. As you can see, there are three flavours of SS that can each be calculated using the formulas shown.\r\n\r\n<img class=\"aligncenter wp-image-474\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-1024x282.png\" alt=\"\" width=\"696\" height=\"192\" \/>\r\n\r\nThe Sum of squares Between-groups (SSB) and Sum of Squares Within-groups (SSW) should add up to the Sum of Squares Total (SST). So here you see the <strong>partitioning of variance<\/strong> coming in.\r\n\r\nThere are also three flavours of degrees of freedom, with matching labels. They also should add up.\r\n\r\n<img class=\"aligncenter wp-image-471\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7.png\" alt=\"\" width=\"490\" height=\"100\" \/>\r\n\r\n[h5p id=\"73\"]\r\n\r\nNotice I used colour coding to help you track the \u201cgood\u201d variance in green and the \u201cbad\u201d variance in red.\r\n\r\nOnce you have the SS and degrees of freedom calculated, you can find the variances.\r\n\r\n<img class=\"aligncenter wp-image-472\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8.png\" alt=\"\" width=\"175\" height=\"145\" \/>\r\n\r\nThe F-test is simple: it is the ratio of explained to unexplained variance, which is represented by the variance between and the variance within. You need more explained than unexplained variance to be able to reject the null.\r\n\r\n<img class=\"aligncenter wp-image-473\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9.png\" alt=\"\" width=\"334\" height=\"106\" \/>\r\n\r\nHow much the ratio needs to be depends on the degrees of freedom. And that\u2019s where sample size becomes very important, just as we saw in the t-test.\r\n\r\n[h5p id=\"72\"]\r\n\r\nOne of the beautiful things about <strong>ANOVA<\/strong> is the calculation table.\u00a0 This is a way of organizing all the components of the workflow, and also highlighting our two math double-checks. For both SS and degrees of freedom, the Between and Within numbers should add up to the total.\r\n\r\n<img class=\"aligncenter wp-image-475\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-1024x526.png\" alt=\"\" width=\"700\" height=\"360\" \/>\r\n\r\nThis table is a good reference for you to keep to hand, as a reminder of each formula and how the ANOVA puzzle fits together.\r\n\r\nNote what each symbol in these formulas means, by referring to the symbols key in the lower right of the table. The one element that tends to be confusing is N<sub>g<\/sub>. This symbol refers to the number of scores in the group \u2013 not to the number of groups in the study. This is important to interpret correctly. If you ever find that your SSB and SSW do not add up to your SST, like really not even close, then that is the first thing to double check. Did you use the number of scores in the group when calculating SSB?\r\n\r\nNow that our research designs are getting more complex, our statistical findings will need a little more descriptive statistics and graphical portrayal in order to easily interpret what those hypothesis test results really tell us. I would encourage you to always graph the means and standard deviations of your data before conducting your inferential statistics, so that you get a sense for significance before you begin. Do not go blind into that statistical routine\u2026 remember, numbers are never as informative as a picture of the data.\r\n\r\n<img class=\" wp-image-477 alignleft\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12.png\" alt=\"\" width=\"317\" height=\"268\" \/>\r\n\r\nI recommend a bar graph displaying group means, and adding error bars as tall as the standard deviation (or standard error) on top of the mean. You can also show the bars going one standard deviation downward as well, to get the full range of the typical scores in the group.\r\n\r\nIf the error bars eclipse the difference in group means, that is a bad sign if your goal is to report a significant difference among means. This visual allows you to preview the signal-to-noise ratio in your data, or your between-to-within variance ratio.\r\n\r\n<img class=\" wp-image-476 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-1024x691.jpg\" alt=\"\" width=\"466\" height=\"315\" \/>Another really great visual is the group scatter plot, shown here. It is not really a standard way to view datasets, but I think it should be.\r\n\r\nStep 1 of hypothesis testing for an ANOVA truly becomes a formality. The hypotheses are always the same. Define a population for each group. Set the research hypothesis to be a general statement of difference among population means. Set the null to be a statement of equality among population means. There is no directionality with the F distribution, so we do not need to worry about the predicted direction of differences.\r\n\r\nUsing our drug-dose example with three levels, the populations and hypotheses would look something like this:\r\n<div>\r\n<div class=\"textbox\">\r\n<div><strong>Population 1<\/strong>: People who receive low dose of drug<\/div>\r\n<div><strong>Population 2<\/strong>: People who receive high dose of drug<\/div>\r\n<div><strong>Population 3<\/strong>: People who do not receive drug<\/div>\r\n<div><\/div>\r\n<div><strong>Research Hypothesis<\/strong>: There exists at least one difference among the population means.<\/div>\r\n<div><strong>Null Hypothesis<\/strong>: <em>\u00b5<\/em><sub>1<\/sub> = <em>\u00b5<sub>2<\/sub><\/em> = <em>\u00b5<sub>3<\/sub><\/em>\u00a0 All population means are equal.<\/div>\r\n<\/div>\r\n<img class=\" wp-image-478 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.13.png\" alt=\"\" width=\"177\" height=\"42\" \/>Now we can move on to step 2. The F distribution has two degrees of freedom.\r\n\r\nWe no longer have to worry about the mean or standard deviation of the comparison distribution, we just need to find the degrees of freedom between and within. The \"good\" variance is\u00a0 the differences between groups, and so the degrees of freedom between is number of groups \u2013 1. The within-groups variance, the \"bad\" variance, is the individual differences among the scores within each group. The degrees of freedom within, then, is the total number of scores in all groups, minus the number of groups.\r\n\r\nFor step 3, we can find the cutoff score in the F-tables if we know the significance level, degrees of freedom between and degrees of freedom within.\r\n\r\n<img class=\"aligncenter wp-image-481\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-1024x684.jpg\" alt=\"\" width=\"373\" height=\"249\" \/>\r\n\r\nStep 4 is where things take some getting used to. Here we use this new system of formulas. Start with Sum of squares calculations: Between, Within, and Total, and double check that both they and the degrees of freedom add up.\r\n\r\n[latex] \\[SS_{B}=\\sum [N_{g}(M_{g}-M_{o})^{2}]\\] [\/latex]\r\n\r\n[latex] \\[SS_{W}=\\sum (X-M_{g})^{2}\\] [\/latex]\r\n\r\n[latex] \\[SS_{T}=\\sum (X-M_{o})^{2}\\] [\/latex]\r\n\r\nThen move across the table, finding the good and the bad variance...\r\n\r\n<\/div>\r\n[latex] \\[S^{2}_{B}=\\frac{SS_{B}}{df_{B}}\\] [\/latex]\r\n\r\n[latex] \\[S^{2}_{W}=\\frac{SS_{W}}{df_{W}}\\] [\/latex]\r\n\r\n... <span style=\"text-align: initial;font-size: 1em\">and finally getting their ratio for the F-test result.<\/span>\r\n<div>\r\n\r\n[latex]\u00a0 \\[F=\\frac{S^{2}_{B}}{S^{2}_{W}}\\] [\/latex]\r\n\r\nTo make our decision in Step 5, we examine the calculated F value (from Step 4) and determine whether it exceeds the cutoff F score (from Step 3).\u00a0 If so, we reject the null hypothesis.\r\n\r\n<span class=\"pullquote-right\">\"There is a significant difference among the mean digit memory scores after listening to the three types of music (f<sub>2,6<\/sub> = 27.00, p &lt; 0.05).\"<\/span> Here is an example of how to express the results \u2013 note the phrase \u201csignificant difference among the means.\u201d If we do not reject the null, we can switch the statement of results to \u201cno significant difference.\u201d The test statistic and p-values are expressed here in common formats.\r\n\r\nWe can continue building a decision tree to help you decide which statistical test to use when you look at a research question. What are the circumstances in which you would need to use a one-way <strong>ANOVA<\/strong> test?\r\n\r\n<\/div>\r\n<img class=\"aligncenter wp-image-493\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-1024x338.png\" alt=\"\" width=\"688\" height=\"227\" \/>\r\n\r\n[h5p id=\"74\"]\r\n\r\n[h5p id=\"75\"]\r\n\r\n[h5p id=\"76\"]\r\n<h1>8b. Planned Contrasts and Posthoc Tests<\/h1>\r\nIn the second part of this chapter we will have a look at follow-up tests we can conduct after an <strong>ANOVA<\/strong> hypothesis test, to investigate the findings in greater detail.\r\n\r\nPlanned contrasts and post-hoc tests are commonly performed following <strong>Analysis of Variance<\/strong>. This is necessary in many instances, because <strong>ANOVA<\/strong> compares all individual mean differences simultaneously, in one test (referred to as an omnibus test). If we run an <strong>ANOVA<\/strong> hypothesis test, and the F-test comes out significant, this indicates that at least one among the mean differences is statistically significant. However, when the <strong>factor<\/strong> has more than two <strong>levels<\/strong>, it does not indicate which means differ significantly from each other.\r\n\r\n<img class=\"wp-image-494 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17.png\" alt=\"\" width=\"335\" height=\"302\" \/>\r\n\r\nIn this example, a significant F-test result from a one-way <strong>ANOVA<\/strong> with the three drug dose conditions does not tell us where the significant difference lies. Is it between 0 and 100 mg? Or between 100 and 200 mg? Or is it only the biggest difference that is significant \u2013 0 vs. 200 mg?\r\n\r\n<strong>Planned contrasts<\/strong> and <strong>post hoc tests<\/strong> are additional tests to determine exactly which mean differences are significant, and which are not. Why is that we cannot just do 3 independent means t-tests here? Each time we conduct a t-test we have a certain risk of a Type I error. If we do 3, we have triple the risk. So first we test for omnibus significance using the overall <strong>ANOVA<\/strong> as detailed in the first part of this chapter. Then, if a statistically significant difference exists among the means, we do the pairwise comparisons with an adjustment to be more conservative. These follow-up tests are designed specifically to avoid inflating risk of Type I error.\r\n\r\nNow, this is very important. We are <em>only<\/em> allowed to conduct these tests <em>if the F-test result was significant<\/em>. This procedural rule also helps protect us from the statistical sin of p-hacking, which is selectively hunting for and reporting significant results in a way that is biased and subjective.\r\n\r\n<strong>Planned contrasts<\/strong> are used when researchers know in advance which groups they expect to differ. For example, suppose from our worksheet example, we expect the pop group to differ from the classical group on our measure of working memory. We can then conduct a single comparison between these means without worrying about Type I error. Because we hypothesized this difference before we saw the data, perhaps based on prior research studies or a strong intuitive hunch, and because there is only one comparison to be analyzed, we need not be concerned about inflated <strong>experimentwise alpha<\/strong>. If multiple comparisons are planned, then we will need to adjust the significance level.\r\n\r\nLet us take a look at how to conduct a single <strong>planned contrast<\/strong>. The process is quite simple, as it is just a modified <strong>ANOVA<\/strong> analysis. First we calculate SSB with just those two groups involved in the planned contrast. We figure out the degrees of freedom between using just the two groups. Then, we calculate the variance between using the new SSB and degrees of freedom, and we calculate an F-test for the comparison using the new variance between and the original overall variance within. To find out if the F-test result is significant, we can use the new degrees of freedom but the original significance level for the cutoff. (Because there is just one pairwise comparison, we can use original significance level.)\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Steps to calculate a planned contrast<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ol>\r\n \t<li>Calculate SS<sub>Between<\/sub> with just those two groups.<\/li>\r\n \t<li>Find the df<sub>Between<\/sub> using just the two groups.<\/li>\r\n \t<li>Calculate S<sup>2<\/sup><sub>Between<\/sub> using the new SS<sub>Between<\/sub> and the new df<sub>Between<\/sub>.<\/li>\r\n \t<li>Calculate F using the new S<sup>2<\/sup><sub>Between<\/sub> and the overall S<sup>2<\/sup><sub>Within<\/sub>.<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\nIf we were to perform multiple planned contrasts, things change a little. Suppose we had hypothesized in this experiment that each group would differ from the others? The <strong>[pb_glossary id=\"497\"]Bonferroni correction[\/pb_glossary]<\/strong> involves adjusting the significance level to protect from the inflation of risk of Type I error. The procedure for each comparison is the same as for a single planned contrast. The difference is that the cutoff score to determine statistical significance will use a more conservative significance level. When we do multiple pairwise comparisons, the <strong>Bonferroni correction<\/strong> is to use the original\u00a0 significance level divided by number of planned contrasts. The adjusted significance level is not likely to be in our F-tables, so to find the cutoff for such tests, we would need to use an <a href=\"http:\/\/statpages.org\/pdfs.html\" target=\"_blank\" rel=\"noopener\">online calculator<\/a> in reverse (that is, we enter the p-value and degrees of freedom, and look up the value on the F-distribution corresponding to that area in the tail).\r\n\r\nWhat about <strong>post hoc tests<\/strong> tests? As the name suggests, these tests come into the picture when we are doing pairwise comparisons (usually all possible combinations) after the fact to find out where the significant differences were. These are tests that do not require that we had an <em>a priori<\/em> hypothesis ahead of data collection. Essentially, these are an allowable and acceptable form of data-snooping. This is where we must be cautious about doing so many tests \u2013 we could end up with huge risk of Type I error. If we use the <strong>Bonferroni correction<\/strong> that we saw for multiple planned comparisons on more than 3 tests, the significance level would be vanishingly small. This would make it nearly impossible to detect significant differences. For this reason, slightly more forgiving tests like <strong>[pb_glossary id=\"498\"]Scheffe\u2019s correction[\/pb_glossary]<\/strong>, Dunn\u2019s or Tukey\u2019s <strong>post-hoc tests<\/strong> are more popular. There are many different post-hoc tests out there, and the choice of which one researchers use is often a matter of convention in their area of research.\r\n\r\n[h5p id=\"77\"]\r\n\r\nNow we shall take a look at how to conduct <strong>post hoc tests<\/strong> using <strong>Scheff\u00e9\u2019s correction<\/strong>. In this example, we will test all pairwise comparisons. The <strong>Scheff\u00e9<\/strong> technique involves adjusting the F-test result, rather than adjusting the significance level. The way it works is the same as the <strong>planned contrast<\/strong> procedure, except for the very end. Before we compare the F-test result to the cutoff score, we divide the F value by the overall degrees of freedom between, or the number of groups minus one. Thus, we keep the significance level at the original level, but divide the calculated F by overall degrees of freedom between from the overall <strong>ANOVA<\/strong>.\r\n\r\n&nbsp;\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Steps to calculate post-hoc tests with Scheff\u00e9's correction<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nFor each pairwise comparison:\r\n<ol>\r\n \t<li>Calculate SS<sub>Between<\/sub> with just those two groups.<\/li>\r\n \t<li>Find the df<sub>Between<\/sub> using just the two groups.<\/li>\r\n \t<li>Calculate S<sup>2<\/sup><sub>Between<\/sub> using the new SS<sub>Between<\/sub> and the new df<sub>Between<\/sub>.<\/li>\r\n \t<li>Calculate F using the new S<sup>2<\/sup><sub>Between<\/sub> and the overall S<sup>2<\/sup><sub>Within<\/sub>.<\/li>\r\n \t<li>Divide F by overall\u00a0df<sub>Between<\/sub>.<\/li>\r\n<\/ol>\r\n<\/div>\r\n<\/div>\r\n<h1>[h5p id=\"78\"]<\/h1>\r\n<h1>Chapter Summary<\/h1>\r\nIn this chapter we introduced the concepts underlying <strong>Analysis of Variance<\/strong> and examined how to conduct a hypothesis test using this technique. We also saw how to follow up on a statistically significant F-test result in an <strong>ANOVA<\/strong> with more than two <strong>levels<\/strong> in a <strong>factor<\/strong>, in order to determine which levels were significantly different from each other.\r\n\r\nKey terms:\r\n<table class=\"no-lines\" style=\"border-collapse: collapse;width: 100%\" border=\"0\">\r\n<tbody>\r\n<tr>\r\n<td style=\"width: 33.3333%\"><strong>Analysis of Variance<\/strong><\/td>\r\n<td style=\"width: 33.3333%\"><strong>post hoc tests<\/strong><\/td>\r\n<td style=\"width: 33.3333%\"><strong>Bonferroni correction<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 33.3333%\"><strong>general linear model<\/strong><\/td>\r\n<td style=\"width: 33.3333%\"><strong>factor<\/strong><\/td>\r\n<td style=\"width: 33.3333%\"><strong>Scheff\u00e9 correction<\/strong><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 33.3333%\"><strong>partitioning of variance<\/strong><\/td>\r\n<td style=\"width: 33.3333%\"><strong>levels<\/strong><\/td>\r\n<td style=\"width: 33.3333%\"><\/td>\r\n<\/tr>\r\n<tr>\r\n<td style=\"width: 33.3333%\"><strong>planned contrasts<\/strong><\/td>\r\n<td style=\"width: 33.3333%\" colspan=\"2\"><strong>experimentwise alpha level<\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>","rendered":"<h1>8a. Analysis of Variance<\/h1>\n<p>In this chapter we graduate from teenage statistics to adult statistics! <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_441\">Analysis of Variance<\/a><\/strong> is a technique that is very widely used in the analysis of data in psychology and many other disciplines. It is a system of analysis that is very flexible, and it is based on a statistical concept called the <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_442\">general linear model<\/a><\/strong>. Once you learn how to use it, you can adapt it to nearly any situation.<\/p>\n<p>Our tasks for this lesson include grasping the concept of <a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_445\"><strong>partitioning variance<\/strong><\/a> into different buckets, like treatment effects vs. error, or between-groups vs. within-groups variance. Next, we will have a look at why <strong>Analysis of Variance<\/strong> is needed to analyze data from experimental designs with more than 2 groups. In particular we will examine the dangers of inflating the risk of Type I error, or alpha. And finally, we will demystify the <strong>analysis of variance<\/strong> system by conducting a one-way <strong>ANOVA<\/strong>. Just to give a little preview, in the following lessons, we will learn how to follow up on <strong>ANOVA<\/strong> with <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_452\">planned contrasts<\/a><\/strong> and <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_453\">post hoc tests<\/a><\/strong>, and then we will progress to a two-way <strong>ANOVA<\/strong> with factorial analysis.<\/p>\n<p>The most important concept to grasp in order to intuitively understand what analysis of variance does, is the <strong>partitioning of variance<\/strong>. Variance should be a familiar concept by now. Variance is a statistic that summarizes the extent to which the individual scores in a dataset are spread out from the mean. It is calculated by the following steps:<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Steps to calculate variance (sample-based estimate for a population)<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ol>\n<li>Take the distance (\u201cdeviation\u201d) of each score from the mean.<\/li>\n<li>Next, Square each distance to get rid of the sign (because some deviations will be negative).<\/li>\n<li>Add up all the resulting \u201csquared deviations\u201d. This number is known as \u201csum of squares\u201d (SS).<\/li>\n<li>Divide the SS by the number of scores minus 1.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<p>This gives us an estimated variance based on a sample, that is appropriate to use in statistical analysis, in which we want to use the differences between sample means to make inferences about the differences between population means.<\/p>\n<p>The way the <strong>partitioning of variance<\/strong> works is this. Differences among scores exist for all sorts of reasons. One of those reasons is the one we are actually interested in. Systematic difference cause by treatments or associated with known characteristics of interest are the differences we are hoping to see in the data. The difference in amount of sleep that can be attributed to the effects of a new drug. The difference in mood specifically caused by chocolate. These are difference between groups, or between samples, that can be explained by the variable of interest.<\/p>\n<figure id=\"attachment_469\" aria-describedby=\"caption-attachment-469\" style=\"width: 287px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-469\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5.png\" alt=\"\" width=\"287\" height=\"307\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5.png 653w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5-281x300.png 281w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5-65x69.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5-225x240.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.5-350x374.png 350w\" sizes=\"auto, (max-width: 287px) 100vw, 287px\" \/><figcaption id=\"caption-attachment-469\" class=\"wp-caption-text\"><em>Sources of variability in data from experimental (or quasi-experimental) research designs<\/em><\/figcaption><\/figure>\n<p>However, there are also difference between scores that are not explained by the variable of interest. These are random, or unsystematic differences. The individual differences among scores within an experimental or control condition count in this category. Error in experimental design or in our measurements also go in this bin. When we want to make an objective decision about data, we need to separate out the systematic, explained differences, which we can label \u201cgood variance\u201d, from the random, unexplained differences, which we shall label \u201cbad variance\u201d. Note that good and bad in this context just means it counts toward (&#8220;good&#8221;) or against (&#8220;bad&#8221;) statistical significance.<\/p>\n<div id=\"h5p-68\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-68\" class=\"h5p-iframe\" data-content-id=\"68\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.02. Source of experimental error.\"><\/iframe><\/div>\n<\/div>\n<p>Before we get to numeric examples of partitioning variance, maybe a visual example will help. At left we have a whole sample of dots of various colours.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-448 alignleft\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1.png\" alt=\"\" width=\"387\" height=\"501\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1.png 659w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1-232x300.png 232w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1-65x84.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1-225x291.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.1-350x453.png 350w\" sizes=\"auto, (max-width: 387px) 100vw, 387px\" \/><\/p>\n<p>What if we wanted to sort the data by hue, to achieve greater consistency in colour within each group. We can apply the <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_450\">factor<\/a><\/strong> of hue to the dots, using three levels: green, blue and red. There are still variations of hue within each grouping, but some of the systematic variability has been separated out by grouping into these three levels. Thus we have accounted for (or explained) some proportion of the variance. The more variance we can explain, the more confident we can be in the effect of our <strong>factors<\/strong>. (In an experimental design, <strong>factors<\/strong> are independent variables.)<\/p>\n<p>In the next chapter, we will see that applying an additional <strong>factor<\/strong> can further sort the colours, to account for even more variability among colours. The more variance we can explain, through multiple <strong>factors<\/strong> and\/or multiple <strong>levels<\/strong>, the better! This is what we will be able to do with two-way <strong>ANOVA<\/strong> and factorial designs.<\/p>\n<p>Note: a one-way <strong>ANOVA<\/strong> includes one <strong>factor<\/strong>, whereas a two-way <strong>ANOVA<\/strong> includes two <strong>factors<\/strong>.<\/p>\n<p>Think of data analysis as a game in which the goal is to explain as much of the variability in the scores as possible through known <strong>factors<\/strong>. It\u2019s like imposing order over chaos in order to see patterns more clearly.<\/p>\n<p><strong>Analysis of Variance<\/strong> becomes necessary when we have experimental designs that are more complex than the ones we have used to date. Up until now, we have covered statistical tests that can handle one-sample and two-sample experimental designs. But what if we are comparing three or more samples? <img loading=\"lazy\" decoding=\"async\" class=\"alignleft wp-image-456\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3-300x149.jpg\" alt=\"\" width=\"264\" height=\"131\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3-300x149.jpg 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3-65x32.jpg 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3-225x112.jpg 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3-350x174.jpg 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.3.jpg 700w\" sizes=\"auto, (max-width: 264px) 100vw, 264px\" \/>For example, what if we have a drug trial in which we are comparing the mean pain levels of patients after receiving placebo, a low dose of the drug, or a high dose of the drug?<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-455 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-1024x376.jpg\" alt=\"\" width=\"302\" height=\"111\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-1024x376.jpg 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-300x110.jpg 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-768x282.jpg 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-65x24.jpg 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-225x83.jpg 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2-350x129.jpg 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.2.jpg 1261w\" sizes=\"auto, (max-width: 302px) 100vw, 302px\" \/>Or what if our memory test using various types of stimuli measures memory for lists of words in black, red, blue or green? <strong>ANOVA<\/strong> can handle comparisons among 3, 4, or really any number of groups at once.<\/p>\n<p>Let\u2019s make sure we have a handle on the jargon that is used in <strong>ANOVA<\/strong>. First of all, the shortened term <strong>ANOVA<\/strong> came from making an acronym of sorts from the phrase <strong>Analysis of Variance<\/strong>. Secondly, the term <strong>factor<\/strong> is used to designate a nominal variable, or in the case of an experimental design, the independent variable, that designates the groups being compared. If we have a drug trial in which we are comparing the mean pain scores of patients after receiving placebo, a low dose of the drug, or a high dose of the drug, the <strong>factor<\/strong> would be \u201cdrug dose.\u201d Finally, the term <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_458\">levels<\/a><\/strong> refers to the individual conditions or values that make up a factor. In our drug trial example, we have three <strong>levels<\/strong> of drug dose: placebo, low dose, and high dose.<\/p>\n<p>So how is this <strong>ANOVA<\/strong> thing different from the t-tests we already learned? Well, in fact, you can think of it as an extension of the t-test to more than 2 groups. If you run an <strong>ANOVA<\/strong> on just 2 groups, the results are equivalent to the t-test. The only difference is that you get an F-value instead of a t-value. Fun fact \u2013 the statistician who invented the t-test published it under a pseudonym \u201cStudent\u201d. Perhaps he was scared of the angst of the many students who would have to learn to use it. The F-test, however, is named for Fisher. He apparently had no such fear. Maybe he was that confident that students would love learning <strong>ANOVA<\/strong>. Hopefully he was right\u2026 ! Anyway, trust me \u2013 if you were to calculate the t-value and the F-value for the exact same two sample, the F-value would be the t-value squared.<\/p>\n<p>There is a nice part of using the F distribution and a not so nice part of it. The F distribution requires two degrees of freedom. Annoying, yes. But the nice part outweighs that annoyance, in my opinion. The F distribution starts from 0 and heads to the right. It has only positive values.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-461\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-1024x664.jpg\" alt=\"\" width=\"425\" height=\"275\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-1024x664.jpg 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-300x194.jpg 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-768x498.jpg 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-65x42.jpg 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-225x146.jpg 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4-350x227.jpg 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.4.jpg 1196w\" sizes=\"auto, (max-width: 425px) 100vw, 425px\" \/><\/p>\n<p>What this means is no more distribution sketches, and no more one-tailed or two-tailed nonsense. So the logistics of the hypothesis test actually get a whole lot simpler.<\/p>\n<div id=\"h5p-69\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-69\" class=\"h5p-iframe\" data-content-id=\"69\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.03. ANOVA vs T-test.\"><\/iframe><\/div>\n<\/div>\n<p>You might be wondering, okay so the <strong>ANOVA<\/strong> thing has some advantages, but we do already know the t-test, so could we not just use multiple t-tests to compare each group within the <strong>factor<\/strong> against each other? The problem is, each comparison includes a risk of a Type I error. The risk of Type I error accumulates with multiple statistical tests on the same data, and that is called the <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_462\">experimentwise alpha level<\/a><\/strong>. <strong>ANOVA<\/strong> does one overall, or omnibus, test of treatment effects, to keep our risk of Type I error down. Inflating alpha is dangerous, and any statistical method we can use to keep it under control is a good thing.<\/p>\n<div id=\"h5p-67\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-67\" class=\"h5p-iframe\" data-content-id=\"67\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.01. ANOVA vs T-test\"><\/iframe><\/div>\n<\/div>\n<div id=\"h5p-70\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-70\" class=\"h5p-iframe\" data-content-id=\"70\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.04. ANOVA vs T-test.\"><\/iframe><\/div>\n<\/div>\n<p>The calculation method I will show you differs from more efficient methods you can find on the internet or in many other textbooks. Sorry for that, but the nice thing about the method I will show you here is that it has beautiful symmetry to it and highlights the concept of partitioning of variance. In other words these are conceptual rather than calculational formulas. This is a deliberate choice to help you understand how <strong>ANOVA<\/strong> works, because if we think about it, you will never need to calculate statistics by hand in the &#8220;real world.&#8221; You will always be able to use a computer instead. However, all the computers in the world cannot help you choose an appropriate statistic for a particular situation, or to understand\/articulate how the selected statistical test works. This is what an introduction to statistics is really about.<\/p>\n<p>&nbsp;<\/p>\n<p>There will also be an inherent math double-check opportunity in this method, which I think you will appreciate. In reality, most people use a computer to calculate <strong>ANOVA<\/strong>. However, I do like to ask you to try calculating things by hand, so you can see how it works. My hope is that you gain a better conceptual understanding of the mechanisms behind these statistical tests by applying them, and seeing how it all fits together like a puzzle, with tangible examples. Given that, I think this calculation system is better than others you can use.<\/p>\n<p>So, how does <strong>ANOVA<\/strong> work? Essentially it works by calculating different kinds of Sums of squares, which we will continue to abbreviate as SS. As you can see, there are three flavours of SS that can each be calculated using the formulas shown.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-474\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-1024x282.png\" alt=\"\" width=\"696\" height=\"192\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-1024x282.png 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-300x83.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-768x212.png 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-65x18.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-225x62.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6-350x96.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.6.png 1324w\" sizes=\"auto, (max-width: 696px) 100vw, 696px\" \/><\/p>\n<p>The Sum of squares Between-groups (SSB) and Sum of Squares Within-groups (SSW) should add up to the Sum of Squares Total (SST). So here you see the <strong>partitioning of variance<\/strong> coming in.<\/p>\n<p>There are also three flavours of degrees of freedom, with matching labels. They also should add up.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-471\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7.png\" alt=\"\" width=\"490\" height=\"100\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7.png 920w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7-300x61.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7-768x157.png 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7-65x13.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7-225x46.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.7-350x72.png 350w\" sizes=\"auto, (max-width: 490px) 100vw, 490px\" \/><\/p>\n<div id=\"h5p-73\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-73\" class=\"h5p-iframe\" data-content-id=\"73\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.06. Degrees of freedom\"><\/iframe><\/div>\n<\/div>\n<p>Notice I used colour coding to help you track the \u201cgood\u201d variance in green and the \u201cbad\u201d variance in red.<\/p>\n<p>Once you have the SS and degrees of freedom calculated, you can find the variances.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-472\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8.png\" alt=\"\" width=\"175\" height=\"145\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8.png 366w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8-300x248.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8-65x54.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8-225x186.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.8-350x289.png 350w\" sizes=\"auto, (max-width: 175px) 100vw, 175px\" \/><\/p>\n<p>The F-test is simple: it is the ratio of explained to unexplained variance, which is represented by the variance between and the variance within. You need more explained than unexplained variance to be able to reject the null.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-473\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9.png\" alt=\"\" width=\"334\" height=\"106\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9.png 623w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9-300x95.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9-65x21.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9-225x72.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.9-350x111.png 350w\" sizes=\"auto, (max-width: 334px) 100vw, 334px\" \/><\/p>\n<p>How much the ratio needs to be depends on the degrees of freedom. And that\u2019s where sample size becomes very important, just as we saw in the t-test.<\/p>\n<div id=\"h5p-72\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-72\" class=\"h5p-iframe\" data-content-id=\"72\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.07. F-ratio of Null hypothesis is true.\"><\/iframe><\/div>\n<\/div>\n<p>One of the beautiful things about <strong>ANOVA<\/strong> is the calculation table.\u00a0 This is a way of organizing all the components of the workflow, and also highlighting our two math double-checks. For both SS and degrees of freedom, the Between and Within numbers should add up to the total.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-475\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-1024x526.png\" alt=\"\" width=\"700\" height=\"360\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-1024x526.png 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-300x154.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-768x394.png 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-65x33.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-225x115.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10-350x180.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.10.png 1319w\" sizes=\"auto, (max-width: 700px) 100vw, 700px\" \/><\/p>\n<p>This table is a good reference for you to keep to hand, as a reminder of each formula and how the ANOVA puzzle fits together.<\/p>\n<p>Note what each symbol in these formulas means, by referring to the symbols key in the lower right of the table. The one element that tends to be confusing is N<sub>g<\/sub>. This symbol refers to the number of scores in the group \u2013 not to the number of groups in the study. This is important to interpret correctly. If you ever find that your SSB and SSW do not add up to your SST, like really not even close, then that is the first thing to double check. Did you use the number of scores in the group when calculating SSB?<\/p>\n<p>Now that our research designs are getting more complex, our statistical findings will need a little more descriptive statistics and graphical portrayal in order to easily interpret what those hypothesis test results really tell us. I would encourage you to always graph the means and standard deviations of your data before conducting your inferential statistics, so that you get a sense for significance before you begin. Do not go blind into that statistical routine\u2026 remember, numbers are never as informative as a picture of the data.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-477 alignleft\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12.png\" alt=\"\" width=\"317\" height=\"268\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12.png 520w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12-300x254.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12-65x55.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12-225x190.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.12-350x296.png 350w\" sizes=\"auto, (max-width: 317px) 100vw, 317px\" \/><\/p>\n<p>I recommend a bar graph displaying group means, and adding error bars as tall as the standard deviation (or standard error) on top of the mean. You can also show the bars going one standard deviation downward as well, to get the full range of the typical scores in the group.<\/p>\n<p>If the error bars eclipse the difference in group means, that is a bad sign if your goal is to report a significant difference among means. This visual allows you to preview the signal-to-noise ratio in your data, or your between-to-within variance ratio.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-476 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-1024x691.jpg\" alt=\"\" width=\"466\" height=\"315\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-1024x691.jpg 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-300x202.jpg 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-768x518.jpg 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-1536x1036.jpg 1536w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-65x44.jpg 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-225x152.jpg 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11-350x236.jpg 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.11.jpg 1771w\" sizes=\"auto, (max-width: 466px) 100vw, 466px\" \/>Another really great visual is the group scatter plot, shown here. It is not really a standard way to view datasets, but I think it should be.<\/p>\n<p>Step 1 of hypothesis testing for an ANOVA truly becomes a formality. The hypotheses are always the same. Define a population for each group. Set the research hypothesis to be a general statement of difference among population means. Set the null to be a statement of equality among population means. There is no directionality with the F distribution, so we do not need to worry about the predicted direction of differences.<\/p>\n<p>Using our drug-dose example with three levels, the populations and hypotheses would look something like this:<\/p>\n<div>\n<div class=\"textbox\">\n<div><strong>Population 1<\/strong>: People who receive low dose of drug<\/div>\n<div><strong>Population 2<\/strong>: People who receive high dose of drug<\/div>\n<div><strong>Population 3<\/strong>: People who do not receive drug<\/div>\n<div><\/div>\n<div><strong>Research Hypothesis<\/strong>: There exists at least one difference among the population means.<\/div>\n<div><strong>Null Hypothesis<\/strong>: <em>\u00b5<\/em><sub>1<\/sub> = <em>\u00b5<sub>2<\/sub><\/em> = <em>\u00b5<sub>3<\/sub><\/em>\u00a0 All population means are equal.<\/div>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-478 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.13.png\" alt=\"\" width=\"177\" height=\"42\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.13.png 346w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.13-300x71.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.13-65x15.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.13-225x53.png 225w\" sizes=\"auto, (max-width: 177px) 100vw, 177px\" \/>Now we can move on to step 2. The F distribution has two degrees of freedom.<\/p>\n<p>We no longer have to worry about the mean or standard deviation of the comparison distribution, we just need to find the degrees of freedom between and within. The &#8220;good&#8221; variance is\u00a0 the differences between groups, and so the degrees of freedom between is number of groups \u2013 1. The within-groups variance, the &#8220;bad&#8221; variance, is the individual differences among the scores within each group. The degrees of freedom within, then, is the total number of scores in all groups, minus the number of groups.<\/p>\n<p>For step 3, we can find the cutoff score in the F-tables if we know the significance level, degrees of freedom between and degrees of freedom within.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-481\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-1024x684.jpg\" alt=\"\" width=\"373\" height=\"249\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-1024x684.jpg 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-300x201.jpg 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-768x513.jpg 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-65x43.jpg 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-225x150.jpg 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15-350x234.jpg 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.15.jpg 1158w\" sizes=\"auto, (max-width: 373px) 100vw, 373px\" \/><\/p>\n<p>Step 4 is where things take some getting used to. Here we use this new system of formulas. Start with Sum of squares calculations: Between, Within, and Total, and double check that both they and the degrees of freedom add up.<\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 25px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/ql-cache\/quicklatex.com-658f4be8fa5ebfc693cd4b836f659c25_l3.png\" height=\"25\" width=\"206\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#32;&#92;&#091;&#83;&#83;&#95;&#123;&#66;&#125;&#61;&#92;&#115;&#117;&#109;&#32;&#091;&#78;&#95;&#123;&#103;&#125;&#40;&#77;&#95;&#123;&#103;&#125;&#45;&#77;&#95;&#123;&#111;&#125;&#41;&#94;&#123;&#50;&#125;&#093;&#92;&#093;&#32;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 25px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/ql-cache\/quicklatex.com-310d82aa6481cef324934a3fe5919060_l3.png\" height=\"25\" width=\"171\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#32;&#92;&#091;&#83;&#83;&#95;&#123;&#87;&#125;&#61;&#92;&#115;&#117;&#109;&#32;&#40;&#88;&#45;&#77;&#95;&#123;&#103;&#125;&#41;&#94;&#123;&#50;&#125;&#92;&#093;&#32;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 25px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/ql-cache\/quicklatex.com-e2b80b3aae3459a94014ca3cd2a4b153_l3.png\" height=\"25\" width=\"166\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#32;&#92;&#091;&#83;&#83;&#95;&#123;&#84;&#125;&#61;&#92;&#115;&#117;&#109;&#32;&#40;&#88;&#45;&#77;&#95;&#123;&#111;&#125;&#41;&#94;&#123;&#50;&#125;&#92;&#093;&#32;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p>Then move across the table, finding the good and the bad variance&#8230;<\/p>\n<\/div>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 41px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/ql-cache\/quicklatex.com-94ac068d1f6fa970b59d402068ee0878_l3.png\" height=\"41\" width=\"84\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#32;&#92;&#091;&#83;&#94;&#123;&#50;&#125;&#95;&#123;&#66;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#83;&#95;&#123;&#66;&#125;&#125;&#123;&#100;&#102;&#95;&#123;&#66;&#125;&#125;&#92;&#093;&#32;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 41px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/ql-cache\/quicklatex.com-e5fe2424674d311235f0a59ef07afcc6_l3.png\" height=\"41\" width=\"91\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#32;&#92;&#091;&#83;&#94;&#123;&#50;&#125;&#95;&#123;&#87;&#125;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#83;&#95;&#123;&#87;&#125;&#125;&#123;&#100;&#102;&#95;&#123;&#87;&#125;&#125;&#92;&#093;&#32;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p>&#8230; <span style=\"text-align: initial;font-size: 1em\">and finally getting their ratio for the F-test result.<\/span><\/p>\n<div>\n<p class=\"ql-center-displayed-equation\" style=\"line-height: 45px;\"><span class=\"ql-right-eqno\"> &nbsp; <\/span><span class=\"ql-left-eqno\"> &nbsp; <\/span><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/ql-cache\/quicklatex.com-10eb2646ad2196e5dcf5ec027e2f3e7d_l3.png\" height=\"45\" width=\"66\" class=\"ql-img-displayed-equation quicklatex-auto-format\" alt=\"&#32;&#32;&#92;&#091;&#70;&#61;&#92;&#102;&#114;&#97;&#99;&#123;&#83;&#94;&#123;&#50;&#125;&#95;&#123;&#66;&#125;&#125;&#123;&#83;&#94;&#123;&#50;&#125;&#95;&#123;&#87;&#125;&#125;&#92;&#093;&#32;\" title=\"Rendered by QuickLaTeX.com\" \/><\/p>\n<p>To make our decision in Step 5, we examine the calculated F value (from Step 4) and determine whether it exceeds the cutoff F score (from Step 3).\u00a0 If so, we reject the null hypothesis.<\/p>\n<p><span class=\"pullquote-right\">&#8220;There is a significant difference among the mean digit memory scores after listening to the three types of music (f<sub>2,6<\/sub> = 27.00, p &lt; 0.05).&#8221;<\/span> Here is an example of how to express the results \u2013 note the phrase \u201csignificant difference among the means.\u201d If we do not reject the null, we can switch the statement of results to \u201cno significant difference.\u201d The test statistic and p-values are expressed here in common formats.<\/p>\n<p>We can continue building a decision tree to help you decide which statistical test to use when you look at a research question. What are the circumstances in which you would need to use a one-way <strong>ANOVA<\/strong> test?<\/p>\n<\/div>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-493\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-1024x338.png\" alt=\"\" width=\"688\" height=\"227\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-1024x338.png 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-300x99.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-768x254.png 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-65x21.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-225x74.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16-350x116.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.16.png 1410w\" sizes=\"auto, (max-width: 688px) 100vw, 688px\" \/><\/p>\n<div id=\"h5p-74\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-74\" class=\"h5p-iframe\" data-content-id=\"74\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.08. F-ratio calculation example.\"><\/iframe><\/div>\n<\/div>\n<div id=\"h5p-75\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-75\" class=\"h5p-iframe\" data-content-id=\"75\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.09. Continued. F-ratio calculation example.\"><\/iframe><\/div>\n<\/div>\n<div id=\"h5p-76\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-76\" class=\"h5p-iframe\" data-content-id=\"76\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8a.10. Continued. F-ratio calculation example.\"><\/iframe><\/div>\n<\/div>\n<h1>8b. Planned Contrasts and Posthoc Tests<\/h1>\n<p>In the second part of this chapter we will have a look at follow-up tests we can conduct after an <strong>ANOVA<\/strong> hypothesis test, to investigate the findings in greater detail.<\/p>\n<p>Planned contrasts and post-hoc tests are commonly performed following <strong>Analysis of Variance<\/strong>. This is necessary in many instances, because <strong>ANOVA<\/strong> compares all individual mean differences simultaneously, in one test (referred to as an omnibus test). If we run an <strong>ANOVA<\/strong> hypothesis test, and the F-test comes out significant, this indicates that at least one among the mean differences is statistically significant. However, when the <strong>factor<\/strong> has more than two <strong>levels<\/strong>, it does not indicate which means differ significantly from each other.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-494 alignright\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17.png\" alt=\"\" width=\"335\" height=\"302\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17.png 674w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17-300x271.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17-65x59.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17-225x203.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/08\/Fig-8.17-350x316.png 350w\" sizes=\"auto, (max-width: 335px) 100vw, 335px\" \/><\/p>\n<p>In this example, a significant F-test result from a one-way <strong>ANOVA<\/strong> with the three drug dose conditions does not tell us where the significant difference lies. Is it between 0 and 100 mg? Or between 100 and 200 mg? Or is it only the biggest difference that is significant \u2013 0 vs. 200 mg?<\/p>\n<p><strong>Planned contrasts<\/strong> and <strong>post hoc tests<\/strong> are additional tests to determine exactly which mean differences are significant, and which are not. Why is that we cannot just do 3 independent means t-tests here? Each time we conduct a t-test we have a certain risk of a Type I error. If we do 3, we have triple the risk. So first we test for omnibus significance using the overall <strong>ANOVA<\/strong> as detailed in the first part of this chapter. Then, if a statistically significant difference exists among the means, we do the pairwise comparisons with an adjustment to be more conservative. These follow-up tests are designed specifically to avoid inflating risk of Type I error.<\/p>\n<p>Now, this is very important. We are <em>only<\/em> allowed to conduct these tests <em>if the F-test result was significant<\/em>. This procedural rule also helps protect us from the statistical sin of p-hacking, which is selectively hunting for and reporting significant results in a way that is biased and subjective.<\/p>\n<p><strong>Planned contrasts<\/strong> are used when researchers know in advance which groups they expect to differ. For example, suppose from our worksheet example, we expect the pop group to differ from the classical group on our measure of working memory. We can then conduct a single comparison between these means without worrying about Type I error. Because we hypothesized this difference before we saw the data, perhaps based on prior research studies or a strong intuitive hunch, and because there is only one comparison to be analyzed, we need not be concerned about inflated <strong>experimentwise alpha<\/strong>. If multiple comparisons are planned, then we will need to adjust the significance level.<\/p>\n<p>Let us take a look at how to conduct a single <strong>planned contrast<\/strong>. The process is quite simple, as it is just a modified <strong>ANOVA<\/strong> analysis. First we calculate SSB with just those two groups involved in the planned contrast. We figure out the degrees of freedom between using just the two groups. Then, we calculate the variance between using the new SSB and degrees of freedom, and we calculate an F-test for the comparison using the new variance between and the original overall variance within. To find out if the F-test result is significant, we can use the new degrees of freedom but the original significance level for the cutoff. (Because there is just one pairwise comparison, we can use original significance level.)<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Steps to calculate a planned contrast<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ol>\n<li>Calculate SS<sub>Between<\/sub> with just those two groups.<\/li>\n<li>Find the df<sub>Between<\/sub> using just the two groups.<\/li>\n<li>Calculate S<sup>2<\/sup><sub>Between<\/sub> using the new SS<sub>Between<\/sub> and the new df<sub>Between<\/sub>.<\/li>\n<li>Calculate F using the new S<sup>2<\/sup><sub>Between<\/sub> and the overall S<sup>2<\/sup><sub>Within<\/sub>.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<p>If we were to perform multiple planned contrasts, things change a little. Suppose we had hypothesized in this experiment that each group would differ from the others? The <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_497\">Bonferroni correction<\/a><\/strong> involves adjusting the significance level to protect from the inflation of risk of Type I error. The procedure for each comparison is the same as for a single planned contrast. The difference is that the cutoff score to determine statistical significance will use a more conservative significance level. When we do multiple pairwise comparisons, the <strong>Bonferroni correction<\/strong> is to use the original\u00a0 significance level divided by number of planned contrasts. The adjusted significance level is not likely to be in our F-tables, so to find the cutoff for such tests, we would need to use an <a href=\"http:\/\/statpages.org\/pdfs.html\" target=\"_blank\" rel=\"noopener\">online calculator<\/a> in reverse (that is, we enter the p-value and degrees of freedom, and look up the value on the F-distribution corresponding to that area in the tail).<\/p>\n<p>What about <strong>post hoc tests<\/strong> tests? As the name suggests, these tests come into the picture when we are doing pairwise comparisons (usually all possible combinations) after the fact to find out where the significant differences were. These are tests that do not require that we had an <em>a priori<\/em> hypothesis ahead of data collection. Essentially, these are an allowable and acceptable form of data-snooping. This is where we must be cautious about doing so many tests \u2013 we could end up with huge risk of Type I error. If we use the <strong>Bonferroni correction<\/strong> that we saw for multiple planned comparisons on more than 3 tests, the significance level would be vanishingly small. This would make it nearly impossible to detect significant differences. For this reason, slightly more forgiving tests like <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_440_498\">Scheffe\u2019s correction<\/a><\/strong>, Dunn\u2019s or Tukey\u2019s <strong>post-hoc tests<\/strong> are more popular. There are many different post-hoc tests out there, and the choice of which one researchers use is often a matter of convention in their area of research.<\/p>\n<div id=\"h5p-77\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-77\" class=\"h5p-iframe\" data-content-id=\"77\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8b.01. Bonferroni correction.\"><\/iframe><\/div>\n<\/div>\n<p>Now we shall take a look at how to conduct <strong>post hoc tests<\/strong> using <strong>Scheff\u00e9\u2019s correction<\/strong>. In this example, we will test all pairwise comparisons. The <strong>Scheff\u00e9<\/strong> technique involves adjusting the F-test result, rather than adjusting the significance level. The way it works is the same as the <strong>planned contrast<\/strong> procedure, except for the very end. Before we compare the F-test result to the cutoff score, we divide the F value by the overall degrees of freedom between, or the number of groups minus one. Thus, we keep the significance level at the original level, but divide the calculated F by overall degrees of freedom between from the overall <strong>ANOVA<\/strong>.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Steps to calculate post-hoc tests with Scheff\u00e9&#8217;s correction<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>For each pairwise comparison:<\/p>\n<ol>\n<li>Calculate SS<sub>Between<\/sub> with just those two groups.<\/li>\n<li>Find the df<sub>Between<\/sub> using just the two groups.<\/li>\n<li>Calculate S<sup>2<\/sup><sub>Between<\/sub> using the new SS<sub>Between<\/sub> and the new df<sub>Between<\/sub>.<\/li>\n<li>Calculate F using the new S<sup>2<\/sup><sub>Between<\/sub> and the overall S<sup>2<\/sup><sub>Within<\/sub>.<\/li>\n<li>Divide F by overall\u00a0df<sub>Between<\/sub>.<\/li>\n<\/ol>\n<\/div>\n<\/div>\n<h1>\n<div id=\"h5p-78\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-78\" class=\"h5p-iframe\" data-content-id=\"78\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 8b.02. Scheffe&#039;s correction\"><\/iframe><\/div>\n<\/div>\n<\/h1>\n<h1>Chapter Summary<\/h1>\n<p>In this chapter we introduced the concepts underlying <strong>Analysis of Variance<\/strong> and examined how to conduct a hypothesis test using this technique. We also saw how to follow up on a statistically significant F-test result in an <strong>ANOVA<\/strong> with more than two <strong>levels<\/strong> in a <strong>factor<\/strong>, in order to determine which levels were significantly different from each other.<\/p>\n<p>Key terms:<\/p>\n<table class=\"no-lines\" style=\"border-collapse: collapse;width: 100%\">\n<tbody>\n<tr>\n<td style=\"width: 33.3333%\"><strong>Analysis of Variance<\/strong><\/td>\n<td style=\"width: 33.3333%\"><strong>post hoc tests<\/strong><\/td>\n<td style=\"width: 33.3333%\"><strong>Bonferroni correction<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%\"><strong>general linear model<\/strong><\/td>\n<td style=\"width: 33.3333%\"><strong>factor<\/strong><\/td>\n<td style=\"width: 33.3333%\"><strong>Scheff\u00e9 correction<\/strong><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%\"><strong>partitioning of variance<\/strong><\/td>\n<td style=\"width: 33.3333%\"><strong>levels<\/strong><\/td>\n<td style=\"width: 33.3333%\"><\/td>\n<\/tr>\n<tr>\n<td style=\"width: 33.3333%\"><strong>planned contrasts<\/strong><\/td>\n<td style=\"width: 33.3333%\" colspan=\"2\"><strong>experimentwise alpha level<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<div class=\"glossary\"><span class=\"screen-reader-text\" id=\"definition\">definition<\/span><template id=\"term_440_441\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_441\"><div tabindex=\"-1\"><p>also called ANOVA, a system of data analysis that is very flexible and adaptable to a variety of research designs. It is based on a statistical concept called the general linear model and involves the technique of partitioning variance.<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_442\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_442\"><div tabindex=\"-1\"><p>an extension of the statistical technique linear regression that is adaptable to various combinations of independent (nominal) and dependent (numeric) variables<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_445\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_445\"><div tabindex=\"-1\"><p>the allocation of variability among scores in numeric data into different buckets, like treatment effects vs. error, or between-groups vs. within-groups variance<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_452\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_452\"><div tabindex=\"-1\"><p>statistical tests of pairwise comparisons among groups, used to follow up on a significant ANOVA result, when researchers know in advance which groups they expect to differ<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_453\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_453\"><div tabindex=\"-1\"><p>statistical tests of pairwise comparisons among groups, used to follow up on a significant ANOVA result, when researchers do not know in advance which groups they expect to differ and wish to test all possible combinations<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_450\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_450\"><div tabindex=\"-1\"><p>in ANOVA, a grouping variable used to account for variance among scores; in an experiment a factor is an independent variable<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_458\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_458\"><div tabindex=\"-1\"><p>the individual conditions or values that make up a factor, a nominal variable that forms the groups in analysis of variance <\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_462\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_462\"><div tabindex=\"-1\"><p>the problem of accumulating risk of Type I error with multiple statistical tests on the same data<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_497\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_497\"><div tabindex=\"-1\"><p>adjustment to avoid inflation of experimentwise risk of Type I error, by dividing significance level by the number of planned contrasts to be conducted <\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_440_498\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_440_498\"><div tabindex=\"-1\"><p>in posthoc analyses, an adjustment to correct for inflated experimentwise risk of Type I error, by dividing the F value by the overall degrees of freedom between from the original overall ANOVA analysis<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><\/div>","protected":false},"author":1394,"menu_order":8,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[48],"contributor":[],"license":[],"class_list":["post-440","chapter","type-chapter","status-publish","hentry","chapter-type-numberless"],"part":3,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/440","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/users\/1394"}],"version-history":[{"count":25,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/440\/revisions"}],"predecessor-version":[{"id":996,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/440\/revisions\/996"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/parts\/3"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/440\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/media?parent=440"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapter-type?post=440"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/contributor?post=440"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/license?post=440"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}