{"id":5,"date":"2021-07-29T11:51:09","date_gmt":"2021-07-29T15:51:09","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/statspsych\/?p=5"},"modified":"2022-07-09T17:20:09","modified_gmt":"2022-07-09T21:20:09","slug":"chapter-1","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/statspsych\/chapter\/chapter-1\/","title":{"raw":"1. Why We Need Statistics and Displaying Data Using Tables and Graphs","rendered":"1. Why We Need Statistics and Displaying Data Using Tables and Graphs"},"content":{"raw":"<h1><a id=\"1a\"><\/a>1a. Why we need statistics<\/h1>\r\n<h6 style=\"text-align: right\"><a href=\"#video1a\">video lesson<\/a><\/h6>\r\nOne of the first things I think we need to accomplish in this course is to understand why statistics are important. Our objective in this first part of Chapter 1 is to be able to articulate the purpose of a course introducing statistical principles and techniques, and to be able to supply examples of situations in which the techniques you will learn in such a course may be necessary to use.\r\n\r\nFirst, let us establish that this is <em>not<\/em> a math course. This is a course that is primarily about decision making. Not just any decision making, but decisions that are made after analyzing data in order to make objective decisions that are guided by empirical evidence. Of course, we use some simple calculations in the course in order to process the data into a form that aids our decision making. However, the math is a necessary means to an end, not an end in itself.\r\n\r\nIn some situations this kind of decision making is not needed. When the decision can be made based on intuition and subjective personal preference, we do not need rigorous data-driven systems. For example, if I am trying to decide whom to date, or what style of clothing I like to wear that suits my personality, I likely am not going to conduct research and a formal data analysis to come to those decisions. Maybe you can think of another situation, in which a good decision can be made without empirical evidence.\r\n\r\nOn the other hand, sometimes a decision that you need to make is one that affects others, or is so high stakes that you want to make an informed decision that is objective and based on empirical evidence. In this kind of decision making, you should check your intuition at the door, and walk in with an open mind, letting the data be your guide. Examples of situations in which an objective decision making process might be necessary would be when you are trying to decide whether a medical treatment is safe, or whether a proposed intervention is actually effective. Perhaps you need to find out if a crime prevention program is effective for urban and rural communities alike. Can you think of another kind of decision that should be made objectively based on data? What these scenarios have in common is that they are professional decisions, or are high stakes. In the professional workplace, we are often in situations where, if we just operated based on our intuition, we may make serious mistakes, because we have not considered whether the course of action we decide on is the best choice for all people, all situations, or over time. The techniques you will learn in this course will help you apply data analysis, so that you can set up a decision making framework that is objective and rigorous, and so that the decision you come to will be generalizable, to suit other people, situations, or time frames.\r\n\r\nWhy does a student in your field of study require statistics? Regardless of your field of study, I bet you are asked to be a critical thinker. If we look at the list of critical thinking guidelines below that make for good science, I bet you can see the value of these guidelines for your own program of study.\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Critical thinking guidelines<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>Ask Questions: Be Willing to Wonder<\/li>\r\n \t<li>Define Your Terms<\/li>\r\n \t<li><strong>Examine the Evidence<\/strong><\/li>\r\n \t<li>Analyze Assumptions and Biases<\/li>\r\n \t<li><strong>Avoid Emotional Reasoning<\/strong><\/li>\r\n \t<li>Don\u2019t Oversimplify<\/li>\r\n \t<li>Consider Other Interpretations<\/li>\r\n \t<li>Tolerate Uncertainty<\/li>\r\n<\/ul>\r\nfrom Wade, Tavris &amp; Swinkels. (2017). <em>Psychology<\/em>. Boston: Pearson.\r\n\r\n<\/div>\r\n<\/div>\r\nStatistics represents a tool for examining evidence and allowing us to use data effectively. However, it is also important to realize that statistics can help us avoid emotional reasoning. Instead of relying on our intuitions about whether a drug is effective, or whether one choice is significantly better than another, statistical analysis allows us to make an objective decision.\r\n\r\nIn statistics, N stands for sample size. In other words, how many data points did you measure. Very often, in everyday life, we are tempted to make assumptions and derive conclusions from single data points. In the world of statistics, we call these situations, \u201can N of one\u201d. These are situations scientists are extremely wary of, because they are vulnerable to bias.\r\n\r\nFor example, let\u2019s say my friend has a really bad experience in one neighborhood. After that, even if there are no objective reports of comparative neighborhood safety that support this conclusion, I am likely to say to others that that\u2019s a bad neighborhood \u2013 one to avoid. We are always overly influenced by our own experiences and the experiences of those close to us. In such moments we should always remind ourselves that until we have asked many individuals who have been in that neighborhood what their experiences were, we only have one observation, and it may not be typical or representative. If my friend\u2019s experience in the neighborhood were the one bad experience in 1000 experiences, would we still be tempted to consider it a \u201cbad\u201d neighbourhood? Next time you face a situation like this in your daily life, just take a moment to pause and think to yourself... what information should I have to make the right decision?\r\n<h4><a id=\"intuition_return\"><\/a><a href=\"&quot;#intuition\">Concept Practice: Intuition<\/a><\/h4>\r\n<img class=\"aligncenter\" src=\"https:\/\/opentextbc.ca\/indigenizationfrontlineworkers\/wp-content\/uploads\/sites\/237\/2018\/06\/IndigenousHolisticFramework.png\" alt=\"&quot;&quot;\" width=\"351\" height=\"342\" \/>\r\n<p style=\"text-align: center;padding-left: 40px\">Fig. 2.2 from <em><a href=\"https:\/\/opentextbc.ca\/indigenizationfrontlineworkers\" rel=\"cc:attributionURL\">Pulling Together: A Guide for Front-Line Staff, Student Services, and Advisors<\/a><\/em>\u00a0by\u00a0Ian Cull, Robert L. A. Hancock, Stephanie McKeown, Michelle Pidgeon, and Adrienne Vedan<\/p>\r\nIn this course, we will be focusing on only one aspect of one way of knowing. Let us acknowledge the fact that various cultures and systems place particular value on various ways of knowing. For example, if we refer to the indigenous ways of knowing framework shown above, we might see this entire course as being one element of \u201cintellectual\u201d ways of knowing. Its contribution might be to contribute to responsibility and relevance by enhancing the generalizability of decision making as we discussed before. However, no one should mistake statistics for a holistic system of knowing.\r\n\r\nI encourage you to think of what you learn in this course as one tool in the toolbox. The reason many academic disciplines require a statistics course is that this is a tool most people do not get in other areas of their lives. We tend not to learn statistics from our parents or by volunteering in the community. In fact, most of us are very bad at this form of decision making until we learn to use these tools.\r\n\r\nBy requiring you to learn statistics, disciplines like Psychology are not suggesting it is the only important decision-making tool. It is one we think you need to understand to be a good scientist and to better interpret some types of evidence to which you will have access in your professional life. I encourage you to learn more about holistic ways of knowing and to reflect on the place that formal, data-driven decision-making practices have in your own ways-of-knowing framework. I think we could all gain some insight by looking at such a model with an eye toward acknowledging areas in which we are weaker, because of our own individual experiences or because of the society in which we have grown up.\r\n<h1><a id=\"1b\"><\/a>1b. Displaying Data Using Tables and Graphs<\/h1>\r\n<h6 style=\"text-align: right\"><a href=\"#video1b\">video lesson<\/a><\/h6>\r\nHave you ever heard the saying, \u201ca picture is worth a thousand words\u201d? That is what the rest of this chapter is all about. First,<span style=\"text-align: initial;font-size: 1em\">\u00a0we need to cover some basic concepts and definitions, including the differences between <strong>[pb_glossary id=\"33\"]descriptive[\/pb_glossary]<\/strong> and <strong>[pb_glossary id=\"43\"]inferential[\/pb_glossary]<\/strong> statistics, and the meaning of the terms <strong>[pb_glossary id=\"45\"]variable[\/pb_glossary]<\/strong>, <strong>[pb_glossary id=\"46\"]value[\/pb_glossary]<\/strong> and <strong>[pb_glossary id=\"47\"]score[\/pb_glossary]<\/strong>. We will then need to learn to distinguish among levels of measurement to be able to choose the appropriate techniques for summarizing different types of data.<\/span>\r\n\r\nFinally, I will demonstrate how to generate <strong>[pb_glossary id=\"49\"]frequency tables[\/pb_glossary]<\/strong> and to graph a dataset, because the first step in data analysis is always to look at it. Just as a picture is worth a thousand words, it is also worth a thousand numbers.\r\n\r\nAt first we will focus on <strong>descriptive<\/strong> statistics. These are ways to summarize or organize data from a research study \u2013 essentially allowing us to describe what the data are.\r\n\r\nA little later in the course, we will move into the realm of <strong>inferential<\/strong> statistics. These are analytical tools that allow us to draw conclusions based on data from a research study. In other words, we go beyond just saying what the data are, and make a statement about what they mean. Inferential statistics are used in research and policy as a tool to make decisions.\r\n<h4><a id=\"inferential_return\"><\/a><a href=\"&quot;#inferential\">Concept Practice: inferential statistics<\/a><\/h4>\r\n<h4><a href=\"&quot;#descriptive\">Concept Practice: descriptive statistics<\/a><\/h4>\r\nThree basic terms are essential jargon in statistics. A <strong>variable<\/strong> is a quality or a quantity that is different for different individuals. A <strong>variable<\/strong> could be a quality, like ethnicity, for which each person might have a different characteristic. Or it could be a quantity, like temperature, that could be different each time you take a reading, and is measured on a number scale. A <strong>value<\/strong> is just any possible number or category that a <strong>variable<\/strong> could take on. So for ethnicity you might have 6 categories in which you place individuals. Or for temperature there might be a numeric range from -100 to +100. Those would be the full set of <strong>values<\/strong> for that <strong>variable<\/strong>. A score is a particular individual\u2019s <strong>value<\/strong> on the <strong>variable<\/strong>. For ethnicity, you would identify yourself as one particular category, and that would be your <strong>score<\/strong>. For temperature, if you check your weather app and see that it is 7 degrees outside, that is the <strong>score<\/strong> for that time and place.\r\n<h4><a id=\"score_return\"><\/a><a href=\"#score\">Concept Practice: variable, value, score<\/a><\/h4>\r\nMeasurement is the assignment of a number to the amount of something., or assigning labels for categories. This is often obvious (for example, we might measure time as number of seconds or number of minutes). Sometimes, however, it can be a bit more arbitrary. We might assign numbers to signify a category, for example 1 for male and 2 for female.\r\n\r\nBased on how we measure them, there are two major types of variables in statistics, and these will be important to keep in mind as we go through the semester. The type of variable determines how we can use it.\r\n<table class=\"lines\" style=\"height: 168px\">\r\n<tbody>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 170.996px;height: 29px\"><em><strong>Type of variable<\/strong><\/em><\/td>\r\n<td style=\"width: 345.996px;height: 29px\"><em><strong>Characteristics<\/strong><\/em><\/td>\r\n<td style=\"width: 228.971px;height: 29px\"><em><strong>Examples<\/strong><\/em><\/td>\r\n<\/tr>\r\n<tr style=\"height: 46px\">\r\n<td style=\"width: 170.996px;height: 46px\">&nbsp;\r\n\r\n<strong>Nominal<\/strong>\/\r\n\r\nCategorical\r\n\r\n&nbsp;<\/td>\r\n<td style=\"width: 345.996px;height: 46px\">\r\n<div>\u2022Label and categorize<\/div>\r\n<div>\u2022If numbered, numbers are arbitrary<\/div><\/td>\r\n<td style=\"width: 228.971px;height: 46px\">\r\n<div>\u2022Gender<\/div>\r\n<div>\u2022Diagnosis<\/div>\r\n<div>\u2022Experimental or Control<\/div><\/td>\r\n<\/tr>\r\n<tr style=\"height: 93px\">\r\n<td style=\"width: 170.996px;height: 93px\">&nbsp;\r\n\r\n<strong>Numeric<\/strong>\/\r\n\r\nQuantitative\r\n\r\n&nbsp;<\/td>\r\n<td style=\"width: 345.996px;height: 93px\">\r\n<div>\u2022Numerical data<\/div>\r\n<div>\u2022Numbers reflect size or amount of something<\/div><\/td>\r\n<td style=\"width: 228.971px;height: 93px\">\r\n<div>\u2022Temperature<\/div>\r\n<div>\u2022IQ<\/div>\r\n<div>\u2022Golf scores (above\/below par)<\/div>\r\n<div>\u2022Number of correct answers<\/div>\r\n<div>\u2022Time to complete task<\/div>\r\n<div>\u2022Gain in height since last year<\/div><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThe first type of variable is <strong>[pb_glossary id=\"50\"]nominal[\/pb_glossary]<\/strong> or categorical (also called qualitative). <strong>Nominal<\/strong> variables label or categorize something, and any numbers used to measure these variables are arbitrary and do not indicate quantity or size. For example, if male is scored as 1 and female is scored as 2, that does not indicate that females are twice as good or double the size of males. It is just a code.\r\n\r\n<strong>[pb_glossary id=\"51\"]Numeric[\/pb_glossary]<\/strong> or quantitative variables are ones for which numbers are actually meaningful. They indicate the size or amount of something.\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Examples of Numeric variables<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>Temperature, in which 10 degrees is warmer than 0<\/li>\r\n \t<li>Golf scores, in which 2 below par means you did well<\/li>\r\n \t<li>IQ, in which 100 is average intelligence<\/li>\r\n \t<li>Number of correct answers, in which 4 correct answers is twice as many as 2 correct answers<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\nWhen we calculate statistics, we will see that we can calculate an average IQ in a group of people, or an average temperature across several days. But we cannot calculate an average gender. What we will do is use groupings or categories as a basis for comparison of other variables; for example, does the experimental group have a higher number of correct answers than the control group?\r\n\r\nNow, a quick side note: If you take a course in research methods, you learn that measurement is a really tricky thing in practice, particularly when you want to measure something internal about a person. The process of operationally defining something so that you can measure it numerically or in discrete categories is a real challenge. This is beyond the scope of this course, but just to give you a sense, try brainstorming a way in which you could measure aggression? Think of at least one way that would create a <strong>nominal variable<\/strong>, and one way that would create a <strong>numeric variable<\/strong>.\r\n\r\nIf you give that example some thought, you will quickly find that a relatively simple variable like aggression can be fiendishly difficult to measure, and in the field of psychology a lot of effort is put into developing good ways to measure mental constructs. In experimental psychology, we often prefer to measure things as numbers, because then we can use statistical methods to summarize and to make inferences about the thing we measured.\r\n\r\nWe should return to our discussion of experimental research design. A <strong>variable<\/strong> is something that has different values for different individuals, and that we can measure. As an example, we can measure how fast someone is at completing a puzzle, and get those scores for a bunch of people. This variable would be speed. We can also assign each of those people into categories or conditions: a high-stress vs. low-stress condition, for example. Research is the study of the relationship between <strong>variables<\/strong>. Therefore, there must be at least two <strong>variables<\/strong> in a research study (or there is no relationship to study). Typically an experimental study in psychology has one (or more) <strong>[pb_glossary id=\"52\"]independent variables[\/pb_glossary]<\/strong> and one (or more) <strong>[pb_glossary id=\"53\"]dependent variables[\/pb_glossary]<\/strong>.\r\n\r\nAn <strong>independent variable<\/strong> is one you manipulate -- most often it is categorical, or <strong>nominal<\/strong> (e.g. experimental group vs. control group). A <strong>dependent variable<\/strong> is one you measure to detect a difference\/change as a result of the manipulation -- most often it is <strong>numeric<\/strong> (e.g. time to complete a puzzle).\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Example of Experimental Design<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n\r\nDo members of your experimental group (who were required to give a speech in front of a group of people) solve a puzzle in a shorter or longer amount of time than members of your control group (who were allowed to browse magazines)?\r\n\r\n<\/div>\r\n<\/div>\r\nIn the example above, the <strong>independent variable<\/strong> would be the manipulation: whether people are required to give a speech or are allowed to browse magazines. Note that is a <strong>nominal variable<\/strong>. The <strong>dependent variable <\/strong>is what you measure after the manipulation: how it takes the participants to solve a puzzle. Note that would be a <strong>numeric variable<\/strong>.\r\n\r\nNow that you have some basic definitions and concepts down regarding types of data and how to measure them, we need to learn how to deal with <strong>numeric<\/strong> data.\r\n<div style=\"text-align: center\">\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Example of Numeric Dataset<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">Stress ratings of 10 students: 8,7,4,10,8,6,8,9,9,7<\/div>\r\n<\/div>\r\n<span style=\"text-align: initial;font-size: 1em\">First\u2026 what can you say about this data set from the list of numbers above? How would you describe the findings to someone?<\/span>\r\n\r\n<\/div>\r\nPerhaps you want to summarize a dataset in table form, to organize the data and make it easy to get an overview of the dataset quickly. A <strong>frequency table<\/strong> does just that. To create a <strong>frequency table<\/strong>, you just ask yourself: for each possible value on this variable, how many individuals have a particular score? That gives you the frequency of each value \u2013 or how often it occurred in the dataset. Let\u2019s look at an example. We measure the stress levels of 10 students, on a scale of 1 to 10, and above are their scores. Hard to make any sense out of that list, right? By following the steps below, we can create a <strong>frequency table<\/strong>.\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Steps for Making a Frequency Table<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>Label the first row: Values, Frequency, and Percentage.<\/li>\r\n \t<li>In the first column, under the heading Values, list all the possible values the variable could take on. In this case, we have 10 possible values, so there should be 10 rows in the data portion of the table.<\/li>\r\n \t<li>Make a list down the page of each score, from lowest to highest, to make it easier to count them.<\/li>\r\n \t<li>Go one by one through the scores, making a mark for each next to its value on the list (e.g., how many 1\u2019s are there? 0. \u2026 How many 4\u2019s are there? 1. \u2026 How many 7\u2019s are there? 2. Repeat that question for every value from 1 to 10. Write those frequencies, or counts, in the Frequency column.<\/li>\r\n \t<li>Figure the percentage of scores for each value. To calculate a percentage you take the frequency, divide by how many scores you have in the dataset (here we have 10 students, so 10 scores), and multiply that by 100 to move the decimal to the right two places. So for the value of 7, with frequency of 2, that becomes 2 divided by 10 times 100 or 20%. Calculate and list all the percentages.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\nHere is what the table should look like once you are done with those steps:\r\n<table class=\"grid aligncenter\" style=\"height: 442px\" width=\"396\" cellspacing=\"0\" cellpadding=\"0\">\r\n<tbody>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\"><em><strong>Values<\/strong><\/em><\/td>\r\n<td style=\"width: 128.927px;height: 29px\"><em><strong>Frequency<\/strong><\/em><\/td>\r\n<td style=\"width: 126.925px;height: 29px\"><em><strong>Percent<\/strong><\/em><\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">1<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">0<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">0%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 26px\">\r\n<td style=\"width: 110.808px;height: 26px\">2<\/td>\r\n<td style=\"width: 128.927px;height: 26px\">0<\/td>\r\n<td style=\"width: 126.925px;height: 26px\">0%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">3<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">0<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">0%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">4<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">1<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">5<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">0<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">0%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">6<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">1<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">7<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">2<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">8<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">3<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">30%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">9<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">2<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">10<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">1<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nNow you can scan down the table and quickly see where most of the <strong>scores<\/strong> fall within the range of possible <strong>values<\/strong>. Now that you have an organized summary of the data, you can clearly see that the majority of students are reporting fairly high stress <strong>scores<\/strong>. By looking at the percentages, you have a quick way to report the proportion of students that are highly stressed. For example, just by doing some quick addition, you can say that 60% of surveyed students report stress levels 8 or higher.\r\n\r\nMost people find graphs easier to interpret at first glance than tables. What can you say about the dataset after looking at this graph?\r\n\r\n<img class=\" wp-image-23 aligncenter\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2-300x251.png\" alt=\"\" width=\"338\" height=\"283\" \/>\r\n\r\nI bet you were able to say that most <strong>scores<\/strong> pile up at the upper end of the graph, at the higher end of the range of stress score <strong>values<\/strong>.\r\n\r\nThe graphical version of a frequency table is a <strong>[pb_glossary id=\"55\"]histogram[\/pb_glossary]<\/strong>. The X axis on a <strong>histogram<\/strong> should have the values of the variable listed, from lowest to highest. The Y axis should represent the frequencies of each value in the dataset. In other words, the <strong>histogram<\/strong> is a <strong>frequency table<\/strong> that has been turned on its side. The added benefit comes from the visual representation of the frequency as the height of the bars in the graph, rather than just a number. You can thus see a clear shape in the dataset.\r\n\r\nIn some circumstances, a <strong>frequency table<\/strong> is not an effective way to summarize a dataset. This is the case if the range of <strong>values<\/strong> is too large. For example, what if you were summarizing temperature <strong>scores<\/strong>, which can range from 0-100? This would mean more than 100 rows in the table. That is not so helpful. In such a case, a <strong>[pb_glossary id=\"56\"]grouped frequency table[\/pb_glossary]<\/strong> is a much better option.\r\n\r\nA <strong>grouped frequency table<\/strong> defines ranges of values in the first column, and reports the frequency of <strong>scores<\/strong> that fall within each range. In this example, we have surveyed 30 students\u2019 stress levels on a scale of 1-10:\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Example of Numeric Dataset<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div>Stress ratings of 30 students: 8,7,4,10,8,6,8,9,9,7,3,7,6,5,1,9,10,7,7,3,6,7,5,2,1,6,7,10,8,8<\/div>\r\n<\/div>\r\n<\/div>\r\nIf we wanted to get the table into the ideal format of 4-8 rows, we could create grouped frequencies, with two <strong>values<\/strong> in each row.\r\n\r\nBy following these steps, we can create a <strong>grouped frequency table<\/strong>:\r\n<div class=\"textbox textbox--exercises\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Steps for Making a Grouped Frequency Table<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>Label the first row: Values, Frequency, and Percentage<\/li>\r\n \t<li>Decide on the ranges of values you need. You want to choose ranges that will leave you with 4-8 rows in the table<\/li>\r\n \t<li>In the first column, under the heading Values, list all the possible value ranges the variable cold take on. In this case, we grouping by twos, so there should be 5 rows in the data portion of the table.<\/li>\r\n \t<li>Make a list down the page of each score, from lowest to highest, to make it easier to count them.<\/li>\r\n \t<li>Go one by one through the scores, making a mark for each next to its value range on the list (e.g., how many 1\u2019s and 2\u2019s are there? 3. \u2026 How many 3\u2019s and 4\u2019s are there? 3. \u2026 and so on. Repeat that question for every value range. Write those frequencies, or counts, in the Frequency column.<\/li>\r\n \t<li>Figure the percentage of scores for each value range. To calculate a percentage you take the frequency, divide by how many scores you have in the dataset (here we have 30 students, so 30 scores), and multiply that by 100 to move the decimal to the right two places. So for the value range of 1-2, with frequency of 3, that becomes 3 divided by 30 times 100, or 10%. Calculate and list all the percentages.<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\nHere is the completed <strong>grouped frequency table<\/strong> for the dataset of 30 students:\r\n<table class=\"grid aligncenter\" cellspacing=\"0\" cellpadding=\"0\">\r\n<tbody>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\"><em><strong>Values<\/strong><\/em><\/td>\r\n<td style=\"width: 128.927px;height: 29px\"><em><strong>Frequency<\/strong><\/em><\/td>\r\n<td style=\"width: 126.925px;height: 29px\"><em><strong>Percent<\/strong><\/em><\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">1-2<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">3<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 26px\">\r\n<td style=\"width: 110.808px;height: 26px\">3-4<\/td>\r\n<td style=\"width: 128.927px;height: 26px\">3<\/td>\r\n<td style=\"width: 126.925px;height: 26px\">10%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">5-6<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">6<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">7-8<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">12<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">40%<\/td>\r\n<\/tr>\r\n<tr style=\"height: 29px\">\r\n<td style=\"width: 110.808px;height: 29px\">9-10<\/td>\r\n<td style=\"width: 128.927px;height: 29px\">6<\/td>\r\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nNote that we can and should double check our work. Simply add up all the frequencies and make sure the sum is 30. Also add up all the percentages and make sure they add up to 100%.\r\n\r\nHere is a <strong>his<\/strong><strong style=\"text-align: initial;font-size: 1em\">togram<\/strong><span style=\"text-align: initial;font-size: 1em\"> of the <\/span><strong style=\"text-align: initial;font-size: 1em\">grouped frequency table<\/strong><span style=\"text-align: initial;font-size: 1em\"> we just generated:<\/span>\r\n\r\n<img class=\" wp-image-24 aligncenter\" style=\"font-size: 1em\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3-300x276.png\" alt=\"\" width=\"313\" height=\"288\" \/>\r\n\r\nNote that <strong>histograms<\/strong> are appropriate for plotting <strong>numeric<\/strong> datasets. The X axis should be labeled with numbers, rather than with categories. That is how a <strong>histogram<\/strong> differs from your typical bar graph. The other difference is the width of the bar. In <strong>histograms<\/strong>, because the data are <strong>numeric<\/strong> or continuous, the bars should appear to touch \u2013 with no break in between the bars. This gives a unitary appearance to the shape of the graph. If you were to draw a smooth line over the shape of the distribution, or overall pattern of the data, you would get the impression of a curve.\r\n\r\nIf you drew a smooth line over the shape of the dataset in a <strong>histogram<\/strong>, you could describe the shape that is generated with two types of descriptors:\r\n<div class=\"textbox textbox--examples\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Describing a Distribution<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<ul>\r\n \t<li>How many peaks there are<\/li>\r\n \t<li>How symmetrical the shape is<\/li>\r\n<\/ul>\r\n<\/div>\r\n<\/div>\r\nSkewness is the term for describing symmetry: is the distribution of data symmetrical (or very close) \u2013 meaning a mirror image from left to right, or is it skewed right\/positively, or left\/negatively? To determine the direction of skew, you need to check the direction of the \u201ctail\u201d. If the tail points right, it is <strong>[pb_glossary id=\"57\"]right skewed[\/pb_glossary]<\/strong>. In this case, the tail points left, so it is <strong>[pb_glossary id=\"58\"]left skewed[\/pb_glossary]<\/strong>.\r\n\r\n<img class=\"wp-image-25 aligncenter\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-300x279.png\" alt=\"\" width=\"336\" height=\"313\" \/>\r\n\r\nHow many peaks the distribution contains is described as <strong>[pb_glossary id=\"60\"]unimodal[\/pb_glossary]<\/strong> or <strong>[pb_glossary id=\"61\"]bimodal[\/pb_glossary]<\/strong>. <strong>Unimodal<\/strong> distributions show one single collection of scores, whereas <strong>bimodal<\/strong> distributions look more like a camel\u2019s back, with two clear lumps. Do not jump to a <strong>bimodal<\/strong> description unless the two peaks are clear and distinct, with some low frequency bins in between. The peaks should also be fairly similar in size to be considered <strong>bimodal<\/strong>. This <strong>histogram<\/strong> above clearly displays just one peak, so we would describe it as <strong>unimodal<\/strong>.\r\n\r\nHow would the shape of our stress <strong>scores<\/strong> distribution look if I measured stress <strong>scores<\/strong> once early in the semester and then again late in the semester? One could speculate that the distribution could become <strong>bimodal<\/strong>, with the early-in-semester <strong>scores<\/strong> piling up on the low end of the stress scale, and late-in-semester scores piling up on the high end of the stress scale (as exam and assignment \u201ccrunch time\u201d has set in).\r\n<h4><a id=\"distribution_return\"><\/a><a href=\"#distCT\">Concept Practice: distribution shape<\/a><\/h4>\r\n<strong>Frequency tables<\/strong> and <strong>histograms<\/strong> are useful for summarizing <strong>numeric<\/strong> datasets. What about qualitative data, from <strong>nominal<\/strong> <strong>variables<\/strong>? Bar graphs and pie charts are excellent ways to summarize those types of data.\u00a0Note the gap between bars in a bar graph, as opposed to the touching bars in a <strong>histogram<\/strong>. This indicates the arbitrariness of the categories. We are still portraying how many of the measured individuals fall into each category, but those categories are not associated with <strong>numeric<\/strong> values, so no continuity should be implied. Pie charts are excellent for highlighting the relative proportion of <strong>scores<\/strong> that fall into each category.\r\n\r\n<img class=\"wp-image-29 aligncenter\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-300x191.png\" alt=\"\" width=\"357\" height=\"227\" \/>\r\n<h1>Chapter Summary<\/h1>\r\nIn this chapter, we reviewed why we need statistics. We also introduced some key terms, listed below. We then saw how to summarize data effectively using tables and graphs and describe the patterns the distributions of data make.\r\n\r\nKey terms:\r\n<table class=\"no-lines\" style=\"border-collapse: collapse;width: 92.0574%;height: 90px\" border=\"0\">\r\n<tbody>\r\n<tr style=\"height: 15px\">\r\n<td style=\"width: 25.0319%;height: 15px\"><strong>descriptive<\/strong><\/td>\r\n<td style=\"width: 34.8659%;height: 15px\"><strong>nominal<\/strong><\/td>\r\n<td style=\"width: 31.9974%;height: 15px\"><strong>independent variable<\/strong><\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"width: 25.0319%;height: 15px\"><strong>inferential<\/strong><\/td>\r\n<td style=\"width: 34.8659%;height: 15px\"><strong>numeric<\/strong><\/td>\r\n<td style=\"width: 31.9974%;height: 15px\"><strong>dependent variable <\/strong><\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"width: 25.0319%;height: 15px\"><strong>variable<\/strong><\/td>\r\n<td style=\"width: 34.8659%;height: 15px\"><strong>frequency table<\/strong><\/td>\r\n<td style=\"width: 31.9974%;height: 15px\"><strong>right skewed<\/strong><\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"width: 25.0319%;height: 15px\"><strong>value<\/strong><\/td>\r\n<td style=\"width: 34.8659%;height: 15px\"><strong>histogram<\/strong><\/td>\r\n<td style=\"width: 31.9974%;height: 15px\"><strong>left skewed<\/strong><\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"width: 25.0319%;height: 15px\"><strong>score<\/strong><\/td>\r\n<td style=\"width: 34.8659%;height: 15px\"><strong>grouped frequency table<\/strong><\/td>\r\n<td style=\"width: 31.9974%;height: 15px\"><strong>unimodal<\/strong><\/td>\r\n<\/tr>\r\n<tr style=\"height: 15px\">\r\n<td style=\"width: 25.0319%;height: 15px\"><\/td>\r\n<td style=\"width: 34.8659%;height: 15px\"><\/td>\r\n<td style=\"width: 31.9974%;height: 15px\"><strong>bimodal<\/strong><\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\n<em>Note: concise definitions of all key terms can be found in the <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/back-matter\/key-terms-list\/\" target=\"_blank\" rel=\"noopener\">Key Terms List<\/a> at the end of the book.<\/em>\r\n<h1>Concept Practice<\/h1>\r\n<a id=\"intuition\"><\/a>[h5p id=\"84\"]\r\n<h6 style=\"text-align: right\">Return to <a href=\"#intuition_return\">text<\/a><\/h6>\r\n<a id=\"inferential\"><\/a>[h5p id=\"90\"]\r\n<a id=\"descriptive\"><\/a>[h5p id=\"91\"]\r\n<h6 style=\"text-align: right\">Return to <a href=\"#inferential_return\">text<\/a><\/h6>\r\n<a id=\"score\"><\/a>[h5p id=\"1\"]\r\n<h6 style=\"text-align: right\">Return to <a href=\"#score_return\">text<\/a><\/h6>\r\n<a id=\"distribution\"><\/a>[h5p id=\"94\"]\r\n<h6 style=\"text-align: right\">Return to <a href=\"#distribution_return\">text<\/a><\/h6>\r\nReturn to <a href=\"#1a\">1a. Why we need statistics<\/a>\r\n<h6>Download <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2022\/06\/1a.-Class-worksheet.pdf\">Worksheet 1a.<\/a><\/h6>\r\nReturn to <a href=\"#1b\">1b. Displaying Data Using Tables and Graphs<\/a>\r\n<h6>Try interactive <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/chapter\/worksheet-1b\/\" target=\"_blank\" rel=\"noopener\">Worksheet 1b.<\/a> or download <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2022\/06\/1b.-Class-worksheet.pdf\">Worksheet 1b.<\/a><\/h6>\r\n<a id=\"video1a\" style=\"font-size: 1em;background-image: url('img\/anchor.gif')\"><\/a>video 1a\r\n\r\n[video width=\"1280\" height=\"720\" mp4=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1a.mp4\"][\/video]\r\n\r\n<a id=\"video1b\" style=\"font-size: 1em;background-image: url('img\/anchor.gif')\"><\/a>video 1b1\r\n\r\n[video  width=\"1440\" height=\"1080\" mp4=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b1.mp4\"][\/video]\r\n\r\nvideo 1b2\r\n\r\n[video width=\"1440\" height=\"1080\" mp4=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b2.mp4\"][\/video]","rendered":"<h1><a id=\"1a\"><\/a>1a. Why we need statistics<\/h1>\n<h6 style=\"text-align: right\"><a href=\"#video1a\">video lesson<\/a><\/h6>\n<p>One of the first things I think we need to accomplish in this course is to understand why statistics are important. Our objective in this first part of Chapter 1 is to be able to articulate the purpose of a course introducing statistical principles and techniques, and to be able to supply examples of situations in which the techniques you will learn in such a course may be necessary to use.<\/p>\n<p>First, let us establish that this is <em>not<\/em> a math course. This is a course that is primarily about decision making. Not just any decision making, but decisions that are made after analyzing data in order to make objective decisions that are guided by empirical evidence. Of course, we use some simple calculations in the course in order to process the data into a form that aids our decision making. However, the math is a necessary means to an end, not an end in itself.<\/p>\n<p>In some situations this kind of decision making is not needed. When the decision can be made based on intuition and subjective personal preference, we do not need rigorous data-driven systems. For example, if I am trying to decide whom to date, or what style of clothing I like to wear that suits my personality, I likely am not going to conduct research and a formal data analysis to come to those decisions. Maybe you can think of another situation, in which a good decision can be made without empirical evidence.<\/p>\n<p>On the other hand, sometimes a decision that you need to make is one that affects others, or is so high stakes that you want to make an informed decision that is objective and based on empirical evidence. In this kind of decision making, you should check your intuition at the door, and walk in with an open mind, letting the data be your guide. Examples of situations in which an objective decision making process might be necessary would be when you are trying to decide whether a medical treatment is safe, or whether a proposed intervention is actually effective. Perhaps you need to find out if a crime prevention program is effective for urban and rural communities alike. Can you think of another kind of decision that should be made objectively based on data? What these scenarios have in common is that they are professional decisions, or are high stakes. In the professional workplace, we are often in situations where, if we just operated based on our intuition, we may make serious mistakes, because we have not considered whether the course of action we decide on is the best choice for all people, all situations, or over time. The techniques you will learn in this course will help you apply data analysis, so that you can set up a decision making framework that is objective and rigorous, and so that the decision you come to will be generalizable, to suit other people, situations, or time frames.<\/p>\n<p>Why does a student in your field of study require statistics? Regardless of your field of study, I bet you are asked to be a critical thinker. If we look at the list of critical thinking guidelines below that make for good science, I bet you can see the value of these guidelines for your own program of study.<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Critical thinking guidelines<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>Ask Questions: Be Willing to Wonder<\/li>\n<li>Define Your Terms<\/li>\n<li><strong>Examine the Evidence<\/strong><\/li>\n<li>Analyze Assumptions and Biases<\/li>\n<li><strong>Avoid Emotional Reasoning<\/strong><\/li>\n<li>Don\u2019t Oversimplify<\/li>\n<li>Consider Other Interpretations<\/li>\n<li>Tolerate Uncertainty<\/li>\n<\/ul>\n<p>from Wade, Tavris &amp; Swinkels. (2017). <em>Psychology<\/em>. Boston: Pearson.<\/p>\n<\/div>\n<\/div>\n<p>Statistics represents a tool for examining evidence and allowing us to use data effectively. However, it is also important to realize that statistics can help us avoid emotional reasoning. Instead of relying on our intuitions about whether a drug is effective, or whether one choice is significantly better than another, statistical analysis allows us to make an objective decision.<\/p>\n<p>In statistics, N stands for sample size. In other words, how many data points did you measure. Very often, in everyday life, we are tempted to make assumptions and derive conclusions from single data points. In the world of statistics, we call these situations, \u201can N of one\u201d. These are situations scientists are extremely wary of, because they are vulnerable to bias.<\/p>\n<p>For example, let\u2019s say my friend has a really bad experience in one neighborhood. After that, even if there are no objective reports of comparative neighborhood safety that support this conclusion, I am likely to say to others that that\u2019s a bad neighborhood \u2013 one to avoid. We are always overly influenced by our own experiences and the experiences of those close to us. In such moments we should always remind ourselves that until we have asked many individuals who have been in that neighborhood what their experiences were, we only have one observation, and it may not be typical or representative. If my friend\u2019s experience in the neighborhood were the one bad experience in 1000 experiences, would we still be tempted to consider it a \u201cbad\u201d neighbourhood? Next time you face a situation like this in your daily life, just take a moment to pause and think to yourself&#8230; what information should I have to make the right decision?<\/p>\n<h4><a id=\"intuition_return\"><\/a><a href=\"&quot;#intuition\">Concept Practice: Intuition<\/a><\/h4>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter\" src=\"https:\/\/opentextbc.ca\/indigenizationfrontlineworkers\/wp-content\/uploads\/sites\/237\/2018\/06\/IndigenousHolisticFramework.png\" alt=\"&quot;&quot;\" width=\"351\" height=\"342\" \/><\/p>\n<p style=\"text-align: center;padding-left: 40px\">Fig. 2.2 from <em><a href=\"https:\/\/opentextbc.ca\/indigenizationfrontlineworkers\" rel=\"cc:attributionURL\">Pulling Together: A Guide for Front-Line Staff, Student Services, and Advisors<\/a><\/em>\u00a0by\u00a0Ian Cull, Robert L. A. Hancock, Stephanie McKeown, Michelle Pidgeon, and Adrienne Vedan<\/p>\n<p>In this course, we will be focusing on only one aspect of one way of knowing. Let us acknowledge the fact that various cultures and systems place particular value on various ways of knowing. For example, if we refer to the indigenous ways of knowing framework shown above, we might see this entire course as being one element of \u201cintellectual\u201d ways of knowing. Its contribution might be to contribute to responsibility and relevance by enhancing the generalizability of decision making as we discussed before. However, no one should mistake statistics for a holistic system of knowing.<\/p>\n<p>I encourage you to think of what you learn in this course as one tool in the toolbox. The reason many academic disciplines require a statistics course is that this is a tool most people do not get in other areas of their lives. We tend not to learn statistics from our parents or by volunteering in the community. In fact, most of us are very bad at this form of decision making until we learn to use these tools.<\/p>\n<p>By requiring you to learn statistics, disciplines like Psychology are not suggesting it is the only important decision-making tool. It is one we think you need to understand to be a good scientist and to better interpret some types of evidence to which you will have access in your professional life. I encourage you to learn more about holistic ways of knowing and to reflect on the place that formal, data-driven decision-making practices have in your own ways-of-knowing framework. I think we could all gain some insight by looking at such a model with an eye toward acknowledging areas in which we are weaker, because of our own individual experiences or because of the society in which we have grown up.<\/p>\n<h1><a id=\"1b\"><\/a>1b. Displaying Data Using Tables and Graphs<\/h1>\n<h6 style=\"text-align: right\"><a href=\"#video1b\">video lesson<\/a><\/h6>\n<p>Have you ever heard the saying, \u201ca picture is worth a thousand words\u201d? That is what the rest of this chapter is all about. First,<span style=\"text-align: initial;font-size: 1em\">\u00a0we need to cover some basic concepts and definitions, including the differences between <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_33\">descriptive<\/a><\/strong> and <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_43\">inferential<\/a><\/strong> statistics, and the meaning of the terms <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_45\">variable<\/a><\/strong>, <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_46\">value<\/a><\/strong> and <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_47\">score<\/a><\/strong>. We will then need to learn to distinguish among levels of measurement to be able to choose the appropriate techniques for summarizing different types of data.<\/span><\/p>\n<p>Finally, I will demonstrate how to generate <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_49\">frequency tables<\/a><\/strong> and to graph a dataset, because the first step in data analysis is always to look at it. Just as a picture is worth a thousand words, it is also worth a thousand numbers.<\/p>\n<p>At first we will focus on <strong>descriptive<\/strong> statistics. These are ways to summarize or organize data from a research study \u2013 essentially allowing us to describe what the data are.<\/p>\n<p>A little later in the course, we will move into the realm of <strong>inferential<\/strong> statistics. These are analytical tools that allow us to draw conclusions based on data from a research study. In other words, we go beyond just saying what the data are, and make a statement about what they mean. Inferential statistics are used in research and policy as a tool to make decisions.<\/p>\n<h4><a id=\"inferential_return\"><\/a><a href=\"&quot;#inferential\">Concept Practice: inferential statistics<\/a><\/h4>\n<h4><a href=\"&quot;#descriptive\">Concept Practice: descriptive statistics<\/a><\/h4>\n<p>Three basic terms are essential jargon in statistics. A <strong>variable<\/strong> is a quality or a quantity that is different for different individuals. A <strong>variable<\/strong> could be a quality, like ethnicity, for which each person might have a different characteristic. Or it could be a quantity, like temperature, that could be different each time you take a reading, and is measured on a number scale. A <strong>value<\/strong> is just any possible number or category that a <strong>variable<\/strong> could take on. So for ethnicity you might have 6 categories in which you place individuals. Or for temperature there might be a numeric range from -100 to +100. Those would be the full set of <strong>values<\/strong> for that <strong>variable<\/strong>. A score is a particular individual\u2019s <strong>value<\/strong> on the <strong>variable<\/strong>. For ethnicity, you would identify yourself as one particular category, and that would be your <strong>score<\/strong>. For temperature, if you check your weather app and see that it is 7 degrees outside, that is the <strong>score<\/strong> for that time and place.<\/p>\n<h4><a id=\"score_return\"><\/a><a href=\"#score\">Concept Practice: variable, value, score<\/a><\/h4>\n<p>Measurement is the assignment of a number to the amount of something., or assigning labels for categories. This is often obvious (for example, we might measure time as number of seconds or number of minutes). Sometimes, however, it can be a bit more arbitrary. We might assign numbers to signify a category, for example 1 for male and 2 for female.<\/p>\n<p>Based on how we measure them, there are two major types of variables in statistics, and these will be important to keep in mind as we go through the semester. The type of variable determines how we can use it.<\/p>\n<table class=\"lines\" style=\"height: 168px\">\n<tbody>\n<tr style=\"height: 29px\">\n<td style=\"width: 170.996px;height: 29px\"><em><strong>Type of variable<\/strong><\/em><\/td>\n<td style=\"width: 345.996px;height: 29px\"><em><strong>Characteristics<\/strong><\/em><\/td>\n<td style=\"width: 228.971px;height: 29px\"><em><strong>Examples<\/strong><\/em><\/td>\n<\/tr>\n<tr style=\"height: 46px\">\n<td style=\"width: 170.996px;height: 46px\">&nbsp;<\/p>\n<p><strong>Nominal<\/strong>\/<\/p>\n<p>Categorical<\/p>\n<p>&nbsp;<\/td>\n<td style=\"width: 345.996px;height: 46px\">\n<div>\u2022Label and categorize<\/div>\n<div>\u2022If numbered, numbers are arbitrary<\/div>\n<\/td>\n<td style=\"width: 228.971px;height: 46px\">\n<div>\u2022Gender<\/div>\n<div>\u2022Diagnosis<\/div>\n<div>\u2022Experimental or Control<\/div>\n<\/td>\n<\/tr>\n<tr style=\"height: 93px\">\n<td style=\"width: 170.996px;height: 93px\">&nbsp;<\/p>\n<p><strong>Numeric<\/strong>\/<\/p>\n<p>Quantitative<\/p>\n<p>&nbsp;<\/td>\n<td style=\"width: 345.996px;height: 93px\">\n<div>\u2022Numerical data<\/div>\n<div>\u2022Numbers reflect size or amount of something<\/div>\n<\/td>\n<td style=\"width: 228.971px;height: 93px\">\n<div>\u2022Temperature<\/div>\n<div>\u2022IQ<\/div>\n<div>\u2022Golf scores (above\/below par)<\/div>\n<div>\u2022Number of correct answers<\/div>\n<div>\u2022Time to complete task<\/div>\n<div>\u2022Gain in height since last year<\/div>\n<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>The first type of variable is <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_50\">nominal<\/a><\/strong> or categorical (also called qualitative). <strong>Nominal<\/strong> variables label or categorize something, and any numbers used to measure these variables are arbitrary and do not indicate quantity or size. For example, if male is scored as 1 and female is scored as 2, that does not indicate that females are twice as good or double the size of males. It is just a code.<\/p>\n<p><strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_51\">Numeric<\/a><\/strong> or quantitative variables are ones for which numbers are actually meaningful. They indicate the size or amount of something.<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Examples of Numeric variables<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>Temperature, in which 10 degrees is warmer than 0<\/li>\n<li>Golf scores, in which 2 below par means you did well<\/li>\n<li>IQ, in which 100 is average intelligence<\/li>\n<li>Number of correct answers, in which 4 correct answers is twice as many as 2 correct answers<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>When we calculate statistics, we will see that we can calculate an average IQ in a group of people, or an average temperature across several days. But we cannot calculate an average gender. What we will do is use groupings or categories as a basis for comparison of other variables; for example, does the experimental group have a higher number of correct answers than the control group?<\/p>\n<p>Now, a quick side note: If you take a course in research methods, you learn that measurement is a really tricky thing in practice, particularly when you want to measure something internal about a person. The process of operationally defining something so that you can measure it numerically or in discrete categories is a real challenge. This is beyond the scope of this course, but just to give you a sense, try brainstorming a way in which you could measure aggression? Think of at least one way that would create a <strong>nominal variable<\/strong>, and one way that would create a <strong>numeric variable<\/strong>.<\/p>\n<p>If you give that example some thought, you will quickly find that a relatively simple variable like aggression can be fiendishly difficult to measure, and in the field of psychology a lot of effort is put into developing good ways to measure mental constructs. In experimental psychology, we often prefer to measure things as numbers, because then we can use statistical methods to summarize and to make inferences about the thing we measured.<\/p>\n<p>We should return to our discussion of experimental research design. A <strong>variable<\/strong> is something that has different values for different individuals, and that we can measure. As an example, we can measure how fast someone is at completing a puzzle, and get those scores for a bunch of people. This variable would be speed. We can also assign each of those people into categories or conditions: a high-stress vs. low-stress condition, for example. Research is the study of the relationship between <strong>variables<\/strong>. Therefore, there must be at least two <strong>variables<\/strong> in a research study (or there is no relationship to study). Typically an experimental study in psychology has one (or more) <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_52\">independent variables<\/a><\/strong> and one (or more) <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_53\">dependent variables<\/a><\/strong>.<\/p>\n<p>An <strong>independent variable<\/strong> is one you manipulate &#8212; most often it is categorical, or <strong>nominal<\/strong> (e.g. experimental group vs. control group). A <strong>dependent variable<\/strong> is one you measure to detect a difference\/change as a result of the manipulation &#8212; most often it is <strong>numeric<\/strong> (e.g. time to complete a puzzle).<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Example of Experimental Design<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<p>Do members of your experimental group (who were required to give a speech in front of a group of people) solve a puzzle in a shorter or longer amount of time than members of your control group (who were allowed to browse magazines)?<\/p>\n<\/div>\n<\/div>\n<p>In the example above, the <strong>independent variable<\/strong> would be the manipulation: whether people are required to give a speech or are allowed to browse magazines. Note that is a <strong>nominal variable<\/strong>. The <strong>dependent variable <\/strong>is what you measure after the manipulation: how it takes the participants to solve a puzzle. Note that would be a <strong>numeric variable<\/strong>.<\/p>\n<p>Now that you have some basic definitions and concepts down regarding types of data and how to measure them, we need to learn how to deal with <strong>numeric<\/strong> data.<\/p>\n<div style=\"text-align: center\">\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Example of Numeric Dataset<\/p>\n<\/header>\n<div class=\"textbox__content\">Stress ratings of 10 students: 8,7,4,10,8,6,8,9,9,7<\/div>\n<\/div>\n<p><span style=\"text-align: initial;font-size: 1em\">First\u2026 what can you say about this data set from the list of numbers above? How would you describe the findings to someone?<\/span><\/p>\n<\/div>\n<p>Perhaps you want to summarize a dataset in table form, to organize the data and make it easy to get an overview of the dataset quickly. A <strong>frequency table<\/strong> does just that. To create a <strong>frequency table<\/strong>, you just ask yourself: for each possible value on this variable, how many individuals have a particular score? That gives you the frequency of each value \u2013 or how often it occurred in the dataset. Let\u2019s look at an example. We measure the stress levels of 10 students, on a scale of 1 to 10, and above are their scores. Hard to make any sense out of that list, right? By following the steps below, we can create a <strong>frequency table<\/strong>.<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Steps for Making a Frequency Table<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>Label the first row: Values, Frequency, and Percentage.<\/li>\n<li>In the first column, under the heading Values, list all the possible values the variable could take on. In this case, we have 10 possible values, so there should be 10 rows in the data portion of the table.<\/li>\n<li>Make a list down the page of each score, from lowest to highest, to make it easier to count them.<\/li>\n<li>Go one by one through the scores, making a mark for each next to its value on the list (e.g., how many 1\u2019s are there? 0. \u2026 How many 4\u2019s are there? 1. \u2026 How many 7\u2019s are there? 2. Repeat that question for every value from 1 to 10. Write those frequencies, or counts, in the Frequency column.<\/li>\n<li>Figure the percentage of scores for each value. To calculate a percentage you take the frequency, divide by how many scores you have in the dataset (here we have 10 students, so 10 scores), and multiply that by 100 to move the decimal to the right two places. So for the value of 7, with frequency of 2, that becomes 2 divided by 10 times 100 or 20%. Calculate and list all the percentages.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>Here is what the table should look like once you are done with those steps:<\/p>\n<table class=\"grid aligncenter\" style=\"height: 442px; width: 396px; border-spacing: 0px;\" cellpadding=\"0\">\n<tbody>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\"><em><strong>Values<\/strong><\/em><\/td>\n<td style=\"width: 128.927px;height: 29px\"><em><strong>Frequency<\/strong><\/em><\/td>\n<td style=\"width: 126.925px;height: 29px\"><em><strong>Percent<\/strong><\/em><\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">1<\/td>\n<td style=\"width: 128.927px;height: 29px\">0<\/td>\n<td style=\"width: 126.925px;height: 29px\">0%<\/td>\n<\/tr>\n<tr style=\"height: 26px\">\n<td style=\"width: 110.808px;height: 26px\">2<\/td>\n<td style=\"width: 128.927px;height: 26px\">0<\/td>\n<td style=\"width: 126.925px;height: 26px\">0%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">3<\/td>\n<td style=\"width: 128.927px;height: 29px\">0<\/td>\n<td style=\"width: 126.925px;height: 29px\">0%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">4<\/td>\n<td style=\"width: 128.927px;height: 29px\">1<\/td>\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">5<\/td>\n<td style=\"width: 128.927px;height: 29px\">0<\/td>\n<td style=\"width: 126.925px;height: 29px\">0%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">6<\/td>\n<td style=\"width: 128.927px;height: 29px\">1<\/td>\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">7<\/td>\n<td style=\"width: 128.927px;height: 29px\">2<\/td>\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">8<\/td>\n<td style=\"width: 128.927px;height: 29px\">3<\/td>\n<td style=\"width: 126.925px;height: 29px\">30%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">9<\/td>\n<td style=\"width: 128.927px;height: 29px\">2<\/td>\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">10<\/td>\n<td style=\"width: 128.927px;height: 29px\">1<\/td>\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Now you can scan down the table and quickly see where most of the <strong>scores<\/strong> fall within the range of possible <strong>values<\/strong>. Now that you have an organized summary of the data, you can clearly see that the majority of students are reporting fairly high stress <strong>scores<\/strong>. By looking at the percentages, you have a quick way to report the proportion of students that are highly stressed. For example, just by doing some quick addition, you can say that 60% of surveyed students report stress levels 8 or higher.<\/p>\n<p>Most people find graphs easier to interpret at first glance than tables. What can you say about the dataset after looking at this graph?<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-23 aligncenter\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2-300x251.png\" alt=\"\" width=\"338\" height=\"283\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2-300x251.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2-65x54.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2-225x188.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2-350x292.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.2.png 595w\" sizes=\"auto, (max-width: 338px) 100vw, 338px\" \/><\/p>\n<p>I bet you were able to say that most <strong>scores<\/strong> pile up at the upper end of the graph, at the higher end of the range of stress score <strong>values<\/strong>.<\/p>\n<p>The graphical version of a frequency table is a <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_55\">histogram<\/a><\/strong>. The X axis on a <strong>histogram<\/strong> should have the values of the variable listed, from lowest to highest. The Y axis should represent the frequencies of each value in the dataset. In other words, the <strong>histogram<\/strong> is a <strong>frequency table<\/strong> that has been turned on its side. The added benefit comes from the visual representation of the frequency as the height of the bars in the graph, rather than just a number. You can thus see a clear shape in the dataset.<\/p>\n<p>In some circumstances, a <strong>frequency table<\/strong> is not an effective way to summarize a dataset. This is the case if the range of <strong>values<\/strong> is too large. For example, what if you were summarizing temperature <strong>scores<\/strong>, which can range from 0-100? This would mean more than 100 rows in the table. That is not so helpful. In such a case, a <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_56\">grouped frequency table<\/a><\/strong> is a much better option.<\/p>\n<p>A <strong>grouped frequency table<\/strong> defines ranges of values in the first column, and reports the frequency of <strong>scores<\/strong> that fall within each range. In this example, we have surveyed 30 students\u2019 stress levels on a scale of 1-10:<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Example of Numeric Dataset<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<div>Stress ratings of 30 students: 8,7,4,10,8,6,8,9,9,7,3,7,6,5,1,9,10,7,7,3,6,7,5,2,1,6,7,10,8,8<\/div>\n<\/div>\n<\/div>\n<p>If we wanted to get the table into the ideal format of 4-8 rows, we could create grouped frequencies, with two <strong>values<\/strong> in each row.<\/p>\n<p>By following these steps, we can create a <strong>grouped frequency table<\/strong>:<\/p>\n<div class=\"textbox textbox--exercises\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Steps for Making a Grouped Frequency Table<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>Label the first row: Values, Frequency, and Percentage<\/li>\n<li>Decide on the ranges of values you need. You want to choose ranges that will leave you with 4-8 rows in the table<\/li>\n<li>In the first column, under the heading Values, list all the possible value ranges the variable cold take on. In this case, we grouping by twos, so there should be 5 rows in the data portion of the table.<\/li>\n<li>Make a list down the page of each score, from lowest to highest, to make it easier to count them.<\/li>\n<li>Go one by one through the scores, making a mark for each next to its value range on the list (e.g., how many 1\u2019s and 2\u2019s are there? 3. \u2026 How many 3\u2019s and 4\u2019s are there? 3. \u2026 and so on. Repeat that question for every value range. Write those frequencies, or counts, in the Frequency column.<\/li>\n<li>Figure the percentage of scores for each value range. To calculate a percentage you take the frequency, divide by how many scores you have in the dataset (here we have 30 students, so 30 scores), and multiply that by 100 to move the decimal to the right two places. So for the value range of 1-2, with frequency of 3, that becomes 3 divided by 30 times 100, or 10%. Calculate and list all the percentages.<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>Here is the completed <strong>grouped frequency table<\/strong> for the dataset of 30 students:<\/p>\n<table class=\"grid aligncenter\" cellpadding=\"0\" style=\"border-spacing: 0px;\">\n<tbody>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\"><em><strong>Values<\/strong><\/em><\/td>\n<td style=\"width: 128.927px;height: 29px\"><em><strong>Frequency<\/strong><\/em><\/td>\n<td style=\"width: 126.925px;height: 29px\"><em><strong>Percent<\/strong><\/em><\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">1-2<\/td>\n<td style=\"width: 128.927px;height: 29px\">3<\/td>\n<td style=\"width: 126.925px;height: 29px\">10%<\/td>\n<\/tr>\n<tr style=\"height: 26px\">\n<td style=\"width: 110.808px;height: 26px\">3-4<\/td>\n<td style=\"width: 128.927px;height: 26px\">3<\/td>\n<td style=\"width: 126.925px;height: 26px\">10%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">5-6<\/td>\n<td style=\"width: 128.927px;height: 29px\">6<\/td>\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">7-8<\/td>\n<td style=\"width: 128.927px;height: 29px\">12<\/td>\n<td style=\"width: 126.925px;height: 29px\">40%<\/td>\n<\/tr>\n<tr style=\"height: 29px\">\n<td style=\"width: 110.808px;height: 29px\">9-10<\/td>\n<td style=\"width: 128.927px;height: 29px\">6<\/td>\n<td style=\"width: 126.925px;height: 29px\">20%<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Note that we can and should double check our work. Simply add up all the frequencies and make sure the sum is 30. Also add up all the percentages and make sure they add up to 100%.<\/p>\n<p>Here is a <strong>his<\/strong><strong style=\"text-align: initial;font-size: 1em\">togram<\/strong><span style=\"text-align: initial;font-size: 1em\"> of the <\/span><strong style=\"text-align: initial;font-size: 1em\">grouped frequency table<\/strong><span style=\"text-align: initial;font-size: 1em\"> we just generated:<\/span><\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-24 aligncenter\" style=\"font-size: 1em\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3-300x276.png\" alt=\"\" width=\"313\" height=\"288\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3-300x276.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3-65x60.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3-225x207.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3-350x323.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.3.png 573w\" sizes=\"auto, (max-width: 313px) 100vw, 313px\" \/><\/p>\n<p>Note that <strong>histograms<\/strong> are appropriate for plotting <strong>numeric<\/strong> datasets. The X axis should be labeled with numbers, rather than with categories. That is how a <strong>histogram<\/strong> differs from your typical bar graph. The other difference is the width of the bar. In <strong>histograms<\/strong>, because the data are <strong>numeric<\/strong> or continuous, the bars should appear to touch \u2013 with no break in between the bars. This gives a unitary appearance to the shape of the graph. If you were to draw a smooth line over the shape of the distribution, or overall pattern of the data, you would get the impression of a curve.<\/p>\n<p>If you drew a smooth line over the shape of the dataset in a <strong>histogram<\/strong>, you could describe the shape that is generated with two types of descriptors:<\/p>\n<div class=\"textbox textbox--examples\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Describing a Distribution<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<ul>\n<li>How many peaks there are<\/li>\n<li>How symmetrical the shape is<\/li>\n<\/ul>\n<\/div>\n<\/div>\n<p>Skewness is the term for describing symmetry: is the distribution of data symmetrical (or very close) \u2013 meaning a mirror image from left to right, or is it skewed right\/positively, or left\/negatively? To determine the direction of skew, you need to check the direction of the \u201ctail\u201d. If the tail points right, it is <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_57\">right skewed<\/a><\/strong>. In this case, the tail points left, so it is <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_58\">left skewed<\/a><\/strong>.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-25 aligncenter\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-300x279.png\" alt=\"\" width=\"336\" height=\"313\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-300x279.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-1024x952.png 1024w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-768x714.png 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-65x60.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-225x209.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4-350x325.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.4.png 1199w\" sizes=\"auto, (max-width: 336px) 100vw, 336px\" \/><\/p>\n<p>How many peaks the distribution contains is described as <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_60\">unimodal<\/a><\/strong> or <strong><a class=\"glossary-term\" aria-haspopup=\"dialog\" aria-describedby=\"definition\" href=\"#term_5_61\">bimodal<\/a><\/strong>. <strong>Unimodal<\/strong> distributions show one single collection of scores, whereas <strong>bimodal<\/strong> distributions look more like a camel\u2019s back, with two clear lumps. Do not jump to a <strong>bimodal<\/strong> description unless the two peaks are clear and distinct, with some low frequency bins in between. The peaks should also be fairly similar in size to be considered <strong>bimodal<\/strong>. This <strong>histogram<\/strong> above clearly displays just one peak, so we would describe it as <strong>unimodal<\/strong>.<\/p>\n<p>How would the shape of our stress <strong>scores<\/strong> distribution look if I measured stress <strong>scores<\/strong> once early in the semester and then again late in the semester? One could speculate that the distribution could become <strong>bimodal<\/strong>, with the early-in-semester <strong>scores<\/strong> piling up on the low end of the stress scale, and late-in-semester scores piling up on the high end of the stress scale (as exam and assignment \u201ccrunch time\u201d has set in).<\/p>\n<h4><a id=\"distribution_return\"><\/a><a href=\"#distCT\">Concept Practice: distribution shape<\/a><\/h4>\n<p><strong>Frequency tables<\/strong> and <strong>histograms<\/strong> are useful for summarizing <strong>numeric<\/strong> datasets. What about qualitative data, from <strong>nominal<\/strong> <strong>variables<\/strong>? Bar graphs and pie charts are excellent ways to summarize those types of data.\u00a0Note the gap between bars in a bar graph, as opposed to the touching bars in a <strong>histogram<\/strong>. This indicates the arbitrariness of the categories. We are still portraying how many of the measured individuals fall into each category, but those categories are not associated with <strong>numeric<\/strong> values, so no continuity should be implied. Pie charts are excellent for highlighting the relative proportion of <strong>scores<\/strong> that fall into each category.<\/p>\n<p><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-29 aligncenter\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-300x191.png\" alt=\"\" width=\"357\" height=\"227\" srcset=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-300x191.png 300w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-768x489.png 768w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-65x41.png 65w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-225x143.png 225w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3-350x223.png 350w, https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/Figure-1.5-3.png 899w\" sizes=\"auto, (max-width: 357px) 100vw, 357px\" \/><\/p>\n<h1>Chapter Summary<\/h1>\n<p>In this chapter, we reviewed why we need statistics. We also introduced some key terms, listed below. We then saw how to summarize data effectively using tables and graphs and describe the patterns the distributions of data make.<\/p>\n<p>Key terms:<\/p>\n<table class=\"no-lines\" style=\"border-collapse: collapse;width: 92.0574%;height: 90px\">\n<tbody>\n<tr style=\"height: 15px\">\n<td style=\"width: 25.0319%;height: 15px\"><strong>descriptive<\/strong><\/td>\n<td style=\"width: 34.8659%;height: 15px\"><strong>nominal<\/strong><\/td>\n<td style=\"width: 31.9974%;height: 15px\"><strong>independent variable<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"width: 25.0319%;height: 15px\"><strong>inferential<\/strong><\/td>\n<td style=\"width: 34.8659%;height: 15px\"><strong>numeric<\/strong><\/td>\n<td style=\"width: 31.9974%;height: 15px\"><strong>dependent variable <\/strong><\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"width: 25.0319%;height: 15px\"><strong>variable<\/strong><\/td>\n<td style=\"width: 34.8659%;height: 15px\"><strong>frequency table<\/strong><\/td>\n<td style=\"width: 31.9974%;height: 15px\"><strong>right skewed<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"width: 25.0319%;height: 15px\"><strong>value<\/strong><\/td>\n<td style=\"width: 34.8659%;height: 15px\"><strong>histogram<\/strong><\/td>\n<td style=\"width: 31.9974%;height: 15px\"><strong>left skewed<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"width: 25.0319%;height: 15px\"><strong>score<\/strong><\/td>\n<td style=\"width: 34.8659%;height: 15px\"><strong>grouped frequency table<\/strong><\/td>\n<td style=\"width: 31.9974%;height: 15px\"><strong>unimodal<\/strong><\/td>\n<\/tr>\n<tr style=\"height: 15px\">\n<td style=\"width: 25.0319%;height: 15px\"><\/td>\n<td style=\"width: 34.8659%;height: 15px\"><\/td>\n<td style=\"width: 31.9974%;height: 15px\"><strong>bimodal<\/strong><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><em>Note: concise definitions of all key terms can be found in the <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/back-matter\/key-terms-list\/\" target=\"_blank\" rel=\"noopener\">Key Terms List<\/a> at the end of the book.<\/em><\/p>\n<h1>Concept Practice<\/h1>\n<p><a id=\"intuition\"><\/a><\/p>\n<div id=\"h5p-84\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-84\" class=\"h5p-iframe\" data-content-id=\"84\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 1a.01. Intuitive and subjective vs empirical evidence\"><\/iframe><\/div>\n<\/div>\n<h6 style=\"text-align: right\">Return to <a href=\"#intuition_return\">text<\/a><\/h6>\n<p><a id=\"inferential\"><\/a><\/p>\n<div id=\"h5p-90\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-90\" class=\"h5p-iframe\" data-content-id=\"90\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 1b.01. Descriptive statistics vs inferential statistics\"><\/iframe><\/div>\n<\/div>\n<p><a id=\"descriptive\"><\/a><\/p>\n<div id=\"h5p-91\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-91\" class=\"h5p-iframe\" data-content-id=\"91\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 1b.02. Descriptive statistics vs. Inferential statistics\"><\/iframe><\/div>\n<\/div>\n<h6 style=\"text-align: right\">Return to <a href=\"#inferential_return\">text<\/a><\/h6>\n<p><a id=\"score\"><\/a><\/p>\n<div id=\"h5p-1\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-1\" class=\"h5p-iframe\" data-content-id=\"1\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 1b. Key terms: variable, value, score\"><\/iframe><\/div>\n<\/div>\n<h6 style=\"text-align: right\">Return to <a href=\"#score_return\">text<\/a><\/h6>\n<p><a id=\"distribution\"><\/a><\/p>\n<div id=\"h5p-94\">\n<div class=\"h5p-iframe-wrapper\"><iframe id=\"h5p-iframe-94\" class=\"h5p-iframe\" data-content-id=\"94\" style=\"height:1px\" src=\"about:blank\" frameBorder=\"0\" scrolling=\"no\" title=\"Practice 1b.05. Distribution shape\"><\/iframe><\/div>\n<\/div>\n<h6 style=\"text-align: right\">Return to <a href=\"#distribution_return\">text<\/a><\/h6>\n<p>Return to <a href=\"#1a\">1a. Why we need statistics<\/a><\/p>\n<h6>Download <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2022\/06\/1a.-Class-worksheet.pdf\">Worksheet 1a.<\/a><\/h6>\n<p>Return to <a href=\"#1b\">1b. Displaying Data Using Tables and Graphs<\/a><\/p>\n<h6>Try interactive <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/chapter\/worksheet-1b\/\" target=\"_blank\" rel=\"noopener\">Worksheet 1b.<\/a> or download <a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2022\/06\/1b.-Class-worksheet.pdf\">Worksheet 1b.<\/a><\/h6>\n<p><a id=\"video1a\" style=\"font-size: 1em;background-image: url('img\/anchor.gif')\"><\/a>video 1a<\/p>\n<div style=\"width: 1280px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-5-1\" width=\"1280\" height=\"720\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1a.mp4?_=1\" \/><a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1a.mp4\">https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1a.mp4<\/a><\/video><\/div>\n<p><a id=\"video1b\" style=\"font-size: 1em;background-image: url('img\/anchor.gif')\"><\/a>video 1b1<\/p>\n<div style=\"width: 1440px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-5-2\" width=\"1440\" height=\"1080\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b1.mp4?_=2\" \/><a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b1.mp4\">https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b1.mp4<\/a><\/video><\/div>\n<p>video 1b2<\/p>\n<div style=\"width: 1440px;\" class=\"wp-video\"><video class=\"wp-video-shortcode\" id=\"video-5-3\" width=\"1440\" height=\"1080\" preload=\"metadata\" controls=\"controls\"><source type=\"video\/mp4\" src=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b2.mp4?_=3\" \/><a href=\"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b2.mp4\">https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-content\/uploads\/sites\/1469\/2021\/07\/PSYC2300-Chapter-1b2.mp4<\/a><\/video><\/div>\n<div class=\"glossary\"><span class=\"screen-reader-text\" id=\"definition\">definition<\/span><template id=\"term_5_33\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_33\"><div tabindex=\"-1\"><p>ways to summarize or organize data from a research study \u2013 essentially allowing us to describe what the data are<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_43\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_43\"><div tabindex=\"-1\"><p>analytical tools that allow us to draw conclusions based on data from a research study -- essentially allowing us to make a statement about what the data mean<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_45\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_45\"><div tabindex=\"-1\"><p>a quality or a quantity that is different for different individuals<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_46\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_46\"><div tabindex=\"-1\"><p>any possible number or category that a variable could take on<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_47\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_47\"><div tabindex=\"-1\"><p>a particular individual\u2019s value on the variable<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_49\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_49\"><div tabindex=\"-1\"><p>a way to summarize a dataset in table form, to organize the data and make it easy to get an overview of the dataset quickly<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_50\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_50\"><div tabindex=\"-1\"><p>variables that label or categorize something, and any numbers used to measure these variables are arbitrary and do not indicate quantity or size<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_51\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_51\"><div tabindex=\"-1\"><p>variables for which numbers are actually meaningful -- they indicate the size or amount of something<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_52\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_52\"><div tabindex=\"-1\"><p>a variable you manipulate -- most often it is categorical, or nominal<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_53\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_53\"><div tabindex=\"-1\"><p>a variable you measure to detect a difference\/change as a result of the manipulation -- most often it is numeric<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_55\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_55\"><div tabindex=\"-1\"><p>a graph for summarizing numeric data that essentially is a frequency table that has been turned on its side, with the added benefit of a visual representation of the frequency as the height of the bars in the graph, rather than just a number<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_56\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_56\"><div tabindex=\"-1\"><p>a frequency table that defines ranges of values in the first column, and reports the frequency of scores that fall within each range<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_57\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_57\"><div tabindex=\"-1\"><p>a descriptor of a distribution that indicates asymmetry, specifically with a low frequency tail leading off to the right<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_58\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_58\"><div tabindex=\"-1\"><p>a descriptor of a distribution that indicates asymmetry, specifically with a low frequency tail leading off to the left<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_60\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_60\"><div tabindex=\"-1\"><p>a descriptor of a distribution indicating that there is one peak, or a single collection of scores<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><template id=\"term_5_61\"><div class=\"glossary__definition\" role=\"dialog\" data-id=\"term_5_61\"><div tabindex=\"-1\"><p>a descriptor of a distribution indicating that there are two peaks, or two collections of scores<\/p>\n<\/div><button><span aria-hidden=\"true\">&times;<\/span><span class=\"screen-reader-text\">Close definition<\/span><\/button><\/div><\/template><\/div>","protected":false},"author":1394,"menu_order":1,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[48],"contributor":[],"license":[],"class_list":["post-5","chapter","type-chapter","status-publish","hentry","chapter-type-numberless"],"part":3,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/5","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/users\/1394"}],"version-history":[{"count":26,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/5\/revisions"}],"predecessor-version":[{"id":1125,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/5\/revisions\/1125"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/parts\/3"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapters\/5\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/media?parent=5"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/pressbooks\/v2\/chapter-type?post=5"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/contributor?post=5"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/statspsych\/wp-json\/wp\/v2\/license?post=5"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}