{"id":1448,"date":"2019-07-30T17:01:27","date_gmt":"2019-07-30T21:01:27","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/simplestats\/?post_type=chapter&#038;p=1448"},"modified":"2019-08-12T20:10:43","modified_gmt":"2019-08-13T00:10:43","slug":"1-5-discrete-and-continuous-variables","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/simplestats\/chapter\/1-5-discrete-and-continuous-variables\/","title":{"raw":"1.5 Discrete and Continuous Variables","rendered":"1.5 Discrete and Continuous Variables"},"content":{"raw":"&nbsp;\r\n\r\nI will introduce a final useful typology by which variables can be grouped: discrete and continuous.\r\n\r\n&nbsp;\r\n\r\nBy definition, variables called <em>discrete<\/em> (note, not discreet!) have finite number of categories (i.e.,\"space\" between them, and nothing occupies that space), while variables called <em>continuous<\/em>\u00a0have potentially infinite number of values (i.e., it's possible that a value exists between any two given values, in smaller and smaller -- <em>infinite<\/em> --- number of \"spaces\" between any two the values, to infinity). To make things easier to understand, and with more than a little risk of oversimplification, <strong>in a very broad sense you can think of nominal and ordinal variables as discrete and of interval\/ratio variables as continuous<\/strong>.[footnote]Technically speaking, in theory nominal and some ordinal variables are categorical, ordinal variables with numerical categories are discrete, and interval\/ratio variables are continuous. In practice, things are less clear cut.[\/footnote] For example, <em>hair colour, religious affiliation,<\/em> and <em>educational attainment<\/em> (as measured in educational degrees) are all discrete: they have finite number of <em>discrete<\/em> categories.\r\n\r\n&nbsp;\r\n\r\nOn the other hand, age, income, or exam scores are all continuous: a number (value) can exist between any two given values, depending on how precise you want your measurement to be.\u00a0To take <em>age<\/em>, for example, if two people report being 20 and 22, respectively, it's obviously possible that another person in 21. However, we need not round to full years; between two people ages 20 and 21, a value of 21.5 (or 21 years and 6 months) is possible to exist. Further, between the ages of 21 years and 21 years and 6 months, we can have a value of 21 years and 3 months, and so on, until we are down to counting days, then counting hours, then counting minutes, then counting seconds, then milliseconds, then microseconds, then nanoseconds, etc.... The point is that, in theory, there is always a smaller number between any two numbers (which can be represented by the possibility of infinite number of digits after the decimal point). The same can be applied to income and exam scores too.\r\n\r\n&nbsp;\r\n\r\nIn practice, however, things are different. In sociological\u00a0<span style=\"text-indent: 37.3333px;font-size: 14pt\">research\u00a0<\/span><span style=\"text-align: initial;text-indent: 2em;font-size: 14pt\">(as with other similar disciplines), the data collected is <em>empirically<\/em> discrete, as the values collected are a finite number and are typically rounded to whole numbers: we don't bother to measure age in anything but years, income in dollars (and not cents), etc. Still, w<\/span><span style=\"text-indent: 18.6667px;font-size: 14pt\">e usually call interval\/ratio variables are continuous<\/span><span style=\"text-indent: 1em;font-size: 14pt\">, because of the <\/span><em style=\"text-indent: 1em;font-size: 14pt\">potential<\/em><span style=\"text-indent: 1em;font-size: 14pt\"> for infinite number of values.<\/span>\r\n\r\n&nbsp;\r\n\r\nAt the same time, however, some ratio variables are truly discrete. Think, for example, about a measure called <em>number of children\u00a0<\/em>of the respondent. Clearly, there is no possibility for an infinite number of values, just like with any \"number of people\"-type variable: people can only be counted in whole numbers, and the count is always finite.\r\n\r\n&nbsp;\r\n\r\nAll this is undoubtedly confusing, so here is a practical tip for applied research, and what you need to focus on. Regardless if a variable is discrete or continuous <em>in theory<\/em>, in practice all variables you will encounter in real-life, actual datasets will be discrete. <strong>What we do is <em>treat<\/em> some variables as discrete, and other variables as continuous <em>for the purposes of statistical analysis<\/em><\/strong>. The rule of thumb is to make the differentiation based on the number of categories\/values: <strong><em>typically<\/em> nominal and ordinal variables have relatively few categories so we treat them as discrete, while interval\/ratio variables <em>typically<\/em> have relatively large number of values, so we treat them as continuous.<\/strong> If, however, an ordinal variable has relatively large number of categories it may be treated as continuous, and, on the flip side, if an interval\/ratio variable has relatively few values it may be treated as discrete. Generally, and\u00a0<span style=\"text-indent: 18.6667px;font-size: 14pt\">assuming proper justification (i.e., a large number of categories\/values),\u00a0<\/span><span style=\"text-indent: 1em;font-size: 14pt\">the decision to treat an ordinal variable as continuous or an interval\/ratio variable as discrete remains a matter of the researcher's discretion.<\/span>\r\n\r\n&nbsp;\r\n\r\n<span style=\"text-indent: 1em;font-size: 14pt\">Finally, what is the magic number in the\u00a0 \"relatively large number of categories\/values\" rule? This also depends, but from what I have seen in practice, the number is around 7-10 categories\/values for most (i.e., if a variable has more categories\/values that that it's treated as continuous, and if it has fewer categories\/values than that it is treated as discrete).<\/span>\r\n\r\n&nbsp;","rendered":"<p>&nbsp;<\/p>\n<p>I will introduce a final useful typology by which variables can be grouped: discrete and continuous.<\/p>\n<p>&nbsp;<\/p>\n<p>By definition, variables called <em>discrete<\/em> (note, not discreet!) have finite number of categories (i.e.,&#8221;space&#8221; between them, and nothing occupies that space), while variables called <em>continuous<\/em>\u00a0have potentially infinite number of values (i.e., it&#8217;s possible that a value exists between any two given values, in smaller and smaller &#8212; <em>infinite<\/em> &#8212; number of &#8220;spaces&#8221; between any two the values, to infinity). To make things easier to understand, and with more than a little risk of oversimplification, <strong>in a very broad sense you can think of nominal and ordinal variables as discrete and of interval\/ratio variables as continuous<\/strong>.<a class=\"footnote\" title=\"Technically speaking, in theory nominal and some ordinal variables are categorical, ordinal variables with numerical categories are discrete, and interval\/ratio variables are continuous. In practice, things are less clear cut.\" id=\"return-footnote-1448-1\" href=\"#footnote-1448-1\" aria-label=\"Footnote 1\"><sup class=\"footnote\">[1]<\/sup><\/a> For example, <em>hair colour, religious affiliation,<\/em> and <em>educational attainment<\/em> (as measured in educational degrees) are all discrete: they have finite number of <em>discrete<\/em> categories.<\/p>\n<p>&nbsp;<\/p>\n<p>On the other hand, age, income, or exam scores are all continuous: a number (value) can exist between any two given values, depending on how precise you want your measurement to be.\u00a0To take <em>age<\/em>, for example, if two people report being 20 and 22, respectively, it&#8217;s obviously possible that another person in 21. However, we need not round to full years; between two people ages 20 and 21, a value of 21.5 (or 21 years and 6 months) is possible to exist. Further, between the ages of 21 years and 21 years and 6 months, we can have a value of 21 years and 3 months, and so on, until we are down to counting days, then counting hours, then counting minutes, then counting seconds, then milliseconds, then microseconds, then nanoseconds, etc&#8230;. The point is that, in theory, there is always a smaller number between any two numbers (which can be represented by the possibility of infinite number of digits after the decimal point). The same can be applied to income and exam scores too.<\/p>\n<p>&nbsp;<\/p>\n<p>In practice, however, things are different. In sociological\u00a0<span style=\"text-indent: 37.3333px;font-size: 14pt\">research\u00a0<\/span><span style=\"text-align: initial;text-indent: 2em;font-size: 14pt\">(as with other similar disciplines), the data collected is <em>empirically<\/em> discrete, as the values collected are a finite number and are typically rounded to whole numbers: we don&#8217;t bother to measure age in anything but years, income in dollars (and not cents), etc. Still, w<\/span><span style=\"text-indent: 18.6667px;font-size: 14pt\">e usually call interval\/ratio variables are continuous<\/span><span style=\"text-indent: 1em;font-size: 14pt\">, because of the <\/span><em style=\"text-indent: 1em;font-size: 14pt\">potential<\/em><span style=\"text-indent: 1em;font-size: 14pt\"> for infinite number of values.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p>At the same time, however, some ratio variables are truly discrete. Think, for example, about a measure called <em>number of children\u00a0<\/em>of the respondent. Clearly, there is no possibility for an infinite number of values, just like with any &#8220;number of people&#8221;-type variable: people can only be counted in whole numbers, and the count is always finite.<\/p>\n<p>&nbsp;<\/p>\n<p>All this is undoubtedly confusing, so here is a practical tip for applied research, and what you need to focus on. Regardless if a variable is discrete or continuous <em>in theory<\/em>, in practice all variables you will encounter in real-life, actual datasets will be discrete. <strong>What we do is <em>treat<\/em> some variables as discrete, and other variables as continuous <em>for the purposes of statistical analysis<\/em><\/strong>. The rule of thumb is to make the differentiation based on the number of categories\/values: <strong><em>typically<\/em> nominal and ordinal variables have relatively few categories so we treat them as discrete, while interval\/ratio variables <em>typically<\/em> have relatively large number of values, so we treat them as continuous.<\/strong> If, however, an ordinal variable has relatively large number of categories it may be treated as continuous, and, on the flip side, if an interval\/ratio variable has relatively few values it may be treated as discrete. Generally, and\u00a0<span style=\"text-indent: 18.6667px;font-size: 14pt\">assuming proper justification (i.e., a large number of categories\/values),\u00a0<\/span><span style=\"text-indent: 1em;font-size: 14pt\">the decision to treat an ordinal variable as continuous or an interval\/ratio variable as discrete remains a matter of the researcher&#8217;s discretion.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"text-indent: 1em;font-size: 14pt\">Finally, what is the magic number in the\u00a0 &#8220;relatively large number of categories\/values&#8221; rule? This also depends, but from what I have seen in practice, the number is around 7-10 categories\/values for most (i.e., if a variable has more categories\/values that that it&#8217;s treated as continuous, and if it has fewer categories\/values than that it is treated as discrete).<\/span><\/p>\n<p>&nbsp;<\/p>\n<hr class=\"before-footnotes clear\" \/><div class=\"footnotes\"><ol><li id=\"footnote-1448-1\">Technically speaking, in theory nominal and some ordinal variables are categorical, ordinal variables with numerical categories are discrete, and interval\/ratio variables are continuous. In practice, things are less clear cut. <a href=\"#return-footnote-1448-1\" class=\"return-footnote\" aria-label=\"Return to footnote 1\">&crarr;<\/a><\/li><\/ol><\/div>","protected":false},"author":533,"menu_order":8,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-1448","chapter","type-chapter","status-publish","hentry"],"part":3,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/1448","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/users\/533"}],"version-history":[{"count":5,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/1448\/revisions"}],"predecessor-version":[{"id":1581,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/1448\/revisions\/1581"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/parts\/3"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapters\/1448\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/media?parent=1448"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/pressbooks\/v2\/chapter-type?post=1448"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/contributor?post=1448"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/simplestats\/wp-json\/wp\/v2\/license?post=1448"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}