{"id":317,"date":"2023-10-18T13:09:50","date_gmt":"2023-10-18T17:09:50","guid":{"rendered":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/chapter\/fun-soccer-worksheet\/"},"modified":"2023-10-18T17:09:49","modified_gmt":"2023-10-18T21:09:49","slug":"fun-soccer-worksheet","status":"publish","type":"chapter","link":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/chapter\/fun-soccer-worksheet\/","title":{"raw":"Data Wrangling: A case Study","rendered":"Data Wrangling: A case Study"},"content":{"raw":"<div class=\"container-fluid main-container\">\r\n<div id=\"header\" class=\"fluid-row\">\r\n<h1 class=\"date\">Case Study: What uniform pieces are we waiting for?<\/h1>\r\nA volunteer with a children's soccer team was trying to keep track of which child ordered which piece of equipment, what the size was, and whether the item was received.\u00a0 The team uses Excel, and the following Excel file was received:\r\n\r\n<\/div>\r\n&nbsp;\r\n<div id=\"header\" class=\"fluid-row\">\r\n\r\n[embed]https:\/\/onedrive.live.com\/embed?resid=3700FC25C4CB2CE9%21753&amp;authkey=%21AL3kRflgotiNEt8&amp;em=2&amp;wdAllowInteractivity=False&amp;wdHideGridlines=True&amp;wdHideHeaders=True&amp;wdDownloadButton=True&amp;wdInConfigurator=True&amp;wdInConfigurator=True[\/embed]\r\n\r\n<\/div>\r\n<div class=\"container-fluid main-container\">\r\n<div><\/div>\r\n<\/div>\r\n<div class=\"container-fluid main-container\">\r\n<div id=\"header\" class=\"fluid-row\">\r\n\r\n&nbsp;\r\n\r\n<\/div>\r\n[NOTE: INSERT EXCEL SHEET]\r\n\r\n[caption id=\"attachment_318\" align=\"aligncenter\" width=\"357\"]<img class=\" wp-image-318\" src=\"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-300x84.png\" alt=\"screen grab of an Excel sheet\" width=\"357\" height=\"100\" \/> This is the original wide data[\/caption]\r\n\r\nNote that this is a well organized table, but using it, it's difficult to see how many pairs of Youth Median (YM) black shorts are still outstanding.\u00a0 This is a job for data wrangling - organizing the data in a way that makes our work clearer.\r\n<h2>Tables should be long - not wide<\/h2>\r\n<\/div>\r\nThis table is short and wide.\u00a0 There is a row for each child (identified by their jersey number) and columns for each item, item size, and status (received or not).\u00a0 The solution in this case was to wrangle this data in R, creating a long but narrow table.\r\n\r\n&nbsp;\r\n<div class=\"container-fluid main-container\">\r\n<div id=\"open-the-libraries-we-will-need\" class=\"section level2\">\r\n<h2>Open the libraries we will need:<\/h2>\r\nIn the boxes you will find all of the R code needed for this project.\u00a0 Note that that this supposes that we have the `soccer.xlsx` file in our working directory\r\n<pre class=\"r\"><code>\u00a0<\/code><\/pre>\r\n<div class=\"textbox textbox--learning-objectives\">\r\n<div>\r\n<pre class=\"r\"><code>\r\nlibrary(dplyr) ## dplyr is for working with data frames \r\n<\/code><code>library(openxlsx) ## lets us liaise with Excel\r\nlibrary(stringr) ## Extract words to make it faster!\r\n\r\n## read in the data\u00a0\r\n<\/code>df &lt;- read.xlsx(\"soccer.xlsx\", sheet = \"Detailorders\", startRow = 2)<\/pre>\r\n<\/div>\r\n<\/div>\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<h2>## Inspecting the data<\/h2>\r\nExploratory Data Analysis (EDA) is a fancy word for what we naturally do:\u00a0 look at the data and explore.\u00a0 Here's what happens when we look around.\u00a0 First we look at what the structure of the data is:\r\n\r\n<\/div>\r\n<div id=\"basic-data\" class=\"section level2\">\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">df %&gt;% str()<\/span><\/code><\/span><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<pre><code><\/code><\/pre>\r\n<pre><code class=\"hljs\">## 'data.frame':    12 obs. of  28 variables:\r\n##  $ Player               : chr  \"1\" \"3\" \"5\" \"8\" ...\r\n##  $ Training.Jersey.Size : chr  \"YL\" \"YM\" \"YL\" \"YL\" ...\r\n##  $ Jersey.Received      : chr  \"Yes\" \"Yes\" \"Yes\" \"Yes\" ...\r\n##  $ Training.Short       : chr  \"YL\" \"YL\" \"YL\" \"YM\" ...\r\n##  $ Shorts.Received      : chr  \"Yes\" \"Yes\" \"Yes\" \"Yes\" ...\r\n##  $ Training.Sock        : chr  \"SM\" \"XS\" \"SM\" \"SM\" ...\r\n##  $ Socks.Received       : chr  \"Yes\" \"Yes\" \"Yes\" \"Yes\" ...\r\n##  $ Jersey.Grey          : chr  NA \"YL\" \"YXL\" \"YL\" ...\r\n##  $ Jersey.Grey.Received : chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Jersey.White         : chr  NA \"YL\" \"YXL\" \"YL\" ...\r\n##  $ Jersey.White.Received: chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Short.Black          : chr  NA \"YL\" \"YL\" \"YM\" ...\r\n##  $ Short.Received       : chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Black.Sock           : chr  NA \"XS\" \"SM\" \"SM\" ...\r\n##  $ Black.Sock.Received  : chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Track.Jacket         : chr  \"YL\" \"YL\" \"YXL\" \"YL\" ...\r\n##  $ Track.Jacket.Received: chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ Track.Pant           : chr  \"YL\" \"YL\" \"YL\" \"YL\" ...\r\n##  $ Track.Pant.Received  : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ Cinch.Pack           : chr  \"Ordered \" \"Ordered \" \"Ordered \" \"Ordered \" ...\r\n##  $ Cinch.Pack.Received  : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ GK.Jersey            : chr  \"YL\" NA NA NA ...\r\n##  $ GK.Jersey.Received   : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ GK.Short             : chr  \"YL\" NA NA NA ...\r\n##  $ GK.Short.Received    : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ GK.Socks             : chr  \"SM\" NA NA NA ...\r\n##  $ GK.Sock.Received     : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ Comments             : chr  NA NA NA \" \" ...<\/code><\/pre>\r\n<pre><code><\/code><\/pre>\r\n<\/div>\r\n<\/div>\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<code>\u00a0<\/code>\r\n\r\nWe will also take a minute to look at the data frame that we have:\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">df<\/span> <\/code><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<div class=\"js-comment-actions comment-actions\">\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\">\r\n<div class=\"js-comment-actions comment-actions\">\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>##    Player Training.Jersey.Size Jersey.Received Training.Short Shorts.Received\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 1       1                   YL             Yes             YL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 2       3                   YM             Yes             YL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 3       5                   YL             Yes             YL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 4       8                   YL             Yes             YM             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 5       9                  YXL              No             YL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 6      10                   YM             Yes             YM             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 7       2                   YL             Yes             YL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 8       4                  YXL             Yes            YXL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 9       6                   YL             Yes            YXL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 10      7                  YXL              No            YXL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 11     11                  YXL              No            YXL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 12     12                   YL             Yes             YL             Yes\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>##    Training.Sock Socks.Received Jersey.Grey Jersey.Grey.Received Jersey.White\r\n## 1             SM            Yes        &lt;NA&gt;                 &lt;NA&gt;         &lt;NA&gt;\r\n## 2             XS            Yes          YL                   No           YL\r\n## 3             SM            Yes         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 4             SM            Yes          YL                   No           YL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 5             SM            Yes         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 6             XS            Yes          YL                   No           YL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 7            MED             No         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 8            MED             No         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 9            MED             No         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 10           MED             No         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 11           MED             No         YXL                   No          YXL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 12            SM             No          YL                   No           YL\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>##    Jersey.White.Received Short.Black Short.Received Black.Sock\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 1                   &lt;NA&gt;        &lt;NA&gt;           &lt;NA&gt;       &lt;NA&gt;\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 2                     No          YL             No         XS\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 3                     No          YL             No         SM\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 4                     No          YM             No         SM\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 5                     No          YL             No         SM\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 6                     No          YM             No         XS\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 7                     No          YL             No        MED\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 8                     No         YXL             No        MED\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 9                     No         YXL             No        MED\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 10                    No         YXL             No        MED\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 11                    No         YXL             No        MED\r\n<\/code><\/div>\r\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 12                    No          YL             No         SM\r\n##    Black.Sock.Received Track.Jacket Track.Jacket.Received Track.Pant\r\n## 1                 &lt;NA&gt;           YL                    No         YL\r\n## 2                   No           YL                    No         YL\r\n## 3                   No          YXL                    No         YL\r\n## 4                   No           YL                    No         YL\r\n## 5                   No           YL                    No         YL\r\n## 6                   No           YL                    No         YM\r\n## 7                   No           YL                    No         YL\r\n## 8                   No           YL                    No         YL\r\n## 9                   No          YXL                    No         YL\r\n## 10                  No          YXL                    No         YL\r\n## 11                  No          YXL                    No        YXL\r\n## 12                  No           YM                    No         YL\r\n##    Track.Pant.Received Cinch.Pack Cinch.Pack.Received GK.Jersey\r\n## 1                   No   Ordered                   No        YL\r\n## 2                   No   Ordered                   No      &lt;NA&gt;\r\n## 3                   No   Ordered                   No      &lt;NA&gt;\r\n## 4                   No   Ordered                   No      &lt;NA&gt;\r\n## 5                   No   Ordered                   No      &lt;NA&gt;\r\n## 6                   No   Ordered                   No      &lt;NA&gt;\r\n## 7                   No   Ordered                   No      &lt;NA&gt;\r\n## 8                   No   Ordered                   No      &lt;NA&gt;\r\n## 9                   No   Ordered                   No      &lt;NA&gt;\r\n## 10                  No   Ordered                   No      &lt;NA&gt;\r\n## 11                  No   Ordered                   No      &lt;NA&gt;\r\n## 12                  No   Ordered                   No      &lt;NA&gt;\r\n##    GK.Jersey.Received GK.Short GK.Short.Received GK.Socks GK.Sock.Received\r\n## 1                  No       YL                No       SM               No\r\n## 2                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 3                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 4                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 5                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 6                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 7                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 8                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 9                  No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 10                 No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 11                 No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n## 12                 No     &lt;NA&gt;                No     &lt;NA&gt;               No\r\n##                                    Comments\r\n## 1                                      &lt;NA&gt;\r\n## 2                                      &lt;NA&gt;\r\n## 3                                      &lt;NA&gt;\r\n## 4\r\n## 5                     Never received jersey\r\n## 6                                      &lt;NA&gt;\r\n## 7                         Need size 2 socks\r\n## 8                         Need size 2 socks\r\n## 9                         Need size 2 socks\r\n## 10  Never received jersey , Need Siz 2 sock\r\n## 11 Never received jersey, need size 2 socks\r\n## 12                        Need size 2 socks<\/code><\/div>\r\n<\/div>\r\n<div class=\"comment-text js-comment-text-and-form\">\r\n<div class=\"comment-body js-comment-edit-hide\"><span class=\"comment-copy\"><code><\/code> <\/span><\/div>\r\n<\/div>\r\n<\/div>\r\n<\/div>\r\n<div class=\"comment-text js-comment-text-and-form\">\r\n<div class=\"comment-body js-comment-edit-hide\"><span class=\"comment-copy\"><code><\/code> <\/span><\/div>\r\n<\/div>\r\n<div class=\"dataTables_scrollHead\">\r\n<div class=\"dataTables_scrollHeadInner\"><\/div>\r\n<\/div>\r\n<div class=\"dataTables_scrollBody\"><\/div>\r\n<\/div>\r\n<\/div>\r\n&nbsp;\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<\/div>\r\n<div id=\"reshape-the-data\" class=\"section level2\">\r\n<h2>Reshape the data:<\/h2>\r\nWe want to create a table that is narrow (not too many columns) and long (many rows).\u00a0 The way to do this is to create a row for each piece of equipment ordered.\u00a0 In the end we should have data that looks like this:\r\n<table class=\"lines aligncenter\">\r\n<thead>\r\n<tr class=\"header\">\r\n<th style=\"width: 175px;text-align: center\">Player<\/th>\r\n<th style=\"width: 160px;text-align: center\">Item<\/th>\r\n<th style=\"width: 128px;text-align: center\">Size<\/th>\r\n<th style=\"width: 225px;text-align: center\">Received<\/th>\r\n<\/tr>\r\n<\/thead>\r\n<tbody>\r\n<tr class=\"odd\">\r\n<td style=\"width: 176px;text-align: center\">1<\/td>\r\n<td style=\"width: 161px;text-align: center\">Jersey<\/td>\r\n<td style=\"width: 129px;text-align: center\">YL<\/td>\r\n<td style=\"width: 225px;text-align: center\">Yes<\/td>\r\n<\/tr>\r\n<tr class=\"even\">\r\n<td style=\"width: 176px;text-align: center\">7<\/td>\r\n<td style=\"width: 161px;text-align: center\">Socks<\/td>\r\n<td style=\"width: 129px;text-align: center\">Med<\/td>\r\n<td style=\"width: 225px;text-align: center\">No<\/td>\r\n<\/tr>\r\n<\/tbody>\r\n<\/table>\r\nThere are many ways to do this, but here is one method:\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<h2 class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">## make a string of names<\/span> <\/code><\/span><\/h2>\r\n<p class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">col_names &lt;- colnames(df)<\/span> <\/code><\/span><\/p>\r\n\r\n<h2 class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">## cut the entry level stuff:<\/span> <\/code><\/span><\/h2>\r\n<p class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">interm_list &lt;- col_names[6:31]<\/span><\/code><\/span><\/p>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\"> interm_list<\/span><\/code><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<pre class=\"r\"><code>\u00a0<\/code><\/pre>\r\n<pre><code>##  [1] \"Training.Jersey.Size\"  \"Jersey.Received\"      \r\n##  [3] \"Training.Short\"        \"Shorts.Received\"      \r\n##  [5] \"Training.Sock\"         \"Socks.Received\"       \r\n##  [7] \"Jersey.Grey\"           \"Jersey.Grey.Received\" \r\n##  [9] \"Jersey.White\"          \"Jersey.White.Received\"\r\n## [11] \"Short.Black\"           \"Short.Received\"       \r\n## [13] \"Black.Sock\"            \"Black.Sock.Received\"  \r\n## [15] \"Track.Jacket\"          \"Track.Jacket.Received\"\r\n## [17] \"Track.Pant\"            \"Track.Pant.Received\"  \r\n## [19] \"Cinch.Pack\"            \"Cinch.Pack.Received\"  \r\n## [21] \"GK.Jersey\"             \"GK.Jersey.Received\"   \r\n## [23] \"GK.Short\"              \"GK.Short.Received\"    \r\n## [25] \"GK.Socks\"              \"GK.Sock.Received\"<\/code><\/pre>\r\n<\/div>\r\n<\/div>\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n&nbsp;\r\n<pre class=\"r\"><span style=\"color: #ffffff\"><code>\u00a0<\/code><\/span><\/pre>\r\n<pre><span style=\"color: #ffffff\"><code><\/code><\/span><\/pre>\r\n<pre class=\"r\"><span style=\"color: #ffffff\"><code><\/code><\/span><\/pre>\r\n<div class=\"textbox textbox--learning-objectives\" style=\"text-align: left\"><header class=\"textbox__header\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">## What are the sizes?<\/span><\/code><\/span>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">size_list &lt;- c() <\/span><\/code><\/p>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">rec_list &lt;- c() <\/span><\/code><\/p>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">for (i in 1:13){ <\/span><\/code><\/p>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">\u00a0\u00a0\u00a0 size_list &lt;- c(size_list,interm_list[2*i-1]) <\/span><\/code><\/p>\r\n<p class=\"textbox__title\">\u00a0 \u00a0\u00a0 \u00a0\u00a0 \u00a0\u00a0 <code><span style=\"color: #ffffff\">rec_list &lt;- c(rec_list,interm_list[2*i]) <\/span><\/code><\/p>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">} <\/span><\/code><\/p>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">item_list &lt;- str_remove(rec_list,\".Received\") <\/span><\/code><\/p>\r\n\r\n<h2 class=\"textbox__title\"><code><span style=\"color: #ffffff\">## Make a blank dataframe<\/span><\/code><\/h2>\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">Soccer_df &lt;- data.frame(\"Player\"=numeric(), \"Item\"= character(), \"Size\"=character(), \"Recieved\"=character())<\/span><\/code><\/p>\r\n<span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">for (i in 1:13){ <\/span><\/code><\/span>\r\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">newdf &lt;- df %&gt;% <\/span><\/code><\/span><\/p>\r\n<p style=\"padding-left: 80px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">select(Player, size_list[i], rec_list[i]) <\/span><\/code><\/span><\/p>\r\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">names(newdf) &lt;- c(\"Player\", \"Size\", \"Received\")<\/span> <\/code><\/span><\/p>\r\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">newdf$Item &lt;- item_list[i] newdf &lt;- newdf %&gt;% <\/span><\/code><\/span><\/p>\r\n<p style=\"padding-left: 80px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">select(Player, Item, Size, Received) <\/span><\/code><\/span><\/p>\r\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">Soccer_df &lt;- rbind(Soccer_df, newdf) <\/span><\/code><\/span><\/p>\r\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">}<\/span><\/code><\/span><\/p>\r\n\r\n<h2><code><span style=\"color: #ffffff\">Inspect the first 5 rows:\r\n<\/span><\/code><\/h2>\r\n<code><span style=\"color: #ffffff\">Soccer_df %&gt;% head()<\/span><\/code>\r\n\r\n&nbsp;\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<pre class=\"r\"><code>\r\n<\/code><\/pre>\r\n<pre><code>##   Player   Item Size Received\r\n## 1      1 Jersey   YL      Yes\r\n## 2      3 Jersey   YM      Yes\r\n## 3      5 Jersey   YL      Yes\r\n## 4      8 Jersey   YL      Yes\r\n## 5      9 Jersey  YXL       No\r\n## 6     10 Jersey   YM      Yes<\/code><\/pre>\r\n<\/div>\r\n<div class=\"textbox__content\"><\/div>\r\n<\/div>\r\n&nbsp;\r\n<pre class=\"r\"><code>\u00a0<\/code><code>\r\n\r\n<\/code><code><\/code><\/pre>\r\nNow we will look at grouping the data.\r\n\r\nI filtered out anything where the size was \u201cNA\u201d, as we assume that means that there is no item ordered.\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">summary &lt;- Soccer_df %&gt;%\r\nfilter(!is.na(Size)) <\/span><\/code>\r\n\r\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%\r\ngroup_by(Item, Size, Received) <\/span><\/code>\r\n\r\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%\r\nsummarize(Count = n())<\/span><\/code>\r\n\r\n&nbsp;\r\n\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">to_save &lt;- Soccer_df <\/span><\/code><\/p>\r\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%\r\nfilter(!is.na(Size)) <\/span><\/code>\r\n\r\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%\r\ngroup_by(Item, Size, Received) <\/span><\/code>\r\n\r\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%\r\nsummarize(Count = n()) <\/span><\/code>\r\n\r\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%\r\nfilter(Received == \"No\")<\/span><\/code>\r\n\r\n&nbsp;\r\n\r\n&nbsp;\r\n\r\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">to_save<\/span><\/code><\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<pre class=\"r\"><code>\u00a0<\/code><\/pre>\r\n<pre><code>## # A tibble: 23 x 4\r\n## # Groups:   Item, Size [23]\r\n##    Item        Size       Received Count\r\n##    &lt;chr&gt;       &lt;chr&gt;      &lt;chr&gt;    &lt;int&gt;\r\n##  1 Black.Sock  MED        No           5\r\n##  2 Black.Sock  SM         No           4\r\n##  3 Black.Sock  XS         No           2\r\n##  4 Cinch.Pack  \"Ordered \" No          12\r\n##  5 GK.Jersey   YL         No           1\r\n##  6 GK.Short    YL         No           1\r\n##  7 GK.Sock     SM         No           1\r\n##  8 Jersey      YXL        No           3\r\n##  9 Jersey.Grey YL         No           4\r\n## 10 Jersey.Grey YXL        No           7\r\n## # ... with 13 more rows<\/code><\/pre>\r\n<\/div>\r\n<\/div>\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n&nbsp;\r\n<pre class=\"r\"><code><\/code><\/pre>\r\n<\/div>\r\n<div id=\"save\" class=\"section level2\">\r\n<h2>Save<\/h2>\r\nFinally, we cansave that file as a new Excel sheet - or you could mkae it an old Excel Sheet, whatever!\r\n<div class=\"textbox textbox--learning-objectives\"><header class=\"textbox__header\">\r\n<p class=\"textbox__title\">Learning Objectives<\/p>\r\n\r\n<\/header>\r\n<div class=\"textbox__content\">\r\n<pre class=\"r\"><code>file_name &lt;- paste0(\"Soccer_\",Sys.Date(),\".xlsx\")\r\nfile_name<\/code><\/pre>\r\n<pre><code>## [1] \"Soccer_2022-04-08.xlsx\"<\/code><\/pre>\r\n<pre class=\"r\"><code>wb &lt;- createWorkbook(\"Soccer_Gear\")\r\n\r\naddWorksheet(wb, sheetName = \"original\")\r\nwriteData(wb, sheet = \"original\", x = df)\r\n\r\naddWorksheet(wb, sheetName = \"Not_Received\")\r\nwriteData(wb, sheet = \"Not_Received\", x = to_save)\r\n\r\naddWorksheet(wb, sheetName = \"Summary\")\r\nwriteData(wb, sheet = \"Summary\", x = summary)\r\n\r\n\r\n\r\nsaveWorkbook(wb, file_name, overwrite = TRUE)<\/code><\/pre>\r\n<pre><code><\/code><\/pre>\r\n<\/div>\r\n<\/div>\r\n&nbsp;\r\n\r\n<code><\/code>\r\n\r\n<\/div>\r\n<\/div>\r\n<\/div>","rendered":"<div class=\"container-fluid main-container\">\n<div id=\"header\" class=\"fluid-row\">\n<h1 class=\"date\">Case Study: What uniform pieces are we waiting for?<\/h1>\n<p>A volunteer with a children&#8217;s soccer team was trying to keep track of which child ordered which piece of equipment, what the size was, and whether the item was received.\u00a0 The team uses Excel, and the following Excel file was received:<\/p>\n<\/div>\n<p>&nbsp;<\/p>\n<div id=\"header\" class=\"fluid-row\">\n<p><a href=\"https:\/\/onedrive.live.com\/embed?resid=3700FC25C4CB2CE9%21753&#38;authkey=%21AL3kRflgotiNEt8&#38;em=2&#38;wdAllowInteractivity=False&#38;wdHideGridlines=True&#38;wdHideHeaders=True&#38;wdDownloadButton=True&#38;wdInConfigurator=True&#38;wdInConfigurator=True\">https:\/\/onedrive.live.com\/embed?resid=3700FC25C4CB2CE9%21753&amp;authkey=%21AL3kRflgotiNEt8&amp;em=2&amp;wdAllowInteractivity=False&amp;wdHideGridlines=True&amp;wdHideHeaders=True&amp;wdDownloadButton=True&amp;wdInConfigurator=True&amp;wdInConfigurator=True<\/a><\/p>\n<\/div>\n<div class=\"container-fluid main-container\">\n<div><\/div>\n<\/div>\n<div class=\"container-fluid main-container\">\n<div id=\"header\" class=\"fluid-row\">\n<p>&nbsp;<\/p>\n<\/div>\n<p>[NOTE: INSERT EXCEL SHEET]<\/p>\n<figure id=\"attachment_318\" aria-describedby=\"caption-attachment-318\" style=\"width: 357px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"wp-image-318\" src=\"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-300x84.png\" alt=\"screen grab of an Excel sheet\" width=\"357\" height=\"100\" srcset=\"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-300x84.png 300w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-1024x287.png 1024w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-768x215.png 768w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-1536x430.png 1536w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-2048x573.png 2048w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-65x18.png 65w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-225x63.png 225w, https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-content\/uploads\/sites\/1653\/2023\/10\/soccer_excel-350x98.png 350w\" sizes=\"auto, (max-width: 357px) 100vw, 357px\" \/><figcaption id=\"caption-attachment-318\" class=\"wp-caption-text\">This is the original wide data<\/figcaption><\/figure>\n<p>Note that this is a well organized table, but using it, it&#8217;s difficult to see how many pairs of Youth Median (YM) black shorts are still outstanding.\u00a0 This is a job for data wrangling &#8211; organizing the data in a way that makes our work clearer.<\/p>\n<h2>Tables should be long &#8211; not wide<\/h2>\n<\/div>\n<p>This table is short and wide.\u00a0 There is a row for each child (identified by their jersey number) and columns for each item, item size, and status (received or not).\u00a0 The solution in this case was to wrangle this data in R, creating a long but narrow table.<\/p>\n<p>&nbsp;<\/p>\n<div class=\"container-fluid main-container\">\n<div id=\"open-the-libraries-we-will-need\" class=\"section level2\">\n<h2>Open the libraries we will need:<\/h2>\n<p>In the boxes you will find all of the R code needed for this project.\u00a0 Note that that this supposes that we have the `soccer.xlsx` file in our working directory<\/p>\n<pre class=\"r\"><code>\u00a0<\/code><\/pre>\n<div class=\"textbox textbox--learning-objectives\">\n<div>\n<pre class=\"r\"><code>\r\nlibrary(dplyr) ## dplyr is for working with data frames \r\n<\/code><code>library(openxlsx) ## lets us liaise with Excel\r\nlibrary(stringr) ## Extract words to make it faster!\r\n\r\n## read in the data\u00a0\r\n<\/code>df &lt;- read.xlsx(\"soccer.xlsx\", sheet = \"Detailorders\", startRow = 2)<\/pre>\n<\/div>\n<\/div>\n<pre class=\"r\"><code><\/code><\/pre>\n<h2>## Inspecting the data<\/h2>\n<p>Exploratory Data Analysis (EDA) is a fancy word for what we naturally do:\u00a0 look at the data and explore.\u00a0 Here&#8217;s what happens when we look around.\u00a0 First we look at what the structure of the data is:<\/p>\n<\/div>\n<div id=\"basic-data\" class=\"section level2\">\n<pre class=\"r\"><code><\/code><\/pre>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">df %&gt;% str()<\/span><\/code><\/span><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<pre class=\"r\"><code><\/code><\/pre>\n<pre><code><\/code><\/pre>\n<pre><code class=\"hljs\">## 'data.frame':    12 obs. of  28 variables:\r\n##  $ Player               : chr  \"1\" \"3\" \"5\" \"8\" ...\r\n##  $ Training.Jersey.Size : chr  \"YL\" \"YM\" \"YL\" \"YL\" ...\r\n##  $ Jersey.Received      : chr  \"Yes\" \"Yes\" \"Yes\" \"Yes\" ...\r\n##  $ Training.Short       : chr  \"YL\" \"YL\" \"YL\" \"YM\" ...\r\n##  $ Shorts.Received      : chr  \"Yes\" \"Yes\" \"Yes\" \"Yes\" ...\r\n##  $ Training.Sock        : chr  \"SM\" \"XS\" \"SM\" \"SM\" ...\r\n##  $ Socks.Received       : chr  \"Yes\" \"Yes\" \"Yes\" \"Yes\" ...\r\n##  $ Jersey.Grey          : chr  NA \"YL\" \"YXL\" \"YL\" ...\r\n##  $ Jersey.Grey.Received : chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Jersey.White         : chr  NA \"YL\" \"YXL\" \"YL\" ...\r\n##  $ Jersey.White.Received: chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Short.Black          : chr  NA \"YL\" \"YL\" \"YM\" ...\r\n##  $ Short.Received       : chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Black.Sock           : chr  NA \"XS\" \"SM\" \"SM\" ...\r\n##  $ Black.Sock.Received  : chr  NA \"No\" \"No\" \"No\" ...\r\n##  $ Track.Jacket         : chr  \"YL\" \"YL\" \"YXL\" \"YL\" ...\r\n##  $ Track.Jacket.Received: chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ Track.Pant           : chr  \"YL\" \"YL\" \"YL\" \"YL\" ...\r\n##  $ Track.Pant.Received  : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ Cinch.Pack           : chr  \"Ordered \" \"Ordered \" \"Ordered \" \"Ordered \" ...\r\n##  $ Cinch.Pack.Received  : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ GK.Jersey            : chr  \"YL\" NA NA NA ...\r\n##  $ GK.Jersey.Received   : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ GK.Short             : chr  \"YL\" NA NA NA ...\r\n##  $ GK.Short.Received    : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ GK.Socks             : chr  \"SM\" NA NA NA ...\r\n##  $ GK.Sock.Received     : chr  \"No\" \"No\" \"No\" \"No\" ...\r\n##  $ Comments             : chr  NA NA NA \" \" ...<\/code><\/pre>\n<pre><code><\/code><\/pre>\n<\/div>\n<\/div>\n<pre class=\"r\"><code><\/code><\/pre>\n<p><code>\u00a0<\/code><\/p>\n<p>We will also take a minute to look at the data frame that we have:<\/p>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">df<\/span> <\/code><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<div class=\"js-comment-actions comment-actions\">\n<div class=\"comment-score js-comment-score js-comment-edit-hide\">\n<div class=\"js-comment-actions comment-actions\">\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>##    Player Training.Jersey.Size Jersey.Received Training.Short Shorts.Received<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 1       1                   YL             Yes             YL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 2       3                   YM             Yes             YL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 3       5                   YL             Yes             YL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 4       8                   YL             Yes             YM             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 5       9                  YXL              No             YL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 6      10                   YM             Yes             YM             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 7       2                   YL             Yes             YL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 8       4                  YXL             Yes            YXL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 9       6                   YL             Yes            YXL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 10      7                  YXL              No            YXL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 11     11                  YXL              No            YXL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 12     12                   YL             Yes             YL             Yes<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>##    Training.Sock Socks.Received Jersey.Grey Jersey.Grey.Received Jersey.White<br \/>\n## 1             SM            Yes        &lt;NA&gt;                 &lt;NA&gt;         &lt;NA&gt;<br \/>\n## 2             XS            Yes          YL                   No           YL<br \/>\n## 3             SM            Yes         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 4             SM            Yes          YL                   No           YL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 5             SM            Yes         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 6             XS            Yes          YL                   No           YL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 7            MED             No         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 8            MED             No         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 9            MED             No         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 10           MED             No         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 11           MED             No         YXL                   No          YXL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 12            SM             No          YL                   No           YL<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>##    Jersey.White.Received Short.Black Short.Received Black.Sock<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 1                   &lt;NA&gt;        &lt;NA&gt;           &lt;NA&gt;       &lt;NA&gt;<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 2                     No          YL             No         XS<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 3                     No          YL             No         SM<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 4                     No          YM             No         SM<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 5                     No          YL             No         SM<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 6                     No          YM             No         XS<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 7                     No          YL             No        MED<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 8                     No         YXL             No        MED<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 9                     No         YXL             No        MED<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 10                    No         YXL             No        MED<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 11                    No         YXL             No        MED<br \/>\n<\/code><\/div>\n<div class=\"comment-score js-comment-score js-comment-edit-hide\"><code>## 12                    No          YL             No         SM<br \/>\n##    Black.Sock.Received Track.Jacket Track.Jacket.Received Track.Pant<br \/>\n## 1                 &lt;NA&gt;           YL                    No         YL<br \/>\n## 2                   No           YL                    No         YL<br \/>\n## 3                   No          YXL                    No         YL<br \/>\n## 4                   No           YL                    No         YL<br \/>\n## 5                   No           YL                    No         YL<br \/>\n## 6                   No           YL                    No         YM<br \/>\n## 7                   No           YL                    No         YL<br \/>\n## 8                   No           YL                    No         YL<br \/>\n## 9                   No          YXL                    No         YL<br \/>\n## 10                  No          YXL                    No         YL<br \/>\n## 11                  No          YXL                    No        YXL<br \/>\n## 12                  No           YM                    No         YL<br \/>\n##    Track.Pant.Received Cinch.Pack Cinch.Pack.Received GK.Jersey<br \/>\n## 1                   No   Ordered                   No        YL<br \/>\n## 2                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 3                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 4                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 5                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 6                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 7                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 8                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 9                   No   Ordered                   No      &lt;NA&gt;<br \/>\n## 10                  No   Ordered                   No      &lt;NA&gt;<br \/>\n## 11                  No   Ordered                   No      &lt;NA&gt;<br \/>\n## 12                  No   Ordered                   No      &lt;NA&gt;<br \/>\n##    GK.Jersey.Received GK.Short GK.Short.Received GK.Socks GK.Sock.Received<br \/>\n## 1                  No       YL                No       SM               No<br \/>\n## 2                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 3                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 4                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 5                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 6                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 7                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 8                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 9                  No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 10                 No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 11                 No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n## 12                 No     &lt;NA&gt;                No     &lt;NA&gt;               No<br \/>\n##                                    Comments<br \/>\n## 1                                      &lt;NA&gt;<br \/>\n## 2                                      &lt;NA&gt;<br \/>\n## 3                                      &lt;NA&gt;<br \/>\n## 4<br \/>\n## 5                     Never received jersey<br \/>\n## 6                                      &lt;NA&gt;<br \/>\n## 7                         Need size 2 socks<br \/>\n## 8                         Need size 2 socks<br \/>\n## 9                         Need size 2 socks<br \/>\n## 10  Never received jersey , Need Siz 2 sock<br \/>\n## 11 Never received jersey, need size 2 socks<br \/>\n## 12                        Need size 2 socks<\/code><\/div>\n<\/div>\n<div class=\"comment-text js-comment-text-and-form\">\n<div class=\"comment-body js-comment-edit-hide\"><span class=\"comment-copy\"><code><\/code> <\/span><\/div>\n<\/div>\n<\/div>\n<\/div>\n<div class=\"comment-text js-comment-text-and-form\">\n<div class=\"comment-body js-comment-edit-hide\"><span class=\"comment-copy\"><code><\/code> <\/span><\/div>\n<\/div>\n<div class=\"dataTables_scrollHead\">\n<div class=\"dataTables_scrollHeadInner\"><\/div>\n<\/div>\n<div class=\"dataTables_scrollBody\"><\/div>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<pre class=\"r\"><code><\/code><\/pre>\n<\/div>\n<div id=\"reshape-the-data\" class=\"section level2\">\n<h2>Reshape the data:<\/h2>\n<p>We want to create a table that is narrow (not too many columns) and long (many rows).\u00a0 The way to do this is to create a row for each piece of equipment ordered.\u00a0 In the end we should have data that looks like this:<\/p>\n<table class=\"lines aligncenter\">\n<thead>\n<tr class=\"header\">\n<th style=\"width: 175px;text-align: center\">Player<\/th>\n<th style=\"width: 160px;text-align: center\">Item<\/th>\n<th style=\"width: 128px;text-align: center\">Size<\/th>\n<th style=\"width: 225px;text-align: center\">Received<\/th>\n<\/tr>\n<\/thead>\n<tbody>\n<tr class=\"odd\">\n<td style=\"width: 176px;text-align: center\">1<\/td>\n<td style=\"width: 161px;text-align: center\">Jersey<\/td>\n<td style=\"width: 129px;text-align: center\">YL<\/td>\n<td style=\"width: 225px;text-align: center\">Yes<\/td>\n<\/tr>\n<tr class=\"even\">\n<td style=\"width: 176px;text-align: center\">7<\/td>\n<td style=\"width: 161px;text-align: center\">Socks<\/td>\n<td style=\"width: 129px;text-align: center\">Med<\/td>\n<td style=\"width: 225px;text-align: center\">No<\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>There are many ways to do this, but here is one method:<\/p>\n<pre class=\"r\"><code><\/code><\/pre>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<h2 class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">## make a string of names<\/span> <\/code><\/span><\/h2>\n<p class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">col_names &lt;- colnames(df)<\/span> <\/code><\/span><\/p>\n<h2 class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">## cut the entry level stuff:<\/span> <\/code><\/span><\/h2>\n<p class=\"textbox__title\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">interm_list &lt;- col_names[6:31]<\/span><\/code><\/span><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\"> interm_list<\/span><\/code><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<pre class=\"r\"><code>\u00a0<\/code><\/pre>\n<pre><code>##  [1] \"Training.Jersey.Size\"  \"Jersey.Received\"      \r\n##  [3] \"Training.Short\"        \"Shorts.Received\"      \r\n##  [5] \"Training.Sock\"         \"Socks.Received\"       \r\n##  [7] \"Jersey.Grey\"           \"Jersey.Grey.Received\" \r\n##  [9] \"Jersey.White\"          \"Jersey.White.Received\"\r\n## [11] \"Short.Black\"           \"Short.Received\"       \r\n## [13] \"Black.Sock\"            \"Black.Sock.Received\"  \r\n## [15] \"Track.Jacket\"          \"Track.Jacket.Received\"\r\n## [17] \"Track.Pant\"            \"Track.Pant.Received\"  \r\n## [19] \"Cinch.Pack\"            \"Cinch.Pack.Received\"  \r\n## [21] \"GK.Jersey\"             \"GK.Jersey.Received\"   \r\n## [23] \"GK.Short\"              \"GK.Short.Received\"    \r\n## [25] \"GK.Socks\"              \"GK.Sock.Received\"<\/code><\/pre>\n<\/div>\n<\/div>\n<pre class=\"r\"><code><\/code><\/pre>\n<p>&nbsp;<\/p>\n<pre class=\"r\"><span style=\"color: #ffffff\"><code>\u00a0<\/code><\/span><\/pre>\n<pre><span style=\"color: #ffffff\"><code><\/code><\/span><\/pre>\n<pre class=\"r\"><span style=\"color: #ffffff\"><code><\/code><\/span><\/pre>\n<div class=\"textbox textbox--learning-objectives\" style=\"text-align: left\">\n<header class=\"textbox__header\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">## What are the sizes?<\/span><\/code><\/span><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">size_list &lt;- c() <\/span><\/code><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">rec_list &lt;- c() <\/span><\/code><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">for (i in 1:13){ <\/span><\/code><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">\u00a0\u00a0\u00a0 size_list &lt;- c(size_list,interm_list[2*i-1]) <\/span><\/code><\/p>\n<p class=\"textbox__title\">\u00a0 \u00a0\u00a0 \u00a0\u00a0 \u00a0\u00a0 <code><span style=\"color: #ffffff\">rec_list &lt;- c(rec_list,interm_list[2*i]) <\/span><\/code><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">} <\/span><\/code><\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">item_list &lt;- str_remove(rec_list,\".Received\") <\/span><\/code><\/p>\n<h2 class=\"textbox__title\"><code><span style=\"color: #ffffff\">## Make a blank dataframe<\/span><\/code><\/h2>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">Soccer_df &lt;- data.frame(\"Player\"=numeric(), \"Item\"= character(), \"Size\"=character(), \"Recieved\"=character())<\/span><\/code><\/p>\n<p><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">for (i in 1:13){ <\/span><\/code><\/span><\/p>\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">newdf &lt;- df %&gt;% <\/span><\/code><\/span><\/p>\n<p style=\"padding-left: 80px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">select(Player, size_list[i], rec_list[i]) <\/span><\/code><\/span><\/p>\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">names(newdf) &lt;- c(\"Player\", \"Size\", \"Received\")<\/span> <\/code><\/span><\/p>\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">newdf$Item &lt;- item_list[i] newdf &lt;- newdf %&gt;% <\/span><\/code><\/span><\/p>\n<p style=\"padding-left: 80px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">select(Player, Item, Size, Received) <\/span><\/code><\/span><\/p>\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">Soccer_df &lt;- rbind(Soccer_df, newdf) <\/span><\/code><\/span><\/p>\n<p style=\"padding-left: 40px\"><span style=\"color: #ffffff\"><code><span style=\"color: #ffffff\">}<\/span><\/code><\/span><\/p>\n<h2><code><span style=\"color: #ffffff\">Inspect the first 5 rows:<br \/>\n<\/span><\/code><\/h2>\n<p><code><span style=\"color: #ffffff\">Soccer_df %&gt;% head()<\/span><\/code><\/p>\n<p>&nbsp;<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<pre class=\"r\"><code>\r\n<\/code><\/pre>\n<pre><code>##   Player   Item Size Received\r\n## 1      1 Jersey   YL      Yes\r\n## 2      3 Jersey   YM      Yes\r\n## 3      5 Jersey   YL      Yes\r\n## 4      8 Jersey   YL      Yes\r\n## 5      9 Jersey  YXL       No\r\n## 6     10 Jersey   YM      Yes<\/code><\/pre>\n<\/div>\n<div class=\"textbox__content\"><\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<pre class=\"r\"><code>\u00a0<\/code><code>\r\n\r\n<\/code><code><\/code><\/pre>\n<p>Now we will look at grouping the data.<\/p>\n<p>I filtered out anything where the size was \u201cNA\u201d, as we assume that means that there is no item ordered.<\/p>\n<pre class=\"r\"><code><\/code><\/pre>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">summary &lt;- Soccer_df %&gt;%<br \/>\nfilter(!is.na(Size)) <\/span><\/code><\/p>\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%<br \/>\ngroup_by(Item, Size, Received) <\/span><\/code><\/p>\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%<br \/>\nsummarize(Count = n())<\/span><\/code><\/p>\n<p>&nbsp;<\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">to_save &lt;- Soccer_df <\/span><\/code><\/p>\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%<br \/>\nfilter(!is.na(Size)) <\/span><\/code><\/p>\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%<br \/>\ngroup_by(Item, Size, Received) <\/span><\/code><\/p>\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%<br \/>\nsummarize(Count = n()) <\/span><\/code><\/p>\n<p class=\"textbox__title\" style=\"padding-left: 40px\"><code><span style=\"color: #ffffff\">%&gt;%<br \/>\nfilter(Received == \"No\")<\/span><\/code><\/p>\n<p>&nbsp;<\/p>\n<p>&nbsp;<\/p>\n<p class=\"textbox__title\"><code><span style=\"color: #ffffff\">to_save<\/span><\/code><\/p>\n<\/header>\n<div class=\"textbox__content\">\n<pre class=\"r\"><code>\u00a0<\/code><\/pre>\n<pre><code>## # A tibble: 23 x 4\r\n## # Groups:   Item, Size [23]\r\n##    Item        Size       Received Count\r\n##    &lt;chr&gt;       &lt;chr&gt;      &lt;chr&gt;    &lt;int&gt;\r\n##  1 Black.Sock  MED        No           5\r\n##  2 Black.Sock  SM         No           4\r\n##  3 Black.Sock  XS         No           2\r\n##  4 Cinch.Pack  \"Ordered \" No          12\r\n##  5 GK.Jersey   YL         No           1\r\n##  6 GK.Short    YL         No           1\r\n##  7 GK.Sock     SM         No           1\r\n##  8 Jersey      YXL        No           3\r\n##  9 Jersey.Grey YL         No           4\r\n## 10 Jersey.Grey YXL        No           7\r\n## # ... with 13 more rows<\/code><\/pre>\n<\/div>\n<\/div>\n<pre class=\"r\"><code><\/code><\/pre>\n<p>&nbsp;<\/p>\n<pre class=\"r\"><code><\/code><\/pre>\n<\/div>\n<div id=\"save\" class=\"section level2\">\n<h2>Save<\/h2>\n<p>Finally, we cansave that file as a new Excel sheet &#8211; or you could mkae it an old Excel Sheet, whatever!<\/p>\n<div class=\"textbox textbox--learning-objectives\">\n<header class=\"textbox__header\">\n<p class=\"textbox__title\">Learning Objectives<\/p>\n<\/header>\n<div class=\"textbox__content\">\n<pre class=\"r\"><code>file_name &lt;- paste0(\"Soccer_\",Sys.Date(),\".xlsx\")\r\nfile_name<\/code><\/pre>\n<pre><code>## [1] \"Soccer_2022-04-08.xlsx\"<\/code><\/pre>\n<pre class=\"r\"><code>wb &lt;- createWorkbook(\"Soccer_Gear\")\r\n\r\naddWorksheet(wb, sheetName = \"original\")\r\nwriteData(wb, sheet = \"original\", x = df)\r\n\r\naddWorksheet(wb, sheetName = \"Not_Received\")\r\nwriteData(wb, sheet = \"Not_Received\", x = to_save)\r\n\r\naddWorksheet(wb, sheetName = \"Summary\")\r\nwriteData(wb, sheet = \"Summary\", x = summary)\r\n\r\n\r\n\r\nsaveWorkbook(wb, file_name, overwrite = TRUE)<\/code><\/pre>\n<pre><code><\/code><\/pre>\n<\/div>\n<\/div>\n<p>&nbsp;<\/p>\n<p><code><\/code><\/p>\n<\/div>\n<\/div>\n<\/div>\n","protected":false},"author":883,"menu_order":11,"template":"","meta":{"pb_show_title":"on","pb_short_title":"","pb_subtitle":"","pb_authors":[],"pb_section_license":""},"chapter-type":[],"contributor":[],"license":[],"class_list":["post-317","chapter","type-chapter","status-publish","hentry"],"part":98,"_links":{"self":[{"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/chapters\/317","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/chapters"}],"about":[{"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/wp\/v2\/types\/chapter"}],"author":[{"embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/wp\/v2\/users\/883"}],"version-history":[{"count":12,"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/chapters\/317\/revisions"}],"predecessor-version":[{"id":332,"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/chapters\/317\/revisions\/332"}],"part":[{"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/parts\/98"}],"metadata":[{"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/chapters\/317\/metadata\/"}],"wp:attachment":[{"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/wp\/v2\/media?parent=317"}],"wp:term":[{"taxonomy":"chapter-type","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/pressbooks\/v2\/chapter-type?post=317"},{"taxonomy":"contributor","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/wp\/v2\/contributor?post=317"},{"taxonomy":"license","embeddable":true,"href":"https:\/\/pressbooks.bccampus.ca\/businessanalytics\/wp-json\/wp\/v2\/license?post=317"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}