Primary exercises

  1. In the survey dataset add a new column feet with heights reported in feet unit (1 foot = 30.48 cm).
mutate(survey, feet=height/30.48)
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age  feet
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl> <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2  5.68
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6  5.83
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9 NA   
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3  5.25
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7  5.41
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21    5.67
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8  6   
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8  5.15
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19    5.74
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3  5.48
# … with 223 more rows
  1. In the survey dataset add a new column diffWritingHandSpan : the difference of span1 (writing hand) and span2 (non-writing hand).
mutate(survey, diffWritingHandSpan=span1-span2)
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age diffWritingHandSpan
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>               <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2               0.5  
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6              -1    
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9               4.7  
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3              -0.100
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7               0    
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21                 0.300
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8               0    
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8              -0.300
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19                 0.5  
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3               0    
# … with 223 more rows
  1. In the pulse dataset add new weight variables pound and stone (1 kg = 2.20462 pound = 0.157473 stone).
mutate(pulse, pound = weight * 2.20462, stone = weight * 0.157473 )
# A tibble: 110 × 15
   id     name      height weight   age gender smokes alcohol exercise ran   pulse1 pulse2  year pound stone
   <chr>  <chr>      <dbl>  <dbl> <dbl> <chr>  <chr>  <chr>   <chr>    <chr>  <dbl>  <dbl> <dbl> <dbl> <dbl>
 1 1993_A Bonnie       173     57    18 female no     yes     moderate sat       86     88  1993  126.  8.98
 2 1993_B Melanie      179     58    19 female no     yes     moderate ran       82    150  1993  128.  9.13
 3 1993_C Consuelo     167     62    18 female no     yes     high     ran       96    176  1993  137.  9.76
 4 1993_D Travis       195     84    18 male   no     yes     high     sat       71     73  1993  185. 13.2 
 5 1993_E Lauri        173     64    18 female no     yes     low      sat       90     88  1993  141. 10.1 
 6 1993_F George       184     74    22 male   no     yes     low      ran       78    141  1993  163. 11.7 
 7 1993_G Cherry       162     57    20 female no     yes     moderate sat       68     72  1993  126.  8.98
 8 1993_H Francesca    169     55    18 female no     yes     moderate sat       71     77  1993  121.  8.66
 9 1993_I Sonja        164     56    19 female no     yes     high     sat       68     68  1993  123.  8.82
10 1993_J Troy         168     60    23 male   no     yes     moderate ran       88    150  1993  132.  9.45
# … with 100 more rows
  1. In the survey dataset convert the variables smokes from character to factor with levels {“never”,“occas”,“regul”, “heavy”}, in that order.
mutate(survey, smokes = fct_relevel(factor(smokes), "never","occas","regul", "heavy")) 
# A tibble: 233 × 13
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <fct>   <dbl> <chr>    <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21  
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19  
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3
# … with 223 more rows

Extra exercises

In the survey dataset:

  1. Add a new column diffHandSpan : the absolute difference between span1 (writing hand) and span2 (non-writing hand). Hint: use abs function (?abs).
mutate(survey, diffWritingHandSpan=abs(span1-span2))
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age diffWritingHandSpan
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>               <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2               0.5  
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6               1    
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9               4.7  
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3               0.100
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7               0    
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21                 0.300
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8               0    
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8               0.300
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19                 0.5  
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3               0    
# … with 223 more rows
  1. Change height unit from cm to inch (1 cm = 0.393701 inch).
mutate(survey, height=height*0.393701)
# A tibble: 233 × 13
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    68.1 metric    18.2
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    70.0 imperial  17.6
 3 Gerald  male    18    13.3 right left       87 neither none     occas    NA   <NA>      16.9
 4 Robert  male    18.8  18.9 right right      NA neither none     never    63.0 metric    20.3
 5 Dustin  male    20    20   right neither    35 right   some     never    65.0 metric    23.7
 6 Abby    female  18    17.7 right left       64 right   some     never    68.0 imperial  21  
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    72.0 imperial  18.8
 8 Michael female  17    17.3 right right      74 right   freq     never    61.8 metric    35.8
 9 Edward  male    20    19.5 right right      72 right   some     never    68.9 metric    19  
10 Carl    male    18.5  18.5 right right      90 right   some     never    65.7 metric    22.3
# … with 223 more rows
  1. Produce the tibble containing the personal information of only those having height (in foot unit) between and including 6.0 and 6.5 feet.
filter( 
  mutate( 
    select(survey, Name=name, Age=age, Gender=gender, Height=height), feet=Height/30.48 
  ), feet >= 6.0 & feet <= 6.5 
) 
# A tibble: 36 × 5
   Name        Age Gender Height  feet
   <chr>     <dbl> <chr>   <dbl> <dbl>
 1 Andre      18.8 male     183.  6   
 2 Virgil     18.3 male     183.  6   
 3 Ken        19.8 male     190.  6.25
 4 Virgil     17.9 male     190.  6.25
 5 Frank      18.2 male     184   6.04
 6 Nathaniel  18   male     190   6.23
 7 Ben        35.5 male     185   6.07
 8 Bernard    17.5 male     190   6.23
 9 Felix      20.3 male     187   6.14
10 Patrick    18   male     183   6.00
# … with 26 more rows
  1. How many observations are in the following call, please try to reason before running the statement.
filter(mutate(filter(survey, height> 190),feet=height/30.48),height<=190 )
  1. The following table is one of many BMI classifications:
classification bmi
1 underweight <18.5
2 normal 18.5-24.9
3 overweight 25-29.9
4 obese >=30

Add the variable BMI_class to the pulse data set. Note that you will need to round (see ?round an additional argument digits) the BMI calculation to one decimal digit to fit into classification values in the table given above. Use the helper function case_when.

pulse_bmi <- mutate(pulse, BMI=round( weight/((height/100)^2), digits = 1)  ) 
mutate(pulse_bmi, BMI_class = case_when(
                BMI < 18.5 ~ "underweight",
                BMI >= 18.5 & BMI <= 24.9 ~ "normal",
                BMI >= 25   & BMI <= 29.9 ~ "overweight",
                BMI >= 30 ~ "obese"
              )                     
      )
# A tibble: 110 × 15
   id     name      height weight   age gender smokes alcohol exercise ran   pulse1 pulse2  year   BMI BMI_class  
   <chr>  <chr>      <dbl>  <dbl> <dbl> <chr>  <chr>  <chr>   <chr>    <chr>  <dbl>  <dbl> <dbl> <dbl> <chr>      
 1 1993_A Bonnie       173     57    18 female no     yes     moderate sat       86     88  1993  19   normal     
 2 1993_B Melanie      179     58    19 female no     yes     moderate ran       82    150  1993  18.1 underweight
 3 1993_C Consuelo     167     62    18 female no     yes     high     ran       96    176  1993  22.2 normal     
 4 1993_D Travis       195     84    18 male   no     yes     high     sat       71     73  1993  22.1 normal     
 5 1993_E Lauri        173     64    18 female no     yes     low      sat       90     88  1993  21.4 normal     
 6 1993_F George       184     74    22 male   no     yes     low      ran       78    141  1993  21.9 normal     
 7 1993_G Cherry       162     57    20 female no     yes     moderate sat       68     72  1993  21.7 normal     
 8 1993_H Francesca    169     55    18 female no     yes     moderate sat       71     77  1993  19.3 normal     
 9 1993_I Sonja        164     56    19 female no     yes     high     sat       68     68  1993  20.8 normal     
10 1993_J Troy         168     60    23 male   no     yes     moderate ran       88    150  1993  21.3 normal     
# … with 100 more rows
  1. Age classification:
classification age notation
1 adult >19 (19,∞)
2 adolescent >=10 and <=19 [10,19]
3 child >=1 and <=9 [1,9]
4 infant <1 (-∞,1)

Add the variable age_group to the survey dataset.

mutate(survey, age_group = case_when(
                age > 19 ~ "adult",
                age >= 10 & age <= 19 ~ "adolescent",
                age > 1   & age <= 9 ~ "child",
                age <= 1 ~ "infant"
              )                     
      )
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age age_group 
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl> <chr>     
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2 adolescent
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6 adolescent
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9 adolescent
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3 adult     
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7 adult     
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21   adult     
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8 adolescent
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8 adult     
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19   adolescent
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3 adult     
# … with 223 more rows


Copyright © 2022 Biomedical Data Sciences (BDS) | LUMC