Primary exercises

  1. In the survey dataset add a new column feet with heights reported in feet unit (1 foot = 30.48 cm).
mutate(survey, feet=height/30.48)
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age  feet
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl> <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2  5.68
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6  5.83
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9 NA   
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3  5.25
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7  5.41
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21    5.67
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8  6   
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8  5.15
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19    5.74
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3  5.48
# … with 223 more rows
  1. In the survey dataset add a new column diffWritingHandSpan : the difference of span1 (writing hand) and span2 (non-writing hand).
mutate(survey, diffWritingHandSpan=span1-span2)
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age diffWr…¹
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>    <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2    0.5  
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6   -1    
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9    4.7  
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3   -0.100
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7    0    
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21      0.300
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8    0    
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8   -0.300
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19      0.5  
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3    0    
# … with 223 more rows, and abbreviated variable name ¹​diffWritingHandSpan
  1. In the pulse dataset add new weight variables pound and stone (1 kg = 2.20462 pound = 0.157473 stone).
mutate(pulse, pound = weight * 2.20462, stone = weight * 0.157473 )
# A tibble: 110 × 15
   id     name   height weight   age gender smokes alcohol exerc…¹ ran   pulse1 pulse2  year pound stone
   <chr>  <chr>   <dbl>  <dbl> <dbl> <chr>  <chr>  <chr>   <chr>   <chr>  <dbl>  <dbl> <dbl> <dbl> <dbl>
 1 1993_A Bonnie    173     57    18 female no     yes     modera… sat       86     88  1993  126.  8.98
 2 1993_B Melan…    179     58    19 female no     yes     modera… ran       82    150  1993  128.  9.13
 3 1993_C Consu…    167     62    18 female no     yes     high    ran       96    176  1993  137.  9.76
 4 1993_D Travis    195     84    18 male   no     yes     high    sat       71     73  1993  185. 13.2 
 5 1993_E Lauri     173     64    18 female no     yes     low     sat       90     88  1993  141. 10.1 
 6 1993_F George    184     74    22 male   no     yes     low     ran       78    141  1993  163. 11.7 
 7 1993_G Cherry    162     57    20 female no     yes     modera… sat       68     72  1993  126.  8.98
 8 1993_H Franc…    169     55    18 female no     yes     modera… sat       71     77  1993  121.  8.66
 9 1993_I Sonja     164     56    19 female no     yes     high    sat       68     68  1993  123.  8.82
10 1993_J Troy      168     60    23 male   no     yes     modera… ran       88    150  1993  132.  9.45
# … with 100 more rows, and abbreviated variable name ¹​exercise
  1. In the survey dataset convert the variables smokes from character to factor with levels {“never”,“occas”,“regul”, “heavy”}, in that order.
mutate(survey, smokes = fct_relevel(factor(smokes), "never","occas","regul", "heavy")) 
# A tibble: 233 × 13
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <fct>   <dbl> <chr>    <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21  
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19  
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3
# … with 223 more rows

Extra exercises

In the survey dataset:

  1. Add a new column diffHandSpan : the absolute difference between span1 (writing hand) and span2 (non-writing hand). Hint: use abs function (?abs).
mutate(survey, diffWritingHandSpan=abs(span1-span2))
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age diffWr…¹
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>    <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2    0.5  
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6    1    
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9    4.7  
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3    0.100
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7    0    
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21      0.300
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8    0    
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8    0.300
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19      0.5  
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3    0    
# … with 223 more rows, and abbreviated variable name ¹​diffWritingHandSpan
  1. Change height unit from cm to inch (1 cm = 0.393701 inch).
mutate(survey, height=height*0.393701)
# A tibble: 233 × 13
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl>
 1 Alyson  female  18.5  18   right right      92 left    some     never    68.1 metric    18.2
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    70.0 imperial  17.6
 3 Gerald  male    18    13.3 right left       87 neither none     occas    NA   <NA>      16.9
 4 Robert  male    18.8  18.9 right right      NA neither none     never    63.0 metric    20.3
 5 Dustin  male    20    20   right neither    35 right   some     never    65.0 metric    23.7
 6 Abby    female  18    17.7 right left       64 right   some     never    68.0 imperial  21  
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    72.0 imperial  18.8
 8 Michael female  17    17.3 right right      74 right   freq     never    61.8 metric    35.8
 9 Edward  male    20    19.5 right right      72 right   some     never    68.9 metric    19  
10 Carl    male    18.5  18.5 right right      90 right   some     never    65.7 metric    22.3
# … with 223 more rows
  1. Produce the tibble containing the personal information of only those having height (in foot unit) between and including 6.0 and 6.5 feet.
filter( 
  mutate( 
    select(survey, Name=name, Age=age, Gender=gender, Height=height), feet=Height/30.48 
  ), feet >= 6.0 & feet <= 6.5 
) 
# A tibble: 36 × 5
   Name        Age Gender Height  feet
   <chr>     <dbl> <chr>   <dbl> <dbl>
 1 Andre      18.8 male     183.  6   
 2 Virgil     18.3 male     183.  6   
 3 Ken        19.8 male     190.  6.25
 4 Virgil     17.9 male     190.  6.25
 5 Frank      18.2 male     184   6.04
 6 Nathaniel  18   male     190   6.23
 7 Ben        35.5 male     185   6.07
 8 Bernard    17.5 male     190   6.23
 9 Felix      20.3 male     187   6.14
10 Patrick    18   male     183   6.00
# … with 26 more rows
  1. How many observations are in the following call, please try to reason before running the statement.
filter(mutate(filter(survey, height> 190),feet=height/30.48),height<=190 )
  1. The following table is one of many BMI classifications:
classification bmi
1 underweight <18.5
2 normal 18.5-24.9
3 overweight 25-29.9
4 obese >=30

Add the variable BMI_class to the pulse data set. Note that you will need to round (see ?round an additional argument digits) the BMI calculation to one decimal digit to fit into classification values in the table given above. Use the helper function case_when.

pulse_bmi <- mutate(pulse, BMI=round( weight/((height/100)^2), digits = 1)  ) 
mutate(pulse_bmi, BMI_class = case_when(
                BMI < 18.5 ~ "underweight",
                BMI >= 18.5 & BMI <= 24.9 ~ "normal",
                BMI >= 25   & BMI <= 29.9 ~ "overweight",
                BMI >= 30 ~ "obese"
              )                     
      )
# A tibble: 110 × 15
   id    name  height weight   age gender smokes alcohol exerc…¹ ran   pulse1 pulse2  year   BMI BMI_c…²
   <chr> <chr>  <dbl>  <dbl> <dbl> <chr>  <chr>  <chr>   <chr>   <chr>  <dbl>  <dbl> <dbl> <dbl> <chr>  
 1 1993… Bonn…    173     57    18 female no     yes     modera… sat       86     88  1993  19   normal 
 2 1993… Mela…    179     58    19 female no     yes     modera… ran       82    150  1993  18.1 underw…
 3 1993… Cons…    167     62    18 female no     yes     high    ran       96    176  1993  22.2 normal 
 4 1993… Trav…    195     84    18 male   no     yes     high    sat       71     73  1993  22.1 normal 
 5 1993… Lauri    173     64    18 female no     yes     low     sat       90     88  1993  21.4 normal 
 6 1993… Geor…    184     74    22 male   no     yes     low     ran       78    141  1993  21.9 normal 
 7 1993… Cher…    162     57    20 female no     yes     modera… sat       68     72  1993  21.7 normal 
 8 1993… Fran…    169     55    18 female no     yes     modera… sat       71     77  1993  19.3 normal 
 9 1993… Sonja    164     56    19 female no     yes     high    sat       68     68  1993  20.8 normal 
10 1993… Troy     168     60    23 male   no     yes     modera… ran       88    150  1993  21.3 normal 
# … with 100 more rows, and abbreviated variable names ¹​exercise, ²​BMI_class
  1. Age classification:
classification age notation
1 adult >19 (19,∞)
2 adolescent >=10 and <=19 [10,19]
3 child >=1 and <=9 [1,9]
4 infant <1 (-∞,1)

Add the variable age_group to the survey dataset.

mutate(survey, age_group = case_when(
                age > 19 ~ "adult",
                age >= 10 & age <= 19 ~ "adolescent",
                age > 1   & age <= 9 ~ "child",
                age <= 1 ~ "infant"
              )                     
      )
# A tibble: 233 × 14
   name    gender span1 span2 hand  fold    pulse clap    exercise smokes height m.i        age age_gr…¹
   <chr>   <chr>  <dbl> <dbl> <chr> <chr>   <dbl> <chr>   <chr>    <chr>   <dbl> <chr>    <dbl> <chr>   
 1 Alyson  female  18.5  18   right right      92 left    some     never    173  metric    18.2 adolesc…
 2 Todd    male    19.5  20.5 left  right     104 left    none     regul    178. imperial  17.6 adolesc…
 3 Gerald  male    18    13.3 right left       87 neither none     occas     NA  <NA>      16.9 adolesc…
 4 Robert  male    18.8  18.9 right right      NA neither none     never    160  metric    20.3 adult   
 5 Dustin  male    20    20   right neither    35 right   some     never    165  metric    23.7 adult   
 6 Abby    female  18    17.7 right left       64 right   some     never    173. imperial  21   adult   
 7 Andre   male    17.7  17.7 right left       83 right   freq     never    183. imperial  18.8 adolesc…
 8 Michael female  17    17.3 right right      74 right   freq     never    157  metric    35.8 adult   
 9 Edward  male    20    19.5 right right      72 right   some     never    175  metric    19   adolesc…
10 Carl    male    18.5  18.5 right right      90 right   some     never    167  metric    22.3 adult   
# … with 223 more rows, and abbreviated variable name ¹​age_group


Copyright © 2023 Biomedical Data Sciences (BDS) | LUMC