Primary exercises

  1. Create tibble
  1. Create a tibble exercise_group for a group of individuals with names {Sonja, Steven, Ines, Robert, Tim} with their heights {164, 188, 164, 180, 170}, weights {56.0, 87.0, 54.0, 80.0, 58.5} and frequency of exercise {high, high, low, moderate, low}.

  2. Update the tibble exercise_group with Ella and Oscar, leave their respective height, weight and exercise values as missing (NA). Avoid copy/paste from (a) with inclusion of new names, instead try to reuse the columns inside exercise_group.

  3. Add the sex variable to exercise_group with values male and female.

  1. Create a tibble which keeps track of the smoking habits over the years of Julio age 21 started smoking at 17 and stopped in 2020, Camille age 20 started smoking in 2021 and Travis 19 started at age 16.

tibble subset

  1. Take the tibble exercise_group from the previous exercise and create a new tibble exercise_group_sub without the height and weight variables by selection [.

  2. Create a tibble called exercise_group_sub with the 1st and 3rd column.

Extract variables as vectors

  1. Given the tibble favourite_colour, how old were the subjects by the end of 2021?

  2. What is the mean height in exercise_group? Use mean function (see ?mean).

Read tibbles from file

  1. Read pulse.csv data set into R and inspect its dimensions.

  2. Read survey.csv data set into R.

  • Inspect the dimensions.

  • Show the first 9 and the last 7 rows.

  • Calculate the mean age.

  • Calculate the mean height in survey data.

Extra exercises

  1. In survey data:
  1. What is the mean height of the last 30 observations?

  2. The variable age is the last column in the survey data. Make a tibble where the variable age comes directly after name.

  1. Create the favourite_colour tibble from the lecture but now with colour variable as a factor. Print the counts for each level.

[<row>, <column>] : row and column selection based on range of indices.

Using single square bracket [ one could select range of rows, columns or a combination. For example take the exercise_group tibble from the primary exercises above, then:

exercise_group[c(2,3),]  # returns rows in the range 2 to 3
# A tibble: 2 × 5
  name   height weight exercise sex   
  <chr>   <dbl>  <dbl> <chr>    <chr> 
1 Steven    188     87 high     male  
2 Ines      164     54 low      female
exercise_group[,c(1,3)]  # returns columns in 1 and 3
# A tibble: 7 × 2
  name   weight
  <chr>   <dbl>
1 Sonja    56  
2 Steven   87  
3 Ines     54  
4 Robert   80  
5 Tim      58.5
6 Ella     NA  
7 Oscar    NA  
exercise_group[c(2,3),c(1,3)]  # combination the above
# A tibble: 2 × 2
  name   weight
  <chr>   <dbl>
1 Steven     87
2 Ines       54
  1. Reproduce the following tibbles from exercise_group:
# A tibble: 2 × 5
  name   height weight exercise sex   
  <chr>   <dbl>  <dbl> <chr>    <chr> 
1 Sonja     164     56 high     female
2 Robert    180     80 moderate male  
# A tibble: 4 × 3
  name   weight exercise
  <chr>   <dbl> <chr>   
1 Steven   87   high    
2 Ines     54   low     
3 Robert   80   moderate
4 Tim      58.5 low     
# A tibble: 2 × 3
  height weight exercise
   <dbl>  <dbl> <chr>   
1     NA     NA <NA>    
2     NA     NA <NA>    


Copyright © 2023 Biomedical Data Sciences (BDS) | LUMC