Use \({\small \;\;{\%}{>}{\%}\;\;} \) to chain calculations.
The \({\small \;\;{\%}{>}{\%}\;\;} \) operator allows you to run your function calls in sequence from left to right instead of nested function calls or using intermediate variables.
For example recall exercise 4 in filter
practice section on survey
data:
Solution:
nrow( filter(survey, smokes!="never" & exercise=="none" & gender=="male") )
[1] 4
Here, first function filter
is run on the tibble survey
with some conditions resulting into a tibble with rows fulfilling those conditions, second the function nrow
is called on the resulting tibble, hence our answer 4.
Use ctrl-M to get \({\small \;\;{\%}{>}{\%}\;\;} \) symbol.
Solution with %>% :
survey %>% filter(smokes!="never" & exercise=="none" & gender=="male") %>% nrow()
[1] 4
The solution now reads from left to right \(survey {\small \;\;{\%}{>}{\%}\;\;} filter(...) {\small \;\;{\%}{>}{\%}\;\;} nrow\) instead of \(nrow(filter(survey,...))\).
Observation In the \({\small \;\;{\%}{>}{\%}\;\;} \) example above, note that filter
does not have the survey
as its first argument, the same is true for nrow
. This is basically what the \({\small \;\;{\%}{>}{\%}\;\;} \) operator does, take the result of the left hand side and place it as the first argument of the function at its right hand side.
Examples:
From the pulse dataset produce a tibble of personal information (name, age and gender):
pulse %>% filter(height>190) %>%
select(name,age,gender)
# A tibble: 5 × 3
name age gender
<chr> <dbl> <chr>
1 Travis 18 male
2 John 19 male
3 Albert 25 male
4 Lance 21 male
5 Christopher 18 male
pulse %>% filter(weight>40 & weight<50) %>%
select(name,age,gender)
# A tibble: 8 × 3
name age gender
<chr> <dbl> <chr>
1 Tisha 18 female
2 Marissa 18 female
3 Adeline 20 female
4 Bridgett 19 female
5 Katrina 22 female
6 Julianne 19 female
7 Sherri 23 female
8 Bettie 19 female
pulse %>% filter(smokes=="no" & alcohol=="no") %>%
select(name,age,gender)
# A tibble: 40 × 3
name age gender
<chr> <dbl> <chr>
1 Frederick 19 male
2 Leslie 19 male
3 Maura 19 female
4 Jerome 19 male
5 Arlene 34 female
6 Glenna 20 female
7 John 19 male
8 Erma 18 female
9 Olga 21 female
10 Laurie 19 female
# … with 30 more rows
Complex example: We are asked to list only the names of the females in the pulse dataset with average pulse \(>110\). We can break down the problem into:
averagePulse
(mutate)i.e.:
pulse %>% filter(gender=="female") %>% # tibble females
mutate(averagePulse=(pulse1+pulse2)/2) %>% # tibble females with averagePulse
filter(averagePulse>110) %>% # tibble females with averagePulse>100
pull(name) # vector of names
[1] "Melanie" "Consuelo" "Kelli" "Eliza" "Maude" "Lizzie"
⚠️ When in R Markdown file, each line in a sequence of commands with pipe must always end with the \({\small \;\;{\%}{>}{\%}\;\;} \) symbol except the last line, otherwise it is an error. See above example. Note also that each line may contain multiple \({\small \;\;{\%}{>}{\%}\;\;} \) symbols, e.g. first line in the example above, as long as it ends with a \({\small \;\;{\%}{>}{\%}\;\;} \) before the command on the next line. This way R know that there will be more commands on the next line.
Copyright © 2023 Biomedical Data Sciences (BDS) | LUMC