Primary exercises
- Manually created factor.
In a study participants were asked whether their sport activity is
none
, oncePerWeek
, severalPerWeek
or daily
.
Build a proper factor for the responses below and store it in a variable
w
.
Print the factor.
Write the code to count the numbers of occurrences of each level and
print the counts.
severalPerWeek, none, none, oncePerWeek, oncePerWeek, oncePerWeek, oncePerWeek, ?, none, none
v <- c( "severalPerWeek", "none", "none", "oncePerWeek", "oncePerWeek", "oncePerWeek", "oncePerWeek", NA, "none", "none" )
w <- factor( v, levels = c( "none", "oncePerWeek", "severalPerWeek", "daily" ) )
w
[1] severalPerWeek none none oncePerWeek oncePerWeek oncePerWeek
[7] oncePerWeek <NA> none none
Levels: none oncePerWeek severalPerWeek daily
fct_count( w )
# A tibble: 5 × 2
f n
<fct> <int>
1 none 4
2 oncePerWeek 4
3 severalPerWeek 1
4 daily 0
5 <NA> 1
- A factor with a random content.
Read help about the function sample
.
Then study and try the following lines of code to understand the
results.
Next, understand why an error is generated and use replace
argument to generate a vector with 100 samples.
Store this vector in a variable v
and build a factor
w
from it.
Finally, count the numbers of occurrences of each level in
w
.
Ensure, that levels are in order provided in the variable
lvs
.
lvs <- c( "none", "oncePerWeek", "severalPerWeek", "daily" )
sample( lvs, 3 )
[1] "severalPerWeek" "oncePerWeek" "none"
sample( lvs, 3 )
[1] "oncePerWeek" "daily" "severalPerWeek"
sample( lvs, 3 )
[1] "none" "daily" "severalPerWeek"
sample( lvs, 100 )
Error in sample.int(length(x), size, replace, prob): cannot take a sample larger than the population when 'replace = FALSE'
v <- sample( lvs, 100, replace = TRUE )
w <- factor( v, levels = lvs )
w
[1] oncePerWeek daily oncePerWeek daily severalPerWeek oncePerWeek
[7] oncePerWeek severalPerWeek none severalPerWeek severalPerWeek daily
[13] severalPerWeek none none oncePerWeek oncePerWeek severalPerWeek
[19] severalPerWeek oncePerWeek none daily oncePerWeek none
[25] oncePerWeek severalPerWeek oncePerWeek none none severalPerWeek
[31] severalPerWeek daily severalPerWeek none severalPerWeek severalPerWeek
[37] daily severalPerWeek none oncePerWeek daily none
[43] oncePerWeek severalPerWeek daily daily severalPerWeek oncePerWeek
[49] oncePerWeek none daily severalPerWeek severalPerWeek oncePerWeek
[55] oncePerWeek severalPerWeek none severalPerWeek severalPerWeek none
[61] daily none none oncePerWeek none severalPerWeek
[67] daily none severalPerWeek oncePerWeek daily daily
[73] daily oncePerWeek daily none oncePerWeek oncePerWeek
[79] none daily oncePerWeek severalPerWeek none daily
[85] none oncePerWeek none oncePerWeek severalPerWeek severalPerWeek
[91] oncePerWeek oncePerWeek severalPerWeek oncePerWeek oncePerWeek daily
[97] none daily daily oncePerWeek
Levels: none oncePerWeek severalPerWeek daily
fct_count( w )
# A tibble: 4 × 2
f n
<fct> <int>
1 none 23
2 oncePerWeek 29
3 severalPerWeek 27
4 daily 21
- Reordering factor levels.
When a factor is shown on an axis of a plot, the order is given by its
levels.
The factor w
from the previous exercise will be then shown
in this order: none
, oncePerWeek
,
severalPerWeek
, daily
.
But for a picture in a manuscript the following order might be needed:
daily
, severalPerWeek
,
oncePerWeek
, none
.
Apply to w
one of the fct_
functions from the
tidyverse
library to produce a factor w2
with
the requested order.
Show the levels of w2
.
Again show the number of elements of each level in w2
and
compare it with the table of the previous exercise.
w2 <- fct_relevel( w, c( "daily", "severalPerWeek", "oncePerWeek", "none" ) )
levels( w2 )
[1] "daily" "severalPerWeek" "oncePerWeek" "none"
fct_count( w2 )
# A tibble: 4 × 2
f n
<fct> <int>
1 daily 21
2 severalPerWeek 27
3 oncePerWeek 29
4 none 23