Primary exercises
On the survey dataset:
- Report mean hand spans
{span1,span2}
per gender.
survey %>% group_by(gender) %>%
summarise(meanWritingHand=mean(span1),meanNonWritingHand=mean(span2), .groups='drop')
# A tibble: 2 × 3
gender meanWritingHand meanNonWritingHand
<chr> <dbl> <dbl>
1 female 17.6 17.5
2 male 19.7 19.7
- Report mean and median
age
per gender, including the size of each gender group.
survey %>% group_by(gender) %>%
summarise(size=n(), meanAge=mean(age), medianAge=median(age), .groups='drop')
# A tibble: 2 × 4
gender size meanAge medianAge
<chr> <int> <dbl> <dbl>
1 female 117 20.4 18.4
2 male 116 20.3 18.9
- Report mean
pulse
pergender
andexercise
group, including size of each group. Do the same forgender
andsmokes
group.
# gender,exercise group
survey %>% group_by(gender, exercise) %>%
summarise(size=n(), meanPulse=mean(pulse, na.rm=TRUE), .groups='drop')
# A tibble: 6 × 4
gender exercise size meanPulse
<chr> <chr> <int> <dbl>
1 female freq 48 73.7
2 female none 11 71.4
3 female some 58 77
4 male freq 65 70.7
5 male none 12 80.6
6 male some 39 75.5
# gender,smokes group
survey %>% group_by(gender, smokes) %>%
summarise(size=n(), meanPulse=mean(pulse, na.rm=TRUE), .groups='drop')
# A tibble: 8 × 4
gender smokes size meanPulse
<chr> <chr> <int> <dbl>
1 female heavy 5 75
2 female never 98 75.7
3 female occas 9 73.4
4 female regul 5 69.2
5 male heavy 6 82.7
6 male never 88 72.4
7 male occas 10 74.5
8 male regul 12 75.2
- Report the mean
age
of those who exercise frequently pergender
including the group size
# Solution 1:
survey %>% filter(exercise=='freq') %>%
group_by(gender) %>%
summarise(size=n(), meanAgeFreqExercise=mean(age), .groups='drop')
# A tibble: 2 × 3
gender size meanAgeFreqExercise
<chr> <int> <dbl>
1 female 48 20.2
2 male 65 20.5
# Solution 2: first group and summarise and then filter out exercise group 'freq'. This is a
# less optimal solution since it calculates the summary on all exercise groups, whereas the
# solution 1 calculates the summary on a smaller table with only 'freq' exercise group.
#
# survey %>% group_by(gender,exercise) %>%
# summarise(size=n(), meanAgeFreqExercise=mean(age), .groups='drop') %>%
# filter(exercise=='freq') %>%
# select(-exercise)