(ADV) Fitting multiple simple linear regressions to parts of a
table.
Load the pulse.csv dataset into the pulse
variable.
Let’s define the following goal: separately for each
gender calculate a linear fit of
weight as a function of height.
First split the pulse table by
gender into the pulseByGender variable.
Now, pulseByGender is a list and can be accessed with
[[...]] operator.
Write the code to perform a linear fit of weight as a
function of height for females only
(obtained from pulseByGender list).
The next goal is to perform the above fit multiple times, separately
for each gender.
Define a variable genders with all genders present in the
split object.
Now a loop is needed.
A variable gender should iterate over all
genders.
For each value of gender the respective linear fit should
be performed.
Understand and try the following code:
for( gender in genders ) {
lm( weight ~ height, data = pulseByGender[[ gender ]] )
}
The above code calculates all needed fits but does show/store the
results anywhere.
The results need to be stored, for example in a list.
Understand/modify the code as follows:
fitByGender <- list()
for( gender in genders ) {
fitByGender[[gender]] <- lm( weight ~ height, data = pulseByGender[[ gender ]] )
}
fitByGender
fitByGender[[ "female" ]]
A special function lapply allows to write the above code
differently.
In R this is a preferred notation (despite being less intuitive at the
first glance).
Rewrite the for loop into lapply as
follows:
fitByGender <- lapply( genders, function( gender ) {
lm( weight ~ height, data = pulseByGender[[ gender ]] )
} )
fitByGender
names( fitByGender )
As you can see, fitByGender is a list but the elements
are not named.
This is because lapply names the elements as they were
named in the iterated input.
Check names of genders (you will see
NULL names).
Consequently, set the names of genders elements to be
equal to the element values.
Now, let’s repeat the lapply loop and check the
result.
Check (by eye) that fitByGender[[ 'female' ]] shows the
same result as at the top of this example.