Primary exercises
- For this exercise we will first split the
surveydataset into two separate tables in order to join them again! Call thesedf1anddf2, these will have disjoint set of variables exceptnameandage, the variablesnameandagecombined are unique in all observations and will be used later for joining. Take for example all variables related to arm or hand in df1 and the rest in df2:
df1 : "name" "span1" "span2" "hand" "fold" "clap" "age"
df2 : "name" "gender" "pulse" "exercise" "smokes" "height" "m.i" "age"
Join df1 and df2 by
nameandagesuch that you obtain the original survey table.In exercise (a) does it make any difference to choose either of
inner_join,left_joinorfull_join? Hint: compare two tables with functionall_equal.Are the pairs
nameandagealso good candidates as the key, i.e. is the combination ofnameandageuniquely identify each observation in the survey data? What about the combination ofnamewithspan1orspan2?