Primary exercises
- For this exercise we will first split the surveydataset into two separate tables in order to join them again! Call thesedf1anddf2, these will have disjoint set of variables exceptnameandage, the variablesnameandagecombined are unique in all observations and will be used later for joining. Take for example all variables related to arm or hand in df1 and the rest in df2:
df1 : "name"   "span1" "span2" "hand"  "fold"   "clap"  "age"
df2 : "name"   "gender"  "pulse"  "exercise"   "smokes"  "height" "m.i" "age"- Join df1 and df2 by - nameand- agesuch that you obtain the original survey table.
- In exercise (a) does it make any difference to choose either of - inner_join,- left_joinor- full_join? Hint: compare two tables with function- all_equal.
- Are the pairs - nameand- agealso good candidates as the key, i.e. is the combination of- nameand- ageuniquely identify each observation in the survey data? What about the combination of- namewith- span1or- span2?