Primary exercises
- Dietary intakes. (Create a vector, use it in
calculation.)
 Four patients had daily dietary intakes of 2314, 2178, 1922, 2004
kcal.
 Make a vectorintakesKCalof these four values.
 What is the class of this vector?
 Convert the values into in kJ using 1 kcal = 4.184 kJ.
intakesKCal <- c( 2314, 2178, 1922, 2004 )
intakesKCal
[1] 2314 2178 1922 2004
class( intakesKCal )
[1] "numeric"
intakesKCal * 4.184
[1] 9681.776 9112.752 8041.648 8384.736
- More dietary intakes. (Combining/appending/merging
vectors.)
 Additional set of intakes is provided: 2122, 2616, NA, 1771 kcal.
 Usec()to append the new intakes after values inintakesKCaland store the result inallIntakesKCal.
 Print the combined vector and print its calculatedlength.
intakesKCal2 <- c( 2122, 2616, NA, 1771 )
allIntakesKCal <- c( intakesKCal, intakesKCal2 )
allIntakesKCal
[1] 2314 2178 1922 2004 2122 2616   NA 1771
length( allIntakesKCal )
[1] 8
- The average and total intakes. (Calculating means and sums,
skipping missing values.)
 Calculatemeanintake for patients in vectorintakesKCal.
 Next, calculatemeanintake for patients in vectorallIntakesKCal.
 Can you explain the result?
 Check help for?mean, in particular thena.rmargument.
 Use the extra argumentna.rm=TRUEto calculate themeanof non-NAelements ofallIntakesKCal.
 Check help for?sumhow to omitNAelements in
sum calculation.
 Now, calculate the totalsumofallIntakesKCalintakes ignoring theNAelement.
mean( intakesKCal )
[1] 2104.5
mean( allIntakesKCal )
[1] NA
# since one element is missing, the mean is unknown
# ?mean, adding argument na.rm=TRUE will omit NA elements
mean( allIntakesKCal, na.rm = TRUE )
[1] 2132.429
# ?sum also allows na.rm=TRUE argument to skip NA elements
sum( allIntakesKCal, na.rm = TRUE )
[1] 14927
- Selecting valid intakes. (Selecting non-missing elements;
logical vectors.)
 Understand the result ofis.na( allIntakesKCal ).
 Now, negate the above result with!operator.
 Use above vectors as argument tosumto calculate the
number of missing and non-missing elements inallIntakesKCal.
 UnderstandallIntakesKCal[ !is.na( allIntakesKCal ) ].
is.na( allIntakesKCal )         # TRUE marks positions with missing data
[1] FALSE FALSE FALSE FALSE FALSE FALSE  TRUE FALSE
!is.na( allIntakesKCal )        # TRUE marks positions with available data
[1]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE  TRUE
sum( is.na( allIntakesKCal ) )                # number of missing elements
[1] 1
sum( !is.na( allIntakesKCal ) )               # number of non-missing elements
[1] 7
allIntakesKCal[ !is.na( allIntakesKCal ) ]    # keeps elements which are not NA
[1] 2314 2178 1922 2004 2122 2616 1771
sum( allIntakesKCal[ !is.na( allIntakesKCal ) ] )    # same as sum( allIntakesKCal, na.rm = TRUE )
[1] 14927
- Generating random kcal intakes. (Generating normally distributed
random numbers; descriptive statistics.)
 The codev <- rnorm( 10 )would sample 10 numbers from
the normal distribution and store them as a vector inv.
 Printv. Then repeatv <- rnorm( 10 )and
printvagain. Hasvchanged?
 Next, read the manual ofrnormand find how to generate
random numbers with givenmeanand standard deviation
(sd).
 Now, invsimulate kcal intake by generating 15 random
numbers withmean=2000andsd=300.
 Printvand find by eye the smallest and the largest of
these numbers.
 Try to use the functionsminandmaxonv– have you found the same numbers by eye?
 Calculate themean,medianand the standard
deviation (sd) ofv.
v <- rnorm( 10 ) # a vector of random numbers
v
 [1]  0.6950653 -1.5075282  0.6950654  0.5490401  2.0878614  0.2727368  0.7177340 -0.9031716  1.0851185 -0.5855618
v <- rnorm( 10 ) # another vector of random numbers
v
 [1] -0.0656511 -1.5421484 -0.9155879 -0.4190172 -0.5047704  0.1704609 -0.8082240  0.6948146  0.4087415 -0.7390048
v <- rnorm( n = 15, mean = 2000, sd = 300 )
v
 [1] 1845.645 1979.962 1762.429 2439.641 2334.950 2080.101 1694.140 1964.535 2082.280 2229.064 2141.860 1930.351 2007.813 1694.550 2173.657
min( v )
[1] 1694.14
max( v )
[1] 2439.641
mean( v )    # is it close to 2000? try several random v vectors and see the effect of growing n
[1] 2024.065
median( v )
[1] 2007.813
sd( v )      # is it close to 300? try several random v vectors and see the effect of growing n
[1] 221.426
- Selecting and counting some kcal intakes. (Selecting elements by
a condition; logical vectors.)
 Invsimulate kcal intake by generating 15 random numbers
withmean=2000andsd=300.
 Typev < 2000and understand the result.
 How to interpret the number produced bysum( v < 2000 )?
 How to interpret the number produced bysum( !( v < 2000 ) )?
v <- rnorm( n = 15, mean = 2000, sd = 300 )
v
 [1] 1737.473 1714.682 2029.537 2284.039 1899.772 1750.727 2132.973 2326.592 1748.889 2461.049 1813.361 2273.268 1857.315 2112.167 1858.562
v < 2000             # TRUE corresponds to elements of vector v SMALLER THAN 2000
 [1]  TRUE  TRUE FALSE FALSE  TRUE  TRUE FALSE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE
v[ v < 2000 ]        # selected elements of v smaller than 2000
[1] 1737.473 1714.682 1899.772 1750.727 1748.889 1813.361 1857.315 1858.562
sum( v < 2000 )      # number of elements in vector v smaller than 2000
[1] 8
sum( !( v < 2000 ) ) # number of elements in vector v GREATER OR EQUAL than 2000
[1] 7
sum( v >= 2000 )     # same as above
[1] 7
- Head and tail.
 Often there is a need to quickly look at the beginning
(head) or at the end (tail) of a vector.
 Try these functions to show the first 5 and the last 7 elements of a
randomly generated vectorv <- rnorm( 20 ).
v <- rnorm( 20 )
v
 [1]  1.1608627 -0.7247818  0.8077146  0.9885814 -0.7746906 -0.3789386  0.4849407 -1.4240067  0.4541046 -0.8707405  0.8812410  0.1355738
[13]  1.3743604 -0.6058297  2.4109237 -0.2175324  0.4867582 -1.1017018  0.7452004  0.9761972
head( v, 5 )
[1]  1.1608627 -0.7247818  0.8077146  0.9885814 -0.7746906
tail( v, 7 )
[1] -0.6058297  2.4109237 -0.2175324  0.4867582 -1.1017018  0.7452004  0.9761972
- Elements of a vector.
 Let’s assume that eight persons had caloric intakes of 2122, 2616, NA,
1771, 2314, 2178, 1922, 2004 kcal.
 Make a vectorintakesKCalof these eight values (in the
given order).
 Use the square brackets to get the 4th element ofintakesKCal.
 Use the square brackets and the colon operator (:) to get
the elements from the second to the fifth (inclusive).
 Define another vectorposeswith values 1, 3, 5, 7. Use it
get the 1st, 3rd, 5th and 7th element ofintakesKCal.
 Finally, get the 1st, 3rd, 5th and 7th element ofintakesKCaltyping numbers directly inside[...](without using an extraposesvariable).
intakesKCal <- c( 2122, 2616, NA, 1771, 2314, 2178, 1922, 2004 )
intakesKCal
[1] 2122 2616   NA 1771 2314 2178 1922 2004
intakesKCal[ 4 ]
[1] 1771
intakesKCal[ 2:5 ]
[1] 2616   NA 1771 2314
poses <- c(1,3,5,7)
intakesKCal[ poses ]
[1] 2122   NA 2314 1922
intakesKCal[ c(1,3,5,7) ]
[1] 2122   NA 2314 1922