E Appendix: Short Intro to Matrix Algebra

In this appendix, we’ll cover the basics of how working with matrices in R, useful for a number of applications in statistics.

E.0.1 Using the matrix() function to store 2D data in R

Or should this be in an appendix?

Imagine the following data, which has counts for four different variables (in rows) under two different conditions (in columns). As a reminder, “columns hang down”.

count A B
\(x_1\) 25 10
\(x_2\) 12 18
\(x_3\) 16 4
\(x_4\) 9 21

How would we store this? We’ll use the matrix() function as follows:

a <- matrix(c(25, 12, 16, 9, 10, 18, 4, 21), 4, 2)
a
##      [,1] [,2]
## [1,]   25   10
## [2,]   12   18
## [3,]   16    4
## [4,]    9   21
  • How could I find the row and column totals easily? Use the function apply() function as:
apply(a, 1, sum)  # the value 1 means to apply across rows
## [1] 35 30 20 30
apply(a, 2, sum)  # the value 2 means to apply down the columns
## [1] 62 53
  • if I assumed independence between rows and columns of a, how could I calculate expected values?

First, create matrices that contain the row and column totals

ct <- matrix(apply(a, 1, sum), 1, 2)
rt <- matrix(apply(a, 2, sum), 4, 1)

And then use matrix multiplication

round(rt%*%ct/sum(a), 1)
##      [,1] [,2]
## [1,] 18.9 16.2
## [2,] 16.1 13.8
## [3,] 18.9 16.2
## [4,] 16.1 13.8
e <- round(rt%*%ct/sum(a), 1)

sum((a-e)^2/e)
## [1] 23.18419

E.0.2 Guided Practice

  1. Create a matrix that is 3x3 that contains the numbers 1..9 in random order (hint: use the sample function)
d <- matrix(sample(c(1:9),9), 3, 3)
d
##      [,1] [,2] [,3]
## [1,]    9    6    4
## [2,]    3    8    7
## [3,]    2    5    1
  1. Create a matrix that is 3x3 that contains the numbers 3 in the first row, 5 in the second row and 7 in the third row.
f <- matrix(c(3,5,7,3,5,7,3,5,7), 3, 3)
f
##      [,1] [,2] [,3]
## [1,]    3    3    3
## [2,]    5    5    5
## [3,]    7    7    7
  1. Assuming (1) is our observed data and (2) is our expected data, calculate the test statistic using one line of code.
sum((d-f)^2/f)
## [1] 28.01905
  1. Unrelated to above, if our expected proportions of four categories were 10%, 20%, 30% and 40%, and our total observations were 68, calculate the expected results using one line of code.
b <- c(0.1, 0.2, 0.3, 0.4)
b*68
## [1]  6.8 13.6 20.4 27.2