In R, a matrix is a 2 dimensional data structure that contains atomic elements of the same type. Just like a vector, a matrix can hold numeric values, logical values, characters or other atomic data types. A matrix consists of 2 or more rows and 2 or more columns. When talking about matrices, the letter m refers to number of rows and n refer to the number of columns.

Matrix Construction

To construct a matrix in R, you can use the matrix() function which takes several arguments:

In [1]:

X <- matrix(data = c(1,2,3,4,5,6),  # a vector used to construct the matrix
            nrow = 2,               # number of rows in the matrix
            ncol = 3,               # number of columns in the matrix
            byrow = FALSE)          # fill the matrix by rows or columns?

print(X)

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6

Matrices are typically assigned to capital lettered variables to distinguish them from vectors.

You can add to a matrix you've already constructed with the rbind() and cbind() functions. rbind takes a sequence of vectors, matrices or data frames (we'll cover those later) as arguments and combines them by rows, while cbind() combines sequences of data objects by columns:

In [2]:

X <- rbind(X, c(7,8,9) )  # Adds the vector c(7,8,9) as a new row
print(X)

     [,1] [,2] [,3]
[1,]    1    3    5
[2,]    2    4    6
[3,]    7    8    9

In [3]:

Y <- matrix( seq(10,15,1), 3, 2)  # Create a new 3x2 matrix
print (Y)

X <- cbind(X,  Y)  # Add the new matrix to X by column
print(X)

     [,1] [,2]
[1,]   10   13
[2,]   11   14
[3,]   12   15
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    3    5   10   13
[2,]    2    4    6   11   14
[3,]    7    8    9   12   15

*Note: the arguments to rbind() should have the same number of columns and the arguments to cbind() should have the same number of rows.

Since cbind() and rbind() work on vectors, you can use them to construct matrices from vectors on a row by row, or column by column basis:

In [4]:

Z <- rbind(c(1,2,3), c(4,5,6), c(7,8,9))
print(Z)

     [,1] [,2] [,3]
[1,]    1    2    3
[2,]    4    5    6
[3,]    7    8    9

You can turn a matrix's rows into columns and columns into rows using the transpose function t():

In [5]:

X <- t(X)
print (X)

     [,1] [,2] [,3]
[1,]    1    2    7
[2,]    3    4    8
[3,]    5    6    9
[4,]   10   11   12
[5,]   13   14   15

Transpose essentially flips a matrix along the main diagonal.

You can also convert a matrix into a vector using the c() function:

In [6]:

c(X)  # Convert a matrix into a vector by column

# If you want to convert a matrix to vector by row, take the transpose first:

c(t(X))

Out[6]:

Out[6]:

Matrix Indexing

Similar to vectors, you can access the elements inside a matrix with indexing. You may have noticed that when we printed matrices above, the rows and columns were labeled with values in square brackets. Those labels are the index values of the rows and columns. Since matrices have two dimensions, they have two indexes, a row index and a column index, which are separated by a comma. You can use indices to grab specific values, rows or columns in a matrix:

In [7]:

X[3,2]   # Get the value at row 3 column 2

# Leave the row or column index blank to take the entire row or column:

X[3, ]   # Get row 3

X[ ,2]   # Get column 2

Out[7]:

Out[7]:

Out[7]:

You can also take slices of rows and columns, just like you can with vectors:

In [8]:

print( X[4:5, 2:3] )  # Get data points in rows 4 and 5 and columns 2 and 3

     [,1] [,2]
[1,]   11   12
[2,]   14   15

All the vector indexing operations discussed last time work for matrices:

In [9]:

print( X[c(1,3,5), c(1,3)] ) # Vector indexing

     [,1] [,2]
[1,]    1    7
[2,]    5    9
[3,]   13   15

In [10]:

print( X[ -2, -2] ) # Remove row 2 and remove column 2

     [,1] [,2]
[1,]    1    7
[2,]    5    9
[3,]   10   12
[4,]   13   15

In [11]:

X_logical <- (X %% 2 == 0)   # Create a logical matrix identifying even numbers
print(X_logical)

      [,1]  [,2]  [,3]
[1,] FALSE  TRUE FALSE
[2,] FALSE  TRUE  TRUE
[3,] FALSE  TRUE FALSE
[4,]  TRUE FALSE  TRUE
[5,] FALSE  TRUE FALSE

In [12]:

X[X_logical]      # Use the logical matrix as an index to get even values in X

Out[12]:

In [13]:

X[X %in% c(2,6,8,9,10,15,100)]   # Get matrix values contained in a vector

Out[13]:

A matrix can also have named dimensions. You can assign dimension names when creating a matrix by passing a list of two vectors the dimnames argument:

In [14]:

Z <- matrix(c(1,2,3,4), 2, 2, dimnames = list( c("r1","r2"), c("c1","c2")) )

print(Z)

   c1 c2
r1  1  3
r2  2  4

You can also create or reassign dimnames after creating a matrix:

In [15]:

dimnames(Z) <- list( c("first_r","second_r"), c("first_c","second_c")) 

print(Z)

         first_c second_c
first_r        1        3
second_r       2        4

When dimensions are named, you can index the matrix using the dimension names or the normal numeric indexes:

In [16]:

Z[2,2]                    # Get the value at 2,2 with numeric indexing
Z["second_r","second_c"]  # Get the value at 2,2 with named dimensions

 Out[16]:

Out[16]:

Matrix Operations

Matrices in R offer many of the same conveniences as vectors. For instance, you can perform element-wise math operations on matrices of same dimensions by using the standard math symbols:

In [17]:

X <- Y <- matrix(c(1,2,-1,1,1,2,1,2,3),3,3)  #Make two new identical matrices
print(X)
print(Y)

# You can use the dim() function to check matrix dimensions:
dim(X)
dim(Y)

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    1    2
[3,]   -1    2    3
     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    2    1    2
[3,]   -1    2    3

Out[17]:

Out[17]:

In [18]:

print(X + Y) # Element-wise addition

     [,1] [,2] [,3]
[1,]    2    2    2
[2,]    4    2    4
[3,]   -2    4    6

In [19]:

print(X * Y) # Element-wise multiplication

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    4    1    4
[3,]    1    4    9

In [20]:

print(X / Y) # Element-wise division

     [,1] [,2] [,3]
[1,]    1    1    1
[2,]    1    1    1
[3,]    1    1    1

For true matrix multiplication, use the %*% operator:

In [21]:

print( X %*% Y )

     [,1] [,2] [,3]
[1,]    2    4    6
[2,]    2    7   10
[3,]    0    7   12

Here are a few other useful matrix operations:

In [22]:

diag(X)  # Get elements on the main diagonal of a matrix

Out[22]:

In [23]:

solve(X)  # Get the inverse of a square matrix

Out[23]:

0.25	0.25	-0.25
2	-1	0
-1.25	0.75	0.25

In [24]:

eigen(X) # Get the eigenvectors and eigenvalues of a matrix

Out[24]:

$values

4.32340427608648
1.3579263675185
-0.681330643604978

$vectors

0.3914710	0.5815460	-0.1960489
0.6358665	0.6698230	0.8383244
0.6651498	-0.4616724	-0.5087013

In [25]:

rowSums(X)    # Get the sums of the rows

colSums(X)    # Get the sums of the columns

rowMeans(X)   # Get the means of the rows

colMeans(X)   #Get the means of the columns

Out[25]:

Out[25]:

Out[25]:

1
1.66666666666667
1.33333333333333

Out[25]:

0.666666666666667
1.33333333333333
2

In [26]:

sum(X)  # Sum all the values in X

Out[26]:

In [27]:

min(X)  # Get the min value in X

Out[27]:

-1

In [28]:

max(X)  # Get the max value in X

Out[28]:

In [29]:

mean(X) # Get the mean of all the values in X

Out[29]:

1.33333333333333

*Note: sum, min max and mean also work on vectors

Any time you want to perform a matrix operation, there's a good chance R has a function that does just what you need built in or available in a package. Google is your friend.

R also contains an array data structure that stores elements of the same atomic type in an arbitrary number of dimensions. An array is just a vector stored with an extra attribute "dim" that specifies its dimensions:

In [30]:

A <- array(1:64, dim = c(4,4,4))    #create a 4 x 4 x 4 array

dim(A)    #Check the dimensions of the array
print(A)

Out[30]:

, , 1

     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16

, , 2

     [,1] [,2] [,3] [,4]
[1,]   17   21   25   29
[2,]   18   22   26   30
[3,]   19   23   27   31
[4,]   20   24   28   32

, , 3

     [,1] [,2] [,3] [,4]
[1,]   33   37   41   45
[2,]   34   38   42   46
[3,]   35   39   43   47
[4,]   36   40   44   48

, , 4

     [,1] [,2] [,3] [,4]
[1,]   49   53   57   61
[2,]   50   54   58   62
[3,]   51   55   59   63
[4,]   52   56   60   64

Arrays generally support the same types of indexing and vector operations as vectors and matrices, just with different numbers of dimensions. Arrays are uncommon so we won't study them any further.

Up till now, all the data structures we've covered hold values of the same atomic type. Real world data comes in all shapes and forms so the data structures we know aren't sufficient to tackle most real data sets. Next time we'll learn about our first heterogeneous data structure: lists.

Life Is Study

Thursday, July 23, 2015

Introduction to R Part 6: Matrices

Matrix Construction

Matrix Indexing

Matrix Operations

Next time: Introduction to R Part 7: Lists

No comments:

Post a Comment