Matrix Multiplication
Created | Updated Dec 9, 2002
This entry explains how to perform multiplication of matrices. A matrix is a simply a rectangular array of numbers, arranged in rows and columns. Matrices are studied in Linear Algebra in connection with many different applications, such as solving systems of equations, performing linear transformations, and breaking down the behaviour of complex systems into simpler parts.
You can actually use this entry to learn how to do matrix multiplication. There is more than one way of learning this process, and in this entry, we do it by talking about linear combinations of columns, because that's an idea that has applications elsewhere in linear algebra. If you learned a different way to multiply matrices, don't panic; it's probably equivalent.
Matrix Addition and Scalar Multiplication
Before getting into matrix multiplication, which is weird, let's look at some fairly straightforward things you can do with matrices. Matrix addition is no problem; if two matrices are the same size, you can just add them directly:
[ 4 11 8] + [1 -9 2] = [5 2 10]
[-3 6 0] [3 -4 -8] [0 2 -8]
Starting with the first entries in the first rows, 4 from the first matrix is added to 1 from the second one to get 5 in the sum. Using the same process, you get the answer, entry by entry.
You can also multiply a matrix by an ordinary number without anything very surprising happening.
10 * [ 4 11 8] = [ 40 110 80]
[-3 6 0] [-30 60 0]
Operations with Columns
Now, let's start looking at the parts of a matrix. First a point of notation. The size of a matrix is specified according to how many rows and columns it has. Thus, a 2x3 matrix has 2 rows, and 3 columns, for a total of 6 entries. The matrices in the above examples were 2x3 (NOT 3x2 - that would be 3 rows and 2 columns). Lets look at this one for a while:
[ 4 11 8]
[-3 6 0]
In this discussion of matrix multiplication, we will often talk about single columns of a matrix. For example, the second column of the above matrix is:
[11]
[ 6]
You can do basic arithmetic with the columns of a matrix, just like we did with matrices in the above section. You could multiply the above column by 2, for example:
2 * [11] = [22]
[ 6] [12]
Also, you can add one column to another one. Let's add the second and third columns of our matrix:
[11] + [8] = [19]
[ 6] [0] [ 6]
Linear Combinations
Putting these column operations together, you could take all the columns of your matrix, multiply them all by different numbers, and then add all the results together!
-1 * [ 4] + 2 * [11] + 3 * [8] = [-4] + [22] + [24] = [42]
[-3] [ 6] [0] [ 3] [12] [ 0] [15]
The final result of this calculation is called a linear combination of the columns of the matrix. The numbers that the columns are multiplied by are called weights. Odd as it may seem, mathematicians find themselves making linear combinations of the columns of matrices all the time!
Since this is such a common process, it's nice to have a compact way to write it down. All the information we really need to know is (A) the original matrix, and (B) the weights for each column. The conventional way to do this is to write the weights down in a column of their own, which we'll call a vector, and say that we're multiplying the original matrix by that vector. Note that this vector will have three entries - the number of columns of the original matrix. That's because we need three weights. So, here's our shorthand for the above linear combination:
[ 4 11 8] * [-1] = [42]
[-3 6 0] [ 2] [15]
[ 3]
Systems of Equations
Rather than just asking you to accept that finding making linear combinations of columns of a matrix is such a natural thing to do, we can say a little bit about why that should be the case. Suppose you're trying to solve some equations, with some unknown variables in them:
4x + 11y + 8z = 42
-3x + 6y = 15
A little bit of rearranging terms shows that this system of equations is just the example we've been playing with!
x * [ 4] + y * [11] + z * [8] = [42]
[-3] [ 6] [0] [15]
Therefore, we know that {x = -1, y = 2, z = 3} is a solution to the system! There may be other solutions, but that's beyond the scope of this entry. Let's get on with matrix multiplication.
A Matrix Times a Vector - Definition and Examples
To formalize what we said above: The product of a matrix and a vector is defined when the vector has as many entries as the matrix has columns. In this case, the product is a linear combination of the columns of the matrix, using the entries of the vector as weights.
The reader might want to try to work out the following examples:
[1 4] * [-2] =
[2 5] [ 1]
[3 6]
[3 4 -5 2] * [10] =
[1 1 9 -6] [-2]
[ 0]
[ 2]
Solutions
[1 4] * [-2] = [2]
[2 5] [ 1] [1]
[3 6] [0]
[3 4 -5 2] * [10] = [26]
[1 1 9 -6] [-2] [-4]
[ 0]
[ 2]
Matrix Multiplication
So, we're multiplying matrices by vectors, and that's great. What if you want to multiply the same matrix by a whole bunch of different vectors, though? It gets tiresome to keep writing it out over and over again. The solution is to line up all the vectors, side-by-side, and make the the columns of a new matrix. To keep things straight, let's call the original matrix 'A', and the new matrix will be 'B'.
Remember that each vector making up B has as many entries as A has columns. That means that the number of rows in B will equal the number of columns in A. Now we can take each column of B, use its entries as weights to combine the columns of A, and produce new columns, which we'll line up for our solution. Let's look at an example, where B is made of two different vectors:
[ 4 11 8] * [-1 1] = [42 1]
[-3 6 0] [ 2 -1] [15 -9]
[ 3 1]
Each column of B gives you one column of the solution, by giving you the weights with which to combine the columns of A.
to be continued.....