Introduction: Matrices – Serlo

Aus Wikibooks
Zur Navigation springen Zur Suche springen

In this article, we introduce matrices as an efficient representation of linear maps. A matrix (of a linear map ) is a rectangular arrangement of elements from ("numbers") that specifies where the standard basis of is mapped by .

Derivation [Bearbeiten]

Let be a field and a linear map. We want to describe this map in the most efficient way. Since we know from the article "vector space of a linear map" that the space of linear maps from to has dimension , and that is an element of this space. So we need numbers to describe our linear map. We are looking for a way to write down these numbers in an organized way.

Let be the standard basis of . Then, following the principle of linear continuation, is already completely determined by the vectors  : If is an arbitrary vector, we can write it as a linear combination of the basis elements, and because of linearity we know the value .

So we need the "data" to describe the linear map. These data are vectors in . So we can write them as

for certain "numbers" . This is a notation for tracking all necessary data of the linear map. But we can still make it more efficient: We just omit the "" and agree on the convention that the -th column describes the image of the -th basis vector:

To save even more space, we can also combine the entries of these vectors into a single "table", still with the image of the -th basis vector being in the -th column:

We call this "table in parenthesis" a matrix. It is the matrix associated with the linear map .

The matrix completely determines and it consists of numbers as entries, which is consistent with our considerations above.


Definition (Matrix)

Let be a field and . Let for all and . Then we call

an -matrix. We denote the set of all matrices by .

Example (Linear map from to )

We consider the linear map

We can see that is indeed linear in an exercise.

In the derivation we have seen that we can describe by a matrix. We want to compute this matrix here explicitly. To do so, we need to determine the images of the standard basis vectors

For these,

Thus the three vectors

contain all the information of the linear map . If we write these side by side in a table, we get the matrix

which represents .

Example (Embedding )

Let us now consider the standard embedding of into , that is, the linear map

For the vectors of the standard basis, we have

So the embedding is represented by the matrix

Example (Reflection of along an axis)

Let's still examine the reflection of along the x-axis. When we mirror a vector along the x-axis, we keep its x-component fixed and change the sign of its y-component. The reflection is thus given by

The first basis vector lies on the x-axis and is therefore not affected by the reflection. Formally:
The second basis vector is perpendicular to the x-axis and is therefore mapped to its negative. Formally:

As the matrix associated with this reflection, we thus obtain:

Matrix-Vector Multiplication [Bearbeiten]


We have just seen how we can represent a linear map by a matrix. Suppose, we now do not a linear map, but only its associated matrix. What does the image of an arbitrary vector under this linear map look like?

First, for simplicity, let's consider the vector space and any linear map be a linear map, of which we know that the associated matrix is

That means, we have


We want to calculate the image of an arbitrary vector under the map , using the entries of the matrix .

To do so, we represent our vector as a linear combination of the standard basis vectors, i.e.

Now we can exploit the linearity of and calculate:

By this calculation, we can describe the effect of applying a linear map to a vector, only by using the matrix . This calculation works for any vector and any -matrix. To simplify the notation, let us define a "multiplication operation" for matrices and vectors:

We call this the "matrix-vector multiplication" and formally write it as a product. The generalization from a to an -matrix is given in the following exercise:


Let be a linear map and the associated matrix. Find a formula to calculate the value for a given vector by using the entries of the matrix .


We write as a linear combination of the standard basis vectors: let be the "coordinates", such that holds. That is the matrix associated with means that is satisfied for all . Thus, it follows for that

Using the sum notation, we can write the result as

The solution of this exercise provides us with a formula to calculate the value of a vector under a mapping, using the associated matrix. We now define using the formula found in the solution.


Definition (Matrix-Vector Multiplication)

Let be a field and . Then we define

From another point of view this means: If we consider the matrix as a collection of column vectors

then the product is a linear combination of the columns of with the coefficients in , namely .

How can you best remember how applying a matrix to a vector works?[Bearbeiten]

To apply a matrix to a vector, you need to compute "row times column".

You may perform a matrix-vector multiplication by using the rule "row times column": The first entry of the result is the first row of the matrix times the column vector. The second entry is the second row of the matrix times the column vector, etc. for larger matrices. For each "row times column" product, you multiply the related entries (first times first, second times second, etc.) and add the results.

It is important that the type of the matrix and the type of the vector match. If you have set up everything correctly so far, this should always be the case, because a linear map includes an matrix. You can apply this matrix to vectors of , since rows and columns have both length .

Reverse direction: The induced linear map[Bearbeiten]

We have seen that every linear map has an associated matrix. Given a linear map , we constructed a matrix such that . That is, some matrices define a linear map. But do all matrices define a linear map? And if yes, what does the corresponding mapping look like?

If a matrix is derived from a linear map , then we can get back from by defining it as the map . More generally, we can apply this rule to any matrix and obtain corresponding a linear map .

So let be an matrix. We consider . This map is indeed linear:

That means, every matrix defines a linear map.

Definition (Induced linear map)

Let be a matrix over the field . Then the linear map

is called the linear map induced by the matrix .

Thus, we now know that for each linear map there is an associated matrix, and for each matrix there is an associated linear map. For a linear map , we call the associated matrix . Our construction of the induced mapping is built exactly such that . This is quite intuitive: the linear map induced by the matrix associated to a linear map is just map itself. We can now ask the "reverse question": If we consider the associated matrix of a linear map induced by some original matrix, is this the original matrix, again? So in mathematical terms: Is ? The following theorem answers this question in the affirmative:


The mappings and are bijections and each other's inverse. In particular, .


To show that the two mappings are inverse to each other, it suffices to show that applying them after each other (in any of the two orders) yields the identity. This would directly imply that both mappings are bijective. So it suffices to show that and that . We already know that the first equation holds. So it only remains to show the second. Let be any -matrix. Let be the entry in the -th row and -th column of and let be the corresponding entry of the matrix .

By definition of we have

So the -th entry of the vector is equal to . That is, .

By definition of the matrix associated with , the -th column of is equal to the image of under . Thus,

In particular, it follows for the -th entry of that

Overall, we get Since and were arbitrarily chosen, all entries of the two matrices are equal and indeed

We have thus shown that matrices and linear maps are in a "one-to-one-correspondence".