Monomorphisms – Serlo

Linear maps preserve linear combinations. We now learn about special linear maps that preserve linear independence. These are called monomorphisms.

Motivation

We have introduced linear maps as functions between vector spaces that preserve linear combinations. Thus, they satisfy the property that a linear combination is preserved nuder the mapping:

${\displaystyle f\left(\sum _{i=1}^{n}\lambda _{i}v_{i}\right)=\sum _{i=1}^{n}\lambda _{i}f(v_{i}).}$

Using linear combinations, we have defined the property of linear independence. Recall: For a vector space ${\displaystyle V}$ over a field ${\displaystyle K}$, a finite set of vectors ${\displaystyle \{v_{1},\ldots ,v_{n}\}\subset V}$ is linearly independent if and only if the only linear combination by ${\displaystyle \lambda _{1},...,\lambda _{n}\in K}$, which leads to zero (${\displaystyle \lambda _{1}v_{1}+\ldots +\lambda _{n}v_{n}=0\in V}$) is the trivial one, i.e., ${\displaystyle \lambda _{1}=\ldots =\lambda _{n}=0}$.

An alternative characterization is that if

${\displaystyle \sum _{i=1}^{n}\lambda _{i}f(v_{i})=\sum _{i=1}^{n}\mu _{i}f(v_{i}),}$

then the set of coefficients ${\displaystyle \lambda _{i},\mu _{i}\in K}$ must be equal as ${\displaystyle \lambda _{i}=\mu _{i}}$.

Is this property preserved? Certainly, there are linear maps, which do not preserve linear independence, e.g. the map to zero: ${\displaystyle f(v)=0\;\forall v\in V}$. Any set of vectors containing the zero vector is linearly dependent, as there is a non-trivial linear combination leading to the zero vector, e.g. with ${\displaystyle \lambda =1}$: ${\displaystyle 1\cdot f(v)=1\cdot 0=0}$. Now are there even linear maps which preserve linear independence?

The answer is: yes and they are called monomorphisms.

What additional property does a linear map need to have in order to preserve linear independence? We take some linearly independent vectors ${\displaystyle v_{1},\ldots ,v_{n}}$. For a linear map ${\displaystyle f}$ to preserve linear independence, it needs to satisfy:

${\displaystyle \sum _{i=1}^{n}\lambda _{i}f(v_{i})=\sum _{i=1}^{n}\mu _{i}f(v_{i})\implies \forall i\leq n:\lambda _{i}=\mu _{i}.}$

We transform:

{\displaystyle {\begin{aligned}&\left(\sum _{i=1}^{n}\lambda _{i}f(v_{i})=\sum _{i=1}^{n}\mu _{i}f(v_{i})\implies \forall i\leq n:\lambda _{i}=\mu _{i}\right)\\[0.3em]&{\color {OliveGreen}\left\downarrow \ {\text{homogeneity of }}f\right.}\\[0.3em]\iff &\left(\sum _{i=1}^{n}f(\lambda _{i}v_{i})=\sum _{i=1}^{n}f(\mu _{i}v_{i})\implies \forall i\leq n:\lambda _{i}=\mu _{i}\right)\\[0.3em]&{\color {OliveGreen}\left\downarrow \ {\text{additivity of }}f\right.}\\[0.3em]\iff &\left(f\left(\sum _{i=1}^{n}\lambda _{i}v_{i}\right)=f\left(\sum _{i=1}^{n}\mu _{i}v_{i}\right)\implies \forall i\leq n:\lambda _{i}=\mu _{i}\right).\end{aligned}}}

Therefore ${\displaystyle f}$ must have the following property to preserve linear independence:

${\displaystyle f\left(\sum _{i=1}^{n}\lambda _{i}v_{i}\right)=f\left(\sum _{i=1}^{n}\mu _{i}v_{i}\right)\implies \sum _{i=1}^{n}\lambda _{i}v_{i}=\sum _{i=1}^{n}\mu _{i}v_{i}.}$

By setting ${\textstyle x:=\sum _{i=1}^{n}\lambda _{i}v_{i}}$ and ${\textstyle y:=\sum _{i=1}^{n}\mu _{i}v_{i}}$, it becomes clearer what this property is. We get that

${\displaystyle f(x)=f(y)\implies x=y}$

for all ${\displaystyle x,y\in V}$ which can be written as linear combination of ${\displaystyle v_{1},\ldots ,v_{n}}$.

This statement should be valid for all linear independent sets and therefore also for bases. In the case of a basis, however, all ${\displaystyle x,y}$ can be written as such a linear combination, which means that ${\displaystyle f}$ must be injective. Thus injectivity is a necessary condition for a linear map to preserve linear independence.

Is injectivity also a sufficient condition for this property? Let for this ${\displaystyle f}$ an injective linear map and ${\displaystyle v_{1},\ldots ,v_{n}\in V}$ linearly independent vectors. We are to find out whether ${\displaystyle f(v_{1}),\ldots ,f(v_{n})}$ are also linearly independent. According to our considerations above, it is enough to show the following for scalars ${\displaystyle \lambda _{i}}$ and ${\displaystyle \mu _{i}\in K}$

${\displaystyle \left(f\left(\sum _{i=1}^{n}\lambda _{i}v_{i}\right)=f\left(\sum _{i=1}^{n}\mu _{i}v_{i}\right)\implies \forall i\leq n:\lambda _{i}=\mu _{i}\right).}$

Let

${\displaystyle f\left(\sum _{i=1}^{n}\lambda _{i}v_{i}\right)=f\left(\sum _{i=1}^{n}\mu _{i}v_{i}\right)}$

Then, we have from the injectivity of ${\displaystyle f}$ that

${\displaystyle \sum _{i=1}^{n}\lambda _{i}v_{i}=\sum _{i=1}^{n}\mu _{i}v_{i}}$.

Because ${\displaystyle v_{1},\ldots ,v_{n}}$ are linearly independent, we have that ${\displaystyle \lambda _{i}=\mu _{i}}$ for all ${\displaystyle i}$. Thus we have shown the above statement and ${\displaystyle f}$ preserves linear independence.

Thus, a linear map preserve linear independence if and only if it is injective. We call injective linear maps monomorphisms.

Definition

Definition (monomorphism)

A monomorphism is an injective linear map ${\displaystyle f\colon V\to W}$ between two ${\displaystyle K}$-vector spaces ${\displaystyle V}$ and ${\displaystyle W}$.

That is, ${\displaystyle f}$ is a linear map such that for all ${\displaystyle v,{\tilde {v}}\in V}$ the statement ${\displaystyle f(v)=f({\tilde {v}})}$ implies that also ${\displaystyle v={\tilde {v}}}$.

Equivalent characterization of monomorphisms

We have considered in the motivation that monomorphisms should be exactly those linear maps, which preserve linear independence of vectors. We now prove this mathematically:

Theorem (monomorphisms preserve linear independence)

Let ${\displaystyle f:V\to W}$ be a linear map. Then, we have that ${\displaystyle f}$ is injective if and only if the image of every linearly independent subset ${\displaystyle M\subseteq V}$ is again linearly independent.

Thus, the linear map ${\displaystyle f}$ preserves linear independence exactly if ${\displaystyle f}$ is a monomorphism.

How to get to the proof? (monomorphisms preserve linear independence)

We follow the preliminary considerations from the motivation. What we would like to show are two implications: "${\displaystyle f}$ is injective ${\displaystyle \implies }$ the image every linearly independent subset ${\displaystyle M\subseteq V}$ is linearly independent." and "The image of every linearly independent subset ${\displaystyle M\subseteq V}$ is linearly independent ${\displaystyle \implies }$ ${\displaystyle f}$ is injective."

However, it is easier to prove linear dependence than linear independence, because with linear dependence of a set we only need to find one example for a non-trivial combination to 0. With linear independence, we need to prove that every finite subset of the set is linearly independent. Therefore, we do not directly show the above implications, but use a proof by contradiction.

Proof (monomorphisms preserve linear independence)

We show "There exists a linearly independent subset ${\displaystyle M\subseteq V}$ such that ${\displaystyle f(M)}$ is linearly dependent" ${\displaystyle \iff }$ "${\displaystyle f}$ is not injective"

Proof step:${\displaystyle \implies }$"

So let ${\displaystyle M\subseteq V}$ be linearly independent, but ${\displaystyle f(M)\subseteq W}$ be linearly dependent.

Then ${\displaystyle f(M)}$ contains a finite linearly dependent subset ${\displaystyle \{w_{1},\ldots ,w_{n}\}}$. Let ${\displaystyle v_{1},\ldots ,v_{n}}$ be the preimages of the vectors ${\displaystyle w_{1},\ldots w_{n}}$, so ${\displaystyle w_{i}=f(v_{i})}$ with ${\displaystyle v_{i}\in M}$. Since ${\displaystyle w_{1},\ldots w_{n}}$ are linearly dependent, there exist scalars ${\displaystyle \mu _{1},\ldots ,\mu _{n}\in K}$ which are not all zero but

${\displaystyle 0_{W}=\mu _{1}w_{1}+\cdots +\mu _{n}w_{n}=\mu _{1}\cdot f(v_{1})+\cdots +\mu _{n}\cdot f(v_{n})=f(\mu _{1}\cdot v_{1}+\cdots +\mu _{n}\cdot v_{n}).}$

Then, we have ${\displaystyle v:=\mu _{1}\cdot v_{1}+\cdots +\mu _{n}\cdot v_{n}\neq 0_{V}}$, since at least one ${\displaystyle \mu _{i}\neq 0}$ and because of ${\displaystyle v_{1},\ldots ,v_{n}\in M}$ these vectors are linearly independent. Now on the one hand ${\displaystyle f(v)=0_{W}}$, but we also know that ${\displaystyle f(0_{V})=0_{W}}$. Because of ${\displaystyle v\neq 0_{V}}$, ${\displaystyle f}$ is not injective.

Proof step:${\displaystyle \Longleftarrow }$

Since ${\displaystyle f}$ is not injective, there are some ${\displaystyle v_{1},v_{2}\in V}$ with ${\displaystyle v_{1}\neq v_{2}}$, but ${\displaystyle f(v_{1})=f(v_{2})}$. For ${\displaystyle {\tilde {v}}:=v_{1}-v_{2}}$ we have that then ${\displaystyle f({\tilde {v}})=f(v_{1}-v_{2})=f(v_{1})-f(v_{2})=0_{W}}$.

Now define the set ${\displaystyle M}$ as the span ${\displaystyle M=\lbrace {\tilde {v}}\rbrace }$. Because of ${\displaystyle {\tilde {v}}\neq 0}$, ${\displaystyle M}$ is linearly independent, but ${\displaystyle f(M)=f(\lbrace {\tilde {v}}\rbrace )=\lbrace 0_{W}\rbrace }$ is linearly dependent.

We can derive a different criterion for a linear map being a monomorphism: Suppose we have linearly independent vectors ${\displaystyle v_{1},\dots ,v_{n}\in V}$. The linear independence means that the vectors describe "independent information". We have seen above that monomorphisms preserve linear independence. This means that monomorphisms map independent information to independent information. So monomorphisms preserve all information. Suppose we have a monomorphism ${\displaystyle f\colon V\to W}$, another vector space ${\displaystyle U}$ and maps ${\displaystyle a,b\colon U\to V}$ such that ${\displaystyle f\circ a=f\circ b}$ holds. Since no information was lost by the application of ${\displaystyle f}$, the maps ${\displaystyle a}$ and ${\displaystyle b}$ must have been the same before the application. So we have that for a monomorphism ${\displaystyle f}$, from ${\displaystyle f\circ a=f\circ b}$ one cam imply ${\displaystyle a=b}$. One also says that the monomorphism can be left shortened. The next theorem verifies that the ability to being left shortened is equivalent to a linear map being a monomorphism.

Theorem (monomorphisms can be "left shortened")

Let ${\displaystyle f\colon V\to W}$ be linear map. Then, we have: ${\displaystyle f}$ is a monomorphism if and only if for all vector spaces ${\displaystyle U}$ and for all ${\displaystyle a,b\colon U\to V}$ with ${\displaystyle f\circ a=f\circ b}$ we have that ${\displaystyle a=b}$.

One also says that ${\displaystyle f}$ can be left shortened.

Proof (monomorphisms can be "left shortened")

Proof step: ${\displaystyle \implies }$, by a direct proof

Let ${\displaystyle f\colon V\to W}$ be a monomorphism, i.e. an injective, linear map. Let ${\displaystyle U}$ be another vector space and ${\displaystyle a,b\colon U\to V}$ with ${\displaystyle f\circ a=f\circ b}$. Let ${\displaystyle v\in U}$. Then ${\displaystyle f(a(v))=f(b(v))}$. Since ${\displaystyle f}$ is injective, it follows that ${\displaystyle a(v)=b(v)}$. Since we have chosen ${\displaystyle v}$ arbitrary, we obtain ${\displaystyle a=b}$.

Proof step: ${\displaystyle \Longleftarrow }$ , proof by contradiction

Suppose that ${\displaystyle f}$ is not a monomorphism, i.e. ${\displaystyle f}$ is not injective. Then there exist ${\displaystyle v}$ and ${\displaystyle w\in V}$ with ${\displaystyle v\neq w}$ and ${\displaystyle f(v)=f(w)}$.

Without loss of generality, let ${\displaystyle v\neq 0}$ (otherwise swap ${\displaystyle v}$ and ${\displaystyle w}$).

We extend ${\displaystyle \lbrace v\rbrace }$ to a basis ${\displaystyle C}$ of ${\displaystyle V}$.

Then we consider the two linear maps ${\displaystyle a=id_{V}}$ and ${\displaystyle b\colon V\to V,v\mapsto w,c\mapsto c}$ for all ${\displaystyle c\in C\setminus \lbrace v\rbrace }$. (the second linear map is given by linear continuation starting from the basis vectors).

We now show that ${\displaystyle f\circ a=f\circ b}$ holds. It suffices to check this identity on our basis ${\displaystyle C}$: For all ${\displaystyle c\in C\setminus \lbrace v\rbrace }$ we have that ${\displaystyle f(a(c))=f(c)=f(b(c))\forall c\in C\setminus \lbrace v\rbrace }$ and ${\displaystyle f(a(v))=f(v)=f(w)=f(b(v))}$. In addition, we have that ${\displaystyle f\circ a=f\circ b}$, since this relation holds for all basis elements of ${\displaystyle V}$.

But we also have that ${\displaystyle a\neq b}$, since ${\displaystyle a(v)=v\neq w=b(v)}$.

This is a contradiction to the assumption and it follows that ${\displaystyle f}$ is a monomorphism.

Hint

This theorem is useful because sometimes it is easier to show that ${\displaystyle f\circ a=f\circ b}$ holds, instead of directly proving ${\displaystyle a=b}$. This theorem gives us a kind of "rule of calculation" for linear maps. Moreover, we do not use concrete elements for left shortening. This allows us to generalize the concept of monomorphism to categories that you may encounter in further study.

Examples

Example

The map ${\displaystyle h\colon \mathbb {R} ^{2}\to \mathbb {R} ^{3}}$ with of the following mapping rule is a vector space monomorphism:

${\displaystyle h\left({\begin{pmatrix}x\\y\end{pmatrix}}\right)={\begin{pmatrix}x\\y\\x+y\end{pmatrix}}}$

Indeed, from ${\displaystyle h(x_{1},y_{1})=h(x_{2},y_{2})}$, it follows:

${\displaystyle {\begin{pmatrix}x_{1}\\y_{1}\\x_{1}+y_{1}\end{pmatrix}}={\begin{pmatrix}x_{2}\\y_{2}\\x_{2}+y_{2}\end{pmatrix}}}$

But then ${\displaystyle x_{1}=x_{2}}$ and ${\displaystyle y_{1}=y_{2}}$ must hold and so the equality of the arguments follows ${\displaystyle (x_{1},x_{2})^{T}=(y_{1},y_{2})^{T}}$. This shows that ${\displaystyle h}$ is injective.

Relation to the kernel

Alternative derivation of a monomorphism

Linear maps preserve linear independence if and only if they are injective. We call these maps monomorphisms. To derive this, we have first clarified how linear independence is defined, namely via the uniqueness of the representation of vectors as a linear combination. As mentioned before that, instead of considering all these vectors, however, linear independence can also be defined only by the representation of the zero vector: ${\displaystyle v_{1},\dots ,v_{n}}$ are linearly independent if it follows from ${\textstyle \sum _{i=1}^{n}\lambda _{i}v_{i}=0}$ that all coefficients are ${\displaystyle \lambda _{i}=0}$.

What if, with this definition, we tried to derive the definition of monomorphism? Again we are looking for a property for a linear map ${\displaystyle f}$ with which we can infer from the linear independence of ${\displaystyle v_{i}}$ the linear independence of ${\displaystyle f(v_{i})}$. Let for this ${\displaystyle v_{1},\ldots ,v_{n}}$ be linearly independent. Let us now show that:

${\displaystyle \sum _{i=1}^{n}\lambda _{i}f(v_{i})=0\implies \forall i\leq n:\lambda _{i}=0.}$

This is equivalent to

${\displaystyle f\left(\sum _{i=1}^{n}\lambda _{i}v_{i}\right)=0\implies \forall i\leq n:\lambda _{i}=0.}$

Our desired property must guarantee that ${\textstyle \sum _{i=1}^{n}\lambda _{i}v_{i}=0}$. Then we can show with the linear independence of ${\displaystyle v_{i}}$ that all ${\displaystyle \lambda _{i}=0}$, which also proves the linear independence of ${\displaystyle f(v_{i})}$.

So ${\displaystyle f}$ needs to fulfil the property: ${\displaystyle f(v)=0\implies v=0}$ for all vectors ${\displaystyle v}$. By the principle of contraposition, this property is equivalent to ${\displaystyle v\neq 0\implies f(v)\neq 0}$. So the property we are looking for is: "The set of elements that are mapped to zero consists only of the zero vector." This property, by the way, is the special case of injectivity at the point ${\displaystyle 0}$ and states that only the zero element of the domain vector space is mapped to the zero element of the image vector space.

Definition of the kernel

So the set of elements that are mapped to zero has a special meaning in this context. That is why it has its own name, one speaks of the kernel of the map.

Definition (Kernel of a linear map)

Let ${\displaystyle f\colon V\to W}$ be a linear map between two ${\displaystyle K}$-vector spaces ${\displaystyle V}$ and ${\displaystyle W}$. The kernel of the map ${\displaystyle f}$ is the set of all vectors from ${\displaystyle V}$ that are mapped to ${\displaystyle 0_{_{W}}}$ by ${\displaystyle f}$ and is denoted ${\displaystyle \ker f}$. In mathematical terms:

${\displaystyle \ker f=f^{-1}\left(\left\{0_{_{W}}\right\}\right)=\left\{\,v\in V:f(v)=0_{_{W}}\,\right\}\subseteq V.}$

Reading off injectivity from the kernel

We now know two properties of linear maps which guarantee that they preserve linear independence: On the one hand the injectivity and on the other hand that the kernel of the linear map being trivial (i.e., only including the zero vector). Both properties have the same effect. So it can be assumed that both properties are equivalent. As the following proof will show, this assumption is correct: (this part is still missing)

Alternative definition of a monomorphism

So we have learned a second property with which one can characterize monomorphisms. A linear map is a monomorphism if its kernel consists only of the zero vector. We also say that the kernel is "trivial". We can thus formulate an alternative definition for monomorphisms:

Definition (monomorphism)

A monomorphism is a linear map ${\displaystyle f\colon V\to W}$ between two ${\displaystyle K}$-vector spaces ${\displaystyle V}$ and ${\displaystyle W}$ for which one (or all) of the following equivalent statements hold:

• ${\displaystyle f}$ is injective.
• For all ${\displaystyle v\in V}$ we have that ${\displaystyle (v\neq 0\implies f(v)\neq 0)}$.
• For all ${\displaystyle v\in V}$ we have that ${\displaystyle (f(v)=0\implies v=0)}$.
• The kernel of ${\displaystyle f}$ is trivial, i.e., ${\displaystyle \ker f=\lbrace 0_{_{V}}\rbrace }$.

Exercises

Exercise (Verification of a monomorphism)

Show that for ${\displaystyle m\geq n}$, the map ${\displaystyle f\colon \mathbb {R} ^{n}\to \mathbb {R} ^{m}:\quad (x_{1},x_{2},\ldots ,x_{n})^{T}\mapsto (x_{1},x_{2},\ldots ,x_{n},\underbrace {0,\ldots ,0} _{(m-n){\text{ times}}})^{T}}$ is a monomorphism. This shows that one can map every "smaller" vector space ${\displaystyle K^{n}}$ injectively into a "bigger" vector space ${\displaystyle K^{m}}$, as long as ${\displaystyle m\geq n}$.

Solution (Verification of a monomorphism)

Let ${\displaystyle v=(v_{1},\ldots ,v_{n})^{T}\in \mathbb {R} ^{n}}$ and ${\displaystyle w=(w_{1},\ldots ,w_{n})^{T}\in \mathbb {R} ^{n}}$, as well as ${\displaystyle \lambda ,\mu \in \mathbb {R} }$. By definition of the map ${\displaystyle f}$, we have that:

${\displaystyle f(\lambda v+\mu w)=f\left({\begin{pmatrix}\lambda v_{1}+\mu w_{1}\\\vdots \\\lambda v_{n}+\mu w_{n}\end{pmatrix}}\right)={\begin{pmatrix}\lambda v_{1}+\mu w_{1}\\\vdots \\\lambda v_{n}+\mu w_{n}\\0\\\vdots \\0\end{pmatrix}}=\lambda \cdot {\begin{pmatrix}v_{1}\\\vdots \\v_{n}\\0\\\vdots \\0\end{pmatrix}}+\mu \cdot {\begin{pmatrix}w_{1}\\\vdots \\w_{n}\\0\\\vdots \\0\end{pmatrix}}=\lambda f(v)+\mu f(w).}$

So ${\displaystyle f}$ is linear. It remains to be shown that ${\displaystyle f}$ is injective. To show the injectivity of ${\displaystyle f}$, there are (at least) two ways:

1st way

From the definition of the linear map ${\displaystyle f}$ it is clear that only the zero vector of ${\displaystyle \mathbb {R} ^{n}}$ is mapped by ${\displaystyle f}$ to the null element of ${\displaystyle \mathbb {R} ^{m}}$. Thus

${\displaystyle \operatorname {ker} f=\left\{(0,\ldots ,0)^{T}\right\}.}$

Thus the kernel of ${\displaystyle f}$ contains only the zero vector. By the theorem on the relation between kernel and injectivity of a linear map, it follows that ${\displaystyle f}$ is injective. Together with the linearity of ${\displaystyle f}$ it is thus shown that ${\displaystyle f}$ is a monomorphism.

2nd way

A second way to prove the injectivity of the map ${\displaystyle f}$ is to directly recalculate the definition of injectivity:

Let ${\displaystyle f(v)=f(w)}$. This is equivalent to the statement ${\displaystyle f(v)-f(w)=0}$. In other words

${\displaystyle f\left({\begin{pmatrix}v_{1}\\\vdots \\v_{n}\end{pmatrix}}\right)-f\left({\begin{pmatrix}w_{1}\\\vdots \\w_{n}\end{pmatrix}}\right)={\begin{pmatrix}v_{1}\\\vdots \\v_{n}\\0\\\vdots \\0\end{pmatrix}}-{\begin{pmatrix}w_{1}\\\vdots \\w_{n}\\0\\\vdots \\0\end{pmatrix}}=0.}$

From this representation one recognizes immediately that ${\displaystyle v_{1}=w_{1},v_{2}=w_{2},\ldots ,v_{n}=w_{n}}$ must hold. Thus ${\displaystyle v=w}$ and hence, ${\displaystyle f}$ is injective. Together with the linearity of ${\displaystyle f}$ we have therefore shown that ${\displaystyle f}$ is a monomorphism.