# Linear independence – Serlo

## Motivation

### Basic motivation

Maybe, you learned about vectors in school, where they were drawn as arrows in the plane or in space. Both a plane and space are vector spaces. But how do they differ?

A spontaneous answer could be: "The plane is two-dimensional and the space is three-dimensional". But this brings us immediately to further questions:

• What is the dimension of a vector space?
• How can we define it?

In the definition of the vector space the term "dimension" does not occur...

### Intuition of a dimension

The term "dimension" describes in how many independent directions geometric objects can be extended in a space. The objects can also move in just as many independent directions in space ("degrees of freedom of motion").

The plane has two dimensions - the width and the length. It is flat, no object of the plane can reach out of it "into height". A sphere as a three-dimensional object cannot be part of the plane. In contrast, the space with length, width and height has three dimensions. A sphere can thus be part of space.

We summarize: The dimension intuitively corresponds to the number of independent directions into which a geometric object can expand or move. So, for the definition of dimension, we need to answer the following questions:

• What is a direction in a vector space?
• When are two directions independent?
• How can the number of independent directions be determined?

## Derivation of the Definition

### What is a direction within a vector space?

Let's take the vector space of the plane as an example. We can represent a direction with an arrow:

Now an arrow is nothing but a vector. So with the help of vectors, directions can be represented. Here we must not use the zero vector. As an arrow of length zero it has no direction. We can generalize this to arbitrary vector spaces:

Every vector not equal to the zero vector represents a direction in a vector space.

The direction in which the vector points is ${\displaystyle \alpha \cdot v:\alpha \in \mathbb {R} \}}$, that is, the span ${\displaystyle \operatorname {span} (\{v\})}$ of the vector ${\displaystyle v}$. To this span belong all extensions ${\displaystyle \alpha v}$ of the direction vector ${\displaystyle v}$ and thus describes the straight line, which is spanned by ${\displaystyle v}$:

### From the line to the plane

To get from the straight line to the plane, we need not only one vector but several, more precisely at least two vectors (${\displaystyle v,w}$). This is intuitively obvious, because a plane can be spanned unambiguously only with two vectors. Therefore we need a further, linearly independent vector. What does "independent" mean in this case? First, we notice that the new vector must not be the zero vector. This vector does not give any direction. Furthermore, the new vector must also not be a multiple of the original vector, i.e. ${\displaystyle w\neq \alpha v}$ . This also holds for reflections of straight line vectors, represented by multiplicatioin with a negative factor.

We conclude: The new vector ${\displaystyle w}$ is independent of the direction vector ${\displaystyle v}$ exactly when the latter is not on the straight line. So we need ${\displaystyle w\neq \alpha v}$ for all real numbers ${\displaystyle \alpha }$. Hence, the new vector must not be in the span of the other one. The two spans have only the zero point as intersection.

### From the plane to space

We have just seen that we can characterize a plane by two independent vectors. Now we want to go from the plane to space. Here, we also have to add an independent direction. But what is a direction independent of the plane?

The new vector must not be the zero vector, because this vector does not indicate a direction. The new vector must also not lie in the plane, because in that case, no new direction would be described. Only if the new vector does not lie in the plane, it will point in a new, independent direction:

How can we formulate this insight mathematically? Let ${\displaystyle v}$ and ${\displaystyle w}$ be the two direction vectors spanning the plane. This plane is then equal to the set ${\displaystyle \{\alpha v+\beta w:\alpha ,\beta \in \mathbb {R} \}}$. Hence, the plane is the set of all sums ${\displaystyle \alpha v+\beta w}$ for real numbers ${\displaystyle \alpha ,\beta \in \mathbb {R} }$. In order for the new vector ${\displaystyle u}$ not to be in the plane, it must be ${\displaystyle u\neq \alpha v+\beta w}$ for all ${\displaystyle \alpha ,\beta \in \mathbb {R} }$. Thus, ${\displaystyle u}$ in independent of <ma

Question: We had first required that the new vector ${\displaystyle u}$ must not be the zero vector. Why is it sufficient that ${\displaystyle u\neq \alpha v+\beta w}$ for all ${\displaystyle \alpha ,\beta \in \mathbb {R} }$? And why does this imply ${\displaystyle u\neq 0}$ ?

For ${\displaystyle \alpha =\beta =0}$ we have ${\displaystyle \alpha v+\beta w=0}$. Since also for ${\displaystyle \alpha =\beta =0}$ the new vector ${\displaystyle u}$ shall not be equal to ${\displaystyle \alpha v+\beta w}$ , we have ${\displaystyle u\neq 0}$.

Question: Is it sufficient that ${\displaystyle u}$ is no multiple of ${\displaystyle v}$ or ${\displaystyle w}$ ?

No, take for example ${\displaystyle u={\tfrac {1}{2}}v+{\tfrac {1}{2}}w}$. If ${\displaystyle v}$ is independent of ${\displaystyle w}$ , then ${\displaystyle u}$ is neither a stretched version of ${\displaystyle v}$ , nor one of ${\displaystyle w}$. However, this vector lies in the plane spanned by ${\displaystyle v}$ and ${\displaystyle w}$ , so it does not point into a direction independent of ${\displaystyle v}$ and ${\displaystyle w}$.

### A first criterion for linear independence

Let's summarize: To describe a straight line we needed a vector ${\displaystyle v}$ not being the zero vector. In the transition from the straight line to the plane, we had to add a vector ${\displaystyle v}$ independent of ${\displaystyle w}$. Independence of ${\displaystyle w}$ from ${\displaystyle v}$ means that ${\displaystyle w}$ does not lie in the line described by ${\displaystyle v}$ . So we need to have ${\displaystyle w\neq \alpha v}$ for all ${\displaystyle \alpha \in \mathbb {R} }$.

In the second step, we added a new direction ${\displaystyle u}$ to the plane, which is independent of the two vectors ${\displaystyle v}$ and ${\displaystyle w}$. Here independence manifests itself in the fact that ${\displaystyle u}$ is not in the plane spanned by ${\displaystyle v}$ and ${\displaystyle w}$ . Hence, we need ${\displaystyle u\neq \alpha v+\beta w}$ for all real numbers ${\displaystyle \alpha }$ and ${\displaystyle \beta }$. We can generalise this to an arbitrary number of vectors (but it is not so easy to visualize anymore):

The vector ${\displaystyle w}$ is independent of the vectors ${\displaystyle v_{1},v_{2}\ldots v_{n}}$, if ${\displaystyle w\neq \lambda _{1}v_{1}+\lambda _{2}v_{2}+\ldots +\lambda _{n}v_{n}}$ for all ${\displaystyle \lambda _{1},\lambda _{2},\ldots \lambda _{n}\in \mathbb {R} }$.

In the above description, the sum ${\displaystyle \lambda _{1}v_{1}+\lambda _{2}v_{2}+\ldots +\lambda _{n}v_{n}}$ appears. Such a sum is called linear combination of the vectors ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ . We may also say that ${\displaystyle w}$ is linearly independent if ${\displaystyle w\notin \operatorname {span} \{v_{1},...,v_{n}\}}$. The description can be changed to:

The vector ${\displaystyle w}$ is independent of the vectors ${\displaystyle v_{1},v_{2}\ldots v_{n}}$, if ${\displaystyle w}$ cannot be written as a linear combinationof the vectors ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$.

Here we have clarified when a vector is independent of other vectors. Is this sufficient to describe the independence of vectors?! Take the following three vectors ${\displaystyle a}$, ${\displaystyle b}$ and ${\displaystyle c}$ as an example:

Since no vector is a multiple of another vector, the three vectors, seen in pairs, point in independent directions. For example ${\displaystyle a}$ is independent of ${\displaystyle b}$ and ${\displaystyle c}$is independent of ${\displaystyle a}$. So the three vectors are not independent of each other because they all lie in one plane. We have ${\displaystyle c=a+b}$ and so ${\displaystyle c}$ is independent of ${\displaystyle a}$ and ${\displaystyle b}$. Accordingly, we have to impose linear independence between ${\displaystyle a}$, ${\displaystyle b}$ and ${\displaystyle c}$:

• ${\displaystyle a}$ is independent of ${\displaystyle b}$ and ${\displaystyle c}$: We have ${\displaystyle a\neq \beta b+\gamma c}$ for all ${\displaystyle \beta ,\gamma \in \mathbb {R} }$.
• ${\displaystyle b}$ is independent of ${\displaystyle a}$ and ${\displaystyle c}$: We have ${\displaystyle b\neq \alpha a+\gamma c}$ for all ${\displaystyle \alpha ,\gamma \in \mathbb {R} }$.
• ${\displaystyle c}$ is independent of ${\displaystyle a}$ and ${\displaystyle b}$: We have ${\displaystyle c\neq \alpha a+\beta b}$ for all ${\displaystyle \alpha ,\beta \in \mathbb {R} }$.

It should be emphasised at this point that it is necessary to require all three conditions. If we were to waive the last two conditions, the first requirement would guarantee that the vector ${\displaystyle a}$ is linearly independent of the vectors ${\displaystyle b}$ and ${\displaystyle c}$, but it is not clear from this requirement that ${\displaystyle b}$ and ${\displaystyle c}$ are linearly independent of each other. This does not have to be fulfilled, which would mean that the three vectors would again not be linearly independent of each other.

Therefore, none of the three vectors must be able to be represented as a linear combination of the other two vectors. Otherwise at least one of the vectors is dependent on the other vectors. We can generalise this to any number of vectors:

Definition (Second criterion for linear independence)

Some vectors ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ are linearly independent if none of the vectors can be written as a linear combination of the other vectors. This means that the following must apply:

• ${\displaystyle v_{1}\neq \lambda _{2}v_{2}+\lambda _{3}v_{3}+\dots +\lambda _{n}v_{n}}$ for all ${\displaystyle \lambda _{2},\lambda _{3},\ldots \lambda _{n}\in \mathbb {R} }$.
• ${\displaystyle v_{2}\neq \lambda _{1}v_{1}+\lambda _{3}v_{3}+\dots +\lambda _{n}v_{n}}$ for all ${\displaystyle \lambda _{1},\lambda _{3},\ldots \lambda _{n}\in \mathbb {R} }$.
• ...
• ${\displaystyle v_{n}\neq \lambda _{1}v_{1}+\lambda _{2}v_{2}+\dots +\lambda _{n-1}v_{n-1}}$ for all ${\displaystyle \lambda _{1},\lambda _{2},\ldots \lambda _{n-1}\in \mathbb {R} }$.

So ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ are linearly independent, if ${\displaystyle v_{i}\neq \sum _{j\neq i}\lambda _{j}v_{j}}$ for all ${\displaystyle i\in \{1,2,\ldots ,n\}}$ and ${\displaystyle \lambda _{j}\in \mathbb {R} }$.

### From the first criterion to the formal definition

With our first criterion, which we found above, we have already found a suitable definition for linear independence of vectors. In the following, we will try to find a more concise equivalent criterion, with which we can examine the linear independence of vectors more easily.

Vectors are independent if no vector can be represented as a linear combination of the other vectors. From this we will derive another criterion for linear independence, which is less computationally demanding. Let us take vectors ${\displaystyle v_{1}}$, ${\displaystyle v_{2}}$ to ${\displaystyle v_{n}}$ from a vector space ${\displaystyle V}$ that are not independent. So there is one vector that can be represented by the others. Let ${\displaystyle v_{1}}$ be this vector. There are thus stretching factors (scalars) ${\displaystyle \lambda _{2}}$ to ${\displaystyle \lambda _{n}}$, such that

${\displaystyle v_{1}=\lambda _{2}v_{2}+\lambda _{3}v_{3}+\dots +\lambda _{n}v_{n}}$

We can transform this equation by computing ${\displaystyle -v_{1}}$ on both sides (${\displaystyle 0_{V}}$ is the zero vector of the vector space ${\displaystyle V}$):

${\displaystyle 0_{V}=-v_{1}+\lambda _{2}v_{2}+\lambda _{3}v_{3}+\dots +\lambda _{n}v_{n}}$

This is a so-called nontrivial linear combination of the zero vector. A nontrivial linear combination of the zero vector is a linear combination with the result ${\displaystyle 0_{V}}$ where at least one coefficient is not equal to ${\displaystyle 0}$. For ${\displaystyle \lambda _{1}=\lambda _{2}=\dots =\lambda _{n}=0}$ we would trivially ${\displaystyle \lambda _{1}v_{1}+\lambda _{2}v_{2}+\lambda _{3}v_{3}+\dots +\lambda _{n}v_{n}=0_{V}}$. This is the so-called trivial linear combination of the zero vector, where all coefficients are equal to ${\displaystyle 0}$. You can always form this trivial linear combination, no matter which vectors ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ you choose. So it does not carry information. If ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ are dependent, there is at least one non-trivial linear combination of the zero vector (as we saw above) in addition to the trivial linear combination. So:

If ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ are linearly dependent, then the zero vector can be represented by at least one non-trivial linear combination of ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$.

In other words:

${\displaystyle v_{1},\dots ,v_{n}}$ linearly dependent ${\displaystyle \implies }$ There exists a non-trivial linear combination of ${\displaystyle 0_{V}}$ using ${\displaystyle v_{1},\dots ,v_{n}}$

Now we can apply the principle of contraposition: ${\displaystyle A\implies B}$ holds if and only if ${\displaystyle \neg B\implies \neg A}$. So:

There is no non-trivial linear combination of ${\displaystyle 0_{V}}$ using ${\displaystyle v_{1},\dots ,v_{n}\implies v_{1},\dots ,v_{n}}$ are linearly independent

With this we have found a criterion for linear independence. If the zero vector can only be represented trivially by a linear combination of ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$, then these vectors are linearly independent. However, this criterion can also be used as a definition of linear independence. To do this, we need to show the converse direction of the above implication. If there is a non-trivial linear combination of the zero vector, then the vectors under consideration are linearly dependent.

So let ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ be vectors for which there exists a non-trivial linear combination of the zero vector. This means, there are coefficients (scalars) ${\displaystyle \lambda _{1}}$ to ${\displaystyle \lambda _{n}}$, such that ${\displaystyle \lambda _{1}v_{1}+\dots +\lambda _{n}v_{n}=0_{V}}$ where at least one of the coefficients ${\displaystyle \lambda _{1}}$ to ${\displaystyle \lambda _{n}}$ is not ${\displaystyle 0}$. Let ${\displaystyle \lambda _{i}}$ be this coefficient. Then

${\displaystyle 0_{V}=\lambda _{1}v_{1}+\lambda _{2}v_{2}+\dots +\lambda _{i}v_{i}+\dots +\lambda _{n}v_{n}}$

Since ${\displaystyle \lambda _{i}\neq 0}$ we can multiply both sides by ${\displaystyle -\lambda _{i}^{-1}=-{\tfrac {1}{\lambda _{i}}}}$ . Then,

${\displaystyle 0_{V}=-{\frac {\lambda _{1}}{\lambda _{i}}}v_{1}-{\frac {\lambda _{2}}{\lambda _{i}}}v_{2}-\dots -v_{i}-\dots -{\frac {\lambda _{n}}{\lambda _{i}}}v_{n}}$

On both sides we can now add ${\displaystyle v_{i}}$:

${\displaystyle v_{i}=-{\frac {\lambda _{1}}{\lambda _{i}}}v_{1}-{\frac {\lambda _{2}}{\lambda _{i}}}v_{2}-\dots -{\frac {\lambda _{i-1}}{\lambda _{i}}}v_{i-1}-{\frac {\lambda _{i+1}}{\lambda _{i}}}v_{i+1}-\dots -{\frac {\lambda _{n}}{\lambda _{i}}}v_{n}}$

Thus ${\displaystyle v_{i}}$ can be represented as a linear combination of the other vectors and hence the vectors ${\displaystyle v_{1}}$ to ${\displaystyle v_{n}}$ are linearly dependent. This proves taht the following definition of linear independence is equivalent to the first one:

Definition (Second criterion for linear independence)

The vectors ${\displaystyle v_{1},\ldots ,v_{n}}$ are linearly independent if the only linear combination of them resulting in the zero vector is the trivial linear combination, i.e. if we have ${\displaystyle \alpha _{1},\ldots ,\alpha _{n}\in K}$ with ${\displaystyle 0_{V}=\sum _{i=1}^{n}\alpha _{i}v_{i}}$, the ${\displaystyle \alpha _{i}=0}$ must hold for all ${\displaystyle i\in \{1,\ldots ,n\}}$.

If there is at least one non-trivial linear combination of the zero vector, the considered vectors are linearly dependent.

### Definition of a family

We have talked above about a several vectors ${\displaystyle v_{1},\ldots ,v_{n}}$ being linearly independent. But what is this "collection" of vectors ${\displaystyle v_{1},\ldots ,v_{n}}$ from a mathematical point of view? We already know the notion of a set. So it is obvious to understand ${\displaystyle M=\{v_{1},\ldots ,v_{n}\}}$ also as a set. Does this view intuitively fit linear independence? Actually, it turns out problematic, if we have two equal vectors ${\displaystyle v,v}$ with ${\displaystyle v\neq 0}$. Both point in the same direction and span no two independent directions. Thus they are intuitively linearly dependent. And indeed, one can be written as a linear combination of the other as ${\displaystyle v=1\cdot v}$. Thus the vectors ${\displaystyle v,v}$ are also strictly mathematically linearly dependent. However, a set may only contain different elements. That is, the set containing ${\displaystyle v}$ and ${\displaystyle v}$ is ${\displaystyle M=\{v,v\}=\{v\}}$. So the set ${\displaystyle M}$ contains only one element and does not capture duplications of vectors.

So we need a new mathematical term that also captures duplications. This is the concept of family:

Definition (family)

A family ${\displaystyle (a_{i})_{i\in I}}$ of elements from a set ${\displaystyle A}$ consists of an index set ${\displaystyle I}$, such that every index ${\displaystyle i\in I}$ gets assigned an element ${\displaystyle a_{i}\in A}$.

If ${\displaystyle I}$ a finite set, we call it a finite family.

If ${\displaystyle I\subseteq J}$, then one cally ${\displaystyle (a_{i})_{i\in I}}$ a sub-family of ${\displaystyle (a_{i})_{i\in J}}$. Conversely, ${\displaystyle (a_{i})_{i\in J}}$ is them called a super-family of ${\displaystyle (a_{i})_{i\in I}}$.

Formally, a family can be seen as a mapping of the index set ${\displaystyle I}$ into the set ${\displaystyle A}$. In contrast to sets, elements may occur more than once in families, namely if they belong to different indices.

If the set ${\displaystyle I}$ is countable, the elements of the family can be numbered: ${\displaystyle (a_{1},a_{2},\ldots )}$. However, the index set ${\displaystyle I}$ may also be overcountable, e.g. ${\displaystyle I=\mathbb {R} }$. In this case ${\displaystyle (a_{i})_{i\in \mathbb {R} }}$ cannot be written as a sequence ${\displaystyle (a_{1},a_{2},\ldots )}$. The term family thus contains all sequences, and includes even larger "collections" of mathematical objects.

So when we say the vectors ${\displaystyle v}$ and ${\displaystyle v}$ are linearly dependent we can express it by saying that the family ${\displaystyle (v_{i})_{i\in \lbrace 1,2\rbrace }}$ with ${\displaystyle v_{1}=v_{2}=v}$ is linearly dependent.

Often one writes (with slight abuse of notation) ${\displaystyle (a_{i})\subseteq A}$ if the ${\displaystyle a_{i}}$ are elements of ${\displaystyle A}$ and it is clear from the context what the index set ${\displaystyle I}$ looks like. Similarly, ${\displaystyle a\in (a_{i})}$ means that there is an ${\displaystyle i\in I}$ with ${\displaystyle a_{i}=a}$.

With this we can rewrite the second definition of linear independence:

Definition (Second criterion for linear independence, new version)

The family ${\displaystyle (v_{1},\ldots ,v_{n})}$ of vectors is linearly independent if the only linear combination representing the zero vector is the trivial linear combination, i.e. if ${\displaystyle \alpha _{1},\ldots ,\alpha _{n}\in K}$ with ${\displaystyle 0_{V}=\sum _{i=1}^{n}\alpha _{i}v_{i}}$, then ${\displaystyle \alpha _{i}=0}$ for all ${\displaystyle i\in \{1,\ldots ,n\}}$.

## General definition of linear independence

### Motivation

We have learned above two definitions for the fact that finitely many vectors ${\displaystyle v_{1},\dots ,v_{n}\in V}$ are linearly independent:

1. A somewhat unwieldy: vectors are independent if no vector ${\displaystyle v_{k}}$ can be written as a linear combination of the others. So ${\displaystyle v_{k}=\lambda _{1}v_{1}+\ldots +\lambda _{k-1}v_{k-1}+\lambda _{k+1}v_{k+1}+\ldots +\lambda _{n}v_{n}}$ must not occur.
2. A somewhat more compact one: The zero vector ${\displaystyle 0_{V}}$ can only be represented as a trivial linear combination. So ${\displaystyle 0_{V}=\lambda _{1}v_{1}+\ldots +\lambda _{n}v_{n}}$ implies ${\displaystyle \lambda _{1}=\ldots =\lambda _{n}=0}$.

So far we have only considered finitely many vectors. What happens with infinitely many vectors? Can there even be an infinite number of linearly independent vectors? We would need a vector space that has infinitely many linearly independent directions. We know intuitively that the vector space ${\displaystyle \mathbb {R} ^{2}}$ has at most two and the ${\displaystyle \mathbb {R} ^{3}}$ at most three independent directions. So we need a much "bigger" vector space to get infinitely many independent directions. So we consider a vector space ${\displaystyle V}$ where every vector has infinitely many coordinates: ${\displaystyle v=(x_{1},x_{2},\ldots )}$ with ${\displaystyle x_{1},x_{2},\ldots \in \mathbb {R} }$. Accordingly, ${\displaystyle v}$ corresponds to a real sequence ${\displaystyle (x_{i})_{i\in \mathbb {N} }}$ and ${\displaystyle V}$ is the sequences vector space, or sequence space.

In ${\displaystyle \mathbb {R} ^{d}}$ we have the linearly independent unit vectors ${\displaystyle (1,0,\dots ,0),(0,1,\dots ,0),\dots ,(0,\dots ,0,1)}$. We can continue this construction and obtain for ${\displaystyle i\in \mathbb {N} }$ the vectors ${\displaystyle e_{i}=(0,\dots ,0,1,0,\ldots )}$ with the ${\displaystyle 1}$ at the ${\displaystyle i}$-th place and otherwise ${\displaystyle 0}$.

The infinitely many vectors ${\displaystyle e_{1},e_{2},\ldots }$ form a family ${\displaystyle (e_{i})_{i\in \mathbb {N} }}$. This family intuitively represents "infinitely many different directions" in ${\displaystyle V}$ and is thus intuitively linearly independent. So it makes sense to define linear independence for infinitely many vectors in such a way that ${\displaystyle (e_{i})_{i\in \mathbb {N} }}$ is a linearly independent family. The "somewhat unwieldy definition 1." above would be suitable for this in principle: We could simply copy it and say "a family of vectors ${\displaystyle (v_{i})_{i\in \mathbb {N} }}$ is linearly independent if no ${\displaystyle v_{i}}$ can be written as a linear combination of the others". In fact, in ${\displaystyle (e_{i})_{i\in \mathbb {N} }}$ none of the ${\displaystyle e_{i}}$ can be written as a linear combination of the other vectors. Therefore, the definition already makes sense at this point. However, there are infinitely many ${\displaystyle e_{i}}$ and thus infinitely many conditions!

We prefer to consider the "somewhat more compact definition 2.": "Vectors ${\displaystyle (v_{i})_{i\in \mathbb {N} }}$ are linearly independent if ${\displaystyle 0_{V}}$ can only be represented by the trivial linear combination." What does this formulation mean explicitly in this example? We are given a linear combination of ${\displaystyle 0_{V}\in V}$. Linear combinations are finite, that is, we have finitely many vectors ${\displaystyle e_{i_{1}},\dots ,e_{i_{n}}}$ and ${\displaystyle \lambda _{1},\dots ,\lambda _{n}\in \mathbb {R} }$ such that

${\displaystyle 0_{V}=\lambda _{1}e_{i_{1}}+\dots +\lambda _{n}e_{i_{n}}.}$

We now have to show that all ${\displaystyle \lambda _{i}=0}$, since then the linear combination of ${\displaystyle 0_{V}\in V}$ above is trivial. This works in exactly the same way as in ${\displaystyle \mathbb {R} ^{d}}$, except that here we have to compare infinitely many entries.

What do we have to do now to get a general definition for general families and general vector spaces? The "somewhat more compact definition 2." carries over almost literally: "A family ${\displaystyle (v_{i})_{i\in I}}$ of vectors is linearly independent if ${\displaystyle 0_{V}}$ can only be represented by the trivial linear combination." For the written out implication, we can make use of our language of families: We replace the double indices by the word "sub-family".

### Definition

Definition (Linear dependence and independence of vectors)

Let ${\displaystyle K}$ be a field, ${\displaystyle V}$ be a ${\displaystyle K}$-vectorspace and ${\displaystyle (v_{i})_{i\in I}\subseteq V}$ a family of vectos from ${\displaystyle V}$.

${\displaystyle (v_{i})_{i\in I}}$ is called linearly independent, if for every finite sub-family ${\displaystyle (v_{j})_{j\in J}}$ and all ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$ the following holds:

${\displaystyle \sum _{j\in J}\lambda _{j}v_{j}=0_{V}\implies \lambda _{j}=0{\text{ for all }}j\in J}$

A family ${\displaystyle (v_{i})_{i\in I}\subseteq V}$ is called linearly dependent, if it is not linearly independent.

Warning

Linear combinations of elements of a set always consist of finitely many summands, even if the set is infinite.

E.g. the family ${\displaystyle (e^{x},1,x,x^{2},\ldots )}$ is linearly independent in the vector space ${\displaystyle \operatorname {Fun} (\mathbb {R} ,\mathbb {R} )}$, although ${\displaystyle e^{x}=\sum _{k=0}^{\infty }{\dfrac {x^{k}}{k!}}}$. This is because the exponential function is not a finite linear combination of the monomials.

Hint

You may often find the term "linearly independent set" instead of "linearly independent family". We have already considered above that it makes more sense to use families here, because unlike sets, families also cover duplications of elements. From every set ${\displaystyle M}$ one can construct the family ${\displaystyle (m)_{m\in M}}$. Thus the term "linearly (in)dependent" carries over to sets. Linearly independent families do not contain a vector twice. Families without double elements correspond to sets via the above construction. If we want to have a family of linearly independent vectors (e.g. in the preconditions of a set), we can also ask for a set of linearly independent vectors. If we want to test whether a family of vectors is linearly independent, we cannot first convert it into a set. Because there, doublings disappear and cause linear dependence.

Hint

The definition of linear (in)dependence refers to subfamilies of a vector space. These may contain vectors several times and may even be overcountable.

For finite families, we alternatively talk about the elements, i.e. the statement "The family ${\displaystyle (v_{1},v_{2},\ldots v_{n})}$ is linearly (in)dependent" becomes "${\displaystyle v_{1},v_{2},\ldots v_{n}}$ are linearly (in)dependent".

## Implications of the definition

### Re-formulating the definition for finite sub-families

We have a definition of linear independence for arbitrary subfamilies of a vector space ${\displaystyle V}$. Does this agree with our old definition for finite subfamilies? Intuitively, they should agree for finite subfamilies, since we derived the general definition from our old definition. The following theorem actually proves this:

Theorem (Linear independence for finitely many vectors)

1. The vectors ${\displaystyle v_{1},\ldots ,v_{n}\in V}$ are linearly independent if and only if ${\displaystyle \lambda _{1}v_{1}+\cdots +\lambda _{n}v_{n}=0_{V}}$ with ${\displaystyle \lambda _{1},\ldots ,\lambda _{n}\in K}$ implies ${\displaystyle \lambda _{1}=\cdots =\lambda _{n}=0}$.
2. The vectors ${\displaystyle v_{1},\ldots ,v_{n}\in V}$ are linearly dependent if and only if there are ${\displaystyle \lambda _{1}}$ to ${\displaystyle \lambda _{n}}$ not all equal to ${\displaystyle 0}$ , such that ${\displaystyle \lambda _{1}v_{1}+\cdots \lambda _{n}v_{n}=0_{V}}$.

Proof (Linear independence for finitely many vectors)

We first prove the first statement. We have to establish an equivalence.

Let ${\displaystyle (v_{1},\ldots ,v_{n})}$ be linearly independent. By the defintion of linear independence we obtain that for every finite sub-family ${\displaystyle (v_{j})_{j\in J}}$ of ${\displaystyle (v_{1},\ldots ,v_{n})}$ and for all scalars ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$ we have:

${\displaystyle \sum _{j\in J}\lambda _{j}\cdot v_{j}=0_{V}\ \implies \ \lambda _{j}=0{\text{ for all }}j\in J}$

${\displaystyle (v_{1},\ldots ,v_{n})}$ is a finite sub-family of itself. Therefore for all ${\displaystyle \lambda _{1},\ldots ,\lambda _{n}\in K}$ from ${\displaystyle \lambda _{1}v_{1}+\cdots +\lambda _{n}v_{n}=0_{V}}$, we get that ${\displaystyle \lambda _{i}=0}$ for all ${\displaystyle i\in \{1,\ldots ,n\}}$.

Conversely, assume that for all ${\displaystyle \lambda _{1},\ldots ,\lambda _{n}\in K}$ from ${\displaystyle \lambda _{1}v_{1}+\cdots +\lambda _{n}v_{n}=0_{V}}$ it follows that ${\displaystyle \lambda _{1}=\cdots =\lambda _{n}=0}$. We would like to show that ${\displaystyle (v_{1},\ldots ,v_{n})}$ is linearly independent. So let ${\displaystyle (v_{j})_{j\in J}}$ be a finite sub-family of ${\displaystyle (v_{1},\ldots ,v_{n})}$. That means ${\displaystyle J\subseteq \{1,\ldots ,n\}}$. Let ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$ be scalars with

${\displaystyle \sum _{j\in J}\lambda _{j}\cdot v_{j}=0_{V}.}$

We extend this sum so that it covers all ${\displaystyle i\in \{1,\ldots ,n\}}$. This is done by defining ${\displaystyle \lambda _{i}:=0}$ for all ${\displaystyle i\in J\setminus \{1,\ldots ,n\}}$. Then

{\displaystyle {\begin{aligned}0_{V}&=\sum _{j\in J}\lambda _{j}\cdot v_{j}\\[0.3em]&\ {\color {OliveGreen}\left\downarrow \ {\text{add }}0_{V}\right.}\\[0.3em]&=\sum _{j\in J}\lambda _{j}\cdot v_{j}+0_{V}\\[0.3em]&\ {\color {OliveGreen}\left\downarrow \ 0_{V}=\sum _{i\in J\setminus \{1,\ldots ,n\}}0\cdot v_{i}\right.}\\[0.3em]&=\sum _{j\in J}\lambda _{j}\cdot v_{j}+\sum _{i\in J\setminus \{1,\ldots ,n\}}0\cdot v_{i}\\[0.3em]&\ {\color {OliveGreen}\left\downarrow \ 0=\lambda _{i}\right.}\\[0.3em]&=\sum _{j\in J}\lambda _{j}\cdot v_{j}+\sum _{i\in J\setminus \{1,\ldots ,n\}}\lambda _{i}\cdot v_{i}\\[0.3em]&=\sum _{k\in \{1,\ldots ,n\}}\lambda _{k}\cdot v_{k}.\end{aligned}}}

It follows from our premise that ${\displaystyle \lambda _{1}=\ldots =\lambda _{n}=0}$ and hence ${\displaystyle \lambda _{j}=0}$ for all ${\displaystyle j\in J}$. So ${\displaystyle (v_{1},\ldots ,v_{n})}$ is linearly independent.

The second statement is exactly the logical contraposition of the first. For we have shown ${\displaystyle A\iff B}$ with the two statements

${\displaystyle A=}$"${\displaystyle (v_{1},\ldots ,v_{n})}$ is linearly independent"

${\displaystyle B=}$ "${\displaystyle \forall \lambda _{1},\ldots ,\lambda _{n}\in K:\lambda _{1}v_{1}+\cdots +\lambda _{n}v_{n}=0_{V}\implies \lambda _{1}=\cdots =\lambda _{n}=0}$"

The second point is the statement ${\displaystyle \neg A\iff \neg B}$. But this is equivalent to ${\displaystyle A\iff B}$ and thus equivalent to the first statement.

### Reducing the definition to finite sub-families

We have defined linear independence for any family ${\displaystyle (v_{i})_{i\in I}}$ of vectors, so also for infinitely many vectors. But in the definition we only need to show a statement for finite subfamilies ${\displaystyle (v_{j})_{j\in J}}$: For all ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$ we need the following:

${\displaystyle \sum _{j\in J}\lambda _{j}v_{j}=0_{V}\implies \lambda _{j}=0{\text{ for all }}j\in J}$

In the previous theorem we have seen that this statement is exactly linear independence of ${\displaystyle (v_{j})_{j\in J}}$.

Theorem (Criterion with finitely many sub-families)

1. A family ${\displaystyle (v_{i})_{i\in I}\subseteq V}$ is linearly independent if and only if every finite sub-family ${\displaystyle (v_{j})_{j\in J}}$ is linearly independent.
2. A family ${\displaystyle (v_{i})_{i\in I}\subseteq V}$ is linearly dependent if and only if it contains a finite linearly dependent sub-family ${\displaystyle (v_{j})_{j\in J}}$.

Proof (Criterion with finitely many sub-families)

First we prove the first statement. We have to establish an equivalence. Let ${\displaystyle (v_{i})_{i\in I}}$ be a linearly independent family of vectors from ${\displaystyle V}$. We show that every finite sub-family of ${\displaystyle (v_{i})_{i\in I}}$ is linearly independent.

Let for this ${\displaystyle (v_{j})_{j\in J}}$ be a finite sub-family of ${\displaystyle (v_{i})_{i\in I}}$. From our definition of linear independence it follows that for all scalars ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$ the following holds:

${\displaystyle \sum _{j\in J}\lambda _{j}\cdot v_{j}=0_{V}\ \implies \ \lambda _{j}=0{\text{ for all }}j\in J}$

Using the previous theorem , we get that ${\displaystyle (v_{j})_{j\in J}}$ is linearly independent.

Conversely, let every finite subfamily of ${\displaystyle (v_{i})_{i\in I}}$ be linearly independent. We show that ${\displaystyle (v_{i})_{i\in I}}$. Let for this ${\displaystyle (v_{j})_{j\in J}}$ a finite subfamily of ${\displaystyle (v_{i})_{i\in I}}$. We want to show that for all scalars ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$ the following holds:

${\displaystyle \sum _{j\in J}\lambda _{j}\cdot v_{j}=0_{V}\ \implies \ \lambda _{j}=0{\text{ for all }}j\in J}$

According to our premise, ${\displaystyle (v_{j})_{j\in J}}$ is linearly independent. So it follows again with the previous theorem that for all scalars ${\displaystyle \lambda _{j}\in K}$ with ${\displaystyle j\in J}$

${\displaystyle \sum _{j\in J}\lambda _{j}\cdot v_{j}=0_{V}\ \implies \ \lambda _{j}=0{\text{ for all }}j\in J}$

holds.

The second statement is exactly the logical contraposition of the first. For we have shown ${\displaystyle A\iff B}$ with the two statements

${\displaystyle A=}$"${\displaystyle (v_{i})_{i\in I}}$ is linearly independent"

${\displaystyle B=}$ "every finite sub-family of ${\displaystyle (v_{i})_{i\in I}}$ is linearly independent"

The second point is the statement ${\displaystyle \neg A\iff \neg B}$. But this is equivalent to ${\displaystyle A\iff B}$ and thus equivalent to the first statement.

### Overview

The following properties can be derived from the definition of linear independence with a few proof steps. Let ${\displaystyle K}$ be a field and ${\displaystyle V}$ a ${\displaystyle K}$-vector space:

1. Every sub-family of a family of linearly independent vectors is linearly independent. Conversely, every super-family of a family of linearly dependent vectors is again linearly dependent.
2. Let ${\displaystyle v\in V}$ be a single vector. Then ${\displaystyle v}$ is linearly independent if and only if ${\displaystyle v\neq 0_{V}}$. So "almost always". Conversely, every family (no matter how large) is linearly dependent as soon as it contains the zero vector.
3. Let ${\displaystyle v,\,w\in V}$. The vectors ${\displaystyle v}$ and ${\displaystyle w}$ are linearly dependent if and only if there is a ${\displaystyle \lambda \in K}$ with the property ${\displaystyle w=\lambda \cdot v}$ or ${\displaystyle v=\lambda \cdot w}$.
4. If a family of vectors is linearly dependent, one of them can be represented as a linear combination of the others.

### Sub-families of linear independent vectors are linearly independent

A linearly independent family remains linearly independent if you take away vectors. Linear dependence, on the other hand, is preserved if you add more vectors. Intuitively, the addition of vectors tends to "destroy" linear independence and cannot be restored by adding more vectors.

Theorem

1. Every sub-family of a family of linearly independent vectors is again linearly independent.
2. Every super-family of a family of linearly dependent vectors is again linearly dependent.

Proof

We start with the first statement. Let ${\displaystyle B\subseteq V}$ be a family of linearly independent vectors from ${\displaystyle V}$ and ${\displaystyle A\subseteq B}$ any sub-family of ${\displaystyle B}$. Let ${\displaystyle v_{1},\dots ,v_{k}\in A}$ and ${\displaystyle \lambda _{1},\dots ,\lambda _{k}\in K}$ with

${\displaystyle 0_{V}=\lambda _{1}v_{1}+\dots +\lambda _{k}v_{k}.}$

Since ${\displaystyle A\subseteq B}$ , the vectors ${\displaystyle v_{1},\dots ,v_{k}}$ are also in ${\displaystyle B}$. And as ${\displaystyle B}$ is linearly independent, we have that ${\displaystyle \lambda _{1}=\lambda _{2}=\ldots =\lambda _{k}=0}$. So ${\displaystyle A}$ is linearly independent.

From this we deduce the second statement. Let ${\displaystyle A\subseteq V}$ a family of linearly dependent vectors from ${\displaystyle V}$ and ${\displaystyle B}$ any super-family of ${\displaystyle A}$. Assume taht ${\displaystyle B}$ is linearly independent. Then, we have with the previous statement that also ${\displaystyle A}$, as a sub-family of ${\displaystyle B}$, is linearly independent. But this is a contradiction because ${\displaystyle A}$ is linearly dependent.

### Families including the zero vector are linearly independent

When is a family with exactly one vector linearly independent? This question is easy to answer: whenever this vector is not the zero vector. Conversely, every family with the zero vector is linearly dependent. Including the one that contains only the zero vector itself.

Theorem (Families including the zero vector are linearly independent)

1. The zero vector is linearly dependent.
2. If ${\displaystyle v\in V}$ is linearly dependent, then ${\displaystyle v=0_{V}}$.
3. A family of vectors containing the zero vector is always linearly dependent.

Proof (Families including the zero vector are linearly independent)

1. We have that ${\displaystyle 1_{K}\cdot 0_{V}=0_{V}}$. There is therefore a non-trivial linear combination of the vector ${\displaystyle 0_{V}}$, which has ${\displaystyle 0_{V}}$ as a result. Hence, ${\displaystyle 0_{V}}$ is linearly independent.
2. If ${\displaystyle v\in V}$ is linearly dependent, then there are ${\displaystyle \lambda \neq 0\in K}$ with ${\displaystyle \lambda \cdot v=0_{V}}$. Since ${\displaystyle \lambda \neq 0}$, there is a multiplicative inverse ${\displaystyle \lambda ^{-1}\in K}$. Multiplying the equation ${\displaystyle \lambda \cdot v=0_{V}}$ by ${\displaystyle \lambda ^{-1}}$ we get ${\displaystyle v=\lambda ^{-1}\lambda v=\lambda ^{-1}0_{V}=0_{V}}$. So ${\displaystyle v}$ must be the zero vector ${\displaystyle 0_{V}}$.
3. This assertion follows simply from 1. and the theorem about the linear dependence of superfamilies of linearly dependent families.

### Two vectors are linearly dependent if one is a stretched version of the other

When is a family with two vectors linearly independent? We can answer the question by saying when the opposite is the case. So when are two vectors linearly dependent? Linear dependence of two vectors holds if and only if both "lie on a straight line", i.e. one vector is a stretched version of the other.

Theorem

Let ${\displaystyle v,\,w\in V}$. The vectors ${\displaystyle v}$ and ${\displaystyle w}$ are linearly dependent if and only if there is a ${\displaystyle \lambda \in K}$ with ${\displaystyle w=\lambda \cdot v}$ or ${\displaystyle v=\lambda \cdot w}$.

Proof

We need to prove the following two implications:

1. ${\displaystyle \exists \ \lambda \in K:\ w=\lambda \cdot v\implies v}$ and ${\displaystyle w}$ are linearly dependent
2. ${\displaystyle v}$ and ${\displaystyle w}$ are linearly dependent ${\displaystyle \implies \exists \ \lambda \in K:\ w=\lambda \cdot v}$

Proof step: First implication

If one of the two vectors is the zero vector, then according to the previous theorem, ${\displaystyle v}$ and ${\displaystyle w}$ are linearly dependent. So let ${\displaystyle v\neq 0_{V}}$ and ${\displaystyle w\neq 0_{V}}$. Further, let ${\displaystyle \lambda \in K}$ be chosen such that ${\displaystyle w=\lambda \cdot v}$. This is w.l.o.g. possible, because if it cannot be done, we swap the labels of the two vectors. So we use ${\displaystyle v}$ instead of ${\displaystyle w}$ and ${\displaystyle w}$ instead of ${\displaystyle v}$. According to the premise, there must exist a ${\displaystyle \lambda \in K}$, such that the equation with the new labels holds.

Now we have that ${\displaystyle 1\cdot w+(-\lambda )\cdot v=0_{V}}$. So we have the zero vector represented as a non-trivial linear combination. This means that ${\displaystyle v}$ and ${\displaystyle w}$ are linearly dependent.

Proof step: Second implication

Let ${\displaystyle v}$ and ${\displaystyle w}$ be linearly dependent. Then, by definition, there is a non-trivial linear combination of the zero vector. Thus there exist ${\displaystyle \alpha ,\beta \in K}$ such that ${\displaystyle \alpha }$ and ${\displaystyle \beta }$ are both nonzero and the equation ${\displaystyle \alpha \cdot v+\beta \cdot w=0_{V}}$ holds. We consider teh case where ${\displaystyle \beta \neq 0_{K}}$. Then, from the equation ${\displaystyle \beta \cdot w=-\alpha \cdot v}$ we conclude

{\displaystyle {\begin{aligned}\beta \cdot w&=-\alpha \cdot v&&\\[0.3em]&\ {\color {OliveGreen}\downarrow {\text{multiply both sides with }}\beta ^{-1}}\\[0.3em]w&=\,\beta ^{-1}\cdot (-\alpha \cdot v)&&\\[0.3em]&\ {\color {OliveGreen}\downarrow {\text{associative law of scalar multiplication}}}\\[0.3em]&=\,-(\beta ^{-1}\alpha )\cdot v&&\\[0.3em]&\ {\color {OliveGreen}\downarrow {\text{set }}{\lambda \colon =-(\beta ^{-1}\alpha )}}\\[0.3em]&=\,\lambda \cdot v\end{aligned}}}

However, if ${\displaystyle \beta =0_{K}}$ , then we need to have ${\displaystyle \alpha \neq 0_{K}}$. Analogously to the calculation above you can then get ${\displaystyle v=\lambda \cdot w}$ with ${\displaystyle \lambda =-(\alpha ^{-1}\beta )}$.

### With linear dependence, one vector is a linear combination of the others

For finitely many vectors, we started with the definition that vectors are linearly dependent if one of the vectors can be written as a linear combination of the others (first definition). We have already seen that this definition is equivalent to the null vector being able to be written as a linear combination of the vectors (second definition). For the general definition with possibly infinitely many vectors, we have used the version with the zero vector (the second) as our definition. And one can indeed show that even in the general case the first definition is equivalent to it:

Theorem

Let ${\displaystyle V}$ be a ${\displaystyle K}$-vector space and let ${\displaystyle v_{1},v_{2},\ldots ,v_{n},w\in V}$ be linearly dependent vectors, but ${\displaystyle (v_{1},v_{2},\ldots ,v_{n})}$ being linearly independent. Then, there exist ${\displaystyle \lambda _{1},\lambda _{2},\ldots ,\lambda _{n}\in K}$ such that ${\displaystyle w=\sum _{i=1}^{n}\lambda _{i}v_{i}}$.

How to get to the proof?

Because of the linear dependence of ${\displaystyle v_{1},v_{2},\ldots ,v_{n},w}$ there exist ${\displaystyle \alpha _{1},\alpha _{2},\ldots ,\alpha _{n},\mu \in K}$ not equal to ${\displaystyle 0}$ , such that ${\displaystyle \sum _{i=1}^{n}\alpha _{i}v_{i}+\mu w=0}$. We want to write ${\displaystyle w}$ as a linear combination of the ${\displaystyle v_{i}}$ . That means, we want to solve the equation ${\displaystyle \sum _{i=1}^{n}\alpha _{i}v_{i}+\mu w=0}$ for ${\displaystyle w}$ . We can transform this equation into

${\displaystyle \mu w=-\left(\sum _{i=1}^{n}\alpha _{i}v_{i}\right).}$

Now we want to divide by ${\displaystyle \mu }$. This only works if ${\displaystyle \mu \neq 0}$. So we show that the case ${\displaystyle \mu =0}$ cannot occur. Suppose ${\displaystyle \mu =0}$. Then, we have

${\displaystyle 0_{V}=\sum _{i=1}^{n}\alpha _{i}v_{i}.}$

We know that ${\displaystyle v_{1},v_{2},\dots ,v_{n}}$ are linearly independent. So all ${\displaystyle \alpha _{i}}$ are equal to ${\displaystyle 0}$. Hence, all ${\displaystyle \alpha _{1},\alpha _{2},\ldots ,\alpha _{n},\mu }$ are equal to ${\displaystyle 0}$. That is a contradiction. Therefore ${\displaystyle \mu =0}$ cannot occur.

Proof

Since ${\displaystyle v_{1},v_{2},\ldots ,v_{n},w\in V}$ are linearly dependent, there exist ${\displaystyle \alpha _{1},\alpha _{2},\ldots ,\alpha _{n},\mu \in K}$ with ${\displaystyle \sum _{i=1}^{n}\alpha _{i}v_{i}+\mu w=0}$, where not all ${\displaystyle \alpha _{1},\alpha _{2},\ldots ,\alpha _{n}}$ are equal ${\displaystyle 0}$ . Hence,

${\displaystyle \mu w=-\left(\sum _{i=1}^{n}\alpha _{i}v_{i}\right)}$

We first show ${\displaystyle \mu \neq 0}$. Assume that ${\displaystyle \mu =0}$. Then, we would have

${\displaystyle 0_{V}=\sum _{i=1}^{n}\alpha _{i}v_{i}+0\cdot w=\sum _{i=1}^{n}\alpha _{i}v_{i}.}$

By linear independence of ${\displaystyle v_{i}}$ we have ${\displaystyle \alpha _{i}=0_{K}}$ for all ${\displaystyle i\in \{1,\dots ,n\}}$. But this is not possible, since ${\displaystyle \alpha _{i},\mu }$ are not all equal to 0. So we have ${\displaystyle \mu \neq 0}$. We can hence divide by ${\displaystyle \mu }$ and the linear combination we are looking for is

${\displaystyle w=-\left(\sum _{i=1}^{n}{\frac {\alpha _{i}}{\mu }}v_{i}\right)=\sum _{i=1}^{n}\left(-{\frac {\alpha _{i}}{\mu }}\right)v_{i}}$

now set ${\displaystyle \lambda _{i}:=\left(-{\frac {\alpha _{i}}{\mu }}\right)}$. Then, ${\displaystyle w=\sum _{i=1}^{n}\lambda _{i}v_{i}}$ , which is what we wanted to show.

## Linear independence and unique linear combinations

In this section, we will take a closer look at the connection between linear independence and linear combinations. To do this, we recall what it means that the vectors ${\displaystyle v_{1},\ldots ,v_{n}}$ are linearly dependent or independent. Suppose the vectors ${\displaystyle v_{1},\ldots ,v_{n}}$ are linearly dependent. From our definition of linear independence, we know that there must then be a non-trivial zero representation, since at least one scalar ${\displaystyle \lambda _{i}\neq 0}$ for some ${\displaystyle 1\leq i\leq n}$. We illustrate this with the following example

Example (Linear independence and non-trivial representation of 0)

Let us consider the vectors ${\displaystyle (1,0,0)^{T},\left(1,1,{\tfrac {1}{2}}\right)^{T},(0,2,1)^{T}\in \mathbb {R} ^{3}}$. These are linearly dependent, since

${\displaystyle {\begin{pmatrix}1\\1\\{\frac {1}{2}}\end{pmatrix}}=1\cdot {\begin{pmatrix}1\\0\\0\end{pmatrix}}+{\frac {1}{2}}\cdot {\begin{pmatrix}0\\2\\1\end{pmatrix}}}$

By transforming this equation we obtain a representation of the zero vector:

${\displaystyle {\begin{pmatrix}0\\0\\0\end{pmatrix}}=1\cdot {\begin{pmatrix}1\\0\\0\end{pmatrix}}-1\cdot {\begin{pmatrix}1\\1\\{\frac {1}{2}}\end{pmatrix}}+{\frac {1}{2}}\cdot {\begin{pmatrix}0\\2\\1\end{pmatrix}}}$

In addition to this representation, there is also the so-called trivial representation of the zero vector, in which every pre-factor is equal to zero:

${\displaystyle {\begin{pmatrix}0\\0\\0\end{pmatrix}}=0\cdot {\begin{pmatrix}1\\0\\0\end{pmatrix}}+0\cdot {\begin{pmatrix}1\\1\\{\frac {1}{2}}\end{pmatrix}}+0\cdot {\begin{pmatrix}0\\2\\1\end{pmatrix}}}$

Because of the linear dependence, the zero vector can be represented in two ways via a linear combination.

Regardless of whether the considered family of vectors is linearly independent or not, there is always the trivial zero representation, in which all scalars ${\displaystyle \lambda _{1},...,\lambda _{n}}$ have the value ${\displaystyle 0}$:

${\displaystyle 0=0\cdot v_{1}+...+0\cdot v_{n}}$

In case of linear dependence of the vectors, the representation of the zero is no longer unambiguous. We can summarise our results so far in a theorem and generalise them:

Theorem (Linear independence and unique linear combination)

Let ${\displaystyle V}$ be a vector space and ${\displaystyle M\subseteq V}$.

All linear combinations of vectors from ${\displaystyle M}$ are unique ${\displaystyle \iff M}$ linearly independent.

Proof (Linear independence and unique linear combination)

We show the contraposition:

There is a linear combination of vectors from ${\displaystyle M}$ that is not unique ${\displaystyle \iff M}$ is linearly dependent.

"${\displaystyle \implies }$" We assume there was a ${\displaystyle v\in V}$, such that at least two different representations of ${\displaystyle v}$ are possible using vectors from ${\displaystyle M}$:

Let ${\displaystyle v=\lambda _{1}v_{1}+\ldots +\lambda _{n}v_{n}}$ with ${\displaystyle \lambda _{1},...,\lambda _{n}\in K}$ and ${\displaystyle v=\mu _{1}v_{1}+\ldots +\mu _{n}v_{m}}$ with ${\displaystyle \mu _{1},...,\mu _{v}\in K}$ and ${\displaystyle v_{1},\ldots ,v_{n}\in M}$. Subtraction of the two equations gives

${\displaystyle 0_{V}=(\lambda _{1}-\mu _{1})\cdot v_{1}+\ldots +(\lambda _{n}-\mu _{n})\cdot v_{n}}$

Since the representations of ${\displaystyle v}$ are different, there is at least one factor ${\displaystyle \lambda _{i}-\mu _{i}\neq 0}$ for ${\displaystyle 1\leq i\leq n}$. Hence, the vectors ${\displaystyle v_{1},...,v_{n}}$ are linearly dependent by definition and thus ${\displaystyle M}$ is also linearly dependent.

"${\displaystyle \Longleftarrow }$" If ${\displaystyle M}$ is linearly independent, then ${\displaystyle M}$ contains a linearly independent subset ${\displaystyle v_{1}\ldots ,v_{n}}$.

Then, apart from the trivial representation of zero ${\displaystyle 0_{V}}$, there is at least one more: because of the linear dependence, there are factors ${\displaystyle \lambda _{i}}$ that are not all zero, with

${\displaystyle \lambda _{1}v_{1}+\ldots +\lambda _{n}v_{n}=0_{V}}$

So we have shown that there are then two representations of ${\displaystyle 0_{V}}$ as linear combinations of these vectors. Thus linear combinations are not unique.

## Exercises

### Exercise 1

Exercise (Linear independence)

Show that the vectors ${\displaystyle (1,1,0)^{T},\,(0,1,0)^{T},\,(0,1,1)^{T}\in \mathbb {R} ^{3}}$ are linearly independent.

Solution (Linear independence)

We have to show that the zero vector can only be represented trivially by the given vectors. This means that the following equation with the real numbers ${\displaystyle \rho _{1},\rho _{2},\rho _{3}\in \mathbb {R} }$ only has the solutions ${\displaystyle \rho _{1}=\rho _{2}=\rho _{3}=0}$:

${\displaystyle \rho _{1}\cdot {\begin{pmatrix}1\\1\\0\end{pmatrix}}+\rho _{2}\cdot {\begin{pmatrix}0\\1\\0\end{pmatrix}}+\rho _{3}\cdot {\begin{pmatrix}0\\1\\1\end{pmatrix}}={\begin{pmatrix}0\\0\\0\end{pmatrix}}}$

This implies:

{\displaystyle {\begin{aligned}&\rho _{1}\cdot {\begin{pmatrix}1\\1\\0\end{pmatrix}}+\rho _{2}\cdot {\begin{pmatrix}0\\1\\0\end{pmatrix}}+\rho _{3}\cdot {\begin{pmatrix}0\\1\\1\end{pmatrix}}={\begin{pmatrix}0\\0\\0\end{pmatrix}}\\[0.3em]&\quad {\color {OliveGreen}\left\downarrow \ {\text{scalar multiplication}}\right.}\\[0.3em]\implies &{\begin{pmatrix}\rho _{1}\\\rho _{1}\\0\end{pmatrix}}+{\begin{pmatrix}0\\\rho _{2}\\0\end{pmatrix}}+{\begin{pmatrix}0\\\rho _{3}\\\rho _{3}\end{pmatrix}}={\begin{pmatrix}0\\0\\0\end{pmatrix}}\\[0.3em]&\quad {\color {OliveGreen}\left\downarrow \ {\text{vector addition}}\right.}\\[0.3em]\implies &{\begin{pmatrix}\rho _{1}\\\rho _{1}+\rho _{2}+\rho _{3}\\\rho _{3}\end{pmatrix}}={\begin{pmatrix}0\\0\\0\end{pmatrix}}\end{aligned}}}

Now, two column vectors are equal if every component is equal. So the following equations must hold:

{\displaystyle {\begin{aligned}\rho _{1}&=0\\\rho _{1}+\rho _{2}+\rho _{3}&=0\\\rho _{3}&=0\end{aligned}}}

We hence have ${\displaystyle \rho _{1}=\rho _{3}=0}$. Plugging this into ${\displaystyle \rho _{1}+\rho _{2}+\rho _{3}=0}$ , we obtain ${\displaystyle \rho _{2}=0}$. With this we have shown that from the equation ${\displaystyle \rho _{1}\cdot (1,1,0)^{T}+\rho _{2}\cdot (0,1,0)^{T}+\rho _{3}\cdot (0,1,1)^{T}=(0,0,0)^{T}}$ we get that all coefficients ${\displaystyle \rho _{1}}$, ${\displaystyle \rho _{2}}$ and ${\displaystyle \rho _{3}}$ are equal to ${\displaystyle 0}$. Thus, the three vectors are linearly independent.

### Exercise 2

Exercise (Linear dependence)

Show that the following set ${\displaystyle M}$ of four vectors is linearly dependent:

${\displaystyle M=\left\{{\begin{pmatrix}1\\0\\0\end{pmatrix}},{\begin{pmatrix}0\\1\\0\end{pmatrix}},{\begin{pmatrix}0\\0\\1\end{pmatrix}},{\begin{pmatrix}1\\1\\1\end{pmatrix}}\right\}}$

Solution (Linear dependence)

By definition, the vectors ${\displaystyle (1,0,0)^{T},(0,1,0)^{T},(0,0,1)^{T}}$ and ${\displaystyle (1,1,1)^{T}}$ are linearly dependent if and only if we can find a nontrivial linear combination of zero. Such a combination is for example given by

${\displaystyle {\begin{pmatrix}1\\0\\0\end{pmatrix}}+{\begin{pmatrix}0\\1\\0\end{pmatrix}}+{\begin{pmatrix}0\\0\\1\end{pmatrix}}-{\begin{pmatrix}1\\1\\1\end{pmatrix}}=0}$

Therefore the vectors are linearly dependent.

Solution (Lineare dependence, alternative)

Vectors are linearly dependent if one of them can be represented as a linear combination of the other ones. Now the vector ${\displaystyle (1,1,1)^{T}}$ can be represented as a linear combination of the others:

${\displaystyle {\begin{pmatrix}1\\1\\1\end{pmatrix}}={\begin{pmatrix}1\\0\\0\end{pmatrix}}+{\begin{pmatrix}0\\1\\0\end{pmatrix}}+{\begin{pmatrix}0\\0\\1\end{pmatrix}}}$

Thus the vectors are linearly dependent.

### Exercise 3

Exercise (Trigonometrical polynomials)

Let ${\displaystyle f:\mathbb {R} \to \mathbb {R} }$ with ${\displaystyle f(x)=f(x+2\pi )}$ for all ${\displaystyle x\in \mathbb {R} }$. That means, ${\displaystyle f}$ is a ${\displaystyle 2\pi }$-preiodic function. We consider the set ${\displaystyle \operatorname {Abb} _{2\pi }(\mathbb {R} ,\mathbb {R} )}$ of ${\displaystyle 2\pi }$-preiodic functions. These form an ${\displaystyle \mathbb {R} }$-vector space.

Are the functions ${\displaystyle \cos(0\cdot x),\cos(1\cdot x),\cos(2\cdot x)}$ linearly independent?

How to get to the proof? (Trigonometrical polynomials)

We investigate how to write the zero function as a linear combination of the three functions. To do this, we determine the values of ${\displaystyle \lambda _{1},\lambda _{2},\lambda _{3}}$ in the equation ${\displaystyle \lambda _{1}\cos(0)+\lambda _{2}\cos(x)+\lambda _{3}\cos(2x)=0}$. We can do this by inserting three different values for ${\displaystyle x}$ and then solving the resulting system of equations.

Values for which we explicitly know the exact values of the cosine are suitable for this - for example ${\displaystyle 0,{\frac {\pi }{2}}}$ and ${\displaystyle \pi }$. For those, we know ${\displaystyle \cos(0)=\cos(2\pi )=1,\cos \left({\dfrac {\pi }{2}}\right)=0}$ and ${\displaystyle \cos(\pi )=-1}$.

Solution (Trigonometrical polynomials)

Let ${\displaystyle \lambda _{1},\lambda _{2},\lambda _{3}\in \mathbb {R} }$, such that

${\displaystyle \lambda _{1}\cos(0\cdot x)+\lambda _{2}\cos(1\cdot x)+\lambda _{3}\cos(2\cdot (x))=0}$

for all ${\displaystyle x\in \mathbb {R} }$ . We would like to establish ${\displaystyle \lambda _{1}=\lambda _{2}=\lambda _{3}=0}$ . Plugging in for ${\displaystyle x}$ the values ${\displaystyle 0,{\frac {\pi }{2}}}$ and ${\displaystyle \pi }$ , we obtai the following system of equations for ${\displaystyle \lambda _{i}}$:

{\displaystyle {\begin{aligned}x={\frac {\pi }{2}}&:&\lambda _{1}-\lambda _{3}&=0\\[0.3em]x=\pi &:&\lambda _{1}-\lambda _{2}+\lambda _{3}&=0\\[0.3em]x=0&:&\lambda _{1}+\lambda _{2}+\lambda _{3}&=0\end{aligned}}}

The system of equations can now be solved in different ways. We transform the first equation and get ${\displaystyle \lambda _{1}=\lambda _{3}}$. We can substitute this into the second equation and get ${\displaystyle \lambda _{1}-\lambda _{2}+\lambda _{1}=0}$, so ${\displaystyle 2\lambda _{1}=\lambda _{2}}$. If we now substitute our results into the third equation, we have ${\displaystyle \lambda _{1}+2\lambda _{1}+\lambda _{1}=0}$. This is equivalent to ${\displaystyle \lambda _{1}=0}$ . From the other equations, we conclude ${\displaystyle \lambda _{2}=\lambda _{3}=0}$.

Thus we have uniquely determined the coefficients ${\displaystyle \lambda _{1},\lambda _{2}}$ and ${\displaystyle \lambda _{3}}$. That is, there is no non-trivial linear combination of the ${\displaystyle 0}$. The functions are therefore linearly independent.

### Exercise 4

Exercise (Linear (in-)dependence?)

Prove or disprove the following statement:

Let ${\displaystyle u,v,w\in \mathbb {R} ^{3}}$. The set ${\displaystyle \{u,v,w\}}$ is linearly dependent if and only if each of the vectors is a linear combination of the other two.

How to get to the proof? (Linear (in-)dependence?)

For the set to be linearly dependent, it is sufficient if two of the three vectors are multiples of each other, while the third can be linearly independent of the two. With this consideration we can construct a counterexample.

Solution (Linear (in-)dependence?)

The statement is not correct. We consider the set

${\displaystyle \left\{{\begin{pmatrix}1\\0\\0\end{pmatrix}},{\begin{pmatrix}2\\0\\0\end{pmatrix}},{\begin{pmatrix}0\\0\\1\end{pmatrix}}\right\}}$

Then we can represent the zero vector as a non-trivial linear combination of the three vectors:

${\displaystyle {\begin{pmatrix}0\\0\\0\end{pmatrix}}=-2\cdot {\begin{pmatrix}1\\0\\0\end{pmatrix}}+{\begin{pmatrix}2\\0\\0\end{pmatrix}}+0\cdot {\begin{pmatrix}0\\0\\1\end{pmatrix}}}$

Thus the set is linearly dependent. However, the vector ${\displaystyle \left(0,0,1\right)^{T}}$ is not a linear combination of the other two.

### Exercise 5

Exercise (Linearly independent vectors in ${\displaystyle K^{n}}$)

Prove: Within the vector space ${\displaystyle K^{n}}$ of ${\displaystyle n}$-tuples over the field ${\displaystyle K}$ , the vectors ${\displaystyle e_{1}=(1,0,0,\ldots ,0)^{T}}$, ${\displaystyle e_{2}=(0,1,0,\ldots ,0)^{T}}$ up to ${\displaystyle e_{n}:=(0,0,0,\ldots ,1)^{T}}$ are linearly independent.

Solution (Linearly independent vectors in ${\displaystyle K^{n}}$)

We have to show that we can uniquely represent the zero vector as a linear combination of the vectors ${\displaystyle e_{1},...,e_{n}}$. So let us consider the linear combination of the vectors with ${\displaystyle \lambda _{i}\in K}$ for ${\displaystyle 1\leq i\leq n}$. We need to have

${\displaystyle {\begin{pmatrix}0\\0\\\vdots \\0\end{pmatrix}}{\overset {!}{=}}\lambda _{1}\cdot {\begin{pmatrix}1\\0\\\vdots \\0\end{pmatrix}}+\lambda _{2}\cdot {\begin{pmatrix}0\\1\\\vdots \\0\end{pmatrix}}+...+\lambda _{n}\cdot {\begin{pmatrix}0\\0\\\vdots \\1\end{pmatrix}}}$

We can interpret this as a linear system of equations as

{\displaystyle {\begin{aligned}0&=\lambda _{1}\\0&=\lambda _{2}\\&\vdots \\0&=\lambda _{n}\end{aligned}}}

Thus we have shown that the ${\displaystyle \lambda _{i}}$ are uniquely determined and ${\displaystyle 0}$. After defining the linear independence, we have shown that the vectors are linearly independent.

### Exercise 6

Exercise (Linearly independent vectors and endomorphisms)

Let ${\displaystyle K}$ be a field and ${\displaystyle F:V\to V}$ an endomorphism of the ${\displaystyle K}$-vector space ${\displaystyle V}$. Let ${\displaystyle v\in V}$, such that for a fixed natural number ${\displaystyle n}$, we have: ${\displaystyle F^{i}(v)\neq 0}$ for ${\displaystyle i=1,...,n}$ and ${\displaystyle F^{n+1}(v)=0}$. Here, ${\displaystyle F^{i}(v)=F(F(...(F(v))...))}$ means that the ${\displaystyle i}$-fold application of ${\displaystyle F}$ onto the vector ${\displaystyle v}$. Prove that then the vectors ${\displaystyle v,F(v),...,F^{n}(v)}$ are linearly independent.

How to get to the proof? (Linearly independent vectors and endomorphisms)

We need to show that for ${\displaystyle \lambda _{0},\dots ,\lambda _{n}\in K}$ with

${\displaystyle \lambda _{0}v+\lambda _{1}F(v)+\dots +\lambda _{n}F^{n}(v)=0}$

we already have ${\displaystyle \lambda _{0}=\dots =\lambda _{n}=0}$ . We can try to get the individual ${\displaystyle \lambda _{i}}$ from this equation: We know that ${\displaystyle F^{n+1}(v)=0}$ . If we now apply ${\displaystyle F}$ to this equation, we get

{\displaystyle {\begin{aligned}0=F(0)&=F(\lambda _{0}v+\lambda _{1}F(v)+\dots +\lambda _{n}F^{n}(v))\\[0.3em]&=\lambda _{0}F(v)+\lambda _{1}F^{2}(v)+\dots +\lambda _{n}F^{n+1}(v)\\[0.3em]&=\lambda _{0}F(v)+\lambda _{1}F^{2}(v)+\dots +\lambda _{n-1}F^{n}(v).\end{aligned}}}

We have thus eliminated one summand. With this we have reduced our problem to a case with ${\displaystyle n}$ summands. That is, by proceeding with induction, we can now infer the statement.

Solution (Linearly independent vectors and endomorphisms)

We perform an induction over ${\displaystyle n}$ to iterate the idea of reducing the number of vectors one-by-one.

Theorem whose validity shall be proven for the ${\displaystyle n\in \mathbb {N} }$:

If ${\displaystyle v\in V}$ with ${\displaystyle F^{i}(v)\neq 0}$ for all ${\displaystyle i\leq n}$ and ${\displaystyle F^{n+1}(v)=0}$, then ${\displaystyle v,F(v),\dots ,F^{n}(v)}$ is linearly independent.

1. Base case:

We need to show that ${\displaystyle v}$ and ${\displaystyle F(v)}$ are linearkly independent, if ${\displaystyle F^{2}(v)=0}$ and ${\displaystyle F(v)\neq 0}$ hold. That means we have to show that for ${\displaystyle \lambda ,\mu \in K}$ with ${\displaystyle \lambda \cdot v+\mu \cdot F(v)=0}$ we already have ${\displaystyle \lambda =\mu =0}$ . Now,

{\displaystyle {\begin{aligned}0=F(0)&=F(\lambda \cdot v+\mu \cdot F(v))\\[0.3em]&=\lambda \cdot F(v)+\mu \cdot F^{2}(v)\\[0.3em]&=\lambda \cdot F(v)\\[0.3em]\end{aligned}}}

Since ${\displaystyle F(v)\neq 0}$ , we have ${\displaystyle \lambda =0}$. So by choice of ${\displaystyle \lambda }$ and ${\displaystyle \mu }$, we also have ${\displaystyle \mu \cdot F(v)=0}$. With the same argument, we now have ${\displaystyle \mu =0}$.

1. inductive step:

2a. inductive hypothesis:

If ${\displaystyle v\in V}$ with ${\displaystyle F^{i}(v)\neq 0}$ for all ${\displaystyle i\leq n}$ and ${\displaystyle F^{n+1}(v)=0}$, then ${\displaystyle v,F(v),\dots ,F^{n}(v)}$ is linearly independent.

2b. induction theorem:

If ${\displaystyle v\in V}$ with ${\displaystyle F^{i}(v)\neq 0}$ for all ${\displaystyle i\leq n+1}$ and ${\displaystyle F^{n+2}(v)=0}$, then ${\displaystyle v,F(v),\dots ,F^{n+1}(v)}$ is linearly independent.

2b. proof of induction step:

Let ${\displaystyle \lambda _{0},\dots ,\lambda _{n+1}\in K}$, such that ${\displaystyle \lambda _{0}\cdot v+\lambda _{1}\cdot F(v)+\dots +\lambda _{n+1}\cdot F^{n+1}(v)=0}$ . Then,

{\displaystyle {\begin{aligned}0=F(0)&=F(\lambda _{0}\cdot v+\lambda _{1}\cdot F(v)+\dots +\lambda _{n+1}\cdot F^{n+1}(v))\\[0.3em]&=\lambda _{0}\cdot F(v)+\lambda _{1}\cdot F^{2}(v)+\dots +\lambda _{n+1}\cdot F^{n+2}(v)\\[0.3em]&=\lambda _{0}\cdot F(v)+\dots +\lambda _{n}\cdot F^{n+1}(v)\end{aligned}}}

Applying the induction assumption to ${\displaystyle F(v)}$ , we get that ${\displaystyle \lambda _{0}=\dots =\lambda _{n}=0}$ . Hence, ${\displaystyle \lambda _{n+1}F^{n+1}(v)=0}$. Since ${\displaystyle F^{n+1}(v)\neq 0}$ , we also have ${\displaystyle \lambda _{n+1}=0}$. So ${\displaystyle v,F(v),\dots ,F^{n+1}(v)}$ is linearly independent.