# Derivatives of higher order – Serlo

## Motivation

The derivative ${\displaystyle f'}$ describes the current rate of change of the function ${\displaystyle f}$. Now the derivative function ${\displaystyle f'}$ can be differentiated again, provided that it is again differentiable. The obtained derivative of the derivative is called second derivative or derivative of second order and is called ${\displaystyle f''}$ or ${\displaystyle f^{(2)}}$. This can be done arbitrarily often. If the second derivative is again differentiable, a third derivative ${\displaystyle f^{(3)}}$ can be constructed, then a fourth derivative ${\displaystyle f^{(4)}}$ and so on.

These higher derivatives allow statements about the course of a function graph. The second derivative tells us whether a graph is curved upwards ("convex") or curved downwards ("concave"). If a function has a convex graph, its gradient increases continuously. For this convexity, ${\displaystyle f''(x)>0}$ is a sufficient condition. If the second derivative is always positive, then the first derivative must grow continuously. Analogously, it follows from ${\displaystyle f''(x)<0}$ that the graph is concave and the derivative falls monotonously.

Higher-order derivatives do not only tell us more about abstract functions, they can also have a physical meaning. Consider the function ${\displaystyle f:[a,b]\to \mathbb {R} }$ with ${\displaystyle f(t)=t^{3}+2}$, which shall describe the location ${\displaystyle f(t)}$ of a car at the time ${\displaystyle t}$. We already know that we can calculate the speed of the car at the time ${\displaystyle t}$ with the first derivative: ${\displaystyle f'(t)=f^{(1)}(t)=3t^{2}}$. What does the derivative ${\displaystyle f''(t)}$ of ${\displaystyle f'}$ say? This is the instantaneous rate of change of speed and thus the acceleration of the car. It accelerates with ${\displaystyle f''(t)=f^{(2)}(t)=6t}$. So second derivatives describe accelerations.

Now we can derive this second derivative again, whereby we get the rate of change of acceleration ${\displaystyle f'''(t)=f^{(3)}(t)=6}$. This is called jerk in vehicle dynamics and indicates how fast a car increases acceleration or how fast it initiates braking. For example, a big jerk occurs during emergency braking. Since ${\displaystyle f^{(3)}(t)<0}$ is in an emergency stop, the graph of the speed ${\displaystyle f'}$ is convex - the speed decreases more and more. The fourth derivative ${\displaystyle f^{4}(t)=0}$ again tells us that the jerk has no instantaneous rate of change.

## Definition

Definition (Derivatives of higher order)

Let ${\displaystyle f:D\to W}$ with ${\displaystyle D,W\subseteq \mathbb {R} }$ be a real function. We set ${\displaystyle f^{(0)}:=f}$ and in the case of differentiability ${\displaystyle f^{(1)}=f'}$. We define the second derivative via ${\displaystyle f^{(2)}=\left(f^{(1)}\right)'}$, the third derivative via ${\displaystyle f^{(3)}=\left(f^{(2)}\right)'}$ etc., if these higher derivatives exist. Overall, we define recursively for ${\displaystyle k\in \mathbb {N} _{0}}$:

${\displaystyle f^{(k+1)}:=\left(f^{(k)}\right)'}$

We say ${\displaystyle f}$ that is ${\displaystyle k}$ times differentiable, if the ${\displaystyle k}$-th derivative ${\displaystyle f^{(k)}}$ of ${\displaystyle f}$ exists. ${\displaystyle f}$ is called ${\displaystyle k}$ times continuously differentiable, if ${\displaystyle f^{(k)}}$ is continuous (which is a stronger statement).

The set of all ${\displaystyle k}$ times continuously differential functions with domain of definition ${\displaystyle D}$ and range ${\displaystyle W}$ is denoted ${\displaystyle C^{k}(D,W)}$. In particular ${\displaystyle C^{0}(D,W)=C(D,W)}$ consists of the continuous functions. If we can derive the function ${\displaystyle f}$ arbitrarily often, we write ${\displaystyle f\in C^{\infty }(D,W)}$. If ${\displaystyle W=\mathbb {R} }$, then we can write ${\displaystyle C^{k}(D)}$ or ${\displaystyle C^{\infty }(D)}$ in short. Those sets of functions satisfy the inclusion chain:

${\displaystyle C(D,W)\supseteq C^{1}(D,W)\supseteq C^{2}(D,W)\supseteq \ldots \supseteq C^{k}(D,W)\supseteq C^{k+1}(D,W)}$

Question: Are the following statements true of false?

1. ${\displaystyle |x|\in C(\mathbb {R} )}$
2. ${\displaystyle |x|\in C^{1}(\mathbb {R} )}$
3. ${\displaystyle \operatorname {sgn}(x)\in C(\mathbb {R} )}$
4. ${\displaystyle \operatorname {sgn}(x)\in C^{1}(\mathbb {R} )}$
5. ${\displaystyle x\in C(\mathbb {R} )}$
6. ${\displaystyle x\in C^{1}(\mathbb {R} )}$

Solutions:

1. true
2. false
3. false
4. false
5. true
6. true

## Examples for higher derivatives

### Derivatives of the power function

Example (Derivatives of the power function)

We consider the function ${\displaystyle f:\mathbb {R} \to \mathbb {R} ,x\mapsto x^{2}}$. This function is infinitely often differentiable, since there is for all ${\displaystyle x\in \mathbb {R} }$ and all ${\displaystyle k\in \mathbb {N} ,k\geq 3}$:

{\displaystyle {\begin{aligned}&\ f'(x)=2x\\[0.3em]&\ f''(x)=2\\[0.3em]&\ f^{(k)}=0\\[0.3em]\end{aligned}}}

In general, for ${\displaystyle f:\mathbb {R} \to \mathbb {R} ,x\mapsto x^{n}}$ with ${\displaystyle n\in \mathbb {N} }$ there is:

{\displaystyle {\begin{aligned}&\ f'(x)=nx^{n-1}\\[0.3em]&\ f^{(k)}(x)=n(n-1)\cdot \ldots \cdot (n-k+1)x^{n-k}{\text{ for }}k\leq n\\[0.3em]&\ f^{(k)}=0{\text{ for }}k>n\\[0.3em]\end{aligned}}}

### Derivatives of the exponential function

Example (Derivatives of the exponential function)

For the exponential function ${\displaystyle \exp :\mathbb {R} \to \mathbb {R} }$ since ${\displaystyle \exp '(x)=\exp(x)}$ for all ${\displaystyle x\in \mathbb {R} }$ we have infinite differentiability ${\displaystyle \exp \in C^{\infty }(\mathbb {R} ,\mathbb {R} )}$. In addition there is for all ${\displaystyle k\in \mathbb {N} }$:

${\displaystyle \exp ^{(k)}(x)=\exp(x)}$

### Derivatives of the sine function

Example (Derivatives of the sine function)

The function ${\displaystyle \sin :\mathbb {R} \to \mathbb {R} }$ is infinitely often continuously differentiable. For all ${\displaystyle x\in \mathbb {R} }$ there is:

{\displaystyle {\begin{aligned}&\sin '(x)=\cos(x)\\[0.3em]&\sin ''(x)=\cos '(x)=-\sin(x)\\[0.3em]&\sin '''(x)=(-\sin )'(x)=-\cos(x)\\[0.3em]&\sin ^{(4)}(x)=(-\cos )'(x)=\sin(x)\\[0.3em]\end{aligned}}}

In general, for all ${\displaystyle n\in \mathbb {N} _{0}}$ there is:

{\displaystyle {\begin{aligned}&\sin ^{(4n+1)}(x)=\cos(x)\\[0.3em]&\sin ^{(4n+2)}(x)=\cos '(x)=-\sin(x)\\[0.3em]&\sin ^{(4n+3)}(x)=(-\sin )'(x)=-\cos(x)\\[0.3em]&\sin ^{(4n)}(x)=(-\cos )'(x)=\sin(x)\\[0.3em]\end{aligned}}}

Question: What are the derivatives von ${\displaystyle \cos :\mathbb {R} \to \mathbb {R} }$?

We use that ${\displaystyle \sin '(x)=\cos(x)}$. For ${\displaystyle x\in \mathbb {R} }$ there is

{\displaystyle {\begin{aligned}&\cos '(x)=-\sin(x)\\[0.3em]&\cos ''(x)=-\cos(x)\\[0.3em]&\cos '''(x)=\sin(x)\\[0.3em]&\cos ^{(4)}(x)=\cos(x)\end{aligned}}}

In general, for all ${\displaystyle n\in \mathbb {N} _{0}}$ there is:

{\displaystyle {\begin{aligned}&\cos ^{(4n+1)}(x)=-\sin(x)\\[0.3em]&\cos ^{(4n+2)}(x)=-\cos(x)\\[0.3em]&\cos ^{(4n+3)}(x)=\sin(x)\\[0.3em]&\cos ^{(4n)}(x)=\cos(x)\end{aligned}}}

## Exercises: higher derivatives

### Derivatives of the logarithm function

Exercise (Derivatives of the logarithm function)

Show that the logarithm function ${\displaystyle \ln :\mathbb {R} ^{+}\to \mathbb {R} }$ is arbitrarily often differentiable and that for all ${\displaystyle n\in \mathbb {N} }$ there is:

${\displaystyle \ln ^{(n)}(x)=(-1)^{n-1}{\frac {(n-1)!}{x^{n}}}}$

Proof (Derivatives of the logarithm function)

Theorem whose validity shall be proven for the ${\displaystyle n\in \mathbb {N} }$:

${\displaystyle \ln ^{(n)}(x)=(-1)^{n-1}{\frac {(n-1)!}{x^{n}}}}$

1. Base case:

${\displaystyle \ln '(x)={\frac {1}{x}}=(-1)^{0}{\frac {0!}{x^{1}}}}$

1. inductive step:

2a. inductive hypothesis:

${\displaystyle \ln ^{(n)}(x)=(-1)^{n-1}{\frac {(n-1)!}{x^{n}}}}$

2b. induction theorem:

${\displaystyle \ln ^{(n+1)}(x)=(-1)^{n}{\frac {(n)!}{x^{n+1}}}}$

2b. proof of induction step:

{\displaystyle {\begin{aligned}\ln ^{(n+1)}(x)&=(\ln ^{(n)}(x))'{\underset {\text{assumption}}{\overset {\text{induction}}{=}}}\left((-1)^{n-1}{\frac {(n-1)!}{x^{n}}}\right)'\\[0.3em]&=\left((-1)^{n-1}(n-1)!x^{-n}\right)'{\underset {\text{rules}}{\overset {\text{derivative}}{=}}}(-1)^{n-1}(n-1)!(-n)x^{-n-1}\\[0.3em]&=(-1)^{n}{\frac {n!}{x^{n+1}}}\end{aligned}}}

### Exactly once differentiable function

Exercise (Exactly once differentiable function)

Prove that the following function is differentiable once, but not twice:

${\displaystyle f:\mathbb {R} \to \mathbb {R} ,x\mapsto {\begin{cases}x^{2}\sin \left({\frac {1}{x}}\right)&,x\neq 0\\0&,x=0\end{cases}}}$

Solution (Exactly once differentiable function)

This function is differentiable in all points ${\displaystyle {\tilde {x}}\neq 0}$, since for all ${\displaystyle x}$ in the open neighbourhood ${\displaystyle (0,2{\tilde {x}})}$ for ${\displaystyle {\tilde {x}}>0}$ or ${\displaystyle (2{\tilde {x}},0)}$ for ${\displaystyle {\tilde {x}}<0}$ there is ${\displaystyle f(x)=x^{2}\sin \left({\frac {1}{x}}\right)}$ . Consequently, by the product and the chain rule

${\displaystyle f'({\tilde {x}})=2{\tilde {x}}\cdot \sin \left({\frac {1}{\tilde {x}}}\right)+{\tilde {x}}^{2}\cos \left({\frac {1}{\tilde {x}}}\right)\cdot {\frac {-1}{{\tilde {x}}^{2}}}=2{\tilde {x}}\cdot \sin \left({\frac {1}{\tilde {x}}}\right)-\cos \left({\frac {1}{\tilde {x}}}\right)}$

For ${\displaystyle {\tilde {x}}=0}$ we obtain

{\displaystyle {\begin{aligned}&\ \lim \limits _{x\to 0}{\frac {f(x)-f(0)}{x-0}}\\[0.3em]&\ =\lim \limits _{x\to 0}{\frac {x^{2}\sin \left({\frac {1}{x}}\right)-0}{x}}\\[0.3em]&\ =\lim \limits _{x\to 0}{x\sin \left({\frac {1}{x}}\right)}\\[0.3em]&\ =0\end{aligned}}}

Since for all ${\displaystyle x\in \mathbb {R} \setminus \{0\}}$ there is ${\displaystyle \sin \left({\frac {1}{x}}\right)\in (-1,1)}$ , so the term is bounded. Hence, for the derivative function

${\displaystyle f':\mathbb {R} \to \mathbb {R} ,x\mapsto {\begin{cases}2x\cdot \sin \left({\frac {1}{x}}\right)-\cos \left({\frac {1}{x}}\right)&,x\neq 0\\0&,x=0\end{cases}}}$

However, this function is not differentiable at ${\displaystyle {\tilde {x}}=0}$. We approach 0 by taking two sequences ${\displaystyle (a_{n})_{n\in \mathbb {N} }}$ and ${\displaystyle (b_{n})_{n\in \mathbb {N} }}$, where we define for all ${\displaystyle n\in \mathbb {N} }$

{\displaystyle {\begin{aligned}&\ a_{n}={\frac {1}{{\tfrac {\pi }{2}}+2\pi n}}\\[0.3em]&\ b_{n}={\frac {1}{{\tfrac {3\pi }{2}}+2\pi n}}\\[0.3em]\end{aligned}}}

Then, there is ${\displaystyle \lim \limits _{n\to \infty }{a_{n}}=0}$ and ${\displaystyle \lim \limits _{n\to \infty }{b_{n}}=0}$. Further there is for all ${\displaystyle n\in \mathbb {N} }$

{\displaystyle {\begin{aligned}&\ f'(a_{n})=2{\frac {1}{{\tfrac {\pi }{2}}+2\pi n}}\cdot \sin \left({\tfrac {\pi }{2}}+2\pi n\right)-\cos \left({\tfrac {\pi }{2}}+2\pi n\right)={\frac {2}{{\tfrac {\pi }{2}}+2\pi n}}\\[0.3em]&\ f'(b_{n})=2{\frac {1}{{\tfrac {3\pi }{2}}+2\pi n}}\cdot \sin \left({\tfrac {3\pi }{2}}+2\pi n\right)-\cos \left({\tfrac {3\pi }{2}}+2\pi n\right)={\frac {-2}{{\tfrac {\pi }{2}}+2\pi n}}\\[0.3em]\end{aligned}}}

So there is

{\displaystyle {\begin{aligned}&\ \lim \limits _{n\to \infty }{\frac {f'(a_{n})-f'(0)}{a_{n}-0}}\\[0.3em]&\ =\lim \limits _{n\to \infty }{\frac {{\frac {2}{{\tfrac {\pi }{2}}+2\pi n}}-0)}{{\frac {1}{{\tfrac {\pi }{2}}+2\pi n}}-0}}=2\end{aligned}}}

But

{\displaystyle {\begin{aligned}&\ \lim \limits _{n\to \infty }{\frac {f'(b_{n})-f'(0)}{b_{n}-0}}\\[0.3em]&\ =\lim \limits _{n\to \infty }{\frac {{\frac {-2}{{\tfrac {3\pi }{2}}+2\pi n}}-0}{{\frac {1}{{\tfrac {3\pi }{2}}+2\pi n}}-0}}=-2\end{aligned}}}

Consequently the limit value ${\displaystyle \lim _{x\to 0}{\tfrac {f'(x)-f'(0)}{x-0}}}$ does not exist and therefore ${\displaystyle f'}$ is not differentiable at ${\displaystyle 0}$ .

Additional question: Is ${\displaystyle f'}$ continuous at ${\displaystyle 0}$?

Nope. Take the two sequences:

{\displaystyle {\begin{aligned}&\ (a_{n})_{n\in \mathbb {N} }=({\tfrac {1}{2n\pi }})_{n\in \mathbb {N} }\\[0.3em]&\ (b_{n})_{n\in \mathbb {N} }=({\tfrac {1}{(2n+1)\pi }})_{n\in \mathbb {N} }\\[0.3em]\end{aligned}}}

For these sequences, there is: ${\displaystyle \lim _{n\to \infty }a_{n}=\lim _{n\to \infty }b_{n}=0}$. However

{\displaystyle {\begin{aligned}&\ f'(a_{n})=2{\frac {1}{2n\pi }}\cdot \underbrace {\sin \left(2n\pi \right)} _{=0}-\underbrace {\cos \left(2n\pi \right)} _{=1}=-1\\[0.3em]&\ f'(b_{n})=2{\frac {1}{(2n+1)\pi }}\cdot \underbrace {\sin \left((2n+1)\pi \right)} _{=0}-\underbrace {\cos \left((2n+1)\pi \right)} _{=-1}=1\end{aligned}}}

So ${\displaystyle \lim _{x\to 0}f'(x)}$ doesn't exist. By means of the sequence criterion, ${\displaystyle f'}$ is hence not continuous at ${\displaystyle 0}$.

Remark: Therefore, ${\displaystyle f'}$ is also not differentiable at ${\displaystyle 0}$.

## Computation rules for higher derivatives

### Linearity

The linearity of derivatives is also "inherited" to higher derivatives: If ${\displaystyle f}$ and ${\displaystyle g}$ are differentiable, for ${\displaystyle a,b\in \mathbb {R} }$ the function ${\displaystyle af+bg}$ is also differentiable with

${\displaystyle (af+bg)'=af'+bg'}$

If ${\displaystyle f}$ and ${\displaystyle g}$ are now even twice differentiable, then there is

${\displaystyle (af+bg)''=(af'+bg')'=af''+bg''}$

If we continue to do so, we will get

Theorem (Linearity of higher derivatives)

Let ${\displaystyle a,b\in \mathbb {R} }$ and ${\displaystyle f,g:D\to \mathbb {R} }$ be ${\displaystyle n}$ times differentiable. Then also ${\displaystyle af+bg:D\to \mathbb {R} }$ is ${\displaystyle n}$ times differentiable, and for all ${\displaystyle n\in \mathbb {N} _{0}}$ there is:

${\displaystyle (af+bg)^{(n)}=af^{(n)}+bg^{(n)}}$

Example (Linearity of higher derivatives)

Since ${\displaystyle \sin ^{(4n+1)}=\cos }$ and ${\displaystyle \cos ^{(4n+1)}=-\sin }$ for ${\displaystyle n\in \mathbb {N} }$ there is

${\displaystyle (3\sin(x)+4\cos(x))^{(1001)}=3\sin ^{(1001)}(x)+4\cos ^{(1001)}(x)=3\sin ^{(4\cdot 250+1)}(x)+4\cos ^{(4\cdot 250+1)}(x)=3\cos(x)-4\sin(x)}$

Proof (Linearity of higher derivatives)

Theorem whose validity shall be proven for the ${\displaystyle n\in \mathbb {N} }$:

${\displaystyle (af+bg)^{(n)}=af^{(n)}+bg^{(n)}}$

1. Base case:

${\displaystyle (af+bg)^{(0)}=af+bg=af^{(0)}+bg^{(0)}}$

1. inductive step:

2a. inductive hypothesis:

${\displaystyle (af+bg)^{(n)}=af^{(n)}+bg^{(n)}}$

2b. induction theorem:

${\displaystyle (af+bg)^{(n+1)}=af^{(n+1)}+bg^{(n+1)}}$

2b. proof of induction step:

{\displaystyle {\begin{aligned}(af+bg)^{(n+1)}&=((af+bg)^{(n)})'\\[0.3em]&{\color {OliveGreen}\left\downarrow \ {\text{induction assumption}}\right.}\\[0.3em]&=(af^{(n)}+bg^{(n)})'\\[0.3em]&{\color {OliveGreen}\left\downarrow \ {\text{linearity of the derivative}}\right.}\\[0.3em]&=af^{(n+1)}+bg^{(n+1)}\end{aligned}}}

### Leibniz rule for product functions

We now try to determine a general formula for the ${\displaystyle n}$-th derivative of the product function ${\displaystyle fg}$ of two arbitrarily often differentiable functions ${\displaystyle f}$ and ${\displaystyle g}$. By applying the factor-, sum- and product rule several times we obtain for ${\displaystyle n=1,2,3}$

{\displaystyle {\begin{aligned}(fg)'&=f'g+fg'\\(fg)''&=(f'g+g'f)'=(f'g)'+(fg')'\\&=f''g+f'g'+f'g'+fg''\\&=f''g+2f'g'+fg''\\(fg)'''&=(f''g+2f'g'+fg'')'\\&=(f''g)'+(2f'g')'+(fg'')'\\&=f'''g+f''g'+2f''g'+2f'g''+f'g''+fg'''\\&=f'''g+3f''g'+3f'g''+fg'''\end{aligned}}}

If we plug in ${\displaystyle f=x}$ and ${\displaystyle g=y}$, and instead of the derivatives of ${\displaystyle f}$ and ${\displaystyle g}$ the corresponding powers of ${\displaystyle x}$ and ${\displaystyle y}$, we see a clear analogy to the binomial theorem:

{\displaystyle {\begin{aligned}(x+y)^{1}&=x^{1}y^{0}+x^{0}+y^{1}\\(x+y)^{2}&=x^{2}+2xy+y^{2}\\&=x^{2}y^{0}+2x^{1}y^{1}+x^{0}y^{2}\\(x+y)^{3}&=x^{3}+3x^{2}y+3xy^{2}+y^{3}\\&=x^{3}y^{0}+3x^{2}y^{1}+3x^{1}y^{2}+3x^{0}y^{3}\end{aligned}}}

This analogy can be made clear as follows:

We assign for every ${\displaystyle k\in \mathbb {N} _{0}}$ the derivative ${\displaystyle f^{(k)}}$ to the power ${\displaystyle x^{k}}$, and the derivative ${\displaystyle g^{(k)}}$ to the power ${\displaystyle y^{k}}$. The ${\displaystyle 0}$-th derivative ${\displaystyle f^{(0)}=f}$ corresponds to the ${\displaystyle 0}$-th power ${\displaystyle x^{0}=1}$. The derivative of the term ${\displaystyle f^{(k)}g^{(l)}}$ is by means of the product rule

${\displaystyle (f^{(k)}g^{(l)})'=f^{(k+1)}g^{(l)}+f^{(k)}g^{(l+1)}}$

The expression ${\displaystyle f^{(k+1)}g^{(l)}+f^{(k)}g^{(l+1)}}$ now corresponds in our analogy to the sum ${\displaystyle x^{k+1}y^{l}+x^{k}y^{l+1}}$. We get this term from ${\displaystyle x^{k}y^{l}}$ by multiplication with ${\displaystyle x+y}$. For our polynomials, the distributive law yields

${\displaystyle (x^{k}y^{l})(x+y)=x^{k+1}y^{l}+x^{k}y^{l+1}}$

Therefore, the application of the product rule corresponds to the multiplication with the sum ${\displaystyle x+y}$. Thus the ${\displaystyle n}$-th derivative ${\displaystyle (fg)^{(n)}}$ corresponds to the power ${\displaystyle (x+y)^{n}}$. From the binomial theorem

${\displaystyle (x+y)^{n}=\sum _{k=0}^{n}{\binom {n}{k}}x^{k}y^{n-k}}$

we hence get the

Theorem (Leibniz rule for derivatives)

Let ${\displaystyle f,g:D\to \mathbb {R} }$ be ${\displaystyle n}$ times differentiable functions. Then, ${\displaystyle fg:D\to \mathbb {R} }$ is ${\displaystyle n}$ times differentiable, and for all ${\displaystyle n\in \mathbb {N} _{0}}$, there is:

${\displaystyle (fg)^{(n)}=\sum _{k=0}^{n}{\binom {n}{k}}f^{(k)}g^{(n-k)}}$

Example (Leibniz rule for derivatives)

Using the Leibniz rule we calculate ${\displaystyle (x^{3}e^{x})^{(2016)}}$. The rule is applicable because ${\displaystyle x\mapsto x^{3}}$ and ${\displaystyle x\mapsto e^{x}}$ are arbitrarily often differentiable on ${\displaystyle \mathbb {R} }$. There is

{\displaystyle {\begin{aligned}(x^{3}e^{x})^{(2016)}&{\underset {\text{rule}}{\overset {\text{Leibniz}}{=}}}\sum _{k=0}^{2016}{\binom {2016}{k}}(x^{3})^{(k)}(e^{x})^{(2016-k)}\\[0.3em]&\color {Gray}\left\downarrow \ (x^{3})'=3x^{2},\ (x^{3})''=6x,\ (x^{3})^{(3)}=6,\ (x^{3})^{(k)}=0{\text{ for }}k\geq 4,\ (e^{x})^{(k)}=e^{x}{\text{ for all }}k\in \mathbb {N} \right.\\[0.3em]&={\binom {2016}{0}}x^{3}e^{x}+{\binom {2016}{1}}\cdot 3x^{2}e^{x}+{\binom {2016}{2}}\cdot 6xe^{x}+{\binom {2016}{3}}\cdot 6e^{x}\\[0.3em]&=x^{3}e^{x}+2016\cdot 3x^{2}e^{x}+{\frac {2016\cdot 2015}{2}}\cdot 6xe^{x}+{\frac {2016\cdot 2015\cdot 2014}{3\cdot 2}}\cdot 6e^{x}\\[0.3em]&=x^{3}e^{x}+2016\cdot 3x^{2}e^{x}+2016\cdot 2015\cdot 3xe^{x}+2016\cdot 2015\cdot 2014e^{x}\end{aligned}}}

Proof (Leibniz rule for derivatives)

Theorem whose validity shall be proven for the ${\displaystyle n\in \mathbb {N} }$:

${\displaystyle (fg)^{(n)}=\sum _{k=0}^{n}{\binom {n}{k}}f^{(k)}g^{(n-k)}}$

1. Base case:

${\displaystyle (fg)^{(0)}=fg=\sum _{k=0}^{0}{\binom {0}{k}}f^{(k)}g^{(0-k)}}$

1. inductive step:

2a. inductive hypothesis:

${\displaystyle (fg)^{(n)}=\sum _{k=0}^{n}{\binom {n}{k}}f^{(k)}g^{(n-k)}}$

2b. induction theorem:

${\displaystyle (fg)^{(n+1)}=\sum _{k=0}^{n+1}{\binom {n+1}{k}}f^{(k)}g^{(n+1-k)}}$

2b. proof of induction step:

{\displaystyle {\begin{aligned}(fg)^{(n+1)}&=\left[(fg)^{(n)}\right]'\\[0.3em]&\color {Gray}\left\downarrow \ {\text{ induction assumption}}\right.\\[0.3em]&=\left[\sum _{k=0}^{n}{\binom {n}{k}}f^{(k)}g^{(n-k)}\right]'\\[0.3em]&\color {Gray}\left\downarrow \ {\text{ linearity of the derivative}}\right.\\[0.3em]&=\sum _{k=0}^{n}{\binom {n}{k}}\left[f^{(k)}g^{(n-k)}\right]'\\[0.3em]&\color {Gray}\left\downarrow \ {\text{ product rule}}\right.\\[0.3em]&=\sum _{k=0}^{n}{\binom {n}{k}}\left[f^{(k+1)}g^{(n-k)}+f^{(k)}g^{(n-k+1)}\right]\\[0.3em]&=\sum _{k=0}^{n}{\binom {n}{k}}f^{(k+1)}g^{(n-k)}+\sum _{k=0}^{n}{\binom {n}{k}}f^{(k)}g^{(n-k+1)}=\\[0.3em]&\color {Gray}\left\downarrow \ {\text{ index shift}}\right.\\[0.3em]&=\sum _{k=1}^{n+1}{\binom {n}{k-1}}f^{(k)}g^{(n-(k-1))}+\sum _{k=0}^{n}{\binom {n}{k}}f^{(k)}g^{(n-k+1)}\\[0.3em]&\color {Gray}\left\downarrow \ {\binom {n}{-1}}=0{\text{ and }}{\binom {n}{n+1}}=0\right.\\[0.3em]&=\sum _{k=0}^{n+1}{\binom {n}{k-1}}f^{(k)}g^{(n-k+1)}+\sum _{k=0}^{n+1}{\binom {n}{k}}f^{(k)}g^{(n-k+1)}\\[0.3em]&=\sum _{k=0}^{n+1}\left[{\binom {n}{k-1}}+{\binom {n}{k}}\right]f^{(k)}g^{(n-k+1)}\\[0.3em]&\color {Gray}\left\downarrow \ {\text{ sum of binomial coefficients}}\right.\\[0.3em]&=\sum _{k=0}^{n+1}{\binom {n+1}{k}}f^{(k)}g^{((n+1)-k)}\end{aligned}}}