Derivative - Sarcasm Wiki

Contents

1. Overview
2. Etymology
3. Cultural Impact

For the algebraic generalization, see Derivation (differential algebra) . For other uses, see Derivative (disambiguation) .

The graph of a function , depicted in black, showcases a tangent line rendered in red. The slope of this tangent line precisely mirrors the derivative of the function at the indicated point. It’s the sharpest linear approximation we can get for the function’s behavior in the immediate vicinity.

The derivative at various points of a function that is differentiable everywhere. In this particular instance, the derivative is calculated as:

$$ \sin(x^2) + 2x^2 \cos(x^2) $$

• • • • • • • • • • • • • • • • • • • • • • • • • • Part of a series on Calculus

$$ \int _{a}^{b}f’(t),dt=f(b)-f(a) $$

• Fundamental theorem of calculus • Limits • Continuity • Rolle’s theorem • Mean value theorem • Inverse function theorem

Differential Calculus

Definitions

• Derivative (generalizations ) • Differential (mathematics) • Infinitesimal • of a function • total

Concepts

• Notation for differentiation • Second derivative • Implicit differentiation • Logarithmic differentiation • Related rates • Taylor’s theorem

Rules and identities

• Sum rule in differentiation • Product rule • Chain rule • Power rule • Quotient rule • L’Hôpital’s rule • Inverse function rule • General Leibniz rule • Faà di Bruno’s formula • Reynolds transport theorem

Integral Calculus

• Lists of integrals • Integral transform • Leibniz integral rule

Definitions

• Antiderivative • Integral (improper integral ) • Riemann integral • Lebesgue integration • Contour integration • Integral of inverse functions

Integration by

• Parts • Discs • Cylindrical shells • Substitution (trigonometric , tangent half-angle , Euler ) • Integration using Euler’s formula • Partial fractions in integration (Heaviside’s method ) • Changing order of integration • Reduction formulae • Differentiating under the integral sign • Risch algorithm

Series (mathematics)

• Geometric (arithmetico-geometric sequence ) • Harmonic series (mathematics) • Alternating series • Power series • Binomial series • Taylor series

Convergence tests

• Summand limit (term test) • Ratio test • Root test • Integral test for convergence • Direct comparison test • Limit comparison test • Alternating series test • Cauchy condensation test • Dirichlet’s test • Abel’s test

Vector calculus

• Gradient • Divergence • Curl (mathematics) • Laplace operator • Directional derivative • Vector calculus identities

Theorems

• Gradient theorem • Green’s theorem • Stokes’ theorem • Divergence theorem • Generalized Stokes theorem • Helmholtz decomposition

Multivariable calculus

Formalisms

• Matrix calculus • Tensor calculus • Exterior derivative • Geometric calculus

Definitions

• Partial derivative • Multiple integral • Line integral • Surface integral • Volume integral • Jacobian matrix and determinant • Hessian matrix

Advanced

• Calculus on Euclidean space • Generalized function • Limit of distributions

Specialized

• Fractional calculus • Malliavin calculus • Stochastic calculus • Calculus of variations

Miscellanea

• Precalculus • History of calculus • Glossary of calculus • List of calculus topics • Integration Bee • Mathematical analysis • Nonstandard analysis

• v • t • e

In mathematics , the derivative is a fundamental concept that precisely measures how a function’s output changes in response to infinitesimal alterations in its input. For a function of a single variable, its derivative at a specific input value, provided it exists, is geometrically represented by the slope of the tangent line touching the graph of the function at that precise point. This tangent line serves as the most accurate linear approximation of the function’s behavior in the immediate vicinity of that input. Often, the derivative is described as the instantaneous rate of change, essentially the ratio of the instantaneous change in the dependent variable to the corresponding infinitesimal change in the independent variable. ¹ The rigorous procedure of calculating a derivative is known as differentiation.

Several distinct notations exist for differentiation. The Leibniz notation , attributed to Gottfried Wilhelm Leibniz , expresses the derivative as a ratio of two differentials , while prime notation achieves this by appending a prime mark . Higher-order derivatives, representing repeated differentiation, are typically indicated in Leibniz notation by superscripts on the differentials and in prime notation by additional prime marks. Higher-order derivatives find significant application in physics ; for instance, the first derivative of an object’s position with respect to time defines its velocity , and the second derivative corresponds to its acceleration .

Derivatives can be extended to functions of several real variables . In this context, the derivative transforms into a linear transformation whose graph, after a suitable translation, offers the best linear approximation to the original function’s graph. The Jacobian matrix is the matrix that embodies this linear transformation relative to a chosen basis of independent and dependent variables. It can be computed using the partial derivatives with respect to the independent variables. For a real-valued function of multiple variables, the Jacobian matrix simplifies to the gradient vector .

Definition

As a limit

A function of a real variable , denoted as $$ f(x) $$, is considered differentiable at a point $$ a $$ within its domain if its domain encompasses an open interval containing $$ a $$, and the following limit exists:

$$ L = \lim_{h \to 0} \frac{f(a+h) - f(a)}{h} $$

This existence implies that for any arbitrarily small positive real number $$ \varepsilon $$, there is a corresponding positive real number $$ \delta $$ such that for all $$ h $$ satisfying $$ |h| < \delta $$ and $$ h \neq 0 $$, the value $$ f(a+h) $$ is defined, and the following inequality holds:

$$ \left| L - \frac{f(a+h) - f(a)}{h} \right| < \varepsilon $$

The vertical bars denote the absolute value . This formulation is a classic example of the (ε, δ)-definition of limit . ²

If the function $$ f $$ is indeed differentiable at $$ a $$, meaning the limit $$ L $$ exists, then this limit $$ L $$ is precisely the derivative of $$ f $$ at $$ a $$. Various notations are employed to represent this derivative. ³ It can be written as $$ f’(a) $$, pronounced “f prime of a,” or as $$\textstyle {\frac {df}{dx}}(a)$$, read as “the derivative of $$ f $$ with respect to $$ x $$ at $$ a $$,” or alternatively, “$$ df $$ by (or over) $$ dx $$ at $$ a $$. Further details on notation can be found in the § Notation section. If $$ f $$ possesses a derivative at every point within its domain , then a derivative function, denoted as $$ f’ $$, can be defined. This function maps each point $$ x $$ to the value of the derivative of $$ f $$ at $$ x $$. It’s important to note that a function $$ f $$ might have a derivative at some, but not all, points in its domain. In such cases, the derivative function is still defined, but its domain might be a subset of $$ f $$’s original domain. ⁴

Consider, for instance, the squaring function: $$ f(x) = x^2 $$. The difference quotient in the definition of its derivative is: ⁵

$$ \frac{f(a+h) - f(a)}{h} = \frac{(a+h)^2 - a^2}{h} = \frac{a^2 + 2ah + h^2 - a^2}{h} = 2a + h $$

This simplification is valid as long as $$ h \neq 0 $$. As $$ h $$ approaches zero, the expression $$ 2a + h $$ draws progressively closer to $$ 2a $$. Since this limit exists and equals $$ 2a $$ for any input $$ a $$, the derivative of the squaring function is the doubling function: $$ f’(x) = 2x $$.

Geometrically, the ratio in the derivative’s definition represents the slope of the line segment connecting two points on the function’s graph: $$ (a, f(a)) $$ and $$ (a+h, f(a+h)) $$. As $$ h $$ diminishes, these points converge, and the slope of this secant line approaches the limiting value – the slope of the tangent line to the graph of $$ f $$ at $$ a $$. In essence, the derivative is the slope of the tangent. ⁶

Using infinitesimals

An alternative perspective on the derivative $$\textstyle {\frac {df}{dx}}(a)$$ views it as the ratio of an infinitesimal change in the function’s output $$ f $$ to an infinitesimal change in its input. ⁷ To formalize this intuition, a system for manipulating infinitesimal quantities is necessary. ⁸ The hyperreal numbers provide such a system, allowing for the rigorous treatment of infinite and infinitesimal quantities. The hyperreals are an extension of the real numbers , incorporating numbers larger than any finite sum of ones (infinite numbers), and consequently, their reciprocals are infinitesimals. This approach to the foundations of calculus, known as nonstandard analysis , affords a precise meaning to the $$ d $$ in Leibniz notation. Thus, the derivative of $$ f(x) $$ can be expressed as:

$$ f’(x) = \operatorname{st} \left(\frac{f(x+dx) - f(x)}{dx}\right) $$

for an arbitrary infinitesimal $$ dx $$, where $$ \operatorname{st} $$ denotes the standard part function , which effectively rounds a finite hyperreal number to its nearest real counterpart. ⁹ Applying this to our example, the squaring function $$ f(x) = x^2 $$, yields:

$$ \begin{aligned} f’(x) &= \operatorname{st} \left(\frac{x^2 + 2x \cdot dx + (dx)^2 - x^2}{dx}\right) \ &= \operatorname{st} \left(\frac{2x \cdot dx + (dx)^2}{dx}\right) \ &= \operatorname{st} \left(\frac{2x \cdot dx}{dx} + \frac{(dx)^2}{dx}\right) \ &= \operatorname{st}(2x + dx) \ &= 2x \end{aligned} $$

Continuity and differentiability

This function fails to have a derivative at the marked point because it exhibits a jump discontinuity .

The absolute value function, though continuous, is not differentiable at $$ x=0 $$ because the slopes of the tangent lines do not converge to the same value from both the left and the right.

If a function $$ f $$ is differentiable at $$ a $$, it must also be continuous at $$ a $$. ¹⁰ Consider, for example, a step function that yields 1 for all $$ x < a $$ and a different value, 10, for all $$ x \geq a $$. This function cannot possess a derivative at $$ a $$. For negative $$ h $$, $$ a+h $$ lies on the lower segment of the step, making the secant line from $$ a $$ to $$ a+h $$ extremely steep, with its slope approaching infinity as $$ h $$ tends to zero. Conversely, for positive $$ h $$, $$ a+h $$ is on the upper segment, resulting in a secant line slope of zero. Consequently, the secant slopes do not converge to a single value, and the limit of the difference quotient does not exist.

However, continuity at a point does not guarantee differentiability. The absolute value function, $$ f(x) = |x| $$, is continuous at $$ x=0 $$ but not differentiable there. For positive $$ h $$, the slope of the secant line from 0 to $$ h $$ is 1; for negative $$ h $$, the slope of the secant line from 0 to $$ h $$ is $$ -1 $$. ¹¹ Graphically, this manifests as a “kink” or a “cusp” at $$ x=0 $$. Even functions with smooth graphs may fail to be differentiable at points where their tangent line is vertical . For instance, the function $$ f(x) = x^{1/3} $$ is not differentiable at $$ x=0 $$. In summary, while differentiability implies continuity, the converse is not true; continuity does not necessitate differentiability. ¹²

The vast majority of functions encountered in practice are differentiable either everywhere or almost everywhere . Historically, mathematicians often presumed that any continuous function was differentiable at most points. ¹³ Under certain conditions, such as the function being monotone or a Lipschitz function , this assumption holds true. However, in 1872, Weierstrass introduced the first example of a function that is continuous everywhere but differentiable nowhere – the Weierstrass function . ¹⁴ Later, in 1931, Stefan Banach demonstrated that the set of functions differentiable at even a single point constitutes a meager set within the space of all continuous functions. This implies, in a statistical sense, that randomly chosen continuous functions are highly unlikely to be differentiable at any point. ¹⁵

Notation

Main article: Notation for differentiation

A widely used method for denoting the derivative of a function is the Leibniz notation , conceived by Gottfried Wilhelm Leibniz in 1675. This notation represents a derivative as a quotient of two differentials , such as $$ dy $$ and $$ dx $$. ¹⁶ It remains prevalent when the equation $$ y = f(x) $$ is viewed as expressing a functional relationship between dependent and independent variables . The first derivative is denoted as $$\textstyle {\frac {dy}{dx}}$$ , read as “the derivative of $$ y $$ with respect to $$ x $$. ¹⁷ This derivative can also be interpreted as the application of a differential operator to a function:

$$ \frac{dy}{dx} = \frac{d}{dx} f(x) $$

Higher-order derivatives are expressed using the notation $$\textstyle {\frac {d^{n}y}{dx^{n}}}$$ for the $$ n $$-th derivative of $$ y = f(x) $$. These notations signify repeated applications of the derivative operator; for example, $$\textstyle {\frac {d^{2}y}{dx^{2}}} = {\frac {d}{dx}}\left({\frac {d}{dx}}f(x)\right)$$. ¹⁸ Unlike some alternative notations, Leibniz notation explicitly specifies the variable of differentiation in the denominator, thereby eliminating ambiguity when dealing with multiple interrelated quantities. The derivative of a composite function is elegantly expressed using the chain rule : if $$ u = g(x) $$ and $$ y = f(g(x)) $$, then:

$$ \frac{dy}{dx} = \frac{dy}{du} \cdot \frac{du}{dx} $$

¹⁹

Another prevalent notation, introduced by Joseph-Louis Lagrange and now known as prime notation, utilizes a prime mark (’) appended to the function symbol. ²⁰ The first derivative is written as $$ f’(x) $$, read as “f prime of x,” or as $$ y’ $$, read as “y prime.” ²¹ Subsequent derivatives are denoted by additional prime marks: $$ f’’ $$ for the second derivative and $$ f’’’ $$ for the third. ²² For higher orders, some authors employ Roman numerals in superscripts (e.g., $$ f^{\mathrm{iv}} $$), while others use parenthesized superscripts, such as $$ f^{(4)} $$. ²³ The latter notation generalizes to $$ f^{(n)} $$ for the $$n$$-th derivative of $$ f $$. ¹⁸

In Newton’s notation , also known as dot notation, a dot placed above a variable signifies its derivative with respect to time. If $$ y $$ is a function of $$ t $$, the first and second derivatives are written as $$ \dot{y} $$ and $$ \ddot{y} $$, respectively. This notation is primarily reserved for derivatives with respect to time or arc length and is commonly employed in differential equations within physics and differential geometry . However, its utility diminishes for higher-order derivatives (order 4 and above) and it is unsuitable for multiple independent variables.

A further notation is the D-notation, where the differential operator is symbolized by $$ D $$. ¹⁸ The first derivative is written as $$ Df(x) $$, and higher derivatives are indicated by superscripts, so the $$n$$-th derivative is $$ D^n f(x) $$. While sometimes attributed to Leonhard Euler , this notation appears to have been introduced by Louis François Antoine Arbogast . ²⁴ For partial derivatives, the variable of differentiation is denoted by a subscript; for example, the partial derivative of a function $$ u = f(x,y) $$ with respect to $$ x $$ can be written as $$ D_x u $$ or $$ D_x f(x,y) $$. Higher-order partial derivatives are indicated by superscripts or multiple subscripts, such as $$ D_{xy}f(x,y) = {\frac {\partial }{\partial y}}\left({\frac {\partial }{\partial x}}f(x,y)\right) $$ and $$ \textstyle D_{x}^{2}f(x,y)={\frac {\partial }{\partial x}}\left({\frac {\partial }{\partial x}}f(x,y)\right) $$. ²⁵

Rules of computation

Main article: Differentiation rules

Theoretically, a function’s derivative can be computed directly from its definition by examining the difference quotient and evaluating its limit. However, once the derivatives of a few fundamental functions are established, calculating the derivatives of more complex functions becomes significantly simpler by applying a set of rules. This systematic process of finding derivatives is termed differentiation. ²⁶

Rules for basic functions

The following rules govern the derivatives of the most common elementary functions. Here, $$ a $$ represents a real number, and $$ e $$ is the base of the natural logarithm, approximately 2.71828 . ²⁷

Power rule :
$$ \frac{d}{dx} x^a = ax^{a-1} $$
Derivatives of exponential and logarithmic functions, including those with a general base:
$$ \frac{d}{dx} e^x = e^x $$
$$ \frac{d}{dx} a^x = a^x \ln(a) $$ for $$ a > 0 $$
$$ \frac{d}{dx} \ln(x) = \frac{1}{x} $$ for $$ x > 0 $$
$$ \frac{d}{dx} \log_a(x) = \frac{1}{x \ln(a)} $$ for $$ x, a > 0 $$
Trigonometric functions :
$$ \frac{d}{dx} \sin(x) = \cos(x) $$
$$ \frac{d}{dx} \cos(x) = -\sin(x) $$
$$ \frac{d}{dx} \tan(x) = \sec^2(x) = \frac{1}{\cos^2(x)} = 1 + \tan^2(x) $$
Inverse trigonometric functions :
$$ \frac{d}{dx} \arcsin(x) = \frac{1}{\sqrt{1-x^2}} $$ for $$ -1 < x < 1 $$
$$ \frac{d}{dx} \arccos(x) = -\frac{1}{\sqrt{1-x^2}} $$ for $$ -1 < x < 1 $$
$$ \frac{d}{dx} \arctan(x) = \frac{1}{1+x^2} $$

Rules for combined functions

These rules enable the derivation of derivatives for complex functions by combining the derivatives of simpler ones: ²⁸

Constant rule: If $$ f $$ is a constant function, then for all $$ x $$, $$ f’(x) = 0 $$.
Sum rule : For any functions $$ f $$ and $$ g $$, and any real numbers $$ \alpha $$ and $$ \beta $$, the following holds:
$$ (\alpha f + \beta g)’ = \alpha f’ + \beta g’ $$
Product rule : For any functions $$ f $$ and $$ g $$,
$$ (fg)’ = f’g + fg’ $$
This rule also covers the case of a constant multiplied by a function, $$ (\alpha f)’ = \alpha f’ $$, since the derivative of a constant is zero.
Quotient rule : For any functions $$ f $$ and $$ g $$, provided $$ g(x) \neq 0 $$,
$$ \left(\frac{f}{g}\right)’ = \frac{f’g - fg’}{g^2} $$
Chain rule for composite functions : If $$ f(x) = h(g(x)) $$, then:
$$ f’(x) = h’(g(x)) \cdot g’(x) $$

Computation example

Let’s find the derivative of the function:

$$ f(x) = x^4 + \sin(x^2) - \ln(x)e^x + 7 $$

Applying the differentiation rules:

$$ \begin{aligned} f’(x) &= 4x^{(4-1)} + \frac{d(x^2)}{dx} \cos(x^2) - \left(\frac{d(\ln x)}{dx} e^x + \ln(x) \frac{d(e^x)}{dx}\right) + 0 \ &= 4x^3 + 2x \cos(x^2) - \left(\frac{1}{x} e^x + \ln(x) e^x\right) \ &= 4x^3 + 2x \cos(x^2) - \frac{1}{x} e^x - \ln(x) e^x \end{aligned} $$

In this calculation, the chain rule was used for the second term, and the product rule for the term involving $$ \ln(x)e^x $$. We also utilized the known derivatives of the elementary functions $$ x^2 $$, $$ x^4 $$, $$ \sin(x) $$, $$ \ln(x) $$, and $$ e^x $$, along with the constant 7.

Antidifferentiation

Main article: Antiderivative

An antiderivative of a function $$ f $$ is another function whose derivative is $$ f $$. Antiderivatives are not unique; if $$ A $$ is an antiderivative of $$ f $$, then so is $$ A + c $$, where $$ c $$ is any constant, because the derivative of a constant is always zero. ²⁹ The fundamental theorem of calculus establishes a profound connection between differentiation and antidifferentiation, showing that finding an antiderivative provides a method for calculating areas bounded by a function. Specifically, the integral of a function over a closed interval is equal to the difference between the values of an antiderivative evaluated at the interval’s endpoints. ³⁰

Higher-order derivatives

Higher-order derivatives are obtained by repeatedly differentiating a function. If $$ f $$ is a differentiable function, its first derivative is denoted by $$ f’ $$. The derivative of $$ f’ $$ is the second derivative , denoted as $$ f’’ $$, and the derivative of $$ f’’ $$ is the third derivative , denoted as $$ f’’’ $$. This process can be continued indefinitely, yielding the $$n$$-th derivative, denoted as $$ f^{(n)} $$, which is the derivative of the $$ (n-1) $$-th derivative. ³¹ A function possessing $$ k $$ successive derivatives is termed $$ k $$-times differentiable. If the $$k$$-th derivative is continuous, the function is classified as belonging to the differentiability class $$ C^k $$. ³² Functions that have an infinite number of derivatives are called infinitely differentiable or smooth . ³³ All polynomial functions are infinitely differentiable; repeated differentiation eventually leads to a constant function , all subsequent derivatives of which are zero. ³⁴

An important application of higher-order derivatives lies in physics . If a function describes the position of an object over time, its first derivative represents the object’s velocity with respect to time, the second derivative its acceleration , ²⁶ and the third derivative its jerk . ³⁵

In other dimensions

See also: Vector calculus and Multivariable calculus

Vector-valued functions

A vector-valued function , denoted by $$ \mathbf{y} $$, maps real numbers to vectors within a vector space $$ \mathbb{R}^n $$. Such a function can be decomposed into its coordinate functions: $$ y_1(t), y_2(t), \dots, y_n(t) $$, such that $$ \mathbf{y} = (y_1(t), y_2(t), \dots, y_n(t)) $$. This encompasses, for example, parametric curves in $$ \mathbb{R}^2 $$ or $$ \mathbb{R}^3 $$. Since the coordinate functions are real-valued, the standard definition of the derivative applies to them. The derivative of $$ \mathbf{y}(t) $$ is defined as the vector , termed the tangent vector , whose coordinates are the derivatives of the individual coordinate functions. Mathematically: ³⁶

$$ \mathbf{y}’(t) = \lim_{h \to 0} \frac{\mathbf{y}(t+h) - \mathbf{y}(t)}{h} $$

This limit must exist for the derivative to be defined. The subtraction in the numerator refers to vector subtraction. If $$ \mathbf{y} $$ is differentiable for all values of $$ t $$, then $$ \mathbf{y}’ $$ is another vector-valued function. ³⁶

Partial derivatives

Main article: Partial derivative

Functions can depend on more than one variable . A partial derivative of a multivariable function is its derivative with respect to one of those variables, while all other variables are held constant. Partial derivatives are crucial in vector calculus and differential geometry . Various notations exist, including $$ f_x $$, $$ f’_x $$, $$ \partial_x f $$, $$ {\frac {\partial }{\partial x}}f $$, or $$ {\frac {\partial f}{\partial x}} $$ for the partial derivative of $$ f(x,y,\dots) $$ with respect to $$ x $$. ³⁷ It quantifies the rate of change of the function along the $$ x $$-axis. ³⁸ The symbol $$ \partial $$ (a rounded ’d’) is known as the partial derivative symbol. It is sometimes pronounced “der,” “del,” or “partial” to distinguish it from the letter ’d’. ³⁹ For example, given $$ f(x,y) = x^2 + xy + y^2 $$, the partial derivatives with respect to $$ x $$ and $$ y $$ are:

$$ \frac{\partial f}{\partial x} = 2x + y, \qquad \frac{\partial f}{\partial y} = x + 2y $$

In general, the partial derivative of a function $$ f(x_1, \dots, x_n) $$ with respect to the variable $$ x_i $$ at the point $$ (a_1, \dots, a_n) $$ is defined as: ⁴⁰

$$ \frac{\partial f}{\partial x_i}(a_1, \dots, a_n) = \lim_{h \to 0} \frac{f(a_1, \dots, a_i+h, \dots, a_n) - f(a_1, \dots, a_i, \dots, a_n)}{h} $$

This concept is foundational for studying functions of several real variables . For a real-valued function $$ f(x_1, \dots, x_n) $$, if all its partial derivatives exist at a point $$ (a_1, \dots, a_n) $$, these derivatives form the gradient of $$ f $$ at that point:

$$ \nabla f(a_1, \dots, a_n) = \left({\frac {\partial f}{\partial x_{1}}}(a_{1},\ldots ,a_{n}),\ldots ,{\frac {\partial f}{\partial x_{n}}}(a_{1},\ldots ,a_{n})\right) $$

If $$ f $$ is differentiable across an entire domain, the gradient $$ \nabla f $$ becomes a vector-valued function mapping points to their respective gradient vectors, thus defining a vector field . ⁴¹

Directional derivatives

Main article: Directional derivative

For a real-valued function $$ f $$ defined on $$ \mathbb{R}^n $$, its partial derivatives measure variations along the coordinate axes. For instance, in a function of $$ x $$ and $$ y $$, these derivatives indicate changes along the $$ x $$ and $$ y $$ directions, but not along arbitrary directions like the line $$ y=x $$. This is where directional derivatives come into play. Given a vector $$ \mathbf{v} = (v_1, \dots, v_n) $$, the directional derivative of $$ f $$ in the direction of $$ \mathbf{v} $$ at point $$ \mathbf{x} $$ is defined as: ⁴²

$$ D_{\mathbf{v}} {f}(\mathbf{x}) = \lim_{h \rightarrow 0} \frac{f(\mathbf{x} + h\mathbf{v}) - f(\mathbf{x})}{h} $$

If all partial derivatives of $$ f $$ exist and are continuous at $$ \mathbf{x} $$, the directional derivative can be computed using the formula: ⁴³

$$ D_{\mathbf{v}} {f}(\mathbf{x}) = \sum_{j=1}^{n} v_j \frac{\partial f}{\partial x_j} $$

Total derivative and Jacobian matrix

Main article: Total derivative

When $$ f $$ is a function mapping an open subset of $$ \mathbb{R}^n $$ to $$ \mathbb{R}^m $$, the directional derivative at a given point represents the best linear approximation of $$ f $$ in that specific direction. However, for $$ n > 1 $$, a single directional derivative does not fully capture the function’s behavior. The total derivative provides a comprehensive view by considering all directions simultaneously. It ensures that for any vector $$ \mathbf{v} $$ originating from $$ \mathbf{a} $$, the linear approximation formula holds: ⁴⁴

$$ f(\mathbf{a} + \mathbf{v}) \approx f(\mathbf{a}) + f’(\mathbf{a})\mathbf{v} $$

Similar to the single-variable case, $$ f’(\mathbf{a}) $$ is chosen to minimize the approximation error. The total derivative of $$ f $$ at $$ \mathbf{a} $$ is the unique linear transformation $$ f’(\mathbf{a}) \colon \mathbb{R}^n \to \mathbb{R}^m $$ satisfying: ⁴⁴

$$ \lim_{\mathbf{h} \to 0} \frac{\lVert f(\mathbf{a} + \mathbf{h}) - (f(\mathbf{a}) + f’(\mathbf{a})\mathbf{h}) \rVert}{\lVert \mathbf{h} \rVert} = 0 $$

Here, $$ \mathbf{h} $$ is a vector in $$ \mathbb{R}^n $$, and the norm in the denominator is the standard Euclidean norm. The term $$ f’(\mathbf{a})\mathbf{h} $$ is a vector in $$ \mathbb{R}^m $$, and the norm in the numerator is the standard Euclidean norm in $$ \mathbb{R}^m $$. ⁴⁴ When $$ \mathbf{v} $$ is a vector starting at $$ \mathbf{a} $$, $$ f’(\mathbf{a})\mathbf{v} $$ is referred to as the pushforward of $$ \mathbf{v} $$ by $$ f $$. ⁴⁵

If the total derivative exists at $$ \mathbf{a} $$, then all partial and directional derivatives of $$ f $$ also exist at $$ \mathbf{a} $$, and for any vector $$ \mathbf{v} $$, $$ f’(\mathbf{a})\mathbf{v} $$ equals the directional derivative of $$ f $$ in the direction of $$ \mathbf{v} $$. If $$ f $$ is expressed in terms of its coordinate functions, $$ f = (f_1, f_2, \dots, f_m) $$, the total derivative can be represented by a matrix , known as the Jacobian matrix of $$ f $$ at $$ \mathbf{a} $$: ⁴⁶

$$ f’(\mathbf{a}) = \operatorname{Jac}{\mathbf{a}} = \left({\frac {\partial f{i}}{\partial x_{j}}}\right)_{ij} $$

Generalizations

Main article: Generalizations of the derivative

The concept of the derivative has been extended to numerous other mathematical contexts. The unifying principle is that the derivative of a function at a point provides a linear approximation of that function around that point.

A significant generalization pertains to complex functions of complex variables . When $$ \mathbb{C} $$ is identified with $$ \mathbb{R}^2 $$ (by writing $$ z = x + iy $$), a differentiable complex function is also differentiable as a real function from $$ \mathbb{R}^2 $$ to $$ \mathbb{R}^2 $$. However, the converse is not universally true. A complex derivative exists only if the real derivative is complex linear, which imposes the Cauchy–Riemann equations – this property characterizes holomorphic functions . ⁴⁷ ⁴⁸
Another generalization applies to functions between differentiable or smooth manifolds . Intuitively, a manifold $$ M $$ can be locally approximated by a vector space known as its tangent space ; a classic example is a smooth surface in $$ \mathbb{R}^3 $$. The derivative (or differential) of a differentiable map $$ f: M \to N $$ between manifolds, at a point $$ x $$ in $$ M $$, is a linear map from the tangent space of $$ M $$ at $$ x $$ to the tangent space of $$ N $$ at $$ f(x) $$. This definition is fundamental in differential geometry . ⁴⁹
Differentiation can also be defined for maps between vector spaces , such as Banach spaces . These generalizations include the Gateaux derivative and the Fréchet derivative . ⁵⁰
A limitation of the classical derivative is that many functions are not differentiable. The concept of the weak derivative addresses this by extending the notion of differentiability to a broader class of functions, including all continuous functions, by operating within the framework of distributions and considering differentiability “on average.” ⁵¹
The properties of the derivative have inspired algebraic and topological concepts like differential algebra , which involves the derivation of structures within abstract algebra such as rings , ideals , and fields . ⁵²
The discrete analogue of differentiation is finite differences . The study of differential calculus is unified with the calculus of finite differences in time scale calculus . [^55^]
The arithmetic derivative is defined for integers based on their prime factorization , drawing an analogy to the product rule. [^56^]