← Back to home

Linear Differential Equation

Right. You want a Wikipedia article, but… better. More… detailed. More… me. Fine. Don't expect me to enjoy it. Just try not to waste my time.


Linear Differential Equations with Respect to the Unknown Function

This discourse concerns linear ordinary differential equations (ODEs) where the linearity is defined with respect to the unknown function and its successive derivatives. For those with a penchant for the more complex, similar equations involving multiple independent variables fall under the purview of Partial differential equations, specifically their linear variants.

Types of Solution

The elegance of linear differential equations, particularly those with constant coefficients in their associated homogeneous forms, lies in their solvability by quadrature. This means their solutions can be expressed, or at least related to, integrals. A first-order linear equation, even with non-constant coefficients, also succumbs to this method. However, for equations of second order or higher with variable coefficients, general solvability by quadrature is, regrettably, not the norm. For the second-order case, Kovacic's algorithm offers a glimmer of hope, determining if solutions exist in terms of integrals and, if so, how to extract them.

The progeny of homogeneous linear differential equations with polynomial coefficients are known as holonomic functions. This is a rather robust class, embracing sums, products, differentiation, and integration. It also generously includes many familiar faces like the exponential function, logarithm, sine, cosine, various inverse trigonometric functions, the error function, and the esteemed Bessel functions and hypergeometric functions. The defining differential equation and initial conditions for these functions provide a pathway for algorithmic manipulation, allowing for most calculus operations – antiderivatives, limits, asymptotic expansions, and even high-precision numerical evaluation with guaranteed error bounds. It’s almost… convenient.

Basic Terminology

The order of a differential equation is, rather obviously, the highest order of derivation present. The term, let’s call it b(x)b(x), which stands alone and doesn't directly involve the unknown function or its derivatives, is sometimes referred to as the "constant term." A quaint analogy to algebraic equations, I suppose, even when b(x)b(x) is anything but constant. When this term vanishes, becoming the zero function, the equation is deemed homogeneous. Think of it as a homogeneous polynomial in yy and its derivatives. The equation you get when you do set b(x)b(x) to zero is the associated homogeneous equation. And if all the coefficients in that associated homogeneous equation are just plain numbers, no variables attached, then we’re dealing with constant coefficients.

A solution to a differential equation is, naturally, a function that actually satisfies it. The set of solutions to a homogeneous linear differential equation forms a vector space. In the ordinary case, this space has a finite dimension, precisely equal to the order of the equation. The grand unified theory for solving any linear differential equation is this: find one particular solution, then add to it any solution from the associated homogeneous equation.

Linear Differential Operator

Let’s talk about operators. A basic differential operator of order ii is essentially a function that maps some other function to its ii-th derivative. For a single variable xx, this is often written as didxi\frac{d^i}{dx^i}. For multiple variables, it gets more… involved, with partial derivatives. The derivative of order 0? That’s just the identity map. Don't overthink it.

A linear differential operator is a linear combination of these basic operators, where the coefficients are themselves differentiable functions. For a single variable xx, it looks like this:

a0(x)+a1(x)ddx++an(x)dndxna_{0}(x) + a_{1}(x)\frac{d}{dx} + \cdots + a_{n}(x)\frac{d^n}{dx^n}

Here, nn is the order of the operator, assuming an(x)a_n(x) isn't the zero function.

Let LL be such an operator. Applying LL to a function ff is denoted LfLf (or Lf(x)Lf(x) if you’re feeling pedantic). Crucially, these operators are linear. They respect addition and scalar multiplication. This means the set of linear differential operators itself forms a vector space (over the real numbers or complex numbers) and a free module over the ring of differentiable functions.

This operator notation offers a rather terse way to write differential equations. If we have our operator LL as defined above, the equation:

a0(x)y+a1(x)y+a2(x)y++an(x)y(n)=b(x)a_{0}(x)y + a_{1}(x)y' + a_{2}(x)y'' + \cdots + a_{n}(x)y^{(n)} = b(x)

can be elegantly reduced to:

Ly=b(x)Ly = b(x)

Or, if you prefer, Ly(x)=b(x)Ly(x) = b(x).

The kernel of a linear differential operator LL is simply the kernel of LL as a linear map. In layman's terms, it's the vector space of solutions to the homogeneous equation Ly=0Ly = 0.

According to Carathéodory's existence theorem, under very mild conditions – essentially, if the coefficients a0,,ana_0, \dots, a_n and the function bb are continuous in an interval II, and an(x)a_n(x) is bounded away from zero in II – the kernel of an nn-th order ordinary differential operator is an nn-dimensional vector space. The solutions to Ly(x)=b(x)Ly(x) = b(x) then take the form:

S0(x)+c1S1(x)++cnSn(x)S_0(x) + c_1 S_1(x) + \cdots + c_n S_n(x)

where S0,S1,,SnS_0, S_1, \dots, S_n are specific functions, and c1,,cnc_1, \dots, c_n are arbitrary constants.

Homogeneous Equation with Constant Coefficients

A homogeneous linear differential equation with constant coefficients has the form:

a0y+a1y+a2y++any(n)=0a_0 y + a_1 y' + a_2 y'' + \cdots + a_n y^{(n)} = 0

where a0,a1,,ana_0, a_1, \dots, a_n are constants. This is where Leonhard Euler made his mark, introducing the exponential function exe^x, the unique solution to f=ff' = f with f(0)=1f(0) = 1. The key insight is that the nn-th derivative of eαxe^{\alpha x} is αneαx\alpha^n e^{\alpha x}. This simplifies solving these equations considerably.

If we seek solutions of the form eαxe^{\alpha x}, substituting this into the equation yields:

a0eαx+a1αeαx+a2α2eαx++anαneαx=0a_0 e^{\alpha x} + a_1 \alpha e^{\alpha x} + a_2 \alpha^2 e^{\alpha x} + \cdots + a_n \alpha^n e^{\alpha x} = 0

Since eαxe^{\alpha x} is never zero, α\alpha must be a root of the characteristic polynomial:

a0+a1t+a2t2++antn=0a_0 + a_1 t + a_2 t^2 + \cdots + a_n t^n = 0

This is the characteristic equation.

If all the roots of this polynomial are distinct, say α1,,αn\alpha_1, \dots, \alpha_n, then we have nn distinct solutions eα1x,,eαnxe^{\alpha_1 x}, \dots, e^{\alpha_n x}. These solutions are linearly independent, which can be verified using the Vandermonde determinant. They form a basis for the solution space. Note that these solutions might be complex even if the coefficients aia_i are real.

Example

Consider the equation:

y2y+2y2y+y=0y'''' - 2y''' + 2y'' - 2y' + y = 0

The characteristic equation is:

z42z3+2z22z+1=0z^4 - 2z^3 + 2z^2 - 2z + 1 = 0

The roots are i,i,1i, -i, 1 (with multiplicity 2). The corresponding solutions are eix,eix,ex,xexe^{ix}, e^{-ix}, e^x, xe^x. A real basis for the solution space is therefore cosx,sinx,ex,xex\cos x, \sin x, e^x, xe^x.

When the characteristic polynomial has multiple roots, we need more linearly independent solutions. These take the form xkeαxx^k e^{\alpha x}, where α\alpha is a root of multiplicity mm, and 0k<m0 \le k < m. The proof relies on the fact that if α\alpha is a root of multiplicity mm, the polynomial can be factored as P(t)(tα)mP(t)(t-\alpha)^m. Applying the differential operator is equivalent to applying an operator with characteristic polynomial P(t)P(t) and then applying (d/dxα)(d/dx - \alpha) mm times. The exponential shift theorem shows that:

(ddxα)(xkeαx)=kxk1eαx\left(\frac{d}{dx} - \alpha\right)(x^k e^{\alpha x}) = k x^{k-1} e^{\alpha x}

This means after k+1k+1 applications of (d/dxα)(d/dx - \alpha), the result is zero.

By the fundamental theorem of algebra, the sum of the multiplicities of the roots equals the degree of the polynomial. This ensures we obtain exactly nn linearly independent solutions, forming a basis for the solution space.

For equations with real coefficients, it's often preferable to have a basis of real-valued functions. Since complex roots of the characteristic polynomial come in conjugate pairs (a±iba \pm ib), we can replace pairs of complex solutions xke(a+ib)xx^k e^{(a+ib)x} and xke(aib)xx^k e^{(a-ib)x} with real solutions xkeaxcos(bx)x^k e^{ax}\cos(bx) and xkeaxsin(bx)x^k e^{ax}\sin(bx) using Euler's formula.

Second-Order Case

A homogeneous linear ODE of the second order is written as:

y+ay+by=0y'' + ay' + by = 0

Its characteristic polynomial is r2+ar+br^2 + ar + b. If aa and bb are real, the nature of the solutions depends on the discriminant D=a24bD = a^2 - 4b. The general solution always involves two arbitrary constants, c1c_1 and c2c_2.

  • D>0D > 0: Two distinct real roots, α\alpha and β\beta. The general solution is c1eαx+c2eβxc_1 e^{\alpha x} + c_2 e^{\beta x}.
  • D=0D = 0: A double real root, a/2-a/2. The general solution is (c1+c2x)eax/2(c_1 + c_2 x) e^{-ax/2}.
  • D<0D < 0: Two complex conjugate roots, α±βi\alpha \pm \beta i. The general solution is c1e(α+βi)x+c2e(αβi)xc_1 e^{(\alpha + \beta i)x} + c_2 e^{(\alpha - \beta i)x}. Using Euler's formula, this can be rewritten in real terms as eαx(c1cos(βx)+c2sin(βx))e^{\alpha x}(c_1 \cos(\beta x) + c_2 \sin(\beta x)).

To find the specific solution satisfying initial conditions y(0)=d1y(0) = d_1 and y(0)=d2y'(0) = d_2 (a Cauchy problem), we set up a system of two linear equations for c1c_1 and c2c_2 by evaluating the general solution and its derivative at x=0x=0.

Non-Homogeneous Equation with Constant Coefficients

A non-homogeneous equation of order nn with constant coefficients looks like this:

y(n)(x)+a1y(n1)(x)++an1y(x)+any(x)=f(x)y^{(n)}(x) + a_1 y^{(n-1)}(x) + \cdots + a_{n-1} y'(x) + a_n y(x) = f(x)

where a1,,ana_1, \dots, a_n are constants and f(x)f(x) is a given function. The method for solving it depends heavily on the form of f(x)f(x).

  • Exponential Response Formula: If f(x)f(x) is a simple exponential or sinusoidal function, this formula can be quite direct.
  • Method of Undetermined Coefficients: This works well when f(x)f(x) is a linear combination of terms like xkeaxx^k e^{ax}, xkcos(ax)x^k \cos(ax), and xksin(ax)x^k \sin(ax). You guess a solution of a similar form and solve for the coefficients.
  • Annihilator Method: This is more general, applicable when f(x)f(x) itself satisfies some homogeneous linear differential equation. This often means f(x)f(x) is a holonomic function.
  • Variation of Constants: This is the most general method, always applicable. Let (y1,,yn)(y_1, \dots, y_n) be a basis for the solutions of the associated homogeneous equation. We assume a particular solution of the form yp=u1y1++unyny_p = u_1 y_1 + \cdots + u_n y_n, where u1,,unu_1, \dots, u_n are unknown functions. By imposing n1n-1 constraints on the derivatives of uiu_i (specifically, ensuring that higher derivatives of ypy_p don't generate extra terms involving uiu_i'), we can derive a system of nn linear equations for u1,,unu'_1, \dots, u'_n. This system, involving f(x)f(x) and the basis solutions yiy_i and their derivatives, can be solved using methods from linear algebra. Integrating the uiu'_i gives the uiu_i, and thus a particular solution ypy_p. The general solution is then ypy_p plus the general solution of the homogeneous equation.

First-Order Equation with Variable Coefficients

A linear ODE of the first order, after some algebraic manipulation, takes the form:

y(x)=f(x)y(x)+g(x)y'(x) = f(x)y(x) + g(x)

If the equation is homogeneous (g(x)=0g(x) = 0), we can separate variables:

yy=f(x)\frac{y'}{y} = f(x)

Integrating both sides gives logy=f(x)dx+k\log|y| = \int f(x) dx + k, leading to the general solution y=ceFy = c e^F, where F(x)=f(x)dxF(x) = \int f(x) dx and cc is an arbitrary constant.

For the general non-homogeneous case, we can use an integrating factor. Multiplying the equation by eF(x)e^{-F(x)} (where F(x)F(x) is an antiderivative of f(x)f(x)):

yeFyfeF=geFy' e^{-F} - y f e^{-F} = g e^{-F}

Recognizing that feF=ddx(eF)-f e^{-F} = \frac{d}{dx}(e^{-F}) by the product rule, the left side becomes the derivative of the product yeFy e^{-F}:

ddx(yeF)=geF\frac{d}{dx}(y e^{-F}) = g e^{-F}

Integrating both sides yields:

yeF=geFdx+cy e^{-F} = \int g e^{-F} dx + c

Thus, the general solution is:

y(x)=ceF+eFgeFdxy(x) = c e^F + e^F \int g e^{-F} dx

Example

Consider the equation:

y(x)+y(x)x=3xy'(x) + \frac{y(x)}{x} = 3x

The associated homogeneous equation is y(x)+y(x)x=0y'(x) + \frac{y(x)}{x} = 0, which leads to yy=1x\frac{y'}{y} = -\frac{1}{x}. Integrating gives logy=logx+k\log|y| = -\log|x| + k, so y=cxy = \frac{c}{x}.

Now, let's use the integrating factor method. Here, f(x)=1/xf(x) = 1/x, so F(x)=1xdx=logxF(x) = \int \frac{1}{x} dx = \log|x|. Thus eF=xe^F = |x|, and we can take eF=xe^F = x for x>0x>0. The integrating factor is eF=1/xe^{-F} = 1/x. Multiplying the original equation by 1/x1/x:

1xy(x)+y(x)x2=3\frac{1}{x}y'(x) + \frac{y(x)}{x^2} = 3

The left side is ddx(y(x)x)\frac{d}{dx}\left(\frac{y(x)}{x}\right). So:

ddx(y(x)x)=3\frac{d}{dx}\left(\frac{y(x)}{x}\right) = 3

Integrating gives:

y(x)x=3x+c\frac{y(x)}{x} = 3x + c

And the general solution is:

y(x)=3x2+cxy(x) = 3x^2 + cx

Wait, that doesn't match the example's y(x)=x2+c/xy(x) = x^2 + c/x. Let's re-evaluate. Ah, the integrating factor should be ef(x)dxe^{\int f(x) dx}. So for y+1xy=3xy' + \frac{1}{x}y = 3x, f(x)=1xf(x) = \frac{1}{x}, f(x)dx=lnx\int f(x) dx = \ln|x|. The integrating factor is elnx=xe^{\ln|x|} = |x|. Let's assume x>0x>0, so the factor is xx.

Multiplying the equation by xx:

xy(x)+y(x)=3x2x y'(x) + y(x) = 3x^2

The left side is precisely the derivative of the product xy(x)xy(x):

ddx(xy(x))=3x2\frac{d}{dx}(xy(x)) = 3x^2

Integrating both sides:

xy(x)=3x2dx=x3+cxy(x) = \int 3x^2 dx = x^3 + c

Therefore, the general solution is:

y(x)=x2+cxy(x) = x^2 + \frac{c}{x}

This matches the example. For the initial condition y(1)=αy(1) = \alpha, we get 12+c/1=α1^2 + c/1 = \alpha, so 1+c=α1+c = \alpha, which means c=α1c = \alpha - 1. The particular solution is y(x)=x2+α1xy(x) = x^2 + \frac{\alpha - 1}{x}.

System of Linear Differential Equations

A system of linear differential equations involves multiple unknown functions and multiple linear differential equations. Typically, we consider systems where the number of functions matches the number of equations.

Any linear ODE, or system thereof, can be converted into a first-order system. For instance, an equation involving y,y,,y(k)y', y'', \dots, y^{(k)} can be transformed by introducing new variables y1=y,y2=y,,yk1=y(k)y_1 = y', y_2 = y'', \dots, y_{k-1} = y^{(k)}. This generates a system of first-order equations: y=y1y' = y_1, y1=y2y_1' = y_2, and so on, up to the original equation expressed in terms of these new variables.

A general first-order linear system of nn equations with nn unknown functions can be written in matrix form:

y(x)=A(x)y(x)+b(x)\mathbf{y}'(x) = A(x)\mathbf{y}(x) + \mathbf{b}(x)

where y\mathbf{y} is the vector of unknown functions, A(x)A(x) is a matrix of coefficients, and b(x)\mathbf{b}(x) is a vector of forcing terms. If A(x)A(x) and b(x)\mathbf{b}(x) are constants, the system is simpler.

The associated homogeneous system is u=Au\mathbf{u}' = A\mathbf{u}. Its solutions form an nn-dimensional vector space. A basis for this space can be represented by the columns of a matrix U(x)U(x), such that det(U(x))0\det(U(x)) \neq 0. If AA is a constant matrix, or if AA commutes with its antiderivative B=AdxB = \int A dx, then we can choose U(x)=eBU(x) = e^B, the matrix exponential.

However, in the general case with variable coefficients, a simple closed-form solution for the homogeneous system is often elusive. Numerical methods or approximation techniques like the Magnus expansion become necessary.

Once we have the matrix U(x)U(x) whose columns form a basis for the homogeneous solutions, the general solution to the non-homogeneous system is:

y(x)=U(x)y0+U(x)U1(x)b(x)dx\mathbf{y}(x) = U(x)\mathbf{y}_0 + U(x) \int U^{-1}(x) \mathbf{b}(x) dx

where y0\mathbf{y}_0 is an arbitrary constant vector (the constant of integration). If initial conditions y(x0)=y0\mathbf{y}(x_0) = \mathbf{y}_0 are given, the solution is unique:

y(x)=U(x)U1(x0)y0+U(x)x0xU1(t)b(t)dt\mathbf{y}(x) = U(x)U^{-1}(x_0)\mathbf{y}_0 + U(x) \int_{x_0}^{x} U^{-1}(t) \mathbf{b}(t) dt

Higher Order with Variable Coefficients

As mentioned, linear ODEs of order one with variable coefficients are solvable by quadrature. This isn't generally true for higher orders. This fact is a cornerstone of Picard–Vessiot theory, a field that has evolved into differential Galois theory.

This impossibility of general quadrature solutions for higher-order equations with variable coefficients is analogous to the Abel–Ruffini theorem, which states that polynomial equations of degree five or higher generally cannot be solved using radicals. The underlying mathematical structures and proof techniques bear striking resemblances, hence the name "differential Galois theory."

Similar to its algebraic counterpart, this theory aims to identify which equations can be solved by quadrature and, if so, to provide those solutions. The computations involved, however, are notoriously complex, even for powerful computers.

A notable exception is the work done by Kovacic's algorithm, which completely solves the case of second-order equations with rational coefficients.

Cauchy–Euler Equation

Cauchy–Euler equations are a class of equations with variable coefficients that can be solved explicitly, regardless of their order. They have the form:

xny(n)(x)+an1xn1y(n1)(x)++a0y(x)=0x^n y^{(n)}(x) + a_{n-1}x^{n-1}y^{(n-1)}(x) + \cdots + a_0 y(x) = 0

where a0,,an1a_0, \dots, a_{n-1} are constant coefficients. The trick here is to substitute y=xry = x^r, which transforms the differential equation into an algebraic equation for rr.

Holonomic Functions

A holonomic function, also known as a D-finite function, is simply a solution to a homogeneous linear differential equation with polynomial coefficients.

It’s remarkable how many functions considered "standard" in mathematics are holonomic or can be expressed as quotients of holonomic functions. This includes polynomials, algebraic functions, the logarithm, the exponential function, trigonometric, hyperbolic functions, their inverses, and many special functions like Bessel functions and hypergeometric functions.

Holonomic functions possess powerful closure properties](/Closure_property). Their sums, products, derivatives, and integrals are also holonomic. Crucially, these properties are "effective," meaning algorithms exist to compute the differential equation of the result of these operations, given the equations of the input functions. This is the essence of Zeilberger's theorem.

The significance of holonomic functions lies in their computational tractability. If we represent them by their defining differential equations and initial conditions, most calculus operations can be automated. This includes differentiation, integration (both indefinite and definite), rapid computation of Taylor series (via recurrence relations on coefficients), high-precision evaluation with error bounds, finding limits, locating singularities, analyzing asymptotic behavior, and proving identities. The dynamic dictionary of mathematical functions (DDMF) is an example of such an endeavor.