← Back to home

Linear Algebra

Honestly, Wikipedia. You want me to rewrite this? It's already dense enough to qualify as a black hole. But fine. If you insist on wading through the abstract, I’ll drag you through it. Just don't expect me to hold your hand.

Branch of Mathematics

Linear algebra. It’s the study of the elegant, if somewhat cold, world of linear equations. Think of those equations like a set of strict, unyielding rules:

a1x1++anxn=ba_1 x_1 + \cdots + a_n x_n = b

And linear maps, which are like functions that only follow those rules, transforming points in space without bending the underlying structure:

(x1,,xn)a1x1++anxn(x_1, \ldots, x_n) \mapsto a_1 x_1 + \cdots + a_n x_n

These aren't just abstract concepts; they're the skeletal framework of our understanding, represented by the precise, unfeeling grids of matrices. [1] [2] [3]

Imagine three planes in Euclidean space. They intersect, forming solutions. If you're lucky, you get a single point – a unique, unassailable truth. Or perhaps a line, a shared destiny for two of those planes. This is the clean, geometric language of linear algebra.

It’s not just geometry, though. Linear algebra is the bedrock upon which much of modern mathematics is built. It’s how we define the very essence of lines and planes, how we describe rotations without losing our bearings. Even functional analysis, that sprawling domain of mathematical analysis, can be seen as linear algebra applied to the infinite dimensions of function spaces.

And it’s not confined to theoretical realms. The sciences and engineering lean on it heavily. Why? Because it’s the most efficient way to model a world that, at its core, often behaves linearly. Even when things get messy, when systems are nonlinear, linear algebra steps in with first-order approximations, using the differential of a multivariate function to capture the local truth.

History

The ancient Chinese, bless their methodical hearts, were already tackling systems of linear equations with methods that would eventually be called Gaussian elimination. It’s all laid out in texts like The Nine Chapters on the Mathematical Art, with problems involving everything from two to five equations.

Europe caught up, or perhaps just rediscovered it, when René Descartes decided to overlay numbers onto geometry in 1637. Suddenly, lines and planes weren't just abstract shapes; they were described by linear equations in what we now call Cartesian geometry. Solving for intersections meant solving systems of linear equations.

Then came the fascination with determinants. Leibniz tinkered with them in 1693, and Gabriel Cramer formalized their use in solving systems with his eponymous rule in 1750. Gauss himself refined the elimination method, even applying it to geodesy. [5]

But the real conceptual leaps came later. In 1844, Hermann Grassmann unleashed his "Theory of Extension," laying down entirely new foundations. Then, in 1848, James Joseph Sylvester coined the term "matrix," a Latin word for "womb," suggesting a container for mathematical ideas.

The complex plane also played a role, hinting at how numbers could represent not just magnitude but also direction. Then came the astonishing discovery of quaternions by W.R. Hamilton in 1843, a four-dimensional system that pushed the boundaries of what we considered "numbers." The concept of a "vector" emerged, a representation of a point in space.

Arthur Cayley brought it all together, introducing matrix multiplication and the inverse matrix in 1856, formalizing the general linear group. He saw matrices not just as collections of numbers but as single entities, declaring, "There would be many things to say about this theory of matrices which should, it seems to me, precede the theory of determinants." [5]

Benjamin Peirce and his son Charles Sanders Peirce delved into Linear Associative Algebra in the late 19th century.

The advent of the telegraph and Maxwell's groundbreaking A Treatise on Electricity and Magnetism in 1873, with its field theory, demanded new mathematical tools. Linear algebra, in its "flat" form, found its place in describing the tangent spaces of manifolds. The symmetries of spacetime itself, described by Lorentz transformations, are deeply rooted in linear algebra, tracing much of its historical development. History of Lorentz transformations

By 1888, Peano had provided a more rigorous definition of a vector space, and by 1900, the theory of linear transformations for finite dimensions was solidifying. The 20th century saw linear algebra blossom into its modern, abstract form, especially with the rise of computers, which turned it into an indispensable tool for modeling and simulation. [5]

Vector Spaces

Forget the old-school approach with just equations. The modern way, the elegant way, is through vector spaces. It's more general, more abstract, and frankly, less tedious.

A vector space, over a field F (usually the real numbers or complex numbers), is a collection of things called vectors, V. These vectors can be added together (vector addition), and they can be scaled by numbers from F, called scalars (scalar multiplication). It all has to follow a strict set of rules, like a meticulously choreographed dance:

Axiom Significance
u+(v+w)=(u+v)+wu + (v + w) = (u + v) + w Addition is associative. No surprises there.
u+v=v+uu + v = v + u Addition is commutative. Order doesn't matter.
There exists 0V0 \in V such that v+0=vv + 0 = v There's a neutral element, the zero vector.
For every vVv \in V, there exists vV-v \in V such that v+(v)=0v + (-v) = 0 Every vector has an additive inverse.
a(u+v)=au+ava(u + v) = au + av Scalar multiplication distributes over vector addition.
(a+b)v=av+bv(a + b)v = av + bv Scalar multiplication distributes over field addition.
a(bv)=(ab)va(bv) = (ab)v Scaling by scalars is compatible with field multiplication.
1v=v1v = v Multiplying by the multiplicative identity of F does nothing.

These first four axioms mean V is an abelian group under addition. It’s a group that plays nice.

The "vectors" themselves can be anything: tuples, sequences, functions, polynomials, even matrices. Linear algebra is about the properties that hold true across all these disparate forms.

Linear Maps

Linear maps are the navigators of vector spaces. They're functions that respect the structure, the rules of the game. Given two vector spaces, V and W, over a field F, a linear map T: V → W must satisfy:

T(u+v)=T(u)+T(v)T(u + v) = T(u) + T(v) T(av)=aT(v)T(av) = aT(v)

For any vectors u,vVu, v \in V and scalar aFa \in F.

In simpler terms, you can add vectors before or after applying the map, and you get the same result. Same with scaling. It’s all about preserving the linear relationships.

An equivalent, and often more useful, condition is:

T(au+bv)=aT(u)+bT(v)T(au + bv) = aT(u) + bT(v)

for any vectors u,vVu, v \in V and scalars a,bFa, b \in F.

When the map goes from a vector space to itself (T: V → V), it's called a linear operator.

The real power comes when we talk about bijective linear maps. If such a map exists between V and W, they are isomorphic. From a linear algebra perspective, they're indistinguishable. The crucial questions are: is a map an isomorphism? If not, what's its range? What maps to the zero vector (the kernel)? Gaussian elimination and its kin are the tools for answering these.

Subspaces, Span, and Basis

Within a vector space, certain subsets behave like smaller vector spaces themselves. These are linear subspaces. They're closed under addition and scalar multiplication. Think of them as self-contained universes within the larger one.

For instance, the image of a linear map T: V → W (all the vectors in W that T can reach) is a subspace of W. And the kernel of T (all vectors in V that T maps to zero) is a subspace of V.

We can build subspaces by taking linear combinations of a set of vectors:

a1v1+a2v2++akvka_1 \mathbf{v}_1 + a_2 \mathbf{v}_2 + \cdots + a_k \mathbf{v}_k

The set of all such combinations is the span of those vectors. It's the smallest subspace containing them.

A set of vectors is linearly independent if none of them can be formed by combining the others. It’s like having fundamental building blocks, each unique. If the only way to get the zero vector from a linear combination is by using all zero coefficients, then the set is linearly independent.

A set of vectors that spans the entire vector space and is also linearly independent? That's a basis. It's the minimal set of generators, the most efficient description of the space.

Crucially, any two bases for a vector space have the same number of elements. This number is the dimension of the space. Two vector spaces over the same field are isomorphic if and only if they have the same dimension. [9] If a space has a finite basis, it's finite-dimensional. If subspace U is inside V, its dimension can't exceed V's.

There's a neat formula for the dimensions of sums and intersections of subspaces:

dim(U1+U2)=dimU1+dimU2dim(U1U2)\dim(U_1 + U_2) = \dim U_1 + \dim U_2 - \dim(U_1 \cap U_2)

where U1+U2U_1 + U_2 is the span of U1U2U_1 \cup U_2. [10]

Matrices

Matrices are the tangible representation of linear algebra, especially for finite-dimensional spaces. They are the language we use to do things.

If V is a finite-dimensional vector space with basis (v1,,vm)(v_1, \ldots, v_m), then any vector can be uniquely written as a linear combination of these basis vectors. This gives us an isomorphism between V and FmF^m, the space of mm-tuples of scalars. Vectors can then be represented as column matrices.

[a1am]\begin{bmatrix} a_1 \\ \vdots \\ a_m \end{bmatrix}

When we have a linear map f:WVf: W \to V, where W has basis (w1,,wn)(w_1, \ldots, w_n), the map is entirely determined by where it sends the basis vectors of W. If f(wj)=a1,jv1++am,jvmf(w_j) = a_{1,j}v_1 + \cdots + a_{m,j}v_m, then the map ff is represented by an m×nm \times n matrix:

[a1,1a1,nam,1am,n]\begin{bmatrix} a_{1,1} & \cdots & a_{1,n} \\ \vdots & \ddots & \vdots \\ a_{m,1} & \cdots & a_{m,n} \end{bmatrix}

This is where matrix multiplication comes in. It mirrors the composition of linear maps. Multiplying a matrix by a column vector gives the column vector representing the transformed vector. It’s a dual language, where matrices describe transformations and transformations are described by matrices.

Matrices representing the same linear transformation but in different bases are called similar. The process of transforming one into another using elementary row and column operations is akin to changing bases. Ultimately, any matrix can be simplified, revealing the core structure of the transformation. Gaussian elimination is the algorithm that reveals these fundamental forms.

Linear Systems

A collection of linear equations in a set of variables—that’s a linear system. Think of it as a set of constraints. Historically, linear algebra was forged in the fire of solving these systems.

Consider this system:

2x  +  y    z  =  83x    y  +  2z  =  112x  +  y  +  2z  =  3\begin{alignedat}{7} 2x &&\;+\;&&y &&\;-\;&&z &&\;=\;&&8 \\ -3x &&\;-\;&&y &&\;+\;&&2z &&\;=\;&&-11 \\ -2x &&\;+\;&&y &&\;+\;&&2z &&\;=\;&&-3 \end{alignedat}

We can represent this with a matrix MM and a vector vv:

M=[211312212],v=[8113]M = \left[{\begin{array}{rrr} 2&1&-1\\-3&-1&2\\-2&1&2 \end{array}}\right], \quad \mathbf{v} = {\begin{bmatrix} 8\\-11\\-3 \end{bmatrix}}

A solution X=[xyz]X = \begin{bmatrix} x\\y\\z \end{bmatrix} is a vector such that MX=vMX = v. The homogeneous system (MX=0MX = 0) describes the kernel of the transformation MM.

Gaussian elimination is the workhorse. By applying elementary row operations to the augmented matrix:

[2118312112123]\left[{\begin{array}{rrr|r} 2&1&-1&8\\-3&-1&2&-11\\-2&1&2&-3 \end{array}}\right]

we can transform it into reduced row echelon form:

[100201030011]\left[{\begin{array}{rrr|r} 1&0&0&2\\0&1&0&3\\0&0&1&-1 \end{array}}\right]

This immediately tells us the unique solution: x=2,y=3,z=1x=2, y=3, z=-1.

In general, a system Ax=bAx = b with mm equations and nn variables can be solved using these methods. If AA is a square, invertible matrix, the solution is simply x=A1bx = A^{-1}b. The same algorithms used for solving systems are fundamental for computing matrix ranks, kernels, and inverses.

Endomorphisms and Square Matrices

A linear map from a vector space to itself is an endomorphism. When the space is finite-dimensional, say with dimension nn, an endomorphism is represented by an n×nn \times n square matrix. These are special because they describe transformations that preserve the space they operate on, leading to concepts like geometric transformations, coordinate changes, and quadratic forms.

Determinant

The determinant of a square matrix is a scalar value that tells us something crucial about the transformation it represents. It’s defined as:

σSn(1)σa1σ(1)anσ(n)\sum_{\sigma \in S_n}(-1)^{\sigma }a_{1\sigma (1)}\cdots a_{n\sigma (n)}

where SnS_n is the set of all permutations of {1,,n}\{1, \ldots, n\}, and (1)σ(-1)^\sigma is the parity of the permutation σ\sigma.

A matrix is invertible if and only if its determinant is non-zero. It signifies whether the transformation collapses the space into a lower dimension. Cramer's rule provides an explicit formula for solutions using determinants, but it's computationally inefficient for anything beyond small systems. The determinant of an endomorphism is invariant under basis changes.

Eigenvalues and Eigenvectors

If a linear endomorphism ff of a vector space VV has a special property, it's that it can map a non-zero vector vv to a scalar multiple of itself: f(v)=avf(v) = av. This vector vv is an eigenvector, and the scalar aa is its corresponding eigenvalue.

For a matrix MM, this becomes Mz=azMz = az. Rearranging, (MaI)z=0(M - aI)z = 0. For a non-zero solution zz to exist, the matrix (MaI)(M - aI) must be singular, meaning its determinant is zero: det(MaI)=0\det(M - aI) = 0. The eigenvalues are the roots of this characteristic polynomial, det(xIM)\det(xI - M).

If a vector space has a basis consisting entirely of eigenvectors, the matrix representing the endomorphism in that basis becomes a diagonal matrix with the eigenvalues on the diagonal. Such a matrix (and the corresponding endomorphism) is called diagonalizable. Not all matrices are diagonalizable, but they can often be brought to a simpler form, like the Jordan normal form, which reveals their essential structure. A symmetric matrix, for instance, is always diagonalizable.

Duality

The concept of a dual space is fundamental. If VV is a vector space, its dual space, VV^*, is the space of all linear forms on VV. A linear form is a linear map from VV to the field of scalars FF.

If VV is finite-dimensional with basis (v1,,vn)(v_1, \ldots, v_n), then its dual space VV^* has a corresponding dual basis (v1,,vn)(v_1^*, \ldots, v_n^*), where vi(vj)=δijv_i^*(v_j) = \delta_{ij} (1 if i=ji=j, 0 otherwise). This duality is remarkably symmetric. For finite-dimensional spaces, VV is isomorphic to its double dual, (V)(V^*)^*. This symmetry is elegantly captured by the bra–ket notation f,x\langle f, x \rangle.

Dual Map

Given a linear map f:VWf: V \to W, there's a corresponding "dual" or transpose map f:WVf^*: W^* \to V^*. If MM is the matrix of ff, then the matrix of ff^* (with respect to dual bases) is the transpose MTM^T. This reflects a deep symmetry in how linear transformations interact with their dual spaces.

Inner Product Spaces

When we equip a vector space with an inner product, we give it a geometric structure, allowing us to define lengths and angles. An inner product ,:V×VF\langle \cdot, \cdot \rangle: V \times V \to F must satisfy:

  • Conjugate symmetry: u,v=v,u\langle u, v \rangle = \overline{\langle v, u \rangle}. (Just symmetry for real numbers).
  • Linearity in the first argument: au,v=au,v\langle au, v \rangle = a\langle u, v \rangle and u+v,w=u,w+v,w\langle u+v, w \rangle = \langle u, w \rangle + \langle v, w \rangle.
  • Positive-definiteness: v,v0\langle v, v \rangle \geq 0, with equality only if v=0v=0.

The length (or norm) of a vector vv is defined as v2=v,v\|v\|^2 = \langle v, v \rangle. The Cauchy–Schwarz inequality, u,vuv|\langle u, v \rangle| \leq \|u\| \|v\|, is a direct consequence, and it allows us to define the "cosine" of the angle between vectors.

Two vectors are orthogonal if u,v=0\langle u, v \rangle = 0. An orthonormal basis is a basis where all vectors have length 1 and are mutually orthogonal. The Gram–Schmidt procedure can construct such a basis. For orthonormal bases, extracting the coefficients of a vector is simple: ai=v,via_i = \langle v, v_i \rangle.

The inner product also defines the Hermitian conjugate TT^* of a linear operator TT, satisfying Tu,v=u,Tv\langle Tu, v \rangle = \langle u, T^*v \rangle. Operators where TT=TTTT^* = T^*T are called normal, and they possess a complete orthonormal set of eigenvectors.

Relationship with Geometry

The connection between linear algebra and geometry is ancient, dating back to Descartes and his Cartesian coordinates. Points became tuples of numbers, and lines and planes became linear equations. Solving geometric problems transformed into solving linear systems.

Most geometric transformations – translations, rotations, reflections, projections – map lines to lines. This makes them expressible as linear maps. Even more complex transformations, like homographies in projective space, rely on linear algebra for their description.

While ancient geometry relied on axioms (synthetic geometry), modern approaches often define geometric spaces using vector spaces (affine space, projective space). The two perspectives are largely equivalent. Linear algebra provides the framework for studying geometry not just over the reals but over any field, including finite fields.

Usage and Applications

Linear algebra is not just an academic exercise; it's woven into the fabric of almost every scientific discipline.

Functional Analysis

Functional analysis, which studies function spaces (often infinite-dimensional vector spaces like Hilbert spaces), is built on linear algebra. It's essential for quantum mechanics and Fourier analysis.

Scientific Computation

Virtually all scientific computations involve linear algebra. This has led to highly optimized libraries like BLAS and LAPACK, and specialized hardware from early array processors to modern graphics processing units (GPUs) designed for matrix operations.

Geometry of Ambient Space

Describing ambient space relies on geometry, which in turn relies on linear algebra. This is crucial for mechanics, robotics, geodesy, computer vision, and computer graphics.

Study of Complex Systems

Many physical phenomena, often described by partial differential equations, are tackled by discretizing space into interacting cells. The interactions are frequently modeled linearly, even for nonlinear systems. This leads to massive matrices, as seen in applications like weather forecasting.

Fluid Mechanics, Fluid Dynamics, and Thermal Energy Systems

Linear algebra is indispensable in fluid mechanics and fluid dynamics. It helps model and simulate fluid flow, linearizing complex differential equations like the Navier–Stokes equations for computational analysis. In thermal energy systems, it's used for analyzing and optimizing power systems, from generation to distribution. Concepts like eigenvalue problems are vital for efficiency and reliability.

Extensions and Generalizations

Module Theory

If we relax the requirement that the scalars come from a field and allow them to come from a ring RR, we get a module over RR. Modules are a generalization of vector spaces, but they don't always have bases. Those that do are called free modules. Matrices over rings are a thing, but their theory is more complex, especially when the ring isn't commutative. Modules over the integers are essentially abelian groups.

Multilinear Algebra and Tensors

Multilinear algebra deals with functions that are linear in each of several variables. This leads to the concept of tensors, which are generalizations of vectors and linear forms, and are essential in physics and differential geometry. The dual space is a key player here.

Topological Vector Spaces

For infinite-dimensional vector spaces, we often need more structure. A normed vector space has a notion of "size" (a norm), which induces a topology. Complete normed vector spaces are Banach spaces, and if they also have an inner product, they're Hilbert spaces. These are the workhorses of functional analysis.


So there you have it. Linear algebra. It’s the quiet architect of the quantitative world. Don't expect me to find it fascinating. But I can explain it. If you can keep up.