QUICK FACTS
Created Jan 0001
Status Verified Sarcastic
Type Existential Dread
change of variables (pde), verification, improve this article, mathematics, calculus, fundamental theorem, limits

Change Of Variables

“(For the concept in partial differential equations, see Change of variables...”

Contents
  • 1. Overview
  • 2. Etymology
  • 3. Cultural Impact

Mathematical technique for simplification

(For the concept in partial differential equations , see Change of variables (PDE) .)

One might assume that understanding how to simplify a problem is intuitive. Evidently, for some, it requires a dedicated article. This particular entry, much like most things in life, could benefit from more rigorous substantiation. It ’needs additional citations for verification ,’ apparently. As if the inherent logic isn’t enough for the perpetually bewildered. Feel free to improve this article by adding citations to reliable sources . Or don’t. It’s not my universe. Unsourced material, of course, ‘may be challenged and removed.’ Such drama for basic mathematics .

Find sources:  “Change of variables” – news  ¡ newspapers  ¡ books  ¡ scholar  ¡ JSTOR (June 2019) ( Learn how and when to remove this message )

                                                    • Part of a series of articles about Calculus

∍

a

b

f ′

( t )

d t

f ( b ) − f ( a )

{\displaystyle \int _{a}^{b}f’(t),dt=f(b)-f(a)}

Differential

Definitions

Concepts

Rules and identities

Integral

Definitions

Integration by

Series

Convergence tests

Vector

Theorems

Multivariable

Formalisms

Definitions

Advanced

Specialized

Miscellanea


In the grand scheme of mathematics , a change of variables is presented as a fundamental technique. It’s essentially a workaround, deployed to simplify problems where the initial variables are, for whatever reason, too cumbersome. The core idea is to replace these original variables with functions of new, presumably more manageable, variables. The underlying hope—or rather, the explicit intention—is that by expressing the problem in this fresh set of variables, it might become less convoluted, or perhaps even transform into something already understood, a problem you should have recognized in the first place.

This operation, a change of variables , is often confused with mere substitution . And while they are indeed related, like a distant cousin you tolerate at family gatherings, they are distinct. This becomes painfully obvious when one delves into the realms of differentiation (where the chain rule makes its unwelcome appearance) or integration (where integration by substitution demands a bit more rigor than simply swapping symbols). It’s not just a rename; it’s a transformation, ideally for the better, though sometimes it feels like trading one headache for another.

A rather straightforward illustration of how a judicious variable change can be useful manifests in the task of identifying the roots of a sixth-degree polynomial equation:

$$x^{6}-9x^{3}+8=0.$$

Solving sixth-degree polynomial equations in terms of radicals is generally deemed impossible, a fact that might comfort those perpetually struggling with algebra (see the Abel–Ruffini theorem ). However, this particular equation, displaying a certain structural redundancy, can be expressed in a more revealing form:

$$(x^{3})^{2}-9(x^{3})+8=0.$$

This is a rather transparent instance of what’s known as a polynomial decomposition . The equation, with a keen eye, practically begs for simplification. One can achieve this by introducing a new variable, let’s call it $u$, defined as:

$$u=x^{3}.$$

Substituting $x$ with ${\sqrt[{3}]{u}}$ into the original polynomial (or, more efficiently, recognizing the pattern directly) yields a much more approachable form:

$$u^{2}-9u+8=0,$$

which, to anyone with a modicum of algebraic competence, is recognizable as a simple quadratic equation . This simplified equation readily provides two solutions for $u$:

$$u=1\quad {\text{and}}\quad u=8.$$

To obtain the solutions in terms of the original variable, $x$, one merely needs to back-substitute $x^3$ for $u$. This gives us:

$$x^{3}=1\quad {\text{and}}\quad x^{3}=8.$$

Assuming, for the sake of simplicity (or perhaps because the problem statement implied it), that one is primarily interested in the real number solutions, the answers to the original equation are then quite trivial:

$$x=(1)^{1/3}=1\quad {\text{and}}\quad x=(8)^{1/3}=2.$$

Without this change of variable, one might spend an inordinate amount of time grappling with an unnecessarily complex problem, proving that sometimes, seeing the forest for the trees requires a different perspective.

Simple example

Let’s consider another example, a system of equations that might initially seem a touch more intimidating, particularly if one is prone to overthinking.

$$xy+x+y=71$$ $$x^{2}y+xy^{2}=880$$

Here, $x$ and $y$ are specified as positive integers , with the additional constraint that $x>y$. (This gem, by the way, comes from the 1991 American Invitational Mathematics Examination (AIME) – a testament to the fact that even competitive math problems often rely on clever simplification rather than brute force.)

Solving this system directly, while not impossible, would likely involve a fair amount of tedious algebraic manipulation. However, a moment of observation reveals that the second equation can be factored rather neatly:

$$xy(x+y)=880.$$

Now, if you haven’t already seen the pattern, allow me to point out the obvious. We can introduce two new variables to simplify this structure. Let’s define:

$$s=x+y$$ $$t=xy$$

With these substitutions, the original system of equations magically transforms into a much more manageable pair:

$$s+t=71$$ $$st=880$$

This new system is, in essence, asking for two numbers ($s$ and $t$) whose sum is 71 and whose product is 880. This is a classic setup that can be solved by recognizing that $s$ and $t$ are the roots of the quadratic equation $z^2 - 71z + 880 = 0$. Solving this yields two possible ordered pairs for $(s,t)$:

$$(s,t)=(16,55)\quad {\text{and}}\quad (s,t)=(55,16).$$

Now, we perform the back-substitution. Taking the first ordered pair, $(s,t)=(16,55)$, we get:

$$x+y=16$$ $$xy=55$$ $$x>y$$

Again, we’re looking for two numbers whose sum is 16 and product is 55. These are clearly 11 and 5. Given the condition $x>y$, this leads directly to the solution:

$$(x,y)=(11,5).$$

Next, let’s consider the second ordered pair, $(s,t)=(55,16)$:

$$x+y=55$$ $$xy=16$$ $$x>y$$

Here, we’re seeking two numbers that sum to 55 and multiply to 16. A quick inspection of factors of 16 (1, 2, 4, 8, 16) reveals no pair that sums to 55. Therefore, this branch yields no valid solutions under the given constraints.

Thus, the unique solution that satisfies the entire system is:

$$(x,y)=(11,5).$$

A simple change of variables, and what initially looked like a minor chore becomes almost trivial. It’s almost as if some problems are designed to filter out those who lack the foresight to simplify.

Formal introduction

For those who appreciate the excruciating precision of formal definitions, let us delve into the more abstract underpinnings. Consider two smooth manifolds , $A$ and $B$. Let $\Phi: A \rightarrow B$ be a $C^r$-diffeomorphism between them. If that string of jargon didn’t immediately clarify things, allow me to elaborate:

A map $\Phi$ is considered a $C^r$-diffeomorphism if it satisfies specific, rather demanding criteria:

  1. $\Phi$ must be a bijective map. This means it’s both injective (each element of $A$ maps to a unique element of $B$) and surjective (every element of $B$ has a corresponding element in $A$). In simpler terms, it’s a perfect one-to-one correspondence.
  2. $\Phi$ itself must be $r$ times continuously differentiable . This implies a certain smoothness; the function and its first $r$ derivatives exist and are continuous.
  3. Its inverse, $\Phi^{-1}: B \rightarrow A$, must also be $r$ times continuously differentiable. This ensures that the transformation can be smoothly undone.

Here, $r$ can be any natural number (even zero, though that’s less interesting for “smoothness”), $\infty$ (implying the map is smooth to all orders of differentiation), or $\omega$ (denoting an analytic function , which is infinitely differentiable and locally given by a convergent power series ).

This meticulously defined map $\Phi$ is formally referred to as a regular coordinate transformation or a regular variable substitution. The term “regular” here specifically refers to the $C^r$-ness of $\Phi$, highlighting its well-behaved nature. Conventionally, one writes $x = \Phi(y)$ to denote the replacement of the original variable $x$ by the new variable $y$. This signifies that for every instance of $x$, you are to substitute the value that $\Phi$ produces when applied to $y$. It’s a precise instruction for transforming your entire problem space, not just a casual swap.

Other examples

Coordinate transformation

Sometimes, the inherent geometry of a problem is simply incompatible with the chosen coordinate system. Trying to solve a problem with radial symmetry using Cartesian coordinates is like trying to hammer a screw. It works, eventually, but it’s inefficient and painful. This is where coordinate transformations shine.

Consider, for instance, the equation:

$$U(x,y):=(x^{2}+y^{2}){\sqrt {1-{\frac {x^{2}}{x^{2}+y^{2}}}}}=0.$$

This might represent a potential energy function in some esoteric physical scenario, or perhaps just a particularly convoluted algebraic expression. If a solution isn’t immediately obvious (and why would it be?), one might consider a transformation to polar coordinates . This involves the substitution:

$$(x,y)=\Phi (r,\theta )$$

where $\Phi$ is given by:

$$\Phi (r,\theta )=(r\cos(\theta ),r\sin(\theta )).$$

A crucial detail here, often overlooked by the less meticulous, is the bijectivity of the map $\Phi$. If $\theta$ is allowed to range freely, for example, beyond a $2\pi$-length interval such as $[0, 2\pi)$, the map $\Phi$ ceases to be bijective. A point $(r, \theta)$ would map to the same $(x,y)$ as $(r, \theta + 2\pi)$. To maintain bijectivity, $\Phi$ must be restricted, for instance, to a domain like $(0, \infty] \times [0, 2\pi)$. Notice the exclusion of $r=0$. This is because at the origin, $\Phi$ is not bijective; the angle $\theta$ can take any value, but the point $(r, \theta)$ will always map to $(0, 0)$. Such details are not mere pedantry; they are fundamental to the validity of the transformation.

Upon replacing all occurrences of the original variables ($x$ and $y$) with their new expressions as prescribed by $\Phi$, and leveraging the undeniably useful trigonometric identity $\sin^2 x + \cos^2 x = 1$, we arrive at:

$$V(r,\theta )=r^{2}{\sqrt {1-{\frac {r^{2}\cos ^{2}\theta }{r^{2}}}}}=r^{2}{\sqrt {1-\cos ^{2}\theta }}=r^{2}\left|\sin \theta \right|.$$

Now, the solutions become painfully obvious: for $V(r,\theta)$ to be zero, either $r=0$ (which we’ve already excluded for bijectivity reasons, and is not a solution to the original problem), or $\sin(\theta)=0$. This implies $\theta=0$ or $\theta=\pi$. Applying the inverse of $\Phi$ back to Cartesian coordinates reveals that this is equivalent to $y=0$, provided $x \not= 0$. Indeed, one can readily observe that for $y=0$, the original function $U(x,y)$ vanishes, everywhere except for the origin itself.

It’s worth reiterating the importance of the bijectivity of $\Phi$. Had we carelessly allowed $r=0$ in our polar coordinate domain, the origin $(0,0)$ would have appeared as a solution in the transformed space, even though it does not satisfy the original equation (the term under the square root becomes undefined at the origin, or at best, ambiguous). The original function, for $x,y \in \mathbb{R}$, is always positive, which is why the absolute value around $\sin \theta$ is necessary, a detail easily overlooked by the less attentive.

Differentiation

The chain rule is perhaps the most ubiquitous application of a change of variables in differential calculus . It serves to simplify the differentiation of composite functions , essentially breaking down a complex problem into more manageable, successive steps. For instance, consider the task of calculating the derivative:

$${\frac {d}{dx}}\sin(x^{2}).$$

One could, of course, attempt to differentiate this directly, but the chain rule offers a more elegant and systematic approach. We introduce an intermediate variable. Let $y = \sin u$, where $u = x^2$. This establishes a clear hierarchy of dependencies: $y$ depends on $u$, and $u$ depends on $x$. Then, the steps unfold as follows:

$${\begin{aligned}{\frac {d}{dx}}\sin(x^{2})&={\frac {dy}{dx}}\[6pt]&={\frac {dy}{du}}{\frac {du}{dx}}&&{\text{This part is the chain rule.}}\[6pt]&=\left({\frac {d}{du}}\sin u\right)\left({\frac {d}{dx}}x^{2}\right)\[6pt]&=(\cos u)(2x)\end{aligned}}$$

At this point, having performed the differentiation with respect to the intermediate variable, we simply substitute back the original expression for $u$:

$${\begin{aligned}&=\left(\cos(x^{2})\right)(2x)\&=2x\cos(x^{2})\end{aligned}}$$

The chain rule, fundamentally, is a change of variables for derivatives. It allows us to transform a derivative with respect to one variable into a product of derivatives with respect to intermediate variables, simplifying the process considerably. It’s a testament to the power of modular thinking, even in mathematics.

Integration

Just as differentiation benefits from strategic variable changes, so too does integration . Many integrals that initially appear intractable can often be elegantly evaluated by transforming the variables. This technique is formally supported by the substitution rule for integrals, which is, in essence, the integral counterpart to the chain rule in differentiation. It allows one to convert a complex integral into a simpler form by introducing a new variable, effectively changing the domain of integration.

Beyond direct substitution, more sophisticated changes of variables are employed, particularly in multivariable calculus . Difficult multiple integrals, for example, can often be simplified by transforming the coordinate system itself. This is facilitated by the use of the Jacobian matrix and determinant . The Jacobian determinant quantifies how infinitesimal volumes (or areas) are scaled by a coordinate transformation. When changing variables in a multiple integral, one must multiply the integrand by the absolute value of the Jacobian determinant of the transformation. This ensures that the integral’s value remains invariant despite the change in coordinates. This reliance on the Jacobian determinant and the corresponding variable change forms the very foundation of commonly used coordinate systems such as polar , cylindrical , and spherical coordinate systems , each designed to simplify integrals over specific types of regions or integrands. [1]

Change of variables formula in terms of Lebesgue measure

For those who demand unwavering rigor, particularly in the realm of measure theory , the change of variables formula takes on a more abstract and generalized form, specifically concerning the Lebesgue measure . This theorem provides a precise mechanism for relating integrals with respect to the Lebesgue measure to an equivalent integral under a transformed measure, which arises from a parameterization $G$. [2] The proof, a rather involved affair, typically proceeds through approximations using the Jordan content , a precursor to the Lebesgue measure concept.

Let us define the stage: Suppose $\Omega$ is an open subset of $\mathbb{R}^n$ (the $n$-dimensional Euclidean space ), and $G: \Omega \rightarrow \mathbb{R}^n$ is a $C^1$-diffeomorphism . This means $G$ is continuously differentiable with a continuously differentiable inverse, ensuring a “smooth” and invertible transformation.

Under these conditions, the theorem asserts two crucial points:

  • If $f$ is a Lebesgue measurable function defined on the image $G(\Omega)$, then the composite function $f \circ G$ (which means $f$ applied to $G(x)$) is Lebesgue measurable on the original domain $\Omega$. Furthermore, if $f \geq 0$ (meaning $f$ is a non-negative function) or if $f \in L^1(G(\Omega), m)$ (meaning $f$ is Lebesgue integrable over $G(\Omega)$ with respect to the Lebesgue measure $m$), then the fundamental change of variables formula holds:

    $$\int _{G(\Omega )}f(x)dx=\int {\Omega }f\circ G(x)|{\text{det}}D{x}G|dx.$$

    Here, $D_x G$ represents the Jacobian matrix of the transformation $G$ at point $x$, and $|{\text{det}}D_x G|$ is its absolute determinant. This term, often referred to as the Jacobian determinant, accounts for the scaling of volume elements during the transformation. It’s the rigorous justification for why those pesky $r$, $r^2 \sin\phi$ terms appear in polar or spherical integrals.

  • If $E \subset \Omega$ is a Lebesgue measurable set , then its image under the transformation, $G(E)$, is also Lebesgue measurable. In this case, the Lebesgue measure of the transformed set $G(E)$ can be computed by integrating the absolute Jacobian determinant over the original set $E$:

    $$m(G(E))=\int {E}|{\text{det}}D{x}G|dx.$$

    This second point is essentially a special case of the first, where $f(x)$ is taken to be the indicator function of the set $G(E)$. It directly relates the “size” (measure) of a set before and after a smooth transformation.

As a direct consequence of this comprehensive theorem, one can then precisely compute the Radon–Nikodym derivatives for both the pullback and pushforward measures of $m$ (the Lebesgue measure) under a suitable transformation $T$. These derivatives are crucial for understanding how measures transform and interact under mappings.

Pullback measure and transformation formula

The concept of a pullback measure is defined in terms of a transformation $T$. Specifically, for a set $A$ in the target space, the pullback measure $T^*\mu$ is given by:

$$T^{*}\mu :=\mu (T(A)).$$

This definition effectively “pulls back” the measure from the target space to the source space, evaluating the measure of the image of a set under $T$. The corresponding change of variables formula for these pullback measures is elegantly expressed as:

$$\int _{T(\Omega )}gd\mu =\int _{\Omega }g\circ TdT^{*}\mu.$$

This equation states that integrating a function $g$ over the transformed domain $T(\Omega)$ with respect to the measure $\mu$ is equivalent to integrating the composite function $g \circ T$ over the original domain $\Omega$ with respect to the pullback measure $T^*\mu$. It’s a formal statement of how integrals adjust when the underlying measure space is transformed.

Pushforward measure and transformation formula

Conversely, the pushforward measure , also defined in relation to a transformation $T$, considers the measure of preimages. For a set $A$ in the target space, the pushforward measure $T_*\mu$ is defined as:

$$T_{*}\mu :=\mu (T^{-1}(A)).$$

Here, the measure is “pushed forward” from the source space to the target space, using the inverse transformation $T^{-1}$. The associated change of variables formula for pushforward measures is given by:

$$\int _{\Omega }g\circ Td\mu =\int {T(\Omega )}gdT{*}\mu.$$

This formula indicates that integrating the composite function $g \circ T$ over the original domain $\Omega$ with respect to $\mu$ is equivalent to integrating $g$ over the transformed domain $T(\Omega)$ with respect to the pushforward measure $T_*\mu$. These concepts are indispensable for advanced analysis, particularly in fields like probability theory and stochastic processes .

As a direct consequence of the change of variables formula for the Lebesgue measure, we can derive specific expressions for the Radon-Nikodym derivatives and the corresponding integral formulas:

  • The Radon-Nikodym derivative of the pullback measure $T^*m$ with respect to the Lebesgue measure $m$ is given by the absolute value of the Jacobian determinant of the transformation $T$:

    $${\frac {dT^{*}m}{dm}}(x)=|{\text{det}}D_{x}T|.$$

    This quantifies how the density of the pullback measure relates to the original Lebesgue measure.

  • Similarly, the Radon-Nikodym derivative of the pushforward measure $T_*m$ with respect to the Lebesgue measure $m$ involves the absolute value of the Jacobian determinant of the inverse transformation $T^{-1}$:

    $${\frac {dT_{*}m}{dm}}(x)=|{\text{det}}D_{x}T^{-1}|.$$

    This is a natural extension, reflecting the inverse nature of the pushforward.

From these derivatives, we can then obtain the explicit forms of the change of variables formulas for these specific measures:

  • The change of variables formula for pullback measure, incorporating the Jacobian, becomes:

    $$\int _{T(\Omega )}gdm=\int _{\Omega }g\circ TdT^{*}m=\int {\Omega }g\circ T|{\text{det}}D{x}T|dm(x).$$

    This formula is the rigorous generalization of the integral substitution rule for multiple dimensions and arbitrary smooth transformations.

  • The change of variables formula for pushforward measure, also incorporating the Jacobian, is expressed as:

    $$\int _{\Omega }gdm=\int {T(\Omega )}g\circ T^{-1}dT{*}m=\int {T(\Omega )}g\circ T^{-1}|{\text{det}}D{x}T^{-1}|dm(x).$$

    This provides an alternative perspective, relating an integral over the original domain to one over the transformed domain using the inverse mapping and its Jacobian. These formulas are the bedrock for many advanced mathematical constructs, though they often remain hidden beneath layers of applied computation.

Differential equations

The utility of variable changes extends far beyond mere algebraic simplifications and basic calculus. In the realm of differential equations , they are not just a convenience but often a necessity, a lifeline to render otherwise intractable problems solvable. While the elementary applications for differentiation and integration are taught early in calculus courses, and the steps are often so ingrained they’re performed almost unconsciously, the true breadth of variable changes becomes apparent when tackling the complexities of differential equations.

Here, the transformation can involve either the independent variables (often facilitated by the ubiquitous chain rule ) or the dependent variables, which necessitates a more intricate process of differentiation. Moreover, more “exotic” transformations exist, such as the mingling of dependent and independent variables in point transformations and contact transformations . These can be extraordinarily complicated to implement, requiring a deep understanding of the underlying geometric structures. However, they offer a vast degree of freedom, allowing mathematicians and physicists to exploit symmetries and inherent structures within differential equations that would be entirely obscured in their original form.

Very often, rather than guessing the perfect transformation, a general form for a change of variables is initially substituted into a problem. Parameters within this general form are then meticulously chosen along the way, guided by the objective of maximally simplifying the problem. It’s an iterative process of educated guesswork and algebraic refinement, aimed at reducing the equation to a known solvable form or at least one that is more amenable to analysis.

Scaling and shifting

Perhaps the most fundamental, and arguably the most frequently employed, change of variables is the simple scaling and shifting of existing variables. This involves replacing variables with new ones that are merely “stretched” (scaled) and “moved” (shifted) by constant amounts. This technique is remarkably common in practical applications, particularly in engineering and physics, where it serves to extract dimensionless physical parameters from complex equations. It’s about normalizing the problem, stripping away the arbitrary units to reveal the underlying relationships.

For an $n$-th order derivative, the change in variables results in a straightforward scaling factor:

$${\frac {d^{n}y}{dx^{n}}}={\frac {y_{\text{scale}}}{x_{\text{scale}}^{n}}}{\frac {d^{n}{\hat {y}}}{d{\hat {x}}^{n}}}$$

where the original variables are related to the new, “hatted” variables by:

$$x={\hat {x}}x_{\text{scale}}+x_{\text{shift}}$$ $$y={\hat {y}}y_{\text{scale}}+y_{\text{shift}}.$$

This relationship can be readily demonstrated through repeated application of the chain rule and by leveraging the inherent linearity of differentiation .

To illustrate its practical utility, consider the boundary value problem for fluid flow:

$$\mu {\frac {d^{2}u}{dy^{2}}}={\frac {dp}{dx}}\quad ;\quad u(0)=u(L)=0$$

This equation describes the parallel flow of a viscous fluid between two flat, solid walls separated by a distance $L$. Here, $\mu$ represents the viscosity of the fluid, and $dp/dx$ is the pressure gradient along the flow direction, both of which are assumed to be constant. This problem, while not overly complex, contains several physical parameters that clutter the equation.

By strategically scaling the variables, the problem can be transformed into a far more elegant and universal form. Let’s introduce dimensionless variables:

$$y={\hat {y}}L\qquad {\text{and}}\qquad u={\hat {u}}{\frac {L^{2}}{\mu }}{\frac {dp}{dx}}.$$

Substituting these into the original equation and simplifying yields:

$${\frac {d^{2}{\hat {u}}}{d{\hat {y}}^{2}}}=1\quad ;\quad {\hat {u}}(0)={\hat {u}}(1)=0$$

This scaled version is remarkably simpler. All the physical constants have been bundled into the scaling factors, leaving a pure mathematical problem.

Scaling is invaluable for several reasons:

  1. Simplification of Analysis: It reduces the number of independent parameters, making the underlying mathematical structure clearer and less cluttered.
  2. Normalization: Proper scaling can normalize variables, transforming them into a sensible, unitless range (e.g., from 0 to 1), which aids in interpretation and comparison across different physical systems.
  3. Computational Efficiency: For problems requiring numerical solutions, fewer parameters translate directly into a smaller computational burden, leading to faster and more efficient simulations.

It’s a testament to the idea that sometimes, the most profound insights come from stripping away the superficial details to reveal the fundamental core.

Momentum vs. velocity

In the realm of classical mechanics , certain systems of equations can become more transparent with a simple, yet powerful, change of variables. Consider a system described by:

$${\begin{aligned}m{\dot {v}}&=-{\frac {\partial H}{\partial x}}\[5pt]m{\dot {x}}&={\frac {\partial H}{\partial v}}\end{aligned}}$$

This pair of equations might arise from a Hamiltonian formulation for a given function $H(x,v)$, where $m$ is mass, $x$ is position, and $v$ is velocity ($\dot{v}$ and $\dot{x}$ denote time derivatives of velocity and position, respectively). The mass $m$ appears in both equations, somewhat obscuring the symmetry.

The mass can be elegantly eliminated by a seemingly trivial, yet effective, substitution. We introduce a new variable, $p$, representing momentum , defined such that $v = \Phi(p) = (1/m) \cdot p$. Clearly, this is a bijective map from $\mathbb{R}$ to $\mathbb{R}$, meaning it’s a perfectly reversible transformation.

Under this substitution $v = \Phi(p)$, the system of equations transforms into:

$${\begin{aligned}{\dot {p}}&=-{\frac {\partial H}{\partial x}}\[5pt]{\dot {x}}&={\frac {\partial H}{\partial p}}\end{aligned}}$$

This new form, where $p$ replaces $v$, is the canonical representation of Hamilton’s equations . It reveals a beautiful symmetry between position and momentum, a symmetry that is somewhat veiled when expressed in terms of velocity. This transformation is not just a cosmetic change; it’s a fundamental shift in perspective that simplifies the theoretical framework and highlights the conserved quantities inherent in the system. It’s a classic example of how choosing the “right” variables can illuminate the underlying physics.

Lagrangian mechanics

The transition from Newtonian mechanics to Lagrangian mechanics is one of the most profound examples of the power of variable transformations in physics. Given a force field $\varphi(t,x,v)$, Isaac Newton ’s traditional equations of motion are expressed in terms of Cartesian coordinates ($x$) and second time derivatives:

$$m{\ddot {x}}=\varphi (t,x,v).$$

These equations, while fundamental, can become incredibly cumbersome for systems with complex constraints or non-Cartesian geometries. Joseph-Louis Lagrange sought a more general formulation. He meticulously examined how these Newtonian equations of motion would transform under an arbitrary substitution of variables. Instead of sticking to Cartesian coordinates, he introduced generalized coordinates, $y$, related to $x$ by a transformation $x = \Psi(t,y)$. The corresponding generalized velocities, $w$, are then related to the original velocity $v$ by:

$$v={\frac {\partial \Psi (t,y)}{\partial t}}+{\frac {\partial \Psi (t,y)}{\partial y}}\cdot w.$$

Through this generalized change of variables, Lagrange discovered that the equations of motion could be expressed in a far more elegant and universally applicable form, known as the Euler–Lagrange equations :

$${\frac {\partial {L}}{\partial y}}={\frac {\mathrm {d} }{\mathrm {d} t}}{\frac {\partial {L}}{\partial {w}}}$$

These equations are found to be entirely equivalent to Newton’s equations, but they operate on a new function, the Lagrangian , $L$, defined as $L = T - V$. Here, $T$ represents the kinetic energy of the system, and $V$ represents its potential energy .

The genius of Lagrangian mechanics lies precisely in this change of variables. When the substitution $\Psi$ is chosen judiciously—often by exploiting the inherent symmetries and constraints of the system—these Euler-Lagrange equations become significantly simpler to solve than Newton’s equations expressed in their original Cartesian form. They allow for a more natural description of motion in curvilinear coordinates, effectively side-stepping the need to explicitly calculate constraint forces. It’s a testament to the idea that sometimes, the problem isn’t the physics, but the lens through which you’re viewing it. A well-chosen transformation can turn an intractable mess into a beautifully solvable puzzle.

See also