From the three-body problem to KAM
I believe that many physicists may have come across the KAM theory in the physics literature. For me, I have read about KAM many times in papers discussing the discrete regime of wave turbulence at low nonlinearity. Nevertheless, I never had a deep understanding of this concept except knowing that the system behaves "like a linear one" or "like an integrable one". Recently, in a project with one of my students, we found a breather solution in a derivative NLS, for which we suspected that the oscillation part is supported on a KAM tori. This motivated me to seek further understanding about the KAM theory. However, most of the mathematical materials that I found online are not immediately accessible, as many of them are written in terms of symplectic manifold which requires further efforts to understand. This is until finally I found a book "The KAM story: a friendly introduction to the content, history, and significance of classical Kolmogorov-Arnold-Moser Theory". The contents of the book are very understandable to me (and in general to physicists, I believe), and are presented in the perspective of historical development (which I am a big fan of). Although the book does not contain a detailed mathematical proof of KAM theory, it provides enough intuition underlying the theory, including its connection to concepts such as integrable system, chaos and ergodicity. In this article, I will summarize and re-organize the materials I learned from reading the book (with many descriptions taken directly from the book), following the order of historical development.
Newton, the Principia and integrability
Newton's monograph "the Principia" inaugurated the modern form of mathematical physics — modeling phenomena by way of differential equations. From this principle Newton was able to describe precisely the movements of planets. On a mathematical level, calculation of such movements can be cast into the n-body problem.
n-body problem: the dynamical system that describes the motions of $n$ given masses interacting by mutual attraction according to Newton’s law of gravity.
The simplest but non-trivial one from the n-body problem is the Kepler's problem, which describes the motion of a point mass moving in an inward central force field with force inversely proportional to the square of the distance from the center. One can visualize Kepler's problem as a special case of two-body problems ($n=2$) but with mass of one body much greater than that of the other one (e.g., sun and earth). The Kepler's problem is analytically solvable, and Newton was able to do that, providing theoretical support on Kepler's laws of planetary motion that were obtained originally from observation. When it comes to more general two-body problems ($n=2$ with comparable masses of the two bodies), analytical solution is still straightforward to obtain since such problems can be readily reduced to the Kepler's problem through very simple manipulation.
The real trouble with the n-body problem begins when $n$ is three or greater, as Newton discovered to his dismay when he tried to use his newly enunciated laws to calculate motions of the earth-moon-sun system (in particular to describe the motion of the moon). There is a famous remark that Newton said that “his head never ached but with his studies on the moon”. The unsuccessful attempt of Newton for the three-body problem led to deeper questions at least from two aspects: (1) Is the solar system stable? That is, even if its motions aren’t periodic, or completely predictable, can we determine whether it will "hang together", or suffer calamity (fly apart or suffer internal collision) over some given long interval of time? (2) What kind of differential equations can be exactly solvable, in the sense that system’s solution is given by explicit formulas or integrals (that may not be evaluated in terms of elementary functions).
The question (1) is not easy to answer and we will come back to this later in the article. Here we would mention that at the beginning of the 18th century, Newton famously wrote that the solar system needed occasional divine intervention (presumably a nudge here and there from the hand of God) in order to remain stable. This was interpreted to mean that Newton believed his mathematical model of the solar system — the n body problem — did not have stable solutions.
To answer question (2), we need to discuss the concept of integrable system. Let us consider a Hamiltonian system
$\frac{dq_i}{dt}=\frac{\partial H}{\partial p_i}$, $\frac{dp_i}{dt}=-\frac{\partial H}{\partial q_i}$, $i=1,...,n$,
with $n$ degrees of freedom and $2n$ dimensions in phase space $M$. This Hamiltonian system is completely integrable (with Liouville integrability) if it has $n$ constants of motion $f_1, f_2,..., f_n$ which are independent and in involution.
In particular, two smooth functions $f_i$, $f_j$ on $M$ are in involution if $\{f_i,f_j\}=0$ (i.e., their Poisson bracket vanishes on $M$). A finite set of smooth functions $\{ f_1, f_2, ..., f_n \}$ are independent on $M$ if the corresponding gradient vectors $\{ Df_1, Df_2, ..., Df_n \}$ is linearly independent at almost every point on $M$. A non-constant smooth function $f:M\rightarrow \mathbb{R}$ is a first integral (or constant of motion) if it is constant when evaluated along solutions of the Hamiltonian system, i.e., $f(q(t),p(t))=constant$ for $q(t), p(t)$ a solution to the Hamiltonian system, or $\{f,H\}=0$.
While the definition of integrability involves some technical points, its main reasoning can be described as follows: For a $2n$ dimensional phase space $M$, setting each of the $n$ first integrals to be a (separate) constant produces a $n$ dimensional smooth surface (manifold). Because it is "cut out" in this way by first integrals, this surface is clearly invariant under the flow; i.e., any solution beginning on the surface remains on it for all times thereafter (instead of in the whole phase space). Furthermore, the property of this surface is described by the LMAJ theorem.
Liouville-Mineur-Arnold-Jost (LMAJ) theorem: Whenever such a surface is bounded and connected, it is an $n$ dimensional torus, denoted $\mathbb{T}^n$. Moreover, on these tori, there are special canonical coordinates $(\theta, I)=(\theta_1, ..., \theta_n, I_1, ..., I_n)$, called action-angle variables, in which the Hamiltonian takes the simple form $H=h(I)$ with canonical equations
$\frac{dI_k}{dt}=0$, $\frac{d\theta_k}{dt}=\omega_k$, $k=1,...,n$
having very simple solutions
$I_k(t)=I_k^0$, $\theta_k(t)=\theta_k^0 + \omega_k t\ (mod\ 1)$, $k=1,...,n$.
Here $I_k^0$, $\theta_k^0$ are the ICs, $\omega_k=\frac{\partial H(I)}{\partial I_k}$ are the frequencies.
Remarks: The invariant surfaces that are tori are called invariant tori (or Lagrangian tori). When they occupy a region of $M$, mathematicians say that the $n$-dimensional tori foliate the region of $2n$ dimensional phase space they occupy. Since action values are fixed under the flow, each torus may be labeled (or parameterized) by the initial action $I_0$ of a trajectory on it. Once an individual torus $I = I_0$ is selected, the angles $\theta = (\theta_1, . . . , \theta_n)$ form a coordinate system on it. Individual trajectories of the flow wrap around the torus with constant angular frequency $\omega_k$ in each angular coordinate $\theta_k$ (the $\omega_k=\omega_k(I)$ are constant on each torus because the actions $I=I_0$ are constant there). Some simple examples of integrable systems with phase spaces and trajectories are shown in the following figure.
In general, if a system is integrable, then it is exactly "solvable" guaranteed by the LMAJ theorem. For the Kepler's problem with 3 degrees of freedom, one can find 5 independent first integrals (total energy, three angular momentum and three Laplace-Runge-Lenz vectors related by two equations), which makes the problem super integrable and the solution periodic. For n-body problems with $n\geq 3$, it is Poincaré who showed that such problems are not completely integrable, which will be discussed in the next session.
Before going into Poincaré's work, we will need to introduce two concepts that will become important later.
Resonant vs. non-resonant tori: On a $n$-dimensional torus with $\omega =(\omega_1, ..., \omega_n)$, consider the equation $k\cdot\omega=0$ where $k=(k_1, ..., k_n)$ is a vector with integer components. If this equation has non-zero solution $k$, we say the frequency $\omega$ is resonant. If the only solution to the equation is $k=0$, we say it is non-resonant. Non-resonant linear flow on $\mathbb{T}^n$ is also called quasi-periodic flow. See the following figure for examples on $\mathbb{T}^2$
The era of Poincaré and the aftermath
This part of the story started from Karl Weierstrass in the early 1880s during his professorship at the University of Berlin. As the world's leading figure in mathematics, Weierstrass took a keen interest in the n-body problem. Weierstrass calculated formal series solutions (akin to so-called Lindstedt series) for the n-body problem, suspected that they are convergent, but had difficulty establishing the convergence rigorously. Meanwhile, in Stockholm, Weierstrass' protégé Gösta Mittag-Leffler had risen to a high station in Swedish academic circles, high enough that the scientifically enlightened Oscar II, King of Norway and Sweden, asked him to suggest questions for a scientific prize competition that would honor the king’s upcoming 60th birthday in 1889. Mittag-Leffler in turn consulted Weierstrass and Kowalevski (student of the former), who suggested the convergence problem. One of the three prize questions was therefore on the convergence of these series in the three body problem, or, as it is dramatically phrased in the competition, “on the stability of our planetary system”.
The prize was, of course, later received by the young French mathematician Henry Poincaré for his magnificent 270-page memoir published in Acta Mathematica (in fact, Poincaré's first version of the paper involved a fundamental error discovered by L.E. Phragmén, and he had to work over the next year to correct and recreate the "great manuscript" we know today). Poincaré is often called the "last universalist of mathematics" and "father of dynamical systems", who stands second only to Newton in revolutionizing the science of mechanics. In order to understand what Poincaré did (that changed the course of astronomical dynamics forever), let us first introduce the Hamiltonian perturbation theory (HPT), which Poincaré called "the fundamental problem of dynamics".
Hamiltonian perturbation theory (HPT): we consider a smooth (or even analytic) completely integrable n-degree-of-freedom Hamiltonian system with Hamiltonian $h=h(I)$ in the action-angle variables $(\theta,I)$, valid in a nice region $U$ of phase space. To this integrable system, we add a smooth (or analytic) function of both actions and angles having the form $\epsilon f(\theta, I, \epsilon)$, called a perturbation of $h$. This gives a so-called perturbed Hamiltonian system (or nearly integrable Hamiltonian system) $H(\theta, I, \epsilon) = h(I) + \epsilon f(\theta, I, \epsilon)$.
This idea of starting from a known entity (here $h = h(I)$), then "perturbing" it slightly as a way of exploring the unknown, was introduced by Newton in the Principia and soon became the standard means of studying the n-body problem for $n \geq 3$. In this case, a two-body problem is the known entity, and smaller additional bodies comprise the perturbation. It was by varying this approach that many of the fundamental advances in planetary motion were made in the 18th and 19th centuries.
In HPT, one is interested in whether the perturbed system $H(\theta, I, \epsilon)$ is integrable or not. The basic method of verifying integrability, by C.G.J. Jacobi, is to look for a smooth canonical change of coordinates $(\theta,I) \rightarrow (\phi, J)$ such that in the new action-angle variables $(\phi, J)$ the Hamiltonian $H$ will depend only on $J$. Such coordinate change can be effectively formulated through the Lie series method (which in fact appeared after Poincaré's time).
Lie series method: We look for a generating function $\chi(\phi, J)$ to make a Liouville operator $L_{\chi}(H) \equiv \{ \chi, H \}$ and a near-identity variable change $e^{\epsilon L_{\chi}}=I+\epsilon L_{\chi} + O(\epsilon^2)$. Consider a symplectic change of variables $T: D' \rightarrow D$ defined as the time-1 flow of the Hamiltonian $\epsilon \chi$, with $(\phi, J) \in D'$ and $(\theta, I) \in D$. Then the new Hamiltonian $K(\Phi, J) = e^{\epsilon L_{\chi}} H = H \circ T$.
Formal proof: Let $\Psi^t (\phi, J)$ be the time-$t$ flow map of $\epsilon \chi(\phi, J)$. By Taylor's expansion, $H\circ \Psi^1 = H\circ \Psi^0 + \frac{d}{dt} (H\circ \Psi^t) |_{t=0} + ...$ with $\frac{d}{dt} (H\circ \Psi^t) = \nabla H(\Psi^t(\phi, J)) \cdot \frac{d}{dt} \Psi^t(\phi, J) = \nabla H(\Psi^t(\phi, J)) \cdot \epsilon (\frac{\partial \chi}{\partial J}, -\frac{\partial \chi}{\partial \phi}) |_{\Psi^t} = \epsilon \{ \chi, H \}\Psi^t (\phi, J)$. Therefore, $H \circ T(\phi, J) = H\circ \Psi^1(\phi, J)= e^{\epsilon L_{\chi}} H(\phi, J) = K(\phi, J)$.
For the latter, let us first describe ergodicity as a slightly stronger form of chaos (see in Appendix the mathematical formulation of "ergodic theorem" that appeared in 1930s after the development of measure theory). An ergodic system has trajectories that do not confine themselves to separate portions of phase space; instead, trajectories pass through almost all points in phase space that are energetically accessible. Since the introduction of the ergodicity concept by Ludwig Boltzmann, the keen interest in ergodic systems merged with Poincaré’s proof of the nonintegrability of generic Hamiltonian systems. Many best researchers of the time enthusiastically followed Boltzmann's idea and thought or hoped that nonintegrability entails ergodicity (which was known not to be true later after the birth of KAM). Such researchers include Enrico Fermi and George Birkhoff (Poincaré's foremost disciple), although Fermi was later "disappointed" by the FPUT numerical experiment which surprisingly showed resistance of the system to thermalization (i.e., ergodicity).
Theoretician's over-indulgence in chaos and ergodicity led to a long paradox in mechanics. While theoreticians believed that as soon as $\epsilon$ is turned on in HPT, the integrability of the system is destroyed, those who worked with solutions of such system in practice saw things differently. In the experience of the latter, for small values of $\epsilon$, most of their approximate solutions (in particular, from series computed by perturbation method) of the perturbed system $H = h + \epsilon f$ stayed fairly close to solutions of the integrable system $h$ for long intervals of time, and that both the closeness and the time interval could be controlled by varying $\epsilon$. In other words, Experience taught them that there was some kind of continuity to integrable behavior, despite the immediate breakdown claimed by theoreticians.
Poincaré had partly explained this paradox by showing that most of the series used in celestial mechanics were divergent asymptotic expansions (although they are useful approximations by retaining only finite number of terms). For the Lindstedt series that is the subject of King Oscar's prize, after showing that the series are plagued by small divisor problems and therefore cannot be everywhere uniformly convergent, Poincaré considered the question of whether the series might converge for special initial conditions chosen in advance. He asserted his belief twice in his manuscript that, even in these special cases, the Lindstedt series diverge. But in each case, he qualifies the statement by saying that his arguments do not allow him to assert this divergence with absolute rigor. It was the birth of KAM which finally resolved the paradox, as a result of which mathematicians are able to understand that the Lindstedt series is indeed convergent for special initial conditions. Everyone in Newton's and Poincaré's era were about to be re-educated in a dramatic way.
The birth of KAM
The first essential breakthrough on the path to KAM theory was the idea to overcome the small divisor problem, which was first constructed by the German number theorist Carl Ludwig Siegel. At that time, Siegel was in fact not working on the HPT problem, and we shall directly describe how Andrey Kolmogorov adapted Siegel's techniques for the purpose of the KAM theory (although there is no direct evidence that Kolmogorov knew about Siegel's work that appeared earlier). Let us now return to equation (3) and recall that, following Poincaré, the presence of a dense set of resonant tori means that small divisors in these solutions prevent $\chi$, hence also $T$ or $e^{\epsilon L_{\chi}}$, from being defined classically. This is still true, but Kolmogorov would take a different approach. The basic idea is not to consider a classical transformation defined on a nice set (one with interior), and simply look for some non-empty set on which the transformation can be defined.
Let us be more specific. Consider a special set of frequency vectors (Diophantine vectors defined by Diophantine condition): $D^{\omega}(\gamma, \tau) = \{ \omega\in\mathbb{R}^n | \text{for each nonzero } k\in\mathbb{Z}^n, |k\cdot\omega|\geq \frac{\gamma}{|k|^\tau} \}$, and the corresponding set of $J$-values $D^J(\gamma, \tau) = \{ J\in\mathbb{R}^n | \text{for each nonzero } k\in\mathbb{Z}^n, |k\cdot\omega(J)|\geq \frac{\gamma}{|k|^\tau} \} $. These two sets are nonempty provided $\gamma>0$ is small enough, with $\tau > n-1$, and provided the unperturbed Hamiltonian $h=h(J)$ is nondegenerate (so that the frequency map $\omega(J)=\partial h/\partial J$ preserves the basic structure of $\omega$'s when mapped backward to $J$'s). The frequencies in $D^{\omega}(\gamma, \tau)$ form a strange sort of set closely related to a Cantor set, which is a type of fractal with no interior (see figure below for the most common construction of the cantor ternary set and an example of $D^{\omega}(\gamma, \tau)$ with $n=2$). And because the frequency map is nondegenerate, the preimage $D^J(\gamma, \tau)$ in $J$ space is also a Cantor-like set. The set $D^J(\gamma, \tau)$ does not have interior, but for small $\gamma$ it is still large in the sense of Lebesgue measure, and it can be shown to generate uncountably many invariant tori for the original system.
Other results in HPT
Now let us return to HPT and mention some other relevant concepts/results. In a paper published in 1999, P. Lochak divides HPT into three basic parts: results pertaining to geometric stability, classical stability, and instability.
Geometric stability refers to the invariant sets in phase space (such as invariant tori or invariant manifolds generally) to which orbits are confined for all time, thus providing stability over infinite time intervals. KAM tori belong to the class (but are not the only objects) of geometric stability results in HPT. Classical stability is the sort of stability arising out of classical Hamiltonian perturbation theory. In other words, when faced with the small divisor problem, one way to proceed is to "go non-classical", ultimately leading to the set of KAM tori, interpreted as transformation to integrable form on a Cantor-like set. But, on the other hand, one can instead "go classical", requiring transformations to be defined on sets with interior. This leads to stability over finite — but very long — time intervals. There’s a trade-off with respect to geometric stability: one loses the infinite time intervals, but gains a nice, classical set of initial conditions for which results hold (see Nekhoroshev theorem described below for more details). Though perhaps less esthetically satisfying than KAM theory, under the right conditions (long enough time intervals, big enough thresholds, etc.) this is probably the kind of stability result best suited to physical applications. Instability in HPT is basically everything else. In systems far from integrable, solutions are not subject to either kind of stability mentioned above, and may behave chaotically even over relatively short time intervals. For near-integrable systems, solutions may be subject to classical stability over long time intervals, but display a weak drifting behavior beyond these intervals.
We now discuss the Nekhoroshev theorem (first laid out in the 1970s by Arnold’s student N.N. Nekhoroshev), which is often referred to as "physicists’ KAM theory". This is so partly because the threshold $\epsilon_0$ in Nekhoroshev theorem can be much greater than that in KAM, more suitable for application in physical problems ($\epsilon_0$ in KAM is often absurdly small for any practical situation). Theorems of this sort show adiabatic invariance of the action variables for analytic nearly integrable systems satisfying certain steepness or "convexity" conditions, which replace the nondegeneracy conditions of KAM theory. Adiabatic invariance means that the actions $I = I(t)$ stay near their initial values $I(0)$ over some (long) time interval. We state a prototype Nekhoroshev theorem below.
Prototype Nekhoroshev theorem: Suppose that the nearly integrable Hamiltonian $H(\theta, I, \epsilon) = h(I) + \epsilon f(\theta, I, \epsilon)$ is analytic and that the unperturbed part $h = h(I)$ is steep (or convex, or quasiconvex) on some nice domain. Then there’s a threshold $\epsilon_0 >0$, and positive constants $R$, $T$, $a$, and $b$ such that whenever $|\epsilon|<\epsilon_0$, for all initial actions $I(0)$ in the domain (and far enough from the boundary) we have $||I(t)-I(0)|| \leq R \epsilon^b$ for times $t\leq Texp(\epsilon^{-a})$.
Remark 1: To say that $h = h(I)$ is convex on a domain means that its Hessian matrix $\partial^2 h/\partial I^2$ is (uniformly positive, or uniformly negative) definite on the $I$-domain; and $h$ is "quasiconvex" if it is convex on every fixed energy level $h(I) = E$ (over a range of appropriate energies $E_1 < E < E_2$).
Remark 2: Note that the conclusion $||I(t)-I(0)|| \leq R\epsilon^b$ is weaker than — but still closely analogous to — the conclusion that trajectories lie on invariant tori, and that it is further weakened by holding for finite times rather than for infinite times. But to compensate for this, there is also an important gain: namely, the conclusion holds for all initial conditions in the domain, rather than only for those in the Cantor family $\mathcal{T}$ of KAM theorems. This is another fact that makes Nekhoroshev theorem a (hypothetically) better choice in physical applications than the KAM theorem, especially when one realizes that exponential time intervals (of length $Texp(\epsilon^{-a})$ may be "practically as long as infinity" (e.g. as long as the age of the universe), depending on the parameters involved.
Finally, we mention two types of instability named after Chirikov and Arnold. Chirikov diffusion is one type of instability in Chirikov regime, which comprises systems whose perturbations are bigger than the Nekhoroshev threshold. In this regime the KAM tori are sparse or nonexistent and chaotic behavior of various kinds predominates in phase space. Arnold diffusion is another type of instability in the Nekhoroshev regime, for which systems are close to integrable. In this regime, KAM tori fill a large part of phase space, and any large-scale instability must manifest itself over very long times, beyond the exponentially long timescales on which Nekhoroshev theory guarantees stability of the actions.
Is the solar system stable?
Equipped with all above knowledge, let us come back to our original problem rooted from Newton: Is our solar system stable or not? To make things clear from the start, we have to point out that unfortunately KAM theory says very little about our own physical solar system. Although the masses of the planets appear very small, with the Sun (solar mass 1), Jupiter (0.001 solar mass), Saturn (0.0003 solar mass), Neptune (0.00005 solar mass), and so on, by the standards of rigorous KAM theory, they are "astronomical"! The study on the stability and chaotic behavior of the solar system will need to wait for the emergence of modern computers. However, the KAM theory has a great deal to say for small subsystems and n-body problems, which we review first.
We considered the scaled 1+n body problem, where a small parameter $\epsilon>0$ is used to roughly represent the ratio of mass of planets ($i\geq 1$) to the mass of the sun ($i=0$). The simplest case among this class of problems is the "restricted, planar, circular three body problem", or RPC3BP for short. Specifically, the word "restricted" means that the third body is a kind of ghost, or infinitesimal body with no mass. In other words, it is assumed to be so small that it exerts no force whatsoever on the first two bodies; it exists only as a "sensor" to test the motion of a body of negligible mass under the gravitational influence of the first two bodies with non-negligible mass. (One often imagines an asteroid moving between the Sun and Jupiter, or a high-orbit satellite between the Earth and Moon.) Next, the word "planar" simply means that all three bodies remain in a fixed plane in space. Finally, "circular" means that the integrable motion of the first two massive bodies is circular. The RPC3BP can be summarized in this way: Given two bodies in circular orbit around their barycenter, what is the motion of a massless test particle restricted to the orbital plane?
The KAM theory can be directly applied to RPC3BP, which roughly states that for the small parameter $\epsilon$ (ratio of body 0 and body 1) below a threshold, there exist a set of analytic KAM tori of positive Liouville measure. Furthermore, there is also a KAM theory for the $1+n$ body problem by J. Féjoz in 2004.
Appendix 1: Rational and irrational numbers
The rational numbers are dense in $\mathbb{R}$ because (intuitively) all irrational numbers can be obtained as limit point of rational numbers. To see that the rational numbers have measure zero (so that irrational numbers have full measure), we consider the fact that the rational number set $S=\{ s_1, s_2, s_3, ... \}$ is countable (i.e., it can be mapped to $\mathbb{N}=\{ 1,2,3,... \}$). For a set to have measure zero, we need to show that given any $\epsilon >0$, the sum of the measure (length, area of volume, in this case the length) is less than $\epsilon$. We pick a disk around $s_1$ that covers $\{ s_1-\epsilon/8, s_1+\epsilon/8 \} $, a disk around $s_2$ as $\{ s_2-\epsilon/16, s_2+\epsilon/16 \} $, etc. After this is done for countable number of rational numbers, we have an infinite number of intervals whose lengths are $\epsilon/4, \epsilon/8, \epsilon/16, ...$, whose summation gives $\epsilon/2$ that is less than $\epsilon$. This completes the proof.
Appendix 2: Ergodic theorem
It took a long time to develop the first mathematically precise results along the lines of the ergodic hypothesis—six decades, if one counts from Boltzmann’s first mention of it in 1871 to the first rigorous ergodic theorems announced in 1931 (proved by J. von Neumann and G.D. Birkhoff). The key mathematical tool which gave proper expression to the ergodic hypothesis was the Lebesgue-Borel theory of measure and integration. The new theory of integration allowed the evolution law (flow map) $\phi_t: M\rightarrow M$ to be abstracted as a measure-preserving flow on a probability space $(M, \Sigma, \mu)$. Or, to explain in more detail, if we consider that $\phi_t$ is a Hamiltonian flow, we know from Liouville’s theorem that the canonical volume element (Liouville measure) on $M$ is preserved by $\phi_t$. We can then use the volume element to define a measure $\rho$ on a natural class $\Sigma$ of measurable subsets of $M$, so $(M, \Sigma, \rho)$ becomes a measure space, with measure $\rho$ invariant with respect to $\phi_t$. Next, assuming the energy surface $M$ has finite measure $\rho(M)$, we can normalize $\rho$ so that it becomes the invariant probability measure $\mu=\rho/\rho(M)$ (in other words, $\mu(M)=1$), and we have the probability space $(M, \Sigma, \mu)$. Finally, we can now give a more precise meaning to the "space average": For any $\mu$-Lebesgue-integrable function $f : M \rightarrow R$, we define its space average $\bar{f}$ by $\bar{f}:=\int_M f(x) d\mu(x)$.
Birkhoff’ s ergodic theorem: Consider the probability space $(M, \Sigma, \mu)$ described above, with $\mu$ invariant with respect to $\phi_t$. Then given any observable $f$ that is Lebesgue-integrable on $M$ with respect to $\mu$, the time average $\langle f \rangle (x) := lim_{T\rightarrow \infty} \frac{1}{T}\int_0^T f(\phi_t) dt$ exists for almost every $x$ in $M$ (so that $\langle f \rangle$ is Lebesgue-integrable on $M$). Furthermore, the space average of $\langle f \rangle$ equals the space average of $f$.
Note that this theorem does not have an equality like "space average equal to time average" as a conclusion; instead, this follows as a corollary for a certain class of flows $\phi_t$ called ergodic, which are defined as follows.
Ergodic flow: If $(M, \Sigma, \mu)$ is a probability space, the measure-preserving flow $\phi_t$ : $M \rightarrow M$ is ergodic with respect to $\mu$ provided every measurable subset $A$ of $M$ which is invariant under $\phi_t$ has either $\mu(A)=0$ or $\mu(A) = 1$.
This very economical definition says that any measurable subset $B$ of $M$ with $0<\mu(B)<1$ must be "moved around all over the place" by $\phi_t$. Any exceptional sets must either have measure zero (and are thus negligible), or full measure 1 (thus essentially all of $M$, having "nowhere to move"). It is this property of "moving sets around all over the place" that takes the place of Boltzmann’s trajectories that pass through every point of $M$. This subtle shift is key, and allows one to prove the following corollary to Birkhoff’s ergodic theorem.
Corollary: Suppose $(M, \Sigma, \mu)$ is a probability space and the measure-preserving flow $\phi_t: M \rightarrow M$ is ergodic with respect to $\mu$. Then given any Lebesgue integrable function $f : M \rightarrow R$, for almost all $x\in M$, the time average exists and equals the space average, i.e., $\langle f \rangle (x) = \int_M f(x) d\mu(x)$.
This finally gave precise mathematical expression to Boltzmann’s intuition from six decades earlier, and launched the mathematical discipline of ergodic theory, which continues as a large and vigorous branch of dynamical systems to the present day. In addition, the physicist/engineering definition of ergodicity, in terms of "spatial/ensemble average equals time average" was also originated from here.
Comments
Post a Comment