Entropy and Dimensions (Following Landau and Lifshitz)

Some time ago I wrote about volumes of spheres in multi-dimensional phase space – as needed in integrals in statistical mechanics.

The post was primarily about the curious fact that the ‘bulk of the volume’ of such spheres is contained in a thin shell beneath their hyperspherical surfaces. The trick to calculate something reasonable is to spot expressions you can Tayler-expand in the exponent.

Large numbers ‘do not get much bigger’ if multiplied by a factor, to be demonstrated again by Taylor-expanding such a large number in the exponent; I used this example:

Assuming N is about 1025  then its natural logarithm is about 58 and Ne^N = e^{\ln(N)+N} = e^{58+10^{25}} , then 58 can be neglected compared to N itself.

However, in the real world numbers associated with locations and momenta of particles come with units. Calling the unit ‘length’ in phase space R_0 the large volume can be written as aN{(\frac{r}{R_0})}^N = ae^{\ln{(N)} + N\ln{(\frac{r}{R_0})}} , and the impact of an additional factor N also depends on the unit length chosen.

I did not yet mention the related issues with the definition of entropy. In this post I will follow the way Landau and Lifshitz introduce entropy in Statistical Physics, Volume 5 of their Course of Theoretical Physics.

Landau and Lifshitz introduce statistical mechanics top-down, starting from fundamental principles and from Hamiltonian classical mechanics: no applications, no definitions of ‘heat’ and ‘work’, nor historical references needed for motivation. Classical phenomenological thermodynamics is only introduced after their are done with the statistical foundations. Both entropy and temperature are defined – these are useful fundamental properties spotted in the mathematical derivations and thus deserve special names. They cover both classical and quantum statistics in small number of pages – LL’s style has been called terse or elegant.

The behaviour of a system with a large number of particles is encoded in a probability distribution function in phase space, a density. In the classical case this is a continuous function of phase-space co-ordinates. In the quantum case you consider distinct states – whose energy levels are densely packed together though. Moving from classical to quantum statistics means to count those states rather than to integrate the smooth density function over a volume. There are equivalent states created by permutations of identical particles – but factoring in that is postponed and not required for a first definition of entropy. A quasi-classical description is sufficient: using a minimum cell in phase space, whose dimensions are defined by Planck’s constant h that has a dimension of action – length times momentum.

Entropy as statistical weight

Entropy S is defined as the logarithm of the statistical weight \Delta \Gamma – the number of quantum states associated with the part of phase phase used by the (sub)-system. (Landau and Lifshitz use the concept of a – still large – subsystem embedded in a larger volume most consequentially, in order to avoid reliance on the ergodic hypothesis as mentioned in the preface). In the quasi-classical view the statistical weight is the volume in phase space occupied by the system divided by the size of the minimum unit cell defined by Planck’s constant h. Denoting momenta by p, positions by q, using \Delta p and \Delta q as a shortcut applying multiple dimensions equivalent to s degrees of freedom…

S = log \Delta \Gamma = log \frac {\Delta p \Delta q}{2 \pi \hbar^s}

An example from solid state physics: if the system is considered a rectangular box in the physical world, possible quantum states related to vibrations can be visualized in terms of possible standing waves that ‘fit’ into the box. The statistical weight would then single out those bunch of states the system actually ‘has’ / ‘uses’ / ‘occupies’ in the long run.

Different sorts of statistical functions are introduced, and one reason for writing this article to emphasize the difference between them: The density function associates each point in phase space – each possible configuration of a system characterized by the momenta and locations of all particles – with a probability. These points are also called microstates. Taking into account the probabilities to find a system in any of these microstates gives you the so-called macrostate characterized by the statistical weight: How large or small a part of phase space the system will use when watched for a long time.

The canonical example is an ideal gas in a vessel: The most probable spacial distribution of particles is to find them spread out evenly, the most unlikely configuration is to have them concentrated in (nearly) the same location, like one corner of the box. The density function assigns probabilities to these configurations. As the even distribution is so much much more likely, the \Delta q part of the statistical weight would cover all of the physical volume available. The statistical weight function has to obtain a maximum value in the most likely case, in equilibrium.

The significance of energies – and why there are logarithms everywhere.

Different sufficiently large subsystems of one big system are statistically independent – as their properties are defined by their bulk volume rather than their surfaces interfacing with other subsystems – and the larger the volume, the larger the ratio of volume and surface.  Thus the probability density function for the combined system – as a function of momenta and locations of all particles in the total phase phase – has to be equal to the product of the densities for each subsystem. Denoting the classical density with \rho and adding a subscript for the set of momenta and positions referring to a subsystem:

\rho(q,p) = \rho_1(q_1,p_1) \rho_2(q_2,p_2)

(Since these are probability densities, the actual probability is always obtained by multiplying with the differential(s) dqdp).

This means that the logarithm of the composite density is equal to the sum of the logarithms of the individual densities. This the root cause of having logarithms show up everywhere in statistical mechanics.

A mechanical system of particles is characterized by only 7 ‘meaningful’ additive integrals: Energy, momentum and angular momentum – they add up when you combine systems, in contrast to all the other millions of integration constants that would appear when solving the equations of motions exactly. Momentum and angular momentum are not that interesting thermodynamically, as one can change to a frame moving and rotating with the system (LL also cover rotating systems). So energy remains as the integral of outstanding importance.

From counting states to energy intervals

What we want is to relate entropy to energy, so assertions about numbers of states covered need to be translated to statements about energy and energy ranges.

LL denote the probability to find a system in (micro-)state n with energy E_n as w_n – the quantum equivalent of density \rho . w_n has to be a linear function of the energy of this micro-state E_n as per the additivity just mentioned above, and thus LL omit the subscript n for w:

w_n = w(E_n)

(They omit any symbol ever if possible to keep their notation succinct ;-))

A thermodynamic system has an enormous number of (mechanical) degrees of freedom. Fluctuations are small as per the law of large numbers in statistics, and the probability to find a system with a certain energy can be approximated by a sharp delta-function-like peak at the system’s energy E. So in thermal equilibrium its energy has a very sharp peak. It occupies a very thin ‘layer’ of thickness \Delta E in config space – around the hyperplane that characterizes its average energy E.

Statistical weight \Delta \Gamma can be considered the width of the related function: Energy-wise broadening of the macroscopic state \Delta E needs to be translated to a broadening related to the number of quantum states.

We change variables, so the connection between Γ and E is made via the derivative of Γ with respect to E. E is an integral, statistical property of the whole system, and the probability for the system to have energy E in equilibrium is W(E)dE . E is not discrete so this is again a  probability density. It is capital W now – in contrast to w_n which says something about the ‘population’ of each quantum state with energy E_n.

A quasi-continuous number of states per energy Γ is related to E by the differential:

d\Gamma = \frac{d\Gamma}{dE} dE.

As E peaks so sharply and the energy levels are packed so densely it is reasonable to use the function (small) w but calculate it for an argument value E. Capital W(E) is a probability density as a function of total energy, small w(E) is a function of discrete energies denoting states – so it has to be multiplied by the number of states in the range in question:

W(E)dE = w(E)d\Gamma

Thus…

W(E) = w(E)\frac{d\Gamma}{dE}.

The delta-function-like functions (of energy or states) have to be normalized, and the widths ΔΓ and ΔE multiplied by the respective heights W and w taken at the average energy E_\text{avg} have to be 1, respectively:

W(E_\text{avg}) \Delta E = 1
w(E_\text{avg}) \Delta \Gamma = 1

(… and the ‘average’ energy is what is simply called ‘the’ energy in classical thermodynamics).

So \Delta \Gamma is inversely proportional to the probability of the most likely state (of average energy). This can also be concluded from the quasi-classical definition: If you imagine a box full of particles, the least possible state is equivalent to all particles occupying a single cell in phase space. The probability for that is (size of the unit cell) over (size of the box) times smaller than the probability to find the particles evenly distributed on the whole box … which is exactly the definition of \Delta \Gamma.

The statistical weight is finally:

\Delta \Gamma =  \frac{d\Gamma(E_\text{avg})}{dE} \Delta E.

… the broadening in \Gamma , proportional to the broadening in E

The more familiar (?) definition of entropy

From that, you can recover another familiar definition of entropy, perhaps the more common one. Taking the logarithm…

log S = log (\Delta \Gamma) = -log (w(E_\text{avg})).

As log w is linear in E, the averaging of E can be extended to the whole log function. Then the definition of ‘averaging over states n’ can be used: To multiply the value for each state n by probability w_n and sum up:

- \sum_{n} w_n log w_n.

… which is the first statistical expression for entropy I had once learned.

LL do not introduce Boltzmann’s constant k here

It is effectively set to 1 – so entropy is defined without a reference to k. k is is only mentioned in passing later: In case one wishes to measure energy and temperature in different units. But there is no need to do so, if you defined entropy and temperature based on first principles.

Back to units

In a purely classical description based on the volume in phase space instead of the number of states there would be no cell of minimum size, and then instead of the statistical weight we had simply this volume: But then entropy would be calculated in a very awkward unit, the logarithm of action. Every change of the unit for measuring volumes in phase space would result in an additive constant – the deeper reason why entropy in a classical context is only defined up to such a constant.

So the natural unit called R_0 above should actually be Planck’s constant taken to the power defined by the number of particles.

Temperature

The first task to be solved in statistical mechanics is to find a general way of formulating a proper density function small w_n as a function of energy E_n. You can either assume that the system has a clearly defined energy upfront – the system lives on a ‘energy-hyperplane in phase space’ – or you can consider it immersed in a larger system later identified with a ‘heat bath’ which causes the system to reach thermal equilibrium. These two concepts are called the micro-canonical and the canonical distribution (or Gibbs distribution) and the actual distribution functions don’t differ much because the energy peaks so sharply also in the canonical case. It’s that type of calculations where those hyperspheres are actually needed.

Temperature as a concept emerges from a closer look at these distributions, but LL introduce it upfront from simpler considerations: It is sufficient to know that 1) entropy only depends on energy, 2) both are additive functions of subsystems, and 3) entropy is a maximum in equilibrium. You divide one system in two subsystems. The total change in entropy has to be zero as this is a maximum (in equilibrium), and what energy dE_1 leaves one system has to be received as dE_2 by the other system. Taking a look at the total entropy S as a function of the energy of one subsystem:

0 = \frac{dS}{dE_1} = \frac{dS_1}{dE_1} + \frac{dS_2}{dE_1} =
= \frac{dS_1}{dE_1} + \frac{dS_2}{dE_2} \frac{dE_2}{dE_1} =
= \frac{dS_1}{dE_1} + \frac{dS_2}{dE_2}

So \frac{dS_x}{dE_x} has to be the same for each subsystem x. Cutting one of the subsystems in two  you can use the same argument again. So there is one very interesting quantity that is the same for every subsystem – \frac{dS}{dE}. Let’s call it 1/T and let’s call T the temperature.

Learning Physics, Metaphors, and Quantum Fields

In my series on Quantum Field Theory I wanted to document my own learning endeavors but it has turned into a meta-contemplation on the ‘explain-ability’ of theoretical physics.

Initially I had been motivated by a comment David Tong made in his introductory lecture: Comparing different QFT books he states that Steven Weinberg‘s books are hard reads because at the time of writing Weinberg was probably the person knowing more than anyone else in the world on Quantum Field Theory. On the contrary Weinberg’s book on General Relativity is accessible which Tong attributes to Weinberg’s learning GR himself when he was writing that textbook.

Probably I figured nothing can go awry if I don’t know too much myself. Of course you should know what you are talking about – avoiding to mask ignorance by vague phrases such as scientists proved, experts said, in a very complicated process XY has been done.

Yet my lengthy posts on phase space didn’t score too high on the accessibility scale. Science writer Jennifer Ouelette blames readers’ confusion on writers not knowing their target audience:

This is quite possibly the most difficult task of all. You might be surprised at how many scientists and science writers get the level of discourse wrong when attempting to write “popular science.” Brian Greene’s The Elegant Universe was an undeniably important book, and it started off quite promising, with one of the best explications of relativity my layperson’s brain has yet encountered. But the minute he got into the specifics of string theory — his area of expertise — the level of discourse shot into the stratosphere. The prose became littered with jargon and densely packed technical details. Even highly science-literate general readers found the latter half of the book rough going.

Actually, I have experienced this effect myself as a reader of popular physics books. I haven’t read The Elegant Universe, but Lisa Randall’s Warped Passages or her Knocking on Heaven’s Door are in my opinion similar with respect to an exponential learning curve.

Authors go to great lengths in explaining the mysteries of ordinary quantum mechanics: the double-slit experiment, Schrödinger’s cat, the wave-particle dualism, probably a version of Schrödinger’s equation motivated by analogies to hydrodynamics.

Curved space

An icon of a science metaphor – curved space (Wikimedia, NASA).

Then tons of different fundamental particles get introduced – hard to keep track of if you don’t a print-out of the standard model in particle physics at hand, but still doable. But suddenly you find yourself in a universe you lost touch with. Re-reading such books again now I find full-blown lectures on QFT compressed into single sentences. The compression rate here is much higher than for the petty QM explanations.

I have a theory:

The comprehensibility of a popular physics text is inversely proportional to the compression factor of the math used (even if math is not explicitly referenced).

In PI in the Sky John Barrow mulls on succinct laws of nature in terms of the unreasonable effectiveness of mathematics. An aside: Yet Barrow is as critical as Nassim Taleb with respect to the allure of Platonicity’What is most remarkable about the success of mathematics  in [particle physics and cosmology] is that they are most remote from human experience (Quote from PI in the Sky).

Important concepts in QM can be explained in high school math. My old high school physics textbook contained a calculation of the zero point energy of a Fermi gas of electrons in metals.

Equations in advanced theoretical physics might still appear simple, still using symbols taken from the Latin or Greek alphabet. But unfortunately these letters denote mathematical objects that are not simple numbers – this is highly efficient compressed notation. These objects are the proverbial mathematical machinery(*) that act on other objects. Sounds like the vague phrases I scathed before, doesn’t it? These operators are rather like a software programs using the thing to the right of this machine as an input – but that’s already too much of a metaphor as the ‘input’ is not a number either.
(*) I used the also common term mathematical crank in earlier posts which I avoid now due to obvious reasons.

You can create rather precise metaphors for differential operators in classical physics, using references to soft rolling hills and things changing in time or (three-dimensional) space. You might be able to introduce the curly small d’s in partial derivatives when applying these concepts to three-dimensional space. More than three-dimensions can be explained resorting by the beetle-on-balloon or ant-in-the-hose metaphors.

But if it gets more advanced than that I frankly run out of metaphors I am comfortable with. You ought to explain some purely mathematical concepts before you continue to discuss physics.

I think comprehension of those popular texts on advanced topics works this way:

  • You can understand anything perfectly if you have once developed a feeling for the underlying math. For example you can appreciate descriptions of physical macroscopic objects moving under the influence of gravity, such as in celestial mechanics. Even if you have forgotten the details of your high school calculus lectures you might remember some facts on acceleration and speed you need to study when cramming for your driver license test.
  • When authors start to introduce new theoretical concepts there is a grey area of understanding – allowing for stretching your current grasp of math a bit. So it might be possible to understand a gradient vector as a slope of a three-dimensional hill even if you never studied vector calculus.
  • Suddenly you are not sure if the content presented is related to anything you have a clue of or if metaphors rather lead you astray. This is where new mathematical concepts have been introduced silently.

The effect of silently introduced cloaked math may even be worse as readers believe they understand but have been led astray. Theoretical physicist (and seasoned science blogger) Sabine Hossenfelder states in her post on metaphors in science:

Love: Analogies and metaphors build on existing knowledge and thus help us to understand something quickly and intuitively.

Hate: This intuition is eventually always misleading. If a metaphor were exact, it wouldn’t be a metaphor.

And while in writing, art, and humor most of us are easily able to tell when an analogy ceases to work, in science it isn’t always so obvious.

My plan has been to balance metaphors and rigor by reading textbooks in parallel with popular science books. I am mainly using Zee’s Quantum Field Theory in a Nutshell, Klauber’s Student Friendly Quantum Field Theory, and Tong’s lecture notes and videos.

Feynman penguin diagram

Feynman diagrams are often used in pop-sci texts to explain particle decay paths and interactions. Actually they are shortcuts for calculating terms in daunting integrals. The penguin is not a metaphor but a crib – a funny name for a specific class of diagrams that sort of resemble penguins.

But I also enjoyed Sean Carroll’s The Particle at the End of the Universe – my favorite QFT- / Higgs-related pop-sci book. Reading his chapters on quantum fields I felt he has boldly gone where no other physicist writing pop-sci had gone before. In many popular accounts of the Higgs boson and Higgs field we find somewhat poetic accounts of particles that communicate forces, such as the photon being the intermediary of electromagnetic forces.

Sean Carroll goes to the mathematical essence of the relationship of (rather abstract) symmetries, connection fields and forces:

The connection fields define invisible ski slopes at every point in space, leading to forces that push particles in different directions, depending on how they interact. There’s a gravitational ski slope that affects every particle in the same way, an electromagnetic ski slope that pushes positively charged particles one way and negatively charged particles in the opposite direction, a strong-interaction ski slope that is only felt by quarks and gluons, and a weak-interaction ski slope that is felt by all the fermions of the Standard Model, as well as by the Higgs boson itself. 

Indeed, in his blog Carroll writes:

So in the end, recognizing that it’s a subtle topic and the discussion might prove unsatisfying, I bit the bullet and tried my best to explain why this kind of symmetry leads directly to what we think of as a force. Part of that involved explaining what a “connection” is in this context, which I’m not sure anyone has ever tried before in a popular book. And likely nobody ever will try again!

This is the best popular account of symmetries and forces I could find so far – yet I confess: I could not make 100% sense of this before I had plowed through the respective chapters in Zee’s book. This is the right place to add a disclaimer: Of course I hold myself accountable for a possibly slow absorbing power or wrong approach of self-studying, as well as for confusing my readers. My brain is just the only one I have access to for empirical analysis right now and the whole QFT thing is an experiment. I should maybe just focus on writing about current research in an accessible way or keeping a textbook-style learner’s blog blog similar to this one.

Back to metaphors: Symmetries are usually explained by invoking rotating regular objects and crystals, but I am not sure if this image will inspire anything close to gauge symmetry in readers’ minds. Probably worse: I had recalled gauge symmetry in electrodynamics, but it was not straight-forward how to apply and generalize it to quantum fields – I needed to see some equations.

Sabine Hossenfelder says:

If you spend some time with a set of equations, pushing them back and forth, you’ll come to understand how the mathematical relationships play together. But they’re not like anything. They are what they are and have to be understood on their own terms.

Actually I had planned a post on the different routes to QFT – complementary to my post on the different ways to view classical mechanics. Unfortunately I feel the mathematically formidable path integrals would lend themselves more to metaphoric popularization – and thus more confusion.

You could either start with fields and quantize them which turn the classical fields (numbers attached to any point in space and time) into mathematical operators that actually create and destroy particles. Depending on the book you pick this is introduced as something straight-forward or as a big conceptual leap. My initial struggles with re-learning QFT concepts were actually due to the fact I had been taught the ‘dull’ approach (many years ago):

  • Simple QM deals with single particles. Mathematically, the state of those is described by the probability of a particle occupying this state. Our mathematical operators let you take the proverbial quantum leap – from one state to another. In QM lingo you destroy or create states.
  • There are many particles in condensed matter, thus we just extend our abstract space. The system is not only described by the properties of each particle, but also by the number of particles present. Special relativity might not matter.
  • Thus it is somehow natural that our machinery now destroys or annihilates particles.

The applications presented in relation to this approach were all taken from solid state physics where you deal with lots of particles anyway and creating and destroying some was not a big deal. It is more exciting if virtual particles are created from the vacuum and violating the conservation of energy for a short time, in line with the uncertainty principle.

The alternative route to this one (technically called the canonical quantization) is so-called path integral formalism. Zee introduces it via an anecdote of a wise guy student (called Feynman) who pesters his teacher with questions on the classical double-slit experiment: A particle emitted from a source passes through one of two holes and a detector records spatially varying intensity based on interference. Now wise guy asks: What if we drill a third hole, a fourth hole, a fifth hole? What if we add a second screen, a third screen? The answer is that adding additional paths the particle might take the amplitudes related to these paths will also contribute to the interference pattern.

Now the final question is: What if we remove all screens – drilling infinite holes into those screens? Then all possible paths the particle can traverse from source to detector would contribute. You sum over all (potential) histories.

I guess, a reasonable pop-sci article would probably not go into further details of what it means to sum over an infinite number of paths and yet get reasonable – finite – results, or to expound why on earth this should be similar to operators destroying particles. We should add that the whole amplitude-adding business was presented as an axiom. This is weird, but this is how the world seems to work! (Paraphrasing Feynman).

Then we would insert an opaque blackbox [something about the complicated machinery – see details on path integrals if you really want to] and jump directly to things that can eventually be calculated like scattering cross-sections and predictions how particle will interact with each other in the LHC … and gossip about Noble Prize winners.

Yet it is so tempting to ponder on how the classical action (introduced here) is related to this path integral: Everything we ‘know about the world’ is stuffed into the field-theoretical counterpart of the action. The action defines the phase (‘angle’) attached to a path. (Also Feynman talks about rotating arrows!) Quantum phenomena emerge when the action becomes comparable to Planck’s constant. If the action is much bigger most of the paths are cancelled out because  If phases fluctuate wildly contributions of different amplitudes get cancelled.

“I am not gonna simplify it. If you don’t like it – that’s too bad!”

On the Relation of Jurassic Park and Alien Jelly Flowing through Hyperspace

Yes, this is a serious physics post – no. 3 in my series on Quantum Field Theory.

I promised to explain what Quantization is. I will also argue – again – that classical mechanics is unjustly associated with pictures like this:

Steampunk wall clock (Wikimedia)

… although it is more like this:

Timelines in Back to the Future | By TheHYPO [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons

This shows the timelines in Back to the Future – in case you haven’t recognized it immediately.

What I am trying to say here is – again – is so-called classical theory is as geeky, as weird, and as fascinating as quantum physics.

Experts: In case I get carried away by my metaphors – please see the bottom of this post for technical jargon and what I actually try to do here.

Get a New Perspective: Phase Space

I am using my favorite simple example: A point-shaped mass connected to an massless spring or a pendulum, oscillating forever – not subject to friction.

The speed of the mass is zero when the motion changes from ‘upward’ to ‘downward’. It is maximum when the pendulum reaches the point of minimum height. Anything oscillates: Kinetic energy is transferred to potential energy and back. Position, velocity and acceleration all follow wavy sine or cosine functions.

For purely aesthetic reasons I could also plot the velocity versus position:

Simple Harmonic Motion Orbit | By Mazemaster (Own work) [Public domain], via Wikimedia Commons

From a mathematical perspective this is similar to creating those beautiful Lissajous curves:  Connecting a signal representing position to the x input of an oscillosope and the velocity signal to the y input results in a circle or an ellipse:

Lissajous curves | User Fiducial, Wikimedia

This picture of the spring’s or pendulum’s motion is called a phase portrait in phase space. Actually we use momentum, that is: velocity times mass, but this is a technicality.

The phase portrait is a way of depicting what a physical system does or can do – in a picture that allows for quick assessment.

Non-Dull Phase Portraits

Real-life oscillating systems do not follow simple cycles. The so-called Van der Pol oscillator is a model system subject to damping. It is also non-linear because the force of friction depends on the position squared and the velocity. Non-linearity is not uncommon; also the friction an airplane or car ‘feels’ in the air is proportional to the velocity squared.

The stronger this non-linear interaction is (the parameter mu in the figure below) the more will the phase portrait deviate from the circular shape:

Van der pols equation phase portrait | By Krishnavedala (Own work) [CC-BY-SA-3.0 (http://creativecommons.org/licenses/by-sa/3.0) or GFDL (http://www.gnu.org/copyleft/fdl.html)], via Wikimedia Commons

Searching for this image I have learned from Wikipedia that the Van der Pol oscillator is used as a model in biology – here the physical quantity considered is not a position but the action potential of a neuron (the electrical voltage across the cell’s membrane).

Thus plotting the rate of change of in a quantity we can measure plotted versus the quantity itself makes sense for diverse kinds of systems. This is not limited to natural sciences – you could also determine the phase portrait of an economic system!

Addicts of popular culture memes might have guessed already which phase portrait needs to be depicted in this post:

Reconnecting to Popular Science

Chaos Theory has become popular via the elaborations of Dr. Ian Malcolm (Jeff Goldblum) in the movie Jurassic Park. Chaotic systems exhibit phase portraits that are called Strange Attractors. An attractor is the set of points in phase space a system ‘gravitates’ to if you leave it to itself.

There is no attractor for the simple spring: This system will trace our a specific circle in phase space forever – the larger the bigger the initial push on the spring is.

The most popular strange attractor is probably the Lorentz Attractor. It  was initially associated with physical properties characteristic of temperature and the flow of air in the earth’s atmosphere, but it can be re-interpreted as a system modeling chaotic phenomena in lasers.

It might be apocryphal but I have been told that it is not the infamous flap of the butterfly’s wing that gave the related effect its name, but rather the shape of the three-dimensional attractor:

Lorenz system r28 s10 b2-6666 | By Computed in Fractint by Wikimol [Public domain], via Wikimedia Commons

We had Jurassic Park – here comes the jelly!

A single point-particle on a spring can move only along a line – it has a single degree of freedom. You need just a two-dimensional plane to plot its velocity over position.

Allowing for motion in three-dimensional space means we need to add additional dimensions: The motion is fully characterized by the (x,y,z) positions in 3D space plus the 3 components of velocity. Actually, this three-dimensional vector is called velocity – its size is called speed.

Thus we need already 6 dimensions in phase space to describe the motion of an idealized point-shaped particle. Now throw in an additional point-particle: We need 12 numbers to track both particles – hence 12 dimensions in phase space.

Why can’t the two particles simply use the same space?(*) Both particles still live in the same 3D space, they could also inhabit the same 6D phase space. The 12D representation has an advantage though: The whole system is represented by a single dot which make our lives easier if we contemplate different systems at once.

Now consider a system consisting of zillions of individual particles. Consider 1 cubic meter of air containing about 1025 molecules. Viewing these particles in a Newtonian, classical way means to track their individual positions and velocities. In a pre-quantum mechanical deterministic assessment of the world you know the past and the future by calculating these particles’ trajectories from their positions and velocities at a certain point of time.

Of course this is not doable and leads to practical non-determinism due to calculation errors piling up and amplifying. This is a 1025 body problem, much much much more difficult than the three-body problem.

Fortunately we don’t really need all those numbers in detail – useful properties of a gas such as the temperature constitute gross statistical averages of the individual particles’ properties. Thus we want to get a feeling how the phase portrait develops ‘on average’, not looking too meticulously at every dot.

The full-blown phase space of the system of all molecules in a cubic meter of air has about 1026 dimensions – 6 for each of the 1025 particles (Physicists don’t care about a factor of 6 versus a factor of 10). Each state of the system is sort of a snapshot what the system really does at a point of time. It is a vector in 1026 dimensional space – a looooong ordered collection of numbers, but nonetheless conceptually not different from the familiar 3D ‘arrow-vector’.

Since we are interesting in averages and probabilities we don’t watch a single point in phase space. We don’t follow a particular system.

We rather imagine an enormous number of different systems under different conditions.

Considering the gas in the cubic vessel this means: We imagine molecule 1 being at the center and very fast whereas molecule 10 is slow and in the upper right corner, and molecule 666 is in the lower left corner and has medium. Now extend this description to 1025 particles.

But we know something about all of these configurations: There is a maximum x, y and z particles can have – the phase portrait is limited by these maximum dimensions as the circle representing the spring was. The particles have all kinds of speeds in all kinds of directions, but there is a most probably speed related to temperature.

The collection of the states of all possible systems occupy a patch in 1026 dimensional phase space.

This patch gradually peters out at the edges in velocities’ directions.

Now let’s allow the vessel for growing: The patch will become bigger in spatial dimensions as particles can have any position in the larger cube. Since the temperature will decrease due to the expansion the mean velocity will decrease – assuming the cube is insulated.

The time evolution of the system (of these systems, each representing a possible system) is represented by a distribution of this hyper-dimensional patch transforming and morphing. Since we consider so many different states – otherwise probabilities don’t make sense – we don’t see the granular nature due to individual points – it’s like a piece of jelly moving and transforming:

Precisely defined initial configurations of systems configurations have a tendency to get mangled and smeared out. Note again that each point in the jelly is not equivalent to a molecule of gas but it is a point in an abstract configuration space with a huge number of dimensions. We can only make it accessible via projections into our 3D world or a 2D plane.

The analogy to jelly or honey or any fluid is more apt than it may seem

The temporal evolution in this hyperspace is indeed governed by equations that are amazingly similar to those governing an incompressible liquid – such as water. There is continuity and locality: Hyper-Jelly can’t get lost and be created. Any increase in hyper-jelly in a tiny volume of phase space can only be attributed to jelly flowing in to this volume from adjacent little volumes.

In summary: Classical mechanical systems comprising many degrees of freedom – that is: many components that have freedom to move in a different way than other parts of the system – can be best viewed in the multi-dimensional space whose dimensions are (something like) positions and (something like) the related momenta.

Can it get more geeky than that in quantum theory?

Finally: Quantization

I said in the previous post that quantization of fields or waves is like turning down intensity in order to bring out the particle-like rippled nature of that wave. In the same way you could say that you add blurry waviness to idealized point-shaped particles.

Another is to consider the loss in information via Heisenberg’s Uncertainly Principle: You cannot know both the position and the momentum of a particle or a classical wave exactly at the same time. By the way, this is why we picked momenta  and not velocities to generate phase space.

You calculate positions and momenta of small little volumes that constitute that flowing and crawling patches of jelly at a point of time from positions and momenta the point of time before. That’s the essence of Newtonian mechanics (and conservation of matter) applied to fluids.

Doing numerical calculation in hydrodynamics you think of jelly as divided into small little flexible cubes – you divide it mentally using a grid, and you apply a mathematical operation that creates the new state of this digitized jelly from the old one.

Since we are still discussing a classical world we do know positions and momenta with certainty. This translates to stating (in math) that it does not matter if you do calculations involving positions first or for momenta.

There are different ways of carrying out steps in these calculations because you could do them one way of the other – they are commutative.

Calculating something in this respect is similar to asking nature for a property or measuring that quantity.

Thus when we apply a quantum viewpoint and quantize a classical system calculating momentum first and position second or doing it the other way around will yield different results.

The quantum way of handling the system of those  1025 particles looks the same as the classical equations at first glance. The difference is in the rules for carrying out calculation involving positions and momenta – so-called conjugate variables.

Thus quantization means you take the classical equations of motion and give the mathematical symbols a new meaning and impose new, restricting rules.

I probably could just have stated that without going off those tangent.

However, any system of interest in the real world is not composed of isolated particles. We live in a world of those enormous phase spaces.

In addition, working with large abstract spaces like this is at the heart of quantum field theory: We start with something spread out in space – a field with infinite degrees in freedom. Considering different state vectors in these quantum systems is considering all possible configurations of this field at every point in space!

(*) This was a question asked on G+. I edited the post to incorporate the answer.

_______________________________________

Expert information:

I have taken a detour through statistical mechanics: Introducing Liouville equations as equation of continuity in a multi-dimensional phase space. The operations mentioned – related to positions of velocities – are the replacement of time derivatives via Hamilton’s equations. I resisted the temptation to mention the hyper-planes of constant energy. Replacing the Poisson bracket in classical mechanics with the commutator in quantum mechanics turns the Liouville equation into its quantum counterpart, also called Von Neumann equation.

I know that a discussion about the true nature of temperature is opening a can of worms. We should rather describe temperature as the width of a distribution rather than the average, as a beam of molecules all travelling in the same direction at the same speed have a temperature of zero Kelvin – not an option due to zero point energy.

The Lorenz equations have been applied to the electrical fields in lasers by Haken – here is a related paper. I did not go into the difference of the phase portrait of a system showing its time evolution and the attractor which is the system’s final state. I also didn’t stress that was is a three dimensional image of the Lorenz attractor and in this case the ‘velocities’ are not depicted. You could say it is the 3D projection of the 6D phase portrait. I basically wanted to demonstrate – using catchy images, admittedly – that representations in phase space allows for a quick assessment of a system.

I also tried to introduce the notion of a state vector in classical terms, not jumping to bras and kets in the quantum world as if a state vector does not have a classical counterpart.

I have picked an example of a system undergoing a change in temperature (non-stationary – not the example you would start with in statistical thermodynamics) and swept all considerations on ergodicity and related meaningful time evolutions of systems in phase space under the rug.

May the Force Field Be with You: Primer on Quantum Mechanics and Why We Need Quantum Field Theory

As Feynman explains so eloquently – and yet in a refreshingly down-to-earth way – understanding and learning physics works like this: There are no true axioms, you can start from anywhere. Your physics knowledge is like a messy landscape, built from different interconnected islands of insights. You will not memorize them all, but you need to recapture how to get from one island to another – how to connect the dots.

The beauty of theoretical physics is in jumping from dot to dot in different ways – and in pondering on the seemingly different ‘philosophical’ worldviews that different routes may provide.

This is the second post in my series about Quantum Field Theory, and I  try to give a brief overview on the concept of a field in general, and on why we need QFT to complement or replace Quantum Mechanics. I cannot avoid reiterating some that often quoted wave-particle paraphernalia in order to set the stage.

From sharp linguistic analysis we might conclude that is the notion of Field that distinguishes Quantum Field Theory from mere Quantum Theory.

I start with an example everybody uses: a so-called temperature field, which is simply: a temperature – a value, a number – attached to every point in space. An animation of monthly mean surface air temperature could be called the temporal evolution of the temperature field:

Monthly Mean Temperature

Solar energy is absorbed at the earth’s surface. In summer the net energy flow is directed from the air to the ground, in winter the energy stored in the soil is flowing to the surface again. Temperature waves are slowly propagating perpendicular to the surface of the earth.

The gradual evolution of temperature is dictated by the fact that heat flows from the hotter to the colder regions. When you deposit a lump of heat underground – Feynman once used an atomic bomb to illustrate this point – you start with a temperature field consisting of a sharp maximum, a peak, located in a region the size of the bomb. Wait for some minutes and this peak will peter out. Heat will flow outward, the temperature will rise in the outer regions and decrease in the center:

Diffluence of a bucket of heat, goverend by the Heat Transfer EquationModelling the temperature field (as I did – in relation to a specific source of heat placed underground) requires to solve the Heat Transfer Equation which is the mathy equivalent of the previous paragraph. The temperature is calculated step by step numerically: The temperature at a certain point in space determines the flow of heat nearby – the heat transferred changes the temperature – the temperature in the next minute determines the flow – and on and on.

This mundane example should tell us something about a fundamental principle – an idea that explains why fields of a more abstract variety are so important in physics: Locality.

It would not violate the principle of the conservation of energy if a bucket of heat suddenly disappeared in once place and appeared in another, separated from the first one by a light year. Intuitively we know that this is not going to happen: Any disturbance or ripple is transported by impacting something nearby.

All sorts of field equations do reflect locality, and ‘unfortunately’ this is the reason why all fundamental equations in physics require calculus. Those equations describe in a formal way how small changes in time and small variations in space do affect each other. Consider the way a sudden displacement traverses a rope:

Propagation of a waveSound waves travelling through air are governed by local field equations. So are light rays or X-rays – electromagnetic waves – travelling through empty space. The term wave is really a specific instance of the more generic field.

An electromagnetic wave can be generated by shaking an electrical charge. The disturbance is a local variation in the electrical field which gives rises to a changing magnetic field which in turn gives rise a disturbance in the electrical field …

Electromagneticwave3D

Electromagnetic fields are more interesting than temperature fields: Temperature, after all, is not fundamental – it can be traced back to wiggling of atoms. Sound waves are equivalent to periodic changes of pressure and velocity in a gas.

Quantum Field Theory, however, should finally cover fundamental phenomena. QFT tries to explain tangible matter only in terms of ethereal fields, no less. It does not make sense to ask what these fields actually are.

I have picked light waves deliberately because those are fundamental. Due to historical reasons we are rather familiar with the wavy nature of light – such as the colorful patterns we see on or CDs whose grooves act as a diffraction grating:

Michael Faraday had introduced the concept of fields in electromagnetism, mathematically fleshed out by James C. Maxwell. Depending on the experiment (that is: on the way your prod nature to give an answer to a specifically framed question) light may behave more like a particle, a little bullet, the photon – as stipulated by Einstein.

In Compton Scattering a photon partially transfers energy when colliding with an electron: The change in the photon’s frequency corresponds with its loss in energy. Based on the angle between the trajectories of the electron and the photon energy and momentum transfer can be calculated – using the same reasoning that can be applied to colliding billiard balls.

Compton Effect

We tend to consider electrons fundamental particles. But they give proof of their wave-like properties when beams of accelerated electrons are utilized in analyzing the microstructure of materials. In transmission electron microscopy diffraction patterns are generated that allow for identification of the underlying crystal lattice:

A complete quantum description of an electron or a photon does contain both the wave and particle aspects. Diffraction patterns like this can be interpreted as highlighting the regions where the probabilities to encounter a particle are maximum.

Schrödinger has given the world that famous equation named after him that does allow for calculating those probabilities. It is his equation that let us imagine point-shaped particles as blurred wave packets:

Schrödinger’s equation explains all of chemistry: It allows for calculating the shape of electrons’ orbitals. It explains the size of the hydrogen atom and it explains why electrons can inhabit stable ‘orbits’ at all – in contrast to the older picture of the orbiting point charge that would lose energy all  the time and finally fall into the nucleus.

But this so-called quantum mechanical picture does not explain essential phenomena though:

  • Pauli’s exclusion principle explains why matter is extended in space – particles need to put into different orbitals, different little volumes in space. But It is s a rule you fill in by hand, phenomenologically!
  • Schrödinger’s equations discribes single particles as blurry probability waves, but it still makes sense to call these the equivalents of well-defined single particles. It does not make sense anymore if we take into account special relativity.

Heisenberg’s uncertainty principle – a consequence of Schrödinger’s equation – dictates that we cannot know both position and momentum or both energy and time of a particle. For a very short period of time conservation of energy can be violated which means the energy associated with ‘a particle’ is allowed to fluctuate.

As per the most famous formula in the world energy is equivalent to mass. When the energy of ‘a particle’ fluctuates wildly virtual particles – whose energy is roughly equal to the allowed fluctuations – can pop into existence intermittently.

However, in order to make quantum mechanics needed to me made compatible with special relativity it was not sufficient to tweak Schrödinger’s equation just a bit.

Relativistically correct Quantum Field Theory is rather based on the concept of an underlying field pervading space. Particles are just ripples in this ur-stuff – I owe to Frank Wilczek for that metaphor. A different field is attributed to each variety of fundamental particles.

You need to take a quantum leap… It takes some mathematical rules to move from the classical description of the world to the quantum one, sometimes called quantization. Using a very crude analogy quantization is like making a beam of light dimmer and dimmer until it reveals its granular nature – turning the wavy ray of light into a cascade of photonic bullets.

In QFT you start from a classical field that should represent particles and then apply the machinery quantization to that field (which is called second quantization although you do not quantize twice.). Amazingly, the electron’s spin and Pauli’s principle are a natural consequence if you do it right. Paul Dirac‘s achievement in crafting the first relativistically correct equation for the electron cannot be overstated.

I found these fields the most difficult concepts to digest, but probably for technical reasons:

Historically  – and this includes some of those old text books I am so fond of – candidate versions of alleged quantum mechanical wave equations have been tested to no avail, such as the Klein-Gordon equation. However this equation turned out to make sense later – when re-interpreted as a classical field equation that still needs to be quantized.

It is hard to make sense of those fields intuitively. However, there is one field we are already familiar with: Photons are ripples arising from the electromagnetic field. Maxwell’s equations describing these fields had been compatible with special relativity – they predate the theory of relativity, and the speed of light shows up as a natural constant. No tweaks required!

I will work hard to turn the math of quantization into comprehensive explanations, risking epic failure. For now I hand over to MinutePhysics for an illustration of the correspondence of particles and fields:

Disclaimer – Bonus Track:

In this series I do not attempt to cover latest research on unified field theories, quantum gravity and the like. But since I started crafting this article, writing about locality when that article on an alleged simple way to replace field theoretical calculations went viral. The principle of locality may not hold anymore when things get really interesting – in the regime of tiny local dimensions and high energy.