The Heat Source Paradox

It is not a paradox – it is a straight-forward relation between a heat pump system’s key data:

The lower a heat pump’s performance factor is, the smaller the source can be built.

I would not write this post, hadn’t I found a version of this statement with a positive twist  used in an advert!

In this post I consider a heat pump a blackbox that converts input energy into output heat energy – it ‘multiplies’ energy by a performance factor. A traditional mechanical heat pump uses electrical input energy to drive a mechanical compressor. The uncommon Rotation Heat Pump utilizes the pressure gradient created by centrifugal forces and thus again by electrical power.

But a pressure difference can also be maintained by adsorption/desorption processes or by changing the amount of one fluid dissolved in another; Einstein’s famous refrigerator uses a more complex combination of such dissolution/evaporation processes. Evaporation or desorption can be directly driven by heat: A gas heat pump thus ‘multiplies’ the energy from burning natural gas (and in addition, a heat pump and a gas boiler can be combined in one unit).

The overall performance factor of a gas heat pump – kWh heating energy out over kWh gas in – is about 1,5 – 2. This is lower than 4 – 5 available with mechanical compressors. But the assessment depends on the costs of kWh gas versus kWh electrical energy: If gas is four times cheaper (which nearly is the case in Germany) than burning natural gas in a traditional boiler without any ‘heat pump multiplication’, then the classical boiler can be more economical than using a heat pump with an electrical compressor. If gas is ‘only’ two times as cheap, then a gas heat pump with an overall performance number of ‘only’ 2 will still beat an electrical heat pump with a performance factor of 4.

While the gas heat pump may have its merits under certain market conditions, its performance number is low: For one kWh of gas you only get two kWh of heating energy. This  means you only need to provide one kWh of ‘ambient’ energy from your source – geothermal, water, or air. If the performance factor of an electrical heat pump is 4, you multiply each kWh of input energy by 4. But the heat source has to be able to supply the required 3 kWh. This is the whole ‘paradox’: The better the heat pump’s performance is in terms of heating energy over input energy, the more energy has to be released by a properly designed heat source, like ground loops sufficiently large, a ground-water well providing sufficient flow-rate, an air heat pump’s ventilator powerful enough, or our combination of a big enough solar/air collector plus water tank.

Illustration of the ‘heat source paradox’: The lower the performance number (ratio of output and input energy), the lower is the required ambient energy that has to be provided by ‘the environment’. The output heating energy in red is the target number that has to be met – it is tied to the building’s design heat load.

If you wish to state it that way, a heat pump with inferior performance characteristics has the ‘advantage’ that the source can be smaller – less pipes to be buried in the ground or a smaller water tank. And in an advert for a gas heat pump I found it spelled out exactly in this way, as a pro argument compared to other heat pumps:

The heat source can be built much smaller – investment costs are lower!

It is not wrong, but it is highly misleading. It is like saying that heating electrically with a resistive heating element – and thus a performance number of 1 – is superior because you do not need to invest in building any source of ambient energy at all.

Entropy and Dimensions (Following Landau and Lifshitz)

Some time ago I wrote about volumes of spheres in multi-dimensional phase space – as needed in integrals in statistical mechanics.

The post was primarily about the curious fact that the ‘bulk of the volume’ of such spheres is contained in a thin shell beneath their hyperspherical surfaces. The trick to calculate something reasonable is to spot expressions you can Tayler-expand in the exponent.

Large numbers ‘do not get much bigger’ if multiplied by a factor, to be demonstrated again by Taylor-expanding such a large number in the exponent; I used this example:

Assuming N is about 1025  then its natural logarithm is about 58 and Ne^N = e^{\ln(N)+N} = e^{58+10^{25}} , then 58 can be neglected compared to N itself.

However, in the real world numbers associated with locations and momenta of particles come with units. Calling the unit ‘length’ in phase space R_0 the large volume can be written as aN{(\frac{r}{R_0})}^N = ae^{\ln{(N)} + N\ln{(\frac{r}{R_0})}} , and the impact of an additional factor N also depends on the unit length chosen.

I did not yet mention the related issues with the definition of entropy. In this post I will follow the way Landau and Lifshitz introduce entropy in Statistical Physics, Volume 5 of their Course of Theoretical Physics.

Landau and Lifshitz introduce statistical mechanics top-down, starting from fundamental principles and from Hamiltonian classical mechanics: no applications, no definitions of ‘heat’ and ‘work’, nor historical references needed for motivation. Classical phenomenological thermodynamics is only introduced after their are done with the statistical foundations. Both entropy and temperature are defined – these are useful fundamental properties spotted in the mathematical derivations and thus deserve special names. They cover both classical and quantum statistics in small number of pages – LL’s style has been called terse or elegant.

The behaviour of a system with a large number of particles is encoded in a probability distribution function in phase space, a density. In the classical case this is a continuous function of phase-space co-ordinates. In the quantum case you consider distinct states – whose energy levels are densely packed together though. Moving from classical to quantum statistics means to count those states rather than to integrate the smooth density function over a volume. There are equivalent states created by permutations of identical particles – but factoring in that is postponed and not required for a first definition of entropy. A quasi-classical description is sufficient: using a minimum cell in phase space, whose dimensions are defined by Planck’s constant h that has a dimension of action – length times momentum.

Entropy as statistical weight

Entropy S is defined as the logarithm of the statistical weight \Delta \Gamma – the number of quantum states associated with the part of phase phase used by the (sub)-system. (Landau and Lifshitz use the concept of a – still large – subsystem embedded in a larger volume most consequentially, in order to avoid reliance on the ergodic hypothesis as mentioned in the preface). In the quasi-classical view the statistical weight is the volume in phase space occupied by the system divided by the size of the minimum unit cell defined by Planck’s constant h. Denoting momenta by p, positions by q, using \Delta p and \Delta q as a shortcut applying multiple dimensions equivalent to s degrees of freedom…

S = log \Delta \Gamma = log \frac {\Delta p \Delta q}{2 \pi \hbar^s}

An example from solid state physics: if the system is considered a rectangular box in the physical world, possible quantum states related to vibrations can be visualized in terms of possible standing waves that ‘fit’ into the box. The statistical weight would then single out those bunch of states the system actually ‘has’ / ‘uses’ / ‘occupies’ in the long run.

Different sorts of statistical functions are introduced, and one reason for writing this article to emphasize the difference between them: The density function associates each point in phase space – each possible configuration of a system characterized by the momenta and locations of all particles – with a probability. These points are also called microstates. Taking into account the probabilities to find a system in any of these microstates gives you the so-called macrostate characterized by the statistical weight: How large or small a part of phase space the system will use when watched for a long time.

The canonical example is an ideal gas in a vessel: The most probable spacial distribution of particles is to find them spread out evenly, the most unlikely configuration is to have them concentrated in (nearly) the same location, like one corner of the box. The density function assigns probabilities to these configurations. As the even distribution is so much much more likely, the \Delta q part of the statistical weight would cover all of the physical volume available. The statistical weight function has to obtain a maximum value in the most likely case, in equilibrium.

The significance of energies – and why there are logarithms everywhere.

Different sufficiently large subsystems of one big system are statistically independent – as their properties are defined by their bulk volume rather than their surfaces interfacing with other subsystems – and the larger the volume, the larger the ratio of volume and surface.  Thus the probability density function for the combined system – as a function of momenta and locations of all particles in the total phase phase – has to be equal to the product of the densities for each subsystem. Denoting the classical density with \rho and adding a subscript for the set of momenta and positions referring to a subsystem:

\rho(q,p) = \rho_1(q_1,p_1) \rho_2(q_2,p_2)

(Since these are probability densities, the actual probability is always obtained by multiplying with the differential(s) dqdp).

This means that the logarithm of the composite density is equal to the sum of the logarithms of the individual densities. This the root cause of having logarithms show up everywhere in statistical mechanics.

A mechanical system of particles is characterized by only 7 ‘meaningful’ additive integrals: Energy, momentum and angular momentum – they add up when you combine systems, in contrast to all the other millions of integration constants that would appear when solving the equations of motions exactly. Momentum and angular momentum are not that interesting thermodynamically, as one can change to a frame moving and rotating with the system (LL also cover rotating systems). So energy remains as the integral of outstanding importance.

From counting states to energy intervals

What we want is to relate entropy to energy, so assertions about numbers of states covered need to be translated to statements about energy and energy ranges.

LL denote the probability to find a system in (micro-)state n with energy E_n as w_n – the quantum equivalent of density \rho . w_n has to be a linear function of the energy of this micro-state E_n as per the additivity just mentioned above, and thus LL omit the subscript n for w:

w_n = w(E_n)

(They omit any symbol ever if possible to keep their notation succinct ;-))

A thermodynamic system has an enormous number of (mechanical) degrees of freedom. Fluctuations are small as per the law of large numbers in statistics, and the probability to find a system with a certain energy can be approximated by a sharp delta-function-like peak at the system’s energy E. So in thermal equilibrium its energy has a very sharp peak. It occupies a very thin ‘layer’ of thickness \Delta E in config space – around the hyperplane that characterizes its average energy E.

Statistical weight \Delta \Gamma can be considered the width of the related function: Energy-wise broadening of the macroscopic state \Delta E needs to be translated to a broadening related to the number of quantum states.

We change variables, so the connection between Γ and E is made via the derivative of Γ with respect to E. E is an integral, statistical property of the whole system, and the probability for the system to have energy E in equilibrium is W(E)dE . E is not discrete so this is again a  probability density. It is capital W now – in contrast to w_n which says something about the ‘population’ of each quantum state with energy E_n.

A quasi-continuous number of states per energy Γ is related to E by the differential:

d\Gamma = \frac{d\Gamma}{dE} dE.

As E peaks so sharply and the energy levels are packed so densely it is reasonable to use the function (small) w but calculate it for an argument value E. Capital W(E) is a probability density as a function of total energy, small w(E) is a function of discrete energies denoting states – so it has to be multiplied by the number of states in the range in question:

W(E)dE = w(E)d\Gamma

Thus…

W(E) = w(E)\frac{d\Gamma}{dE}.

The delta-function-like functions (of energy or states) have to be normalized, and the widths ΔΓ and ΔE multiplied by the respective heights W and w taken at the average energy E_\text{avg} have to be 1, respectively:

W(E_\text{avg}) \Delta E = 1
w(E_\text{avg}) \Delta \Gamma = 1

(… and the ‘average’ energy is what is simply called ‘the’ energy in classical thermodynamics).

So \Delta \Gamma is inversely proportional to the probability of the most likely state (of average energy). This can also be concluded from the quasi-classical definition: If you imagine a box full of particles, the least possible state is equivalent to all particles occupying a single cell in phase space. The probability for that is (size of the unit cell) over (size of the box) times smaller than the probability to find the particles evenly distributed on the whole box … which is exactly the definition of \Delta \Gamma.

The statistical weight is finally:

\Delta \Gamma =  \frac{d\Gamma(E_\text{avg})}{dE} \Delta E.

… the broadening in \Gamma , proportional to the broadening in E

The more familiar (?) definition of entropy

From that, you can recover another familiar definition of entropy, perhaps the more common one. Taking the logarithm…

log S = log (\Delta \Gamma) = -log (w(E_\text{avg})).

As log w is linear in E, the averaging of E can be extended to the whole log function. Then the definition of ‘averaging over states n’ can be used: To multiply the value for each state n by probability w_n and sum up:

- \sum_{n} w_n log w_n.

… which is the first statistical expression for entropy I had once learned.

LL do not introduce Boltzmann’s constant k here

It is effectively set to 1 – so entropy is defined without a reference to k. k is is only mentioned in passing later: In case one wishes to measure energy and temperature in different units. But there is no need to do so, if you defined entropy and temperature based on first principles.

Back to units

In a purely classical description based on the volume in phase space instead of the number of states there would be no cell of minimum size, and then instead of the statistical weight we had simply this volume: But then entropy would be calculated in a very awkward unit, the logarithm of action. Every change of the unit for measuring volumes in phase space would result in an additive constant – the deeper reason why entropy in a classical context is only defined up to such a constant.

So the natural unit called R_0 above should actually be Planck’s constant taken to the power defined by the number of particles.

Temperature

The first task to be solved in statistical mechanics is to find a general way of formulating a proper density function small w_n as a function of energy E_n. You can either assume that the system has a clearly defined energy upfront – the system lives on a ‘energy-hyperplane in phase space’ – or you can consider it immersed in a larger system later identified with a ‘heat bath’ which causes the system to reach thermal equilibrium. These two concepts are called the micro-canonical and the canonical distribution (or Gibbs distribution) and the actual distribution functions don’t differ much because the energy peaks so sharply also in the canonical case. It’s that type of calculations where those hyperspheres are actually needed.

Temperature as a concept emerges from a closer look at these distributions, but LL introduce it upfront from simpler considerations: It is sufficient to know that 1) entropy only depends on energy, 2) both are additive functions of subsystems, and 3) entropy is a maximum in equilibrium. You divide one system in two subsystems. The total change in entropy has to be zero as this is a maximum (in equilibrium), and what energy dE_1 leaves one system has to be received as dE_2 by the other system. Taking a look at the total entropy S as a function of the energy of one subsystem:

0 = \frac{dS}{dE_1} = \frac{dS_1}{dE_1} + \frac{dS_2}{dE_1} =
= \frac{dS_1}{dE_1} + \frac{dS_2}{dE_2} \frac{dE_2}{dE_1} =
= \frac{dS_1}{dE_1} + \frac{dS_2}{dE_2}

So \frac{dS_x}{dE_x} has to be the same for each subsystem x. Cutting one of the subsystems in two  you can use the same argument again. So there is one very interesting quantity that is the same for every subsystem – \frac{dS}{dE}. Let’s call it 1/T and let’s call T the temperature.

The Collector Size Paradox

Recently I presented the usual update of our system’s and measurement data documentation.The PDF document contains consolidated numbers for each year and month of operations:

Total output heating energy (incl. hot tap water), electrical input energy (incl. brine pump) and its ratio – the performance factor. Seasons always start at Sept.1, except the first season that started at Nov. 2011. For ‘special experiments’ that had an impact on the results see the text and the PDF linked above.

It is finally time to tackle the fundamental questions:

What id the impact of the size of the solar/air collector?

or

What is the typical output power of the collector?

In 2014 the Chief Engineer had rebuilt the collector so that you can toggle between 12m2 instead of 24m

TOP: Full collector – hydraulics as in seasons 2012, 2013. Active again since Sept. 2017. BOTTOM: Half of the collector, used in seasons 201414, 15, and 16.

Do we have data for seasons we can compare in a reasonable way – seasons that (mainly) differ by collector area?

We disregard seasons 2014 and 2016 – we had to get rid of a nearly 100 years old roof truss and only heated the ground floor with the heat pump.

Attic rebuild project – point of maximum destruction – generation of fuel for the wood stove.

Season 2014 was atypical anyway because of the Ice Storage Challenge experiment.

Then seasonal heating energy should be comparable – so we don’t consider the cold seasons 2012 and 2016.

Remaining warm seasons: 2013 – where the full collector was used – and 2015 (half collector). The whole house was heated with the heat pump; heating and energies and ambient energies were similar – and performance factors were basically identical. So we checked the numbers for the ice months Dec/Feb/Jan. Here a difference can be spotted, but it is far less dramatic than expected. For half the collector:

  • Collector harvest is about 10% lower
  • Performance factor is lower by about 0,2
  • Brine inlet temperature for the heat pump is about 1,5K lower

The upper half of the collector is used, as indicated by hoarfrost.

It was counter-intuitive, and I scrutinized Data Kraken to check it for bugs.

But actually we forgot that we had predicted that years ago: Simulations show the trend correctly, and it suffices to do some basic theoretical calculations. You only need to know how to represent a heat exchanger’s power in two different ways:

Power is either determined by the temperature of the fluid when it enters and exits the exchanger tubes …

[1]   T_brine_outlet – T_brine_inlet * flow_rate * specific_heat

… but power can also be calculated from the heat energy flow from brine to air – over the surface area of the tubes:

[2]   delta_T_brine_air * Exchange_area * some_coefficient

Delta T is an average over the whole exchanger length (actually a logarithmic average but using an arithmetic average is good enough for typical parameters). Some_coefficient is a parameter that characterized heat transfer for area or per length of a tube, so Exchange_area * Some_coefficient could also be called the total heat transfer coefficient.

If several heat exchangers are connected in series their powers are not independent as they share common temperatures of the fluid at the intersection points:

The brine circuit connecting heat pump, collector and the underground water/ice storage tank. The three ‘interesting’ temperatures before/after the heat pump, collector and tank can be calculated from the current power of the heat pump, ambient air temperature, and tank temperature.

When the heat pump is off in ‘collector regeneration mode’ the collector and the heat exchanger in the tank necessarily transfer heat at the same power  per equation [1] – as one’s brine inlet temperature is the other one’s outlet temperature, the flow rate is the same, and also specific heat (whose temperature dependence can be ignored).

But powers can also be expressed by [2]: Each exchanger has a different area, a different heat transfer coefficient, and different mean temperature difference to the ambient medium.

So there are three equations…

  • Power for each exchanger as defined by [1]
  • 2 equations of type [2], one with specific parameters for collector and air, the other for the heat exchanger in the tank.

… and from those the three unknowns can be calculated: Brine inlet temperatures, brine outlet temperature, and harvesting power. All is simple and linear, it is not a big surprise that collector harvesting power is proportional temperature difference between air and tank. The warmer the air, the more you harvest.

The combination of coefficient factors is the ratio of the product of total coefficients and their sum, like: \frac{f_1 * f_2}{f_1 + f_2} – the inverse of the sum of inverses.

This formula shows what one might you have guessed intuitively: If one of the factors is much bigger than the other – if one of the heat exchangers is already much ‘better’ than the others, then it does not help to make the better one even better. In the denominator, the smaller number in the sum can be neglected before and after optimization, the superior properties always cancel out, and the ‘bad’ component fully determines performance. (If one of the ‘factors’ is zero, total power is zero.) Examples for ‘bad’ exchangers: If the heat exchanger tubes in the tank are much too short or if a flat plat collector is used instead of an unglazed collector.

On the other hand, if you make a formerly ‘worse’ exchanger much better, the ratio will change significantly. If both exchangers have properties of the same order of magnitude – which is what we deign our systems for – optimizing one will change things for the better, but never linearly, as effects always cancel out to some extent (You increase numbers in both parts if the fraction).

So there is no ‘rated performance’ in kW or kW per area you could attach to a collector. Its effective performance also depends on the properties of the heat exchanger in the tank.

But there is a subtle consequence to consider: The smaller collector can deliver the same energy and thus ‘has’ twice the power per area. However, air temperature is given, and [2] must hold: In order to achieve this, the delta T between brine and air necessarily has to increase. So brine will be a bit colder and thus the heat pump’s Coefficient of Performance will be a bit lower. Over a full season including the warm periods of heating hot water only the effect is less pronounced – but we see a more significant change in performance data and brine inlet temperature for the ice months in the respective seasons.

Tinkering, Science, and (Not) Sharing It

I stumbled upon this research paper called PVC polyhedra:

We describe how to construct a dodecahedron, tetrahedron, cube, and octahedron out of pvc pipes using standard fittings.

In particular, if we take a connector that takes three pipes each at 120 degree angles from the others (this is called a “true wye”) and we take elbows of the appropriate angle, we can make the edges come together below the center at exactly the correct angles.

A pivotal moment: What you consider tinkering is actually research-paper-worthy science. Here are some images from the Chief Engineer’s workbench.

The supporting construction of our heat exchangers are built from standard parts connected at various angles:

The final result can be a cuboid for holding meandering tubes:

… or cascaded prisms with n-gon basis – for holding spirals of flexible tubes:

The implementation of this design is documented here (a German post whose charm would be lost in translation unless I wanted to create Internet Poetry).

But I also started up my time machine – in order to find traces of my polyhedra research in the early 1980s. From photos and drawings of the three-dimensional crystals in mineralogy books I figured out how to draw two-dimensional maps of maximally connected surface areas. I cut out the map, and glued together the remaining free edges. Today I would be made redundant by Origami AI.

I filled several shelves with polyhedra of increasing number of faces, starting with a tetrahedron and culminating with this rhombicosidodecahedron. If I recall correctly, I cheated a bit with this one and created some of the pyramids as completely separate items.

I think this was a rather standard hobby for the typical nerdy child, among things like growing crystals from solutions of toxic chemicals, building a makeshift rotatable telescope tripod from scraps, or verifying the laws of optics using prisms and lenses from ancient dismantled devices.

The actually interesting thing is that this photo is the only trace of any of these hobbies. In many years after creating this stuff – and destroying it again – I never thought about documenting it. Until today. It seems we weren’t into sharing these days.

Heat Transport: What I Wrote So Far.

Don’t worry, The Subversive Elkement will publish the usual silly summer posting soon! Now am just tying up loose ends.

In the next months I will keep writing about heat transport: Detailed simulations versus maverick’s rules of thumb, numerical solutions versus insights from the few things you can solve analytically, and applications to our heat pump system.

So I checked what I have already written – and I discovered a series which does not show up as such in various lists, tags, categories:

[2014-12-14] Cistern-Based Heat Pump – Research Done in 1993 in Iowa. Pioneering work, but the authors dismissed a solar collector for economic reasons. They used a steady-state estimate of the heat flow from ground to the tank, and did not test the setup in winter.

Cistern-Based Water-Source Heat Pump System Design, 1993[2015-01-28] More Ice? Exploring Spacetime of Climate and Weather. A simplified simulation based on historical weather data – only using daily averages. Focus: Estimate of the maximum volume of ice per season, demonstration of yearly variations. As explained later (2017) in more detail I had to use information from detailed simulations though – to calculate the energy harvested by the collector correctly in such a simple model.

Simple simulations of volume of ice[2015-04-01] Ice Storage Challenge: High Score! Our heat pump created an ice cube of about 15m3 because we had turned the collector off. About 10m3 of water remained unfrozen, most likely when / because the ice cube touched ground. Some qualitative discussions of heat transport phenomena involved and of relevant thermal parameters.

Ice formation during the 'ice storage challenge'[2016-01-07] How Does It Work? (The Heat Pump System, That Is) Our system, in a slide-show of operating statuses throughput a typical year. For each period typical temperatures are given and the ‘typical’ direction of heat flow.

System in September - typical operations conditions[2016-01-22] Temperature Waves and Geothermal Energy. ‘Geothermal’ energy used by heat pumps is mainly stored solar energy. A simple model: The temperature at the surface of the earth varies sinusoidally throughout the year – this the boundary condition for the heat equation. This differential equation links the temporal change of temperature to its spatial variation. I solve the equation, show some figures, and check how results compare to the thermal diffusivity of ground obtained from measurements.

Measured 'wave' and propagation time[2016-03-01] Rowboats, Laser Pulses, and Heat Energy (Boring Title: Dimensional Analysis). Re-visiting heat transport and heat diffusion length, this time only analyzing dimensional relationships. By looking at the heat equation (without the need to solve it) a characteristic length can be calculated: ‘How far does heat get in a certain time?’

Temperature waves in ground - attenuation length of about 10 meters[2017-02-05] Earth, Air, Water, and Ice. Data analysis of the heating season 2014/15 (when we turned off the solar/air collector to simulate a harsher winter) – and an attempt to show energy storages, heat exchangers, and heat flows in one schematic. From the net energy ‘in the tank’ the contribution of ground can be calculated.

Energy storage, heat exchangers, heat flow[2017-02-22] Ice Storage Hierarchy of Needs. Continued from the previous post – bird’s eye view: How much energy comes from which sources, and which input parameters are critical? I try to answer when and if simple energy accounting makes sense in comparison to detailed simulations.

Hierarchy of needs - ambient energy in ice months[2017-05-02] Simulating Peak Ice. I compare measurements of the level in the tank with simulations of the evolution of the volume of ice. Simulations (1-minute intervals) comprise a model of the control logic, the varying performance factor of the heat pump, heat transport in ground, energy balances for the hot and cold tanks, and the heat exchangers connected in series.

Simulations of brine and tank temperature and volume of ice, based on system state in 1-minute intervals.(Adding the following after having published this post. However, there is no guarantee I will update this post forever ;-))

[2017-08-17] Simulations: Levels of Consciousness. Bird’s Eye View: How does simulating heat transport fit into my big picture of simulating the heat pump system or buildings or heating systems in general? I consider it part of the ‘physics’ layer of a hierarchy of levels.

Simulation - levels of consciousnessPlanned episodes? Later this year (2017) or next year I might write about the error made when considering simplified geometry – like modeling a linear 1D flow when the actualy symmetry is e.g. spherical.

Spheres in a Space with Trillions of Dimensions

I don’t venture into speculative science writing – this is just about classical statistical mechanics; actually about a special mathematical aspect. It was one of the things I found particularly intriguing in my first encounters with statistical mechanics and thermodynamics a long time ago – a curious feature of volumes.

I was mulling upon how to ‘briefly motivate’ the calculation below in a comprehensible way, a task I might have failed at years ago already, when I tried to use illustrations and metaphors (Here and here). When introducing the ‘kinetic theory’ in thermodynamics often the pressure of an ideal gas is calculated first, by considering averages over momenta transferred from particles hitting the wall of a container. This is rather easy to understand but still sort of an intermediate view – between phenomenological thermodynamics that does not explain the microscopic origin of properties like energy, and ‘true’ statistical mechanics. The latter makes use of a phase space with with dimensions the number of particles. One cubic meter of gas contains ~1025 molecules. Each possible state of the system is depicted as a point in so-called phase space: A point in this abstract space represents one possible system state. For each (point-like) particle 6 numbers are added to a gigantic vector – 3 for its position and 3 for its momentum (mass times velocity), so the space has ~6 x 1025 dimensions. Thermodynamic properties are averages taken over the state of one system watched for a long time or over a lot of ‘comparable’ systems starting from different initial conditions. At the heart of statistical mechanics are distributions functions that describe how a set of systems described by such gigantic vectors evolves. This function is like a density of an incompressible fluid in hydrodynamics. I resorted to using the metaphor of a jelly in hyperspace before.

Taking averages means to multiply the ‘mechanical’ property by the density function and integrate it over the space where these functions live. The volume of interest is a  generalized N-ball defined as the volume within a generalized sphere. A ‘sphere’ is the surface of all points in a certain distance (‘radius’ R) from an origin

x_1^2 + x_2^2 + ... + x_ {N}^2 = R^2

(x_n being the co-ordinates in phase space and assuming that all co-ordinates of the origin are zero). Why a sphere? Because states are ordered or defined by energy, and larger energy means a greater ‘radius’ in phase space. It’s all about rounded surfaces enclosing each other. The simplest example for this is the ellipse of the phase diagram of the harmonic oscillator – more energy means a larger amplitude and a larger maximum velocity.

And here is finally the curious fact I actually want to talk about: Nearly all the volume of an N-ball with so many dimensions is concentrated in an extremely thin shell beneath its surface. Then an integral over a thin shell can be extended over the full volume of the sphere without adding much, while making integration simpler.

This can be seen immediately from plotting the volume of a sphere over radius: The volume of an N-ball is always equal to some numerical factor, times the radius to the power of the number of dimensions. In three dimensions the volume is the traditional, honest volume proportional to r3, in two dimensions the ‘ball’ is a circle, and its ‘volume’ is its area. In a realistic thermodynamic system, the volume is then proportional to rN with a very large N.

The power function rN turn more and more into an L-shaped function with increasing exponent N. The volume increases enormously just by adding a small additional layer to the ball. In order to compare the function for different exponents, both ‘radius’ and ‘volume’ are shown in relation to the respective maximum value, R and RN.

The interesting layer ‘with all the volume’ is certainly much smaller than the radius R, but of course it must not be too small to contain something. How thick the substantial shell has to be can be found by investigating the volume in more detail – using a ‘trick’ that is needed often in statistical mechanics: Taylor expanding in the exponent.

A function can be replaced by its tangent if it is sufficiently ‘straight’ at this point. Mathematically it means: If dx is added to the argument x, then the function at the new target is f(x + dx), which can be approximated by f(x) + [the slope df/dx] * dx. The next – higher-order term would be proportional to the curvature, the second derivation – then the function is replaced by a 2nd order polynomial. Joseph Nebus has recently published a more comprehensible and detailed post about how this works.

So the first terms of this so-called Taylor expansion are:

f(x + dx) = f(x) + dx{\frac{df}{dx}} + {\frac{dx^2}{2}}{\frac{d^2f}{dx^2}} + ...

If dx is small higher-order terms can be neglected.

In the curious case of the ball in hyperspace we are interested in the ‘remaining volume’ V(r – dr). This should be small compared to V(r) = arN (a being the uninteresting constant numerical factor) after we remove a layer of thickness dr with the substantial ‘bulk of the volume’.

However, trying to expand the volume V(r – dr) = a(r – dr)N, we get:

V(r - dr) = V(r) - adrNr^{N-1} + a{\frac{dr^2}{2}}N(N-1)r^{N-2} + ...
= ar^N(1 - N{\frac{dr}{r}} + {\frac{N(N-1)}{2}}({\frac{dr}{r}})^2) + ...

But this is not exactly what we want: It is finally not an expansion, a polynomial, in (the small) ratio of dr/r, but in Ndr/r, and N is enormous.

So here’s the trick: 1) Apply the definition of the natural logarithm ln:

V(r - dr) = ae^{N\ln(r - dr)} = ae^{N\ln(r(1 - {\frac{dr}{r}}))}
= ae^{N(\ln(r) + ln(1 - {\frac{dr}{r}}))}
= ar^Ne^{\ln(1 - {\frac{dr}{r}}))} = V(r)e^{N(\ln(1 - {\frac{dr}{r}}))}

2) Spot a function that can be safely expanded in the exponent: The natural logarithm of 1 plus something small, dr/r. So we can expand near 1: The derivative of ln(x) is 1/x (thus equal to 1/1 near x=1) and ln(1) = 0. So ln(1 – x) is about -x for small x:

V(r - dr) = V(r)e^{N(0 - 1{\frac{dr}{r})}} \simeq V(r)e^{-N{\frac{dr}{r}}}

3) Re-arrange fractions …

V(r - dr) = V(r)e^{-\frac{dr}{(\frac{r}{N})}}

This is now the remaining volume, after the thin layer dr has been removed. It is small in comparison with V(r) if the exponential function is small, thus if {\frac{dr}{(\frac{r}{N})}} is large or if:

dr \gg \frac{r}{N}

Summarizing: The volume of the N-dimensional hyperball is contained mainly in a shell dr below the surface if the following inequalities hold:

{\frac{r}{N}} \ll dr \ll r

The second one is needed to state that the shell is thin – and allow for expansion in the exponent, the first one is needed to make the shell thick enough so that it contains something.

This might help to ‘visualize’ a closely related non-intuitive fact about large numbers, like eN: If you multiply such a number by a factor ‘it does not get that much bigger’ in a sense – even if the factor is itself a large number:

Assuming N is about 1025  then its natural logarithm is about 58 and…

Ne^N = e^{\ln(N)+N} = e^{58+10^{25}}

… 58 can be neglected compared to N itself. So a multiplicative factor becomes something to be neglected in a sum!

I used a plain number – base e – deliberately as I am obsessed with units. ‘r’ in phase space would be associated with a unit incorporating lots of lengths and momenta. Note that I use the term ‘dimensions’ in two slightly different, but related ways here: One is the mathematical dimension of (an abstract) space, the other is about cross-checking the physical units in case a ‘number’ is something that can be measured – like meters. The co-ordinate  numbers in the vector refer to measurable physical quantities. Applying the definition of the logarithm just to rN would result in dimensionless number N side-by-side with something that has dimensions of a logarithm of the unit.

Using r – a number with dimensions of length – as base, it has to be expressed as a plain number, a multiple of the unit length R_0 (like ‘1 meter’). So comparing the original volume of the ball a{(\frac{r}{R_0})}^N to one a factor of N bigger …

aN{(\frac{r}{R_0})}^N = ae^{\ln{(N)} + N\ln{(\frac{r}{R_0})}}

… then ln(N) can be neglected as long as \frac{r}{R_0} is not extreeeemely tiny. Using the same argument as for base e above, we are on the safe side (and can neglect factors) if r is of about the same order of magnitude as the ‘unit length’ R_0 . The argument about negligible factors is an argument about plain numbers – and those ‘don’t exist’ in the real world as one could always decide to measure the ‘radius’ in a units of, say, 10-30 ‘meters’, which would make the original absolute number small and thus the additional factor non-negligible. One might save the argument by saying that we would always use units that sort of match the typical dimensions (size) of a system.

Saying everything in another way: If the volume of a hyperball ~rN is multiplied by a factor, this corresponds to multiplying the radius r by a factor very, very close to 1 – the Nth root of the factor for the volume. Only because the number of dimensions is so large, the volume is increased so much by such a small increase in radius.

As the ‘bulk of the volume’ is contained in a thin shell, the total volume is about the product of the surface area and the thickness of the shell dr. The N-ball is bounded by a ‘sphere’ with one dimension less than the ball. Increasing the volume by a factor means that the surface area and/or the thickness have to be increased by factors so that the product of these factors yield the volume increase factor. dr scales with r, and does thus not change much – the two inequalities derived above do still hold. Most of the volume factor ‘goes into’ the factor for increasing the surface. ‘The surface becomes the volume’.

This was long-winded. My excuse: Also Richard Feynman took great pleasure in explaining the same phenomenon in different ways. In his lectures you can hear him speak to himself when he says something along the lines of: Now let’s see if we really understood this – let’s try to derive it in another way…

And above all, he says (in a lecture that is more about math than about physics)

Now you may ask, “What is mathematics doing in a physics lecture?” We have several possible excuses: first, of course, mathematics is an important tool, but that would only excuse us for giving the formula in two minutes. On the other hand, in theoretical physics we discover that all our laws can be written in mathematical form; and that this has a certain simplicity and beauty about it. So, ultimately, in order to understand nature it may be necessary to have a deeper understanding of mathematical relationships. But the real reason is that the subject is enjoyable, and although we humans cut nature up in different ways, and we have different courses in different departments, such compartmentalization is really artificial, and we should take our intellectual pleasures where we find them.

___________________________________

Further reading / sources: Any theoretical physics textbook on classical thermodynamics / statistical mechanics. I am just re-reading mine.

You Never Know

… when obscure knowledge comes in handy!

You can dismantle an old gutter without efforts, and without any special tools:

Just by gently setting it into twisted motion, effectively applying ~1Hz torsion waves that would lead to fatigue break within a few minutes.

I knew my stint in steel research in the 1990s would finally be good for something.

If you want to create a meme from this and tag it with Work Smart Not Harder, don’t forget to give me proper credits.