Re-Visiting Carnot’s Theorem

The proof by contradiction used in physics textbooks is one of those arguments that appear surprising, then self-evident, then deceptive in its simplicity. You – or maybe only: I – cannot resist turning it over and over in your head again, viewing it from different angles.

tl;dr: I just wanted to introduce the time-honored tradition of ASCII text art images to illustrate Carnot’s Theorem, but this post got out of hand when I mulled about how to  refute an erroneous counter-argument. As there are still research papers being written about Carnot’s efficiency I feel vindicated for writing a really long post though.

Carnot‘s arguments prove that there is a maximum efficiency of a thermodynamic heat engine – a machine that turns heat into mechanical energy. He gives the maximum value by evaluating one specific, idealized process, and then proves that a machine with higher efficiency would give rise to a paradox. The engine uses part of the heat available in a large, hot reservoir of heat and turns it into mechanical work and waste heat – the latter dumped to a colder ‘environment’ in a 4-step process. (Note that while our modern reformulation of the proof by contradiction refers to the Second Law of Thermodynamics, Carnot’s initial version was based on the caloric theory.)

The efficiency of such an engine η – mechanical energy per cycle over input heat energy – only depends on the two temperatures (More details and references here):

$\eta_\text{carnot} = \frac {T_1-T_2}{T_1}$

These are absolute temperatures in Kelvin; this universal efficiency can be used to define what we mean by absolute temperature.

I am going to use ‘nice’ numbers. To make ηcarnot equal to 1/2, the hot temperature
T1 = 273° = 546 K, and the colder ‘environment’ has T2 = 0°C = 273 K.

If this machine is run in reverse, it uses mechanical input energy to ‘pump’ energy from the cold environment to the hot reservoir – it is a heat pump using the ambient reservoir as a heat source. The Coefficient of Performance (COP, ε) of the heat pump is heat output over mechanical input, the inverse of the efficiency of the corresponding engine. εcarnot is 2 for the temperatures given above.

If we combine two such perfect machines – an engine and a heat pump, both connected to the hot space and to the cold environment, their effects cancel out: The mechanical energy released by the engine drives the heat pump which ‘pumps back’ the same amount of energy.

In the ASCII images energies are translated to arrows, and the number of parallel arrows indicates the amount of energy per cycle (or power). For each device, the number or arrows flowing in and out is the same; energy is always conserved. I am viewing this from the heat pump’s perspective, so I call the cold environment the source, and the hot environment room.

Neither of the heat reservoirs are heated or cooled in this ideal case as the same amount of energy flows from and to each of the heat reservoirs:

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| | | |                         | | | |
v v v v                         ^ ^ ^ ^
| | | |                         | | | |
|------------|                 |---------------|
|   Engine   |->->->->->->->->-|   Heat pump   |
|  Eta = 1/2 |->->->->->->->->-| COP=2 Eta=1/2 |
|------------|                 |---------------|
| |                             | |
v v                             ^ ^
| |                             | |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

If either of the two machines works less than perfectly and in tandem with a perfect machine, anything is still fine:

If the engine is far less than perfect and has an efficiency of only 1/4 – while the heat pump still works perfectly – more of the engine’s heat energy input is now converted to waste heat and diverted to the environment:

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| | | |                           | |
v v v v                           ^ ^
| | | |                           | |
|------------|                 |---------------|
|   Engine   |->->->->->->->->-|   Heat pump   |
|  Eta = 1/4 |                 | COP=2 Eta=1/2 |
|------------|                 |---------------|
| | |                             |
v v v                             ^
| | |                             |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

Now two net units of energy flow from the hot room to the environment (summing up the arrows to and from the devices):

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| |
v v
| |
|------------------|
|   Combination:   |
| Eta=1/4 COP=1/2  |
|------------------|
| |
v v
| |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

Using a real-live heat pump with a COP of 3/2 (< 2) together with a perfect engine …

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| | | |                             | | |
v v v v                             ^ ^ ^
| | | |                             | | |
|------------|                 |-----------------|
|   Engine   |->->->->->->->->-|    Heat pump    |
|  Eta = 1/2 |->->->->->->->->-|     COP=3/2     |
|------------|                 |-----------------|
| |                                 |
v v                                 ^
| |                                 |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

… causes again a non-paradoxical net flow of one unit of energy from the room to the environment.

In the most extreme case  a poor heat pump (not worth this name) with a COP of 1 just translates mechanical energy into heat energy 1:1. This is a resistive heating element, a heating rod, and net heat fortunately flows from hot to cold without paradoxes:

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| |                                |
v v                                ^
| |                                |
|------------|                 |-----------------|
|   Engine   |->->->->->->->->-|   'Heat pump'   |
|  Eta = 1/2 |                 |     COP = 1     |
|------------|                 |-----------------|
|
v
|
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

The textbook paradox in encountered, when an ideal heat pump is combined with an allegedly better-than-possible engine, e.g. one with an efficiency:

ηengine = 2/3 (> ηcarnot = 1/2)

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| | |                           | | | |
v v v                           ^ ^ ^ ^
| | |                           | | | |
|------------|                 |---------------|
|   Engine   |->->->->->->->->-|   Heat pump   |
|  Eta = 2/3 |->->->->->->->->-| COP=2 Eta=1/2 |
|------------|                 |---------------|
|                               | |
v                               ^ ^
|                               | |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

The net effect / heat flow is then:

|----------------------------------------------------------|
|        Hot room at temperature T_1 = 273°C = 546 K       |
|----------------------------------------------------------|
|
^
|
|------------------|
|   Combination:   |
| Eta=3/2; COP=1/2 |
|------------------|
|
^
|
|----------------------------------------------------------|
|       Cold source at temperature T_2 = 0°C = 273 K       |
|----------------------------------------------------------|

One unit of heat would flow from the environment to the room, from the colder to the warmer body without any other change being made to the system. The combination of these machines would violate the Second Law of Thermodynamics; it is a Perpetuum Mobile of the Second Kind.

If the heat pump has a higher COP than the inverse of the perfect engine’s efficiency, a similar paradox arises, and again one unit of heat flows in the forbidden direction:

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| |                             | | |
v v                             ^ ^ ^
| |                             | | |
|------------|                 |---------------|
|   Engine   |->->->->->->->->-|   Heat pump   |
|  Eta = 1/2 |                 |    COP = 3    |
|------------|                 |---------------|
|                               | |
v                               ^ ^
|                               | |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

A weird question: Can’t we circumvent the paradoxes if we pair the impossible superior devices with poorer ones (of the reverse type)?

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| | |                             | |
v v v                             ^ ^
| | |                             | |
|------------|                 |---------------|
|   Engine   |->->->->->->->->-|   Heat pump   |
|  Eta = 2/3 |->->->->->->->->-|    COP = 1    |
|------------|                 |---------------|
|
v
|
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------

Indeed: If the COP of the heat pump (= 1) is smaller than the inverse of the (impossible) engine’s efficiency (3/2), there will be no apparent violation of the Second Law – one unit of net heat flows from hot to cold.

An engine with low efficiency 1/4 would ‘fix’ the second paradox involving the better-than-perfect heat pump:

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
| | | |                          | | |
v v v v                          ^ ^ ^
| | | |                          | | |
|------------|                 |---------------|
|   Engine   |->->->->->->->->-|   Heat pump   |
|  Eta = 1/4 |                 |     COP=3     |
|------------|                 |---------------|
| | |                            | |
v v v                            ^ ^
| | |                            | |
|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

But we cannot combine heat pumps and engines at will, just to circumvent the paradox – one counter-example is sufficient: Any realistic engine combined with any realistic heat pump – plus all combinations of those machines with ‘worse’ ones – have to result in net flow from hot to cold …

The Second Law identifies such ‘sets’ of engines and heat pumps that will all work together nicely. It’s easier to see this when all examples are condensed into one formula:

The heat extracted in total from the hot room – Q1 –  is the difference of heat used by the engine and heat delivered by the heat pump, both of which are defined in relation to the same mechanical work W:

$Q_1 = W\left (\frac{1}{\eta_\text{engine}}-\varepsilon_\text{heatpump}\right)$

This is also automatically equal to Qas another quick calculation shows or by just considering that energy is conserved: Some heat goes into the combination of the two machines, part of it – W – flows internally from the engine to the heat pump. But no part of the input Q1 can be lost, so the output of the combined machine has to match the input. Energy ‘losses’ such as energy due to friction will flow to either of the heat reservoirs: If an engine is less-then-perfect, more heat will be wasted to the environment; and if the heat pump is less-than-perfect a greater part of mechanical energy will be translated to heat only 1:1. You might be even lucky: Some part of heat generated by friction might end up in the hot room.

As Q1 has to be > 0 according to the Second Low, the performance numbers have to related by this inequality:

$\frac{1}{\eta_\text{engine}}\geq\varepsilon_\text{heatpump}$

The equal sign is true if the effects of the two machines just cancel each other.

If we start from a combination of two perfect machines (ηengine = 1/2 = 1/εheatpump) and increase either ηengine or εheatpump, this condition would be violated and heat would flow from cold to hot without efforts.

But also an engine with efficiency = 1 would work happily with the worst heat pump with COP = 1. No paradox would arise at first glance  – as 1/1 >= 1:

|----------------------------------------------------------|
|         Hot room at temperature T_1 = 273°C = 546 K      |
|----------------------------------------------------------|
|                                |
v                                ^
|                                |
|------------|                 |-----------------|
|   Engine   |->->->->->->->->-|   'Heat pump'   |
|   Eta = 1  |                 |      COP=1      |
|------------|                 |-----------------|

|----------------------------------------------------------|
|        Cold source at temperature T_2 = 0°C = 273 K      |
|----------------------------------------------------------|

What’s wrong here?

Because of conservation of energy ε is always greater equal 1; so the set of valid combinations of machines all consistent with each other is defined by:

$\frac{1}{\eta_\text{engine}}\geq\varepsilon_\text{heatpump}\geq1$

… for all efficiencies η and COPs / ε of machines in a valid set. The combination η = ε = 1 is still not ruled out immediately.

But if the alleged best engine (in a ‘set’) would have an efficiency of 1, then the alleged best heat pump would have an Coefficient of Performance of only 1 – and this is actually the only heat pump possible as ε has to be both lower equal and greater equal than 1. It cannot get better without creating paradoxes!

If one real-live heat pump is found that is just slightly better than a heating rod – say
ε = 1,1 – then performance numbers for the set of consisent, non-paradoxical machines need to fulfill:

$\eta_\text{engine}\leq\eta_\text{best engine}$

and

$\varepsilon_\text{heatpump}\leq\varepsilon_\text{best heatpump}$

… in addition to the inequality relating η and ε.

If ε = 1,1 is a candidate for the best heat pump, a set of valid machines would comprise:

• All heat pumps with ε between 1 and 1,1 (as per limits on ε)
• All engines with η between 0 and 0,9 (as per inequality following the Second Law plus limit on η).

Consistent sets of machines are thus given by a stronger condition – by adding a limit for both efficiency and COP ‘in between’:

$\frac{1}{\eta_\text{engine}}\geq\text{Some Number}\geq\varepsilon_\text{heatpump}\geq1$

Carnot has designed a hypothetical ideal heat pump that could have a COP of εcarnot = 1/ηcarnot. It is a limiting case of a reversible machine, but feasible on principle. εcarnot  is thus a valid upper limit for heat pumps, a candidate for Some Number. In order to make this inequality true for all sets of machines (ideal ones plus all worse ones) then 1/ηcarnot = εcarnot also constitutes a limit for engines:

$\frac{1}{\eta_\text{engine}}\geq\frac{1}{\eta_\text{carnot}}\geq\varepsilon_\text{heatpump}\geq1$

So in order to rule out all paradoxes, Some Number in Between has to be provided for each set of machines. But what defines a set? As machines of totally different making have to work with each other without violating this equality, this number can only be a function of the only parameters characterizing the system – the two temperatures

Carnot’s efficiency is only a function of the temperatures. His hypothetical process is reversible, the machine can work either as a heat pump or an engine. If we could come up with a better process for a reversible heat pump (ε > εcarnot), the machine run in reverse would be an engine with η less than ηcarnot, whereas a ‘better’ engine would lower the upper bound for heat pumps.

If you have found one truly reversible process, both η and ε associated with it are necessarily the upper bounds of performance of the respective machines, so you cannot push Some Number in one direction or the other, and the efficiencies of all reversible engines have to be equal – and thus equal to ηcarnot. The ‘resistive heater’ with ε = 1 is the iconic irreversible device. It will not turn into a perfect engine with η = 1 when ‘run in reverse’.

The seemingly odd thing is that 1/ηcarnot appears like a lower bound for ε at first glance if you just declare ηcarnot an upper bound for corresponding engines and take the inverse, while in practice and according to common sense it is the maximum value for all heat pumps, including irreversible ones. (As a rule of thumb a typical heat pump for space heating has a COP only 50% of 1/ηcarnot.)

But this ‘contradiction’ is yet another way of stating that there is one universal performance indicator of all reversible machines making use of two heat reservoirs: The COP of a hypothetical ‘superior’ reversible heat pump would be at least 1/ηcarnot  … as good as Carnot’s reversible machine, maybe better. But the same is true for the hypothetical superior engine with an efficiency of at least ηcarnot. So the performance numbers of all reversible machines (all in one set, characterized by the two temperatures) have to be exactly the same.

Historical piston compressor (from the time when engines with pistons looked like the ones in textbooks), installed 1878 in the salt mine of Bex, Switzerland. 1943 it was still in operation. Such machines used in salt processing were considered the first heat pumps.

An Efficiency Greater Than 1?

No, my next project is not building a Perpetuum Mobile.

Sometimes I mull upon definitions of performance indicators. It seems straight-forward that the efficiency of a wood log or oil burner is smaller than 1 – if combustion is not perfect you will never be able to turn the caloric value into heat, due to various losses and incomplete combustion.

Our solar panels have an ‘efficiency’ or power ratio of about 16,5%. So 16.5% of solar energy are converted to electrical energy which does not seem a lot. However, that number is meaningless without adding economic context as solar energy is free. Higher efficiency would allow for much smaller panels. If efficiency were only 1% and panels were incredibly cheap and I had ample roof spaces I might not care though.

The coefficient of performance of a heat pump is 4-5 which sometimes leaves you with this weird feeling of using odd definitions. Electrical power is ‘multiplied’ by a factor always greater than one. Is that based on crackpottery?

Our heat pump. (5 connections: 2x heat source – brine, 3x heating water hot water / heating water supply, joint return).

Actually, we are cheating here when considering the ‘input’ – in contrast to the way we view photovoltaic panels: If 1 kW of electrical power is magically converted to 4 kW of heating power, the remaining 3 kW are provided by a cold or lukewarm heat source. Since those are (economically) free, they don’t count. But you might still wonder, why the number is so much higher than 1.

There is an absolute minimum temperature, and our typical refrigerators and heat pumps operate well above it.

The efficiency of thermodynamic machines is most often explained by starting with an ideal process using an ideal substance – using a perfect gas as a refrigerant that runs in a closed circuit. (For more details see pointers in the Further Reading section below). The gas would be expanded at a low temperature. This low temperature is constant as heat is transferred from the heat source to the gas. At a higher temperature the gas is compressed and releases heat. The heat released is the sum of the heat taken in at lower temperatures plus the electrical energy fed in to the compressor – so there is no violation of energy conservation. In order to ‘jump’ from the lower to the higher temperature, the gas is compressed – by a compressor run on electrical power – without exchanging heat with the environment. This process is repeating itself again and again, and with every cycle the same heat energy is released at the higher temperature.

In defining the coefficient of performance the energy from the heat source is omitted, in contrast to the electrical energy:

$COP = \frac {\text{Heat released at higher temperature per cycle}}{\text{Electrical energy fed into the compressor per cycle}}$

The efficiency of a heat pump is the inverse of the efficiency of an ideal engine – the same machine, running in reverse. The engine has an efficiency lower than 1 as expected. Just as the ambient energy fed into the heat pump is ‘free’, the related heat released by the engine to the environment is useless and thus not included in the engine’s ‘output’.

One of Austria’s last coal power plants – Kraftwerk Voitsberg, retired in 2006 (Florian Probst, Wikimedia). Thermodynamically, this is like ‘a heat pump running in reverse. That’s why I don’t like when a heat pump is said to ‘work like a refrigerator, just in reverse’ (Hinting at: The useful heat provided by the heat pump is equivalent to the waste heat of the refrigerator). If you run the cycle backwards, a heat pump would become sort of a steam power plant.

The calculation (see below) results in a simple expression as the efficiency only depends on temperatures. Naming the higher temperature (heating water) T1 and the temperature of the heat source (‘environment’, our water tank for example) T….

$COP = \frac {T_1}{T_1-T_2}$

The important thing here is that temperatures have to be calculated in absolute values: 0°C is equal to 273,15 Kelvin, so for a typical heat pump and floor loops the nominator is about 307 K (35°C) whereas the denominator is the difference between both temperature levels – 35°C and 0°C, so 35 K. Thus the theoretical COP is as high as 8,8!

Two silly examples:

• Would the heat pump operate close to absolute zero, say, trying to pump heat from 5 K to 40 K, the COP would only be
40 / 35 = 1,14.
• On the other hand, using the sun as a heat source (6000 K) the COP would be
6035 / 35 = 172.

So, as heat pump owners we are lucky to live in an environment rather hot compared to absolute zero, on a planet where temperatures don’t vary that much in different places, compared to how far away we are from absolute zero.

__________________________

Richard Feynman has often used unusual approaches and new perspectives when explaining the basics in his legendary Physics Lectures. He introduces (potential) energy at the very beginning of the course drawing on Carnot’s argument, even before he defines force, acceleration, velocity etc. (!) In deriving the efficiency of an ideal thermodynamic engine many chapters later he pictured a funny machine made from rubber bands, but otherwise he follows the classical arguments:

Chapter 44 of Feynman’s Physics Lectures Vol 1, The Laws of Thermodynamics.

For an ideal gas heat energies and mechanical energies are calculated for the four steps of Carnot’s ideal process – based on the Ideal Gas Law. The result is the much more universal efficiency given above. There can’t be any better machine as combining an ideal engine with an ideal heat pump / refrigerator (the same type of machine running in reverse) would violate the second law of thermodynamics – stated as a principle: Heat cannot flow from a colder to a warmer body and be turned into mechanical energy, with the remaining system staying the same.

Pressure over Volume for Carnot’s process, when using the machine as an engine (running it counter-clockwise it describes a heat pump): AB: Expansion at constant high temperature, BC: Expansion without heat exchange (cooling), CD: Compression at constant low temperature, DA: Compression without heat exhange (gas heats up). (Image: Kara98, Wikimedia).

Feynman stated several times in his lectures that he does not want to teach history of physics or downplayed the importance of learning about history of science a bit (though it seems he was well versed in – as e.g. his efforts to follow Newton’s geometrical prove of Kepler’s Laws showed). For historical background of the evolution of Carnot’s ideas and his legacy see the the definitive resource on classical thermodynamics and its history – Peter Mander’s blog carnotcycle.wordpress.com:

What had puzzled me is once why we accidentally latched onto such a universal law, using just the Ideal Gas Law.The reason is that the Gas Law has the absolute temperature already included. Historically, it did take quite a while until pressure, volume and temperature had been combined in a single equation – see Peter Mander’s excellent article on the historical background of this equation.

Having explained Carnot’s Cycle and efficiency, every course in thermodynamics reveals a deeper explanation: The efficiency of an ideal engine could actually be used as a starting point defining the new scale of temperature.

Carnot engines with different efficiencies due to different lower temperatures. If one of the temperatures is declared the reference temperature, the other can be determined by / defined by the efficiency of the ideal machine (Image: Olivier Cleynen, Wikimedia.)

However, according to the following paper, Carnot did not rigorously prove that his ideal cycle would be the optimum one. But it can be done, applying variational principles – optimizing the process for maximum work done or maximum efficiency:

Carnot Theory: Derivation and Extension, paper by Liqiu Wang

Mastering Geometry is a Lost Art

I am trying to learn Quantum Field Theory the hard way: Alone and from text books. But there is something harder than the abstract math of advanced quantum physics:

You can aim at comprehending ancient texts on physics.

If you are an accomplished physicist, chemist or engineer – try to understand Sadi Carnot’s reasoning that was later called the effective discovery of the Second Law of Thermodynamics.

At Carnotcycle’s excellent blog on classical thermodynamics you can delve into thinking about well-known modern concepts in a new – or better: in an old – way. I found this article on the dawn of entropy a difficult ready, even though we can recognize some familiar symbols and concepts such as circular processes, and despite or because of the fact I was at the time of reading this article a heavy consumer of engineering thermodynamics textbooks. You have to translate now unused notions such as heat received and the expansive power into their modern counterparts. It is like reading a text in a foreign language by deciphering every single word instead of having developed a feeling for a language.

Stephen Hawking once published an anthology of the original works of the scientific giants of the past millennium: Corpernicus, Galieo, Kepler, Newton and Einstein: On the Shoulders of Giants. So just in case you googled for Hawkins – don’t expect your typical Hawking pop-sci bestseller with lost of artistic illustrations. This book is humbling. I found the so-called geometrical proofs most difficult and unfamiliar to follow. Actually, it is my difficulties in (not) taming that Pesky Triangle that motivated me to reflect on geometrical proofs.

I am used to proofs stacked upon proofs until you get to the real thing. In analysis lectures you get used to starting by proving that 1+1=2 (literally) until you learn about derivatives and slopes. However, Newton and his processor giants talk geometry all the way! I have learned a different language. Einstein is most familiar in the way he tackles problems though his physics is on principle the most non-intuitive.

This amazon.com review is titled Now We Know why Geometry is Called the Queen of the Sciences and the reviewer perfectly nails it:

It is simply astounding how much mileage Copernicus, Galileo, Kepler, Newton, and Einstein got out of ordinary Euclidean geometry. In fact, it could be argued that Newton (along with Leibnitz) were forced to invent the calculus, otherwise they too presumably would have remained content to stick to Euclidean geometry.

Science writer Margaret Wertheim gives an account of a 20th century giant trying to recapture Isaac Newton’s original discovery of the law of gravitation in her book Physics on the Fringe (The main topic of the book are outsider physicists’ theories, I have blogged about the book at length here.).

This giant was Richard Feynman.

Today the gravitational force, gravitational potential and related acceleration objects in the gravitational fields are presented by means of calculus: The potential is equivalent to a rubber membrane model – the steeper the membrane, the higher the force. (However, this is not a geometrical proof – this is an illustration of underlying calculus.)

Model of the gravitational potential. An object trapped in these wells moves along similar trajectories as bodies in a gravitational field. Depending on initial conditions (initial position and velocity) you end up with elliptical, parabolic or hyperbolic orbits. (Wikimedia, Invent2HelpAll)

(Today) you start from the equation of motion for a object under the action of a force that weakens with the inverse square of the distance between two massive objects, and out pops Kepler’s law about elliptical orbits. It takes some pages of derivation, and you need to recognize conic sections in formulas – but nothing too difficult for an undergraduate student of science.

Newton actually had to invent calculus together with tinkering with the law of gravitation. In order to convince his peers he needed to use the geometrical language and the mental framework common back then. He uses all kinds of intricate theorems about triangles and intersecting lines (;-)) in order to say what we say today using the concise shortcuts of derivatives and differentials.

Wertheim states:

Feynman wasn’t doing this to advance the state of physics. He was doing it to experience the pleasure of building a law of the universe from scratch.

Feynman said to his students:

“For your entertainment and interest I want you to ride in a buggy for its elegance instead of a fancy automobile.”

But he underestimated the daunting nature of this task:

In the preparatory notes Feynman made for his lecture, he wrote: “Simple things have simple demonstrations.” Then, tellingly, he crossed out the second “simple” and replaced it with “elementary.” For it turns out there is nothing simple about Newton’s proof. Although it uses only rudimentary mathematical tools, it is a masterpiece of intricacy. So arcane is Newton’s proof that Feynman could not understand it.

Given the headache that even Corpernicus’ original proofs in the Shoulders of Giants gave me I can attest to:

… in the age of calculus, physicists no longer learn much Euclidean geometry, which, like stonemasonry, has become something of a dying art.

Richard Feynman has finally made up his own version of a geometrical proof to fully master Newton’s ideas, and Feynman’s version covered hundred typewritten pages, according to Wertheim.

Everybody who indulges gleefully in wooden technical prose and takes pride in plowing through mathematical ideas can relate to this:

For a man who would soon be granted the highest honor in science, it was a DIY triumph whose only value was the pride and joy that derive from being able to say, “I did it!”

Richard Feynman gave a lecture on the motion of the planets in 1964, that has later been called his Lost Lecture. In this lecture he presented his version of the geometrical proof which was simpler than Newton’s.

The proof presented in the lecture have been turned in a series of videos by Youtube user Gary Rubinstein. Feynman’s original lecture was 40 minutes long and confusing, according to Rubinstein – who turned it into 8 chunks of videos, 10 minutes each.

The rest of the post is concerned with what I believe that social media experts call curating. I am just trying to give an overview of the episodes of this video lecture. So my summaries do most likely not make a lot of sense if you don’t watch the videos. But even if you don’t watch the videos you might get an impression of what a geometrical proof actually is.

In Part I (embedded also below) Kepler’s laws are briefly introduced. The characteristic properties of an ellipse are shown – in the way used by gardeners to creating an elliptical with a cord and a pencil. An ellipse can also be created within a circle by starting from a random point, connecting it to the circumference and creating the perpendicular bisector:

Part II starts with emphasizing that the bisector is actually a tangent to the ellipse (this will become an important ingredient in the proof later). Then Rubinstein switches to physics and shows how a planet effectively ‘falls into the sun’ according to Newton, that is a deviation due to gravity is superimposed to its otherwise straight-lined motion.

Part III shows in detail why the triangles swept out by the radius vector need to stay the same. The way Newton defined the size of the force in terms of parallelogram attached to the otherwise undisturbed path (no inverse square law yet mentioned!) gives rise to constant areas of the triangles – no matter what the size of the force is!

In Part IV the inverse square law in introduced – the changing force is associated with one side of the parallelogram denoting the deviation from motion without force. Feynman has now introduced the velocity as distance over time which is equal to size of the tangential line segments over the areas of the triangles. He created a separate ‘velocity polygon’ of segments denoting velocities. Both polygons – for distances and for velocities – look elliptical at first glance, though the velocity polygon seems more circular (We will learn later that it has to be a circle).

In Part V Rubinstein expounds that the geometrical equivalent of the change in velocity being proportional to 1 over radius squared times time elapsed with time elapsed being equivalent to the size of the triangles (I silently translate back to dv = dt times acceleration). Now Feynman said that he was confused by Newton’s proof of the resulting polygon being an ellipse – and he proposed a different proof:
Newton started from what Rubinstein calls the sun ‘pulsing’ at the same intervals, that is: replacing the smooth path by a polygon, resulting in triangles of equal size swept out by the radius vector but in a changing velocity.  Feynman divided the spatial trajectory into parts to which triangles of varying area e are attached. These triangles are made up of radius vectors all at the same angles to each other. On trying to relate these triangles to each other by scaling them he needs to consider that the area of a triangle scales with the square of its height. This also holds for non-similar triangles having one angle in common.

Part VI: Since ‘Feynman’s triangles’ have one angle in common, their respective areas scale with the squares of the heights of their equivalent isosceles triangles, thus basically the distance of the planet to the sun. The force is proportional to one over distance squared, and time is proportional to distance squared (as per the scaling law for these triangles). Thus the change in velocity – being the product of both – is constant! This is what Rubinstein calls Feynman’s big insight. But not only are the changes in velocity constant, but also the angles between adjacent line segments denoting those changes. Thus the changes in velocities make up for a regular polygon (which seems to turn into a circle in the limiting case).

Part VII: The point used to build up the velocity polygon by attaching the velocity line segments to it is not the center of the polygon. If you draw connections from the center to the endpoints the angle corresponds to the angle the planet has travelled in space. The animations of the continuous motion of the planet in space – travelling along its elliptical orbit is put side-by-side with the corresponding velocity diagram. Then Feynman relates the two diagrams, actually merges them, in order to track down the position of the planet using the clues given by the velocity diagram.

In Part VIII (embedded also below) Rubinstein finally shows why the planet traverses an elliptical orbit. The way the position of the planet has finally found in Part VII is equivalent to the insights into the properties of an ellipse found at the beginning of this tutorial. The planet needs be on the ‘ray’, the direction determined by the velocity diagram. But it also needs to be on the perpendicular bisector of the velocity segment – as force cause a change in velocity perpendicular to the previous velocity segment and the velocity needs to correspond to a tangent to the path.