The Amelioration of Uncertainty

The odds of choking from atmospheric fluctuations

This is the first of a two part series about Bayes Theorem and the Second Law of Thermodynamics. It begins with the question:

What are the odds that one hour from now, the diffuse air in the sealed and insulated room I’m in will position itself in the half where I’m not, thereby causing me to choke?

The air’s microstate is a point in equation formed from the position and momentum of each particle equation. The microstate’s details are unknown, but we do know it’s volume equation and energy equation. This confines the possibilities for equation to some region equation as shown below. Within that region, there’s subsets of “bad” states equation that will evolve into states equation which only take up half the room. The probability equation we seek is the chance that the air is currently in equation.

SL

Using Liouville’s Theorem and the entropy of an ideal gas,

    equation

where equation depend on the gas, we get, assuming a equation room at 1 Atmosphere and T=25 degrees Celsius,

    equation

It’s clear why this never happens: real gases are one of those equation majorities and not one of the exceptions. In this sense physical systems don’t make “transitions” equation whenever equation which is just the Second Law of thermodynamics. Although a better way to express it is “the odds are against a state in equation winding up in equation whenever equation“.

Probability distributions are introduced solely to define regions equation and to count states. In this case the true state equation has to be in region consistent with the measured energy equation where equation. This implicit definition of equation can be made explicit by maximizing the entropy functional equation subject to the constraints:

    equation

The second constraint can usually be dropped for reason given in the next post. The region of interest is then equation.

These probabilities are not the frequency of anything. They describe our uncertainty about equation and are similar to prior probabilities for fixed parameters. The frequency of states under repeated preparation, or of a single system over equation lifetimes, are interesting quantities, but they’re separate from reasoning about the air currently in my room. It’s likely effects not considered would confine states to a small subset of equation contained entirely in that equation majority, so we’d never observe an exception even if we could repeat it equation times, which we can’t. The frequency interpretation is not known or likely to be true, not needed, and impossible to verify.

This understanding is associated with Jaynes (see references), but is quite old. Plank expressed the same views in a classic 1913 work. Planck saw entropy as a general tool not limited to physics which allowed us to take partial knowledge (functions) of the true state and find conclusions implied by almost every state consistent with that knowledge. The only thing needed to make those conclusions a reality is for the true state to be one of those great majority. There is no mention of frequency/ergodic interpretations and he even carries out an explicit maxent construction like above (section 139). As proof of the generality of these ideas, I used Planck’s reasoning in this post to show why IID Normal assumptions work much better than Frequentists think they should.

So now imagine there is a Bayesian who takes equation as a prior probability and observes aspects of the air as time progresses. With each measurement the Bayesian updates the distribution producing a sequence equation of regions which shrink around the true state equation. The Bayesian creates a decreasing equation and concludes equation won’t overlap with equation at all, so there is no chance of asphyxiation.

Bays

The Bayesian’s results do not contradict the Second Law calculation either as an inference or in it’s physical implications.

As an inference they are both perfectly valid. The Bayesian’s additional knowledge allows them to place equation in a “funnel” in time+phase space equation smaller than the “cylinder” used in the Second Law calculation, but they are both accurately locating the truth path in that space. So while they both reach the same no-choking conclusion, the Bayesian does so with less uncertainty, equation rather than equation, but this merely reflects the truism that greater knowledge leads to less uncertainty. If you lacked the Bayesian’s additional measurements, the Second Law inference is still valid.

Nor are various entropies in contradiction. Although both calculations use equation they reference different equation, which are functions of different variables and have different numerical values, and are being used in slightly different ways.

They agree physically as well. Either analysis predicts the mass density equation, which is proportional to the frequency of particles in each small volume, will maximize the empirical entropy equation and diffuse out to a uniform density. Just as in the last post, the equation decreasing in no way contradicts equation increasing or our intuitions about increasing disorder.

So equation is compatible with equation. The implications of this for thermodynamics and statistical inference will be sketched in the next post.

REFERENCES:
Jaynes (1963) Information Theory and Statistical Mechanics
Jaynes (1965) Gibbs vs Boltzmann Entropies
Jaynes (1967) Foundations of Probability Theory and Statistical Mechanics
Jyanes (1988) The Evolution of Carnot’s Principle
Jaynes (1992) The Gibbs Paradox
Jaynes (1998) The Second Law as Physical Fact and Human Inference

September 10, 2013
11 comments »
  • September 10, 2013Joseph

    All,

    This is my sincere attempt to answer Cosma Shalizi’s questions from the last post, without poking Frequentists in the eye.

    This understanding radically changes the nature of non-equilibrium Statistical Mechanics and much else besides. I hope to give some sense of that in the next post.

    Regards,
    Joseph

  • September 10, 2013Daniel Lakeland

    So, I think you’ve shown that you can use the maximum entropy formalism to construct measures on the phase space constrained by the energy and its uncertainty (which you say we can usually ignore) that give the right answers. You’ve also shown that if you obtain information you can construct Bayesian measures that give the right answers with less uncertainty. I think the only question left is in what specific sense does the statement “entropy tends to increase in time” make sense? ie. for what general procedure for calculating a measure and computing its entropy result in that statement being true?

    Hopefully that’s what you’re going for next.

  • September 10, 2013Joseph

    Daniel,

    I did show that in the post. Sometimes it make sense to say,

        equation

    Other times it makes sense to say equation is increasing in time. But these are just special cases, and trying to generalize them to much is misleading and often wrong.

    A far more general way to look at the problem, in line with Planck, Jaynes and few others, is to simply ask “what conclusions are implied by the vast majority of states compatible with what’s known?”.

    Sometimes in answering this question you’ll get an “entropy” which increases, and sometimes you won’t. Sometimes you’ll get multiple “entropies” doing different things in the same problem. So the key is to stop taking “entropy increasing in time” as the foundation of the subject. The real foundation is the question in the previous paragraph.

    Doing so opens up a whole world of opportunities which can never be reached by trying to generalize “entropy increases in time”.

  • September 10, 2013Daniel Lakeland

    Got it. I think that clarifies your point.

  • September 11, 2013Brendon J. Brewer

    Nice explanation, very clear.

    One thing I’m not totally comfortable with on the technical side is why knowing the energy constrains the expected value of your probability distribution, leading to the canonical distribution. The microcanonical distribution seems a much more reasonable description of “knowing the energy”, although it leads to harder mathematics and pretty much the same predictions.

  • September 11, 2013Joseph

    Brendon,

    Suppose our knowledge of the actual energy equation can be expressed by saying

    equation is somewhere in the high probability manifold of equation

    Given this state of knowledge and the form of the function equation what distribution does this induce or imply on phase space?

    If you think about it long and hard, you’ll get the answer in the post with equation replaced by equation and equation replaced by equation. That we can drop the later constraint is due to the fact that the first constraint is already enough to get a descent answer to questions of interest and including the second constraint wouldn’t improve those answers appreciably.

  • September 11, 2013Brendon J. Brewer

    “Given this state of knowledge and the form of the function equation what distribution does this induce or imply on phase space?”

    A normal mixture of microcanonical distributions :) That’s the MaxEnt result with the constraint on the marginal distribution for the energy, not just the expected value of the energy.

    This might be very well approximated by the canonical distribution if sigma is exactly right (it will need to be very small). If sigma is large then no single canonical distribution will work, but a mixture could work very well.

  • September 12, 2013Corey

    Jaynes has a very nice and clear explanation for why Liouville’s theorem is compatible with the Second Law. From the macrostate at equation, we get a maxent distribution on phase space. We transform it according to the microstate time evolution rules (Newton or quantum) to equation and make a prediction of the macrostate. Because we included all of the physically relevant info in the definition of “macrostate”, the prediction is accurate. Liouville’s theorem then implies that if we run maxent again at the equation macrostate, the associated entropy must be at least as large as that of the equation macrostate.

    A curious feature of the above reasoning is that it holds for both equation < equation and equation < equation. I have a fuzzy, incompletely thought-out notion about why it is nevertheless correct…

    Pedantic semantics: you mean "suffocation" rather than "choking"; the latter means obstruction of the airways, while the former is what happens when one lacks access to fresh air — perhaps due to choking.

  • September 12, 2013Corey

    Jaynes has a very nice and clear explanation for why Liouville’s theorem is compatible with the Second Law. From the macrostate at equation, we get a maxent distribution on phase space. We transform it according to the microstate time evolution rules (Newton or quantum) to equation and make a prediction of the macrostate. Because we included all of the physically relevant info in the definition of “macrostate”, the prediction is accurate. Liouville’s theorem then implies that if we run maxent again at the equation macrostate, the associated entropy must be at least as large as that of the equation macrostate.

    A curious feature of the above reasoning is that it holds for both equation < equation and equation < equation. I have a fuzzy, incompletely thought-out notion about why it is nevertheless correct…

    Pedantic semantics: you mean "suffocation" rather than "choking"; the latter means obstruction of the airways, while the former is what happens when one lacks access to fresh air — perhaps due to choking.

  • September 12, 2013Corey

    Rather than saying, “why it is nevertheless correct,” I should have said, “how it can be repaired.”

  • September 15, 2013Joseph

    Corey,

    It’s never a problem when Canadians defend the English language from the ruffians and cowboys to the south.

    Since I used Liouville’s theorem to derive the second law, I obviously don’t see an incompatibility. The key is to drop the idea of a Physical second law altogether and work with a purely inferential one. Probably a slight majority of the great physicists saw it that, while the notion that liouville’s theorem is incompatible with the second law because of equation is more common among rank and file physicists. I think people like Jonny von Nuemann for example understood exactly what was going on.

    A more interesting question is to turn things around. Since empirically we don’t see air accidentally bunch up on one side of the room and choke people, how much evidence does that provide for Liouville’s theorem or its quantum equivalent at the microscopic level. Answer: basically none.

    Liouville’s theorem could be hugely wrong, and you would still get a second law as described in the post. For example if you used equation it wouldn’t change the result. Which is a good thing because it’s probably wrong in reality. No matter how much we try to make a closed system, the particles will be subject to constantly changing stray influences and forces of various kinds which make the true potential they experience non-conservative and very weird.

    So if we ask the question “given that we start out in equation were might we reasonably end up at equation, the answer is not going to be the equation implied by Liouville’s theorem, it’s going to be a set equation big enough to take into consideration all those unknowns in the potential.

    You’ll also get the same effect if you try to retrodict the past state rather than predict the future state. If you want to know what set at equation the state could have been in to get equation you’ll need a set equation bigger than what Liouville’s theorem implies. So given that the state is in equation now tells us that future and past states were somewhere in a kind of double cone which expands both into the future and into the past.

    Again my answer to Daniel above is key to all this. “Entropy increasing in time” is not the foundation of the subject. Fundamentally, this subject is the same as the rest of statistics: identify sets which contain the true state by any method that you can, and then count possibilities over those sets to identify (as Planck stated) conclusions which are true for almost every possibility.

Leave a Reply or trackback