This is the first of a two part series about Bayes Theorem and the Second Law of Thermodynamics. It begins with the question:
What are the odds that one hour from now, the diffuse air in the sealed and insulated room I’m in will position itself in the half where I’m not, thereby causing me to choke?
The air’s microstate is a point in formed from the position and momentum of each particle . The microstate’s details are unknown, but we do know it’s volume and energy . This confines the possibilities for to some region as shown below. Within that region, there’s subsets of “bad” states that will evolve into states which only take up half the room. The probability we seek is the chance that the air is currently in .
Using Liouville’s Theorem and the entropy of an ideal gas,
where depend on the gas, we get, assuming a room at 1 Atmosphere and T=25 degrees Celsius,
It’s clear why this never happens: real gases are one of those majorities and not one of the exceptions. In this sense physical systems don’t make “transitions” whenever which is just the Second Law of thermodynamics. Although a better way to express it is “the odds are against a state in winding up in whenever “.
Probability distributions are introduced solely to define regions and to count states. In this case the true state has to be in region consistent with the measured energy where . This implicit definition of can be made explicit by maximizing the entropy functional subject to the constraints:
The second constraint can usually be dropped for reason given in the next post. The region of interest is then .
These probabilities are not the frequency of anything. They describe our uncertainty about and are similar to prior probabilities for fixed parameters. The frequency of states under repeated preparation, or of a single system over lifetimes, are interesting quantities, but they’re separate from reasoning about the air currently in my room. It’s likely effects not considered would confine states to a small subset of contained entirely in that majority, so we’d never observe an exception even if we could repeat it times, which we can’t. The frequency interpretation is not known or likely to be true, not needed, and impossible to verify.
This understanding is associated with Jaynes (see references), but is quite old. Plank expressed the same views in a classic 1913 work. Planck saw entropy as a general tool not limited to physics which allowed us to take partial knowledge (functions) of the true state and find conclusions implied by almost every state consistent with that knowledge. The only thing needed to make those conclusions a reality is for the true state to be one of those great majority. There is no mention of frequency/ergodic interpretations and he even carries out an explicit maxent construction like above (section 139). As proof of the generality of these ideas, I used Planck’s reasoning in this post to show why IID Normal assumptions work much better than Frequentists think they should.
So now imagine there is a Bayesian who takes as a prior probability and observes aspects of the air as time progresses. With each measurement the Bayesian updates the distribution producing a sequence of regions which shrink around the true state . The Bayesian creates a decreasing and concludes won’t overlap with at all, so there is no chance of asphyxiation.
The Bayesian’s results do not contradict the Second Law calculation either as an inference or in it’s physical implications.
As an inference they are both perfectly valid. The Bayesian’s additional knowledge allows them to place in a “funnel” in time+phase space smaller than the “cylinder” used in the Second Law calculation, but they are both accurately locating the truth path in that space. So while they both reach the same no-choking conclusion, the Bayesian does so with less uncertainty, rather than , but this merely reflects the truism that greater knowledge leads to less uncertainty. If you lacked the Bayesian’s additional measurements, the Second Law inference is still valid.
Nor are various entropies in contradiction. Although both calculations use they reference different , which are functions of different variables and have different numerical values, and are being used in slightly different ways.
They agree physically as well. Either analysis predicts the mass density , which is proportional to the frequency of particles in each small volume, will maximize the empirical entropy and diffuse out to a uniform density. Just as in the last post, the decreasing in no way contradicts increasing or our intuitions about increasing disorder.
So is compatible with . The implications of this for thermodynamics and statistical inference will be sketched in the next post.
Jaynes (1963) Information Theory and Statistical Mechanics
Jaynes (1965) Gibbs vs Boltzmann Entropies
Jaynes (1967) Foundations of Probability Theory and Statistical Mechanics
Jyanes (1988) The Evolution of Carnot’s Principle
Jaynes (1992) The Gibbs Paradox
Jaynes (1998) The Second Law as Physical Fact and Human Inference