The Amelioration of Uncertainty

## Statistics is the Theory of Relevance

I’m working on a post defining probabilities. That’s the crux of the matter, so it worth laying out forthrightly. The unifying question probabilities aim to answer is: “When is A relevant for B?” It’s fair to surmise I’m the only living person who believes that, so as preparation for that future post, I brought in Jaynes for moral support.

The following is from his paper “Where do we Stand on Maximum Entropy“. Although this paper is only remembered for Jaynes’s dice example, it’s the most important philosophy of science paper of the second half the 20th century. It’s not just for the philsophical though; a Ph.D. in Statistics could fill a long and illustrious career by exploiting seams opened therein.

From bottom of page 16:

From Boltzmann’s reasoning, then, we get a very unexpected and nontrivial dynamical prediction by an analysis that, seemingly, ignores the dynamics altogether! This is only the first of many such examples where it appears that we are “getting something for nothing,” the answer coming too easily to believe. Poincare, in his essays on “Science and Method” felt this paradox very keenly, and wondered how by exploiting our ignorance we can make correct predictions in a few lines of calculation, that would be quite impossible to obtain if we attempted a detailed calculation of the individual trajectories.

It requires very deep thought to understand why we are not, in this argument and others to come, getting something for nothing. In fact, Boltzmann’s argument does take the dynamics into account, but in a very efficient manner. Information about the dynamics entered his equations at two places: (1) the conservation of total energy; and (2) the fact that he defined his cells in terms of phase volume, which is conserved in the dynamical motion (Liouville’s theorem). The fact that this was enough to predict the correct spatial and velocity distribution of the molecules shows that the millions of intricate dynamical details that were not taken into account, were actually irrelevant to the predictions, and would have cancelled out anyway if he had taken the trouble to calculate them.

Boltzmann’s reasoning was super-efficient; far more so than he ever realized. Whether by luck or inspiration, he put into his equations only the dynamical information that happened to be relevant to the questions he was asking. Obviously, it would be of some importance to discover the secret of how this came about, and to understand it so well that we can exploit it in other problems.

Exploit it in other problems indeed! The future truly belongs to Bayesians.

December 6, 2013
• December 6, 2013Corey

The future belongs to those who read Jaynes. There are lots of people under the Bayesian tent who just wouldn’t get it, e.g., Bayesian philosophers of science, or personalistic Bayesian statisticians like Jay Kadane, who writes on page 1 of his 2011 text, Principles of Uncertainty,

Before we begin, I emphasize that the answers you give to the questions I ask you about your uncertainty are yours alone, and need not be the same as what someone else would say, even someone with the same information as you have, and facing the same decisions.

(I should point out that nothing else in the book actually relies on the truth of this assertion.)

• December 6, 2013Joseph

Yeah, I’ve been shocked more times than I care to remember just how bad even prominent members of the Bayesian establishment can be. A big part of that is that I had been reading Jaynes’s papers for a long time before encountering other Bayesians. It turns my stomach to think I might have been one of those knuckleheads under less fortunate circumstances. Thank God for the Corps!

• December 6, 2013Rasmus Bååth

“Thank God for the Corps!”

Just curious, did the Corps influence you in reading Jaynes?

• December 6, 2013Joseph

Ha! no. Marines are very skeptical of eggheads. If you’d like to see what Marines read check out their official reading list. It’s set by the Commandant and usually changes slightly every year. Here’s the current version broken down by ranks:

It did get me away from Academia though, which has gotten noticably worse over my lifetime.

• December 6, 2013Joseph

I hadn’t looked at that list in a while, so it’s funny to see Taleb, Gladwell, Kuhn, and Kahneman on there.

• December 6, 2013Daniel Lakeland

Corey, the quoted text doesn’t seem so bad when read generously enough.

Bayes rule applied to inference gives a consistent procedure for calculating your posterior uncertainty, the procedure relies on two aspects of your state of information: A Prior uncertainty, and a Likelihood. When two Bayesians have the same Prior and Likelihood they SHOULD get the same answers. But several times on this blog we have discussed how both Priors and Likelihoods are in practice modeling decisions. We almost always throw away some information in constructing both of them (hence Gelman’s tendency to assert that answers that don’t “feel right” usually mean that some implicit knowledge was left out).

Since two people with the same knowledge could make different modeling decisions, it shouldn’t surprise us that two Bayesians with the same knowledge could get different results about their uncertainty. Two Bayesians with the same MODEL and information though should get the same results.

• December 6, 2013Daniel Lakeland

The part that annoys me is where personalistic Bayesians assert that their results are more or less true by definition. (ie. no model checking needed). Ok fine, if you want to define truth as the result of a bayesian computation with your personal prior and likelihood plugged in…. then you could do that… but then it’s a true statement about your internal life not a true statement about the external world. I’m much more interested in the external world. Telling me that you believe a particular elephant weighs between 1 and 3 pounds with 99% certainty might be a true statement about your internal state, but it’s more or less completely wrong with respect to the actual elephant.

So if that’s what you were getting at, then yes, I agree with you.

• December 6, 2013Joseph

Daniel,

That’s what I thought too, but there really are people who think as a matter of principle, that it’s ok to associate two different distributions with the same precisely defined state of information.

Note this isn’t a practical consideration where one person throws away some of the information for convenience thereby implicitly using a different state of information. Nor is it case where one analyst considers different states of information for their work. Or even a case where there is ambiguity in the state of information.

But even more than that, I took Corey to be referring to a whole family of blather, like the one you mentioned, that can be found in real Bayesian circles. I think he’s right most Bayesians aren’t really going to get it any better than Frequentists will.

• December 6, 2013Daniel Lakeland

“If we can learn how to recognize and remove irrelevant information at the beginning of a problem, we shall be spared having to carry out immense calculations, only to discover at the end that practically everything we calculated was irrelevant to the question we were asking”

Amen brother! So much of the stuff being done these days with enormous finite element or CFD models on supercomputers has this flavor to me. We do an enormous amount of computation, we get a result, this result is applicable in some very specific set of circumstances which will never happen exactly in practice, we wonder what will happen in practice, so we’re forced to re-calculate things under alternative circumstances… yikes.

Now admittedly, I don’t have a general way in which we can avoid all that stuff, but I think it’s worth considering this as a goal. Enormously precise CFD and other numerics are telling us enormously precise information about things we only care about in some relatively vague way.

Perfect examples are things like predicting the future climate from essentially weather models.

• December 6, 2013Daniel Lakeland

Also, I’d like to mention that Jaynes says some useful stuff about ergodic sampling and soforth. In my work on fitting parameters for my dissertation I began to suspect that all the fancy Hamiltonian monte-carlo and markov chain theory in general were restricting the dynamics of MCMC type simulations more than necessary.

I read some papers on adaptive monte carlo and various schemes that don’t satisfy detailed-balance and it seems to me that there’s a LOT to be done in that area. Although there are some technicalities related to continuous state spaces, there’s already a proof that detailed balance is too strong a condition. http://link.aip.org/link/JCPSA6/v110/i6/p2753/s1&Agg=doi

Although I love Stan and NUTS compared to other Bayesian computation schemes available today, I’d really love to see a system that is less restrictive of the type of models available (for example something that could handle ODEs and mixed continuous and discrete parameters) and yet still very efficient at sampling the relevant high probability region of the posterior. I suspect that a system which throws out strict detailed balance *and* uses continuous adaptation but yet still converges on the correct posterior distribution will ultimately be the answer. Jaynes indicates related ideas in this paper.

• December 6, 2013Corey

Daniel,

I have a policy of taking people at their word very literally (when I don’t know of any incentive for them to be dishonest, anyway). So when Kadane writes that personalistic creed and specifically mentions “same information” I take him to be denying Jaynes’s desideratum (IIIc), which you can find on page 14 of PTLOS (a link for convenience: http://bayes.wustl.edu/etj/prob/book.pdf). That desideratum underlies the proper understanding the Principle of Insufficient Reason (the explanation starts in the middle of page 34 of PTLOS), and also one of the most impressive resolutions of a probabilistic paradox and predictions of empirical frequencies (on the basis of pure thought!) that I know of. (Link.)

• December 7, 2013Brendon J. Brewer

“The following is from his paper “Where do we Stand on Maximum Entropy“. Although this paper is only remembered for Jaynes’s dice example, it’s the most important philosophy of science paper of the second half the 20th century.”

It is indeed brilliant. When I read something like this, I get disappointed at how few people can communicate so well.

With regard to the personalistic Bayesian ideas I understand the discomfort, but I do see the value of “eliciting an expert’s probabilities” mostly because it is much easier than analysing all of their prior information explicitly (i.e. becoming a better expert yourself).

• December 7, 2013Joseph

I don’t think “eliciting an expert’s probabilities” are illegitimate at all. Basically, any method whereby you can construct a distribution where the true value is in the high probability manifold will work, including simply asking an expert where they think the true value is (if their guess is a good one that is). I’m thinking about putting out a post on a time I did exactly that in Iraq and the results lead to 2 named military operations. The experts in that case were the local EOD (Explosive Ordinance Disposal) team.

The math of the subjective Bayesians like Savage/De Finetti et al is often great great stuff that really moves the subject forward. And what are called “subjective probabilities” are often fine if properly understood. But the philosophical stance of the subjective Bayesians, including their weird attachment to Bayes Theorem as opposed to the sum/product rule more generally, is all kinds of messed up.

Having said all that, Jaynes talks about meeting Savage once and them having a long conversation hashing out disagreements. He claimed that after properly understanding each other they were much closer to being on the same page than a naive look at subjective vs objective Bayesians would indicate.

• December 8, 2013David Rohde

My entry point to learning about statistics was Jaynes. I was extremely impressed by his book and papers. I also get the impression that many people who enter statistics from the side, like I have done, particularly from engineering, computer science or physics take a similar route in finding Jaynes’ work an excellent starting place.

Jaynes role seems to be polarised within Bayesian groups either to dominate to the near exclusion of everything else, or else being ignored. I prefer to see him as one of many important figures, but I like him a lot.

I read this paper: http://arxiv.org/abs/physics/0010064/ some time ago, when I was extremely impressed by Jaynes and found it quite difficult to accept (although, I am now pretty sympathetic). It relates directly to Corey’s comment:

“Our goal is that inferences are to be completely ‘objective’ in the sense that
two persons with the prior information must assign the same prior probability.”
[20] This is a very na¨ıve idealistic statement of little practical relevance.

This is one difficulty with Jaynes. A reader of the book can (as I did) get the impression he has solved the problem of setting priors, in very general circumstances. In practice someone constructing a hierarchical model or even a plain old normal mixture model (like I did) pretty much has to use convenience priors to make practical progress. Reading about Jeffrey’s priors and transformation groups etc really won’t help you (as interesting and as brilliant as all this stuff is)… (FWIW I don’t think elicitating priors in these situations except in the crudest sense is particularly helpful either)

My more fundamental doubts about the objective Bayes approach come from the following dilemma:

Is it reasonable to use objective Bayesian probability in order compute the expected utility of decisions?

I think the answer is sometimes yes, sometimes no. This leads me to access if they were to conflict which is more important and for me the important problem is the ordering of decisions – and I am willing to give up on the objective Bayesian ideal as attractive as that is.

A tangential point addressing Joseph’s last comment. “including their weird attachment to Bayes Theorem as opposed to the sum/product rule”. I am a bit puzzled by that comment. If you read (say) Frank Lad, “Operational Subjective Statistical Methods” Bayes theorem is introduced after the fundamental theorem of prevision on page 150. de Finetti’s theory of probability (which was Lad’s main inspiration) does much the same, Kadane referred to above is a little faster introducing it in Chapter 2.

I agree with Joseph’s last paragraph. For me one of the most exciting and under appreciated ideas on the interface of probability theory and philosophy of science is the idea that an exchangeably extendable probability specification is a _restriction_ on being just an exchangeable probability sequence. This explains why in a non extendable probabilistic sequence say when we card count in blackjack inference runs in the opposite direction to normal i.e. many low card seen means low cards in the future are _less_ likely. This was of course due to de Finetti but perhaps explained better by Jaynes in bayes.wustl.edu/etj/articles/applications.pdf. I find it interesting to see a version of this idea being used in quantum information in http://perimeterinstitute.ca/personal/cfuchs/ based on the constraints putting an exchangeable distribution on _many_ particles (disclaimer: I don’t know much at all about physics).

• December 9, 2013Brendon J. Brewer

I hope this isn’t too much of a text dump, but I find this discussion by Ariel Caticha (from this paper: http://arxiv.org/pdf/0908.3212.pdf) very complelling, on the goals of inference and “subjective” vs “objective” Bayes.

“Diﬀerent individuals may hold diﬀerent beliefs and it is certainly important to ﬁgure out what those beliefs might be — perhaps by observing their gambling behavior — but this is not our present concern. Our objective is neither to assess nor to describe the subjective beliefs of any particular individual. Instead we deal with the altogether diﬀerent but very common problem that arises when we are confused and we want some guidance about what we are supposed to believe. Our concern here is not so much with beliefs as they actually are, but rather, with beliefs as they ought to be. Rational beliefs are constrained beliefs. Indeed, the essence of rationality lies precisely in the existence of some constraints. The problem, of course, is to ﬁgure out what those constraints might be. We need to identify normative
criteria of rationality. It must be stressed that the beliefs discussed here are meant to be those held by an idealized rational individual who is not subject to practical human limitations. We are concerned with those ideal standards of rationality that we ought to strive to attain at least when discussing scientiﬁc matters.
Here is our ﬁrst criterion of rationality: whatever guidelines we pick they must be of general applicability—otherwise they fail when most needed, namely, when not much is known about a problem. Diﬀerent rational individuals can reason about diﬀerent topics, or about the same subject but on the basis of diﬀerent information, and therefore they could hold diﬀerent beliefs, but they must agree to follow the same rules.”

• December 9, 2013Corey

David Rohde,

The point of noninformative measures isn’t necessarily to actually use them as priors in any given analysis — it’s to set the starting point for updating with whatever prior info you actually have. Zero is the starting point for a sequence of addition operations; unity is the starting point for a sequence of multiplication operations; a noninformative measure is the starting point for a sequence of Bayesian updates.

• December 10, 2013Corey

David Rohde, I was a bit puzzled as to how one could start with Jaynes — the most uncompromising and acerbic writer on statistics I have ever read — and then get “converted” to the subjective Bayesian approach. But then I Googled up the Lad book and found this review (<- hides a link) that describes the book as "even outdo[ing] de Finetti’s far-ranging opinionatedness and stubborn consistency."

Ah, now I see: I hypothesize that Lad's and Jaynes's writing styles share a certain quality of I might call "the quality of convincing exhortation". (Note to self: must master this style.)

• December 10, 2013Joseph

Brendon,

I guess you’ll be seeing Caticha next week in Australia!

Corey,

“the most uncompromising and acerbic writer on statistics I have ever read” It’s funny you say this because I though Jaynes was very mild. Indeed, I pictured myself the “bad cop” to his “good cop”. I thought this until recently when it became clear what Jaynes’s reputation in the wider world really is. Still, I can’t help but think of him as anything other than a mild mannered guy who had some definite, but highly constructive, ideas about statistics. I chalk up the differences in style between him and me as resulting from the fact that he was a Navy Officer, while I was a Marine Officer.

David,

There are those running around claiming the only way to ever change a distribution is through Bayes Theorem. It’s easy for anti-Bayesians to make this view look stupid. I believe it’s wrong in at least two ways: the philosophical definition of probabilities P(x|K) allows us to change from K_1 to K_2 whenever we feel like it (as long as they’re both true or hypothesized to be so), and secondly, the sum and product rules imply an infinite number of updating rules. See here:

http://www.entsophy.net/blog/?p=193

Jaynes definitely did not consider the problem of how to convert states of information K into probability distributions P(x|K) as solved. The extreme opposite in fact, he not only though this the main open question in Statistics, but he also though it open ended since there are always new states of information K to consider. His main beef with Frequentists in practice (as opposed to the philosophical) was that they diverted enormous amounts of mathematical talent away from this (in general) unsolved problem toward irrelevancies.

The idea that Jaynes’s Objective Bayes is some kind of impractical ideal is a very serious misreading of Jaynes and the points he was making. Reread that quoted passage again. What he’s talking about is an tremendously powerful practical tool: namely the ability to throw away (or never learn in the first place) vast amounts of information and still get accurate answers to specific questions. This is not some impossible ideal, it’s the essence of practicality. Indeed it’s a good deal more practical than the current fit-distribution-to-histogram paradigm and magically hope the future looks like the past.

• December 10, 2013Joseph

Also, David, it’s often the case that those convenient conjugate priors are far better justified than most people think. Basically, we can imagine increasing states of knowledge:

all of which are true. These leads to different distributions:

with each distribution having smaller entropy than the one before it. While your true state of knowledge might be it’s perfectly ok to use something like in many instances. That in effect is what those convenient conjugate priors often are.

Far from this being in conflict with Jaynes’s Bayesian viewpoint, his is the only viewpoint I know of in which this makes perfect sense and which also explains when it will and won’t work.

Frequentists will think this wrong because they think there is only one “correct” distribution. Subjective Bayesians will think it’s wrong because it doesn’t represent your true beliefs. Or they’ll think it always works or some such.

• December 10, 2013Brendon J. Brewer

“I guess you’ll be seeing Caticha next week in Australia!”

Unfortunately he couldn’t make it. John Skilling will be arguing against some of the parts of Ariel’s philosophy that I am also skeptical of, so that should be interesting.

• December 10, 2013David Rohde

I am trying to remember what caused me to ‘convert’ from objective Bayes to subjective. I think I found this special issue: http://ba.stat.cmu.edu/vol01is03.php as well as Lindley’s philosophy of stats http://www.phil.vt.edu/dmayo/personal_website/Lindley_Philosophy_of_Statistics.pdf to be influential. Another review of ‘operational subjective statistical methods’ is this one (David Banks) http://link.springer.com/article/10.1007%2Fs003579900047?LI=true (paywalled). In style Lad is pretty different to Jaynes, although both write very well. When writing my thesis, I tried to emulate Jaynes writing style, but it didn’t work at all for me! Also, I was lucky enough to meet Lad at a past EBEB conference, incidentally he seemed to be a fan of Caticha (and Jaynes).

It seems that Fuchs also ‘converted’ from a Jaynesian to an operational subjective view, see: http://perimeterinstitute.ca/personal/cfuchs/VaccineQPH.pdf. Note how I immodestly compare myself with a great physicist. FWIW, I am not trying to change anyone’s mind and expect all of you know most or all of these refs already.

I would agree that in terms of practical applications the Objective Bayesians have a good record. That said, I do think Jaynes’s desideratum (IIIc) is a problem in practice, but I don’t think we loose all that much by letting it go. I agree ignoring information in order to simplify a problem is very practical, but if Jaynes developed a general way to do this then I don’t see it.

Joseph: I also like your old post on Quantum Mechanics (at least it is consistent with my prejudices!). Although as I see it the updating rules you speak about seem to be just combinations of conditioning and marginalisation, something I think subjective Bayesians are on top of.

I find the rest of your comment interesting and challenging. I am not sure what you mean by \$K\$, it seems to be something more abstract than an observation perhaps like \$I\$ in Jaynes writings…

The case of ignoring an observation is I think very interesting, and practically necessary but philosophically troublesome (to me at least). Say we have a probability distribution P(a,b,c). Say we want to know a and we learn b and c but we decide to ignore c. There are at least two things we can do

marginalize to get P(a,b) and then condition P(a|b) – but what does this distribution really mean? As Jaynes shows us using non-sufficient statistics can cause problems http://bayes.wustl.edu/etj/articles/confidence.pdf but isn’t using non-sufficient statistics a big part of what approximation is all about?

Alternatively we could find the most extreme values of P(a,b,c) in order to compute an interval probability P_u(a|b) P_l(a|b). This has a clearer interpretation, but maybe the interval is too wide to be useful and maybe it is harder not easier to compute.

For context some examples might be useful e.g. imagine a is a quantity to predict, b are posterior samples and c is the data. Another example could be a is a quantity to predict, b are imprecise measurements and c are precise measurements.

• December 11, 2013Corey

David Rohde, if you give up desideratum (IIIc), you’re giving up on the reasoning found in Jaynes’s article The Well-Posed Problem (link).

• December 11, 2013Joseph

Corey,

I absolutely love that paper. I think it’s viewed as kind of cute result not relevant to most of statistics. I couldn’t disagree more. I think it’s a devastating critique of both Frequentists and Subjective Bayesians using mathematics and experimental results both of which are un-assialable.

When I was still on speaking terms with Mayo, I tried to point out that the “frequency correspondence” stuff at the end was a deathblow to Frequentism, but she refused to even think about it. In retrospect, I think the mathematical sophistication of the paper is just a little beyond her and she didn’t want to admit it. So much for philosophers untiring quest for the Truth and all that.

Even those who are favorable to the paper commonly think it has no bearing on applications. While it’s true that it differs from the usual statistical work seen in the life and social sciences, most of that work is worthless crap anyway, so that isn’t much of a slight. It is however reminiscent of the way Physicists get their distributions, which in my physics education were always theoretically derived rather than learned from data. It’s worth noting that Physicists have had dramatically better results with their probability distributions than most others have.

So there are applications and then there are APPLICATIONS.

• December 11, 2013Corey

Joseph, maybe I should post a “good parts” version a la The Princess Bride (only mine would be an actual abridgment, not a fictional one).

Just the other day I was talking to a frequentistically-trained (but not doctrinaire) coworker about how invariance approaches for point estimators and intervals can be extended to invariant distributions. I pointed him to the Wikipedia article on Bertrand’s paradox, which does not do full justice to Jaynes’s reasoning. He pointed out that Jaynes was, to all appearances, getting something (a frequency distribution) for nothing (a lack of information in the problem statement), so I gave a short summary of the argument from the “no-skill” limit. I was dissatisfied with leaving it at that, so yesterday I printed out a copy of the article and highlighted only those parts that contributed most succinctly to the philosophical approach underlying the derivation. It included Figure 1 and amounted to about 2/5 of the text.

• December 11, 2013David Rohde

That’s a great paper, but why is it a problem to a subjective Bayesian?

quoting de Finetti (1970):

The main points of view that have been put forward are as follows.

The classical view, based on physical considerations of symmetry, in which one should be obliged to give the same probability to such ‘symmetric cases’. But which symmetry? And, in any case why? The original sentence becomes meaningful if reversed: the symmetry is probabilistically significant, in someone’s opinion, if it leads him to assign the same probabilities to such events.

• December 11, 2013Joseph

“But which symmetry? And, in any case why?”

Both those questions are answered in the paper. This problem involves connecting an input fact to an output fact. The input is the “low skill” of the thrower, and the output is a definite shape to some histogram.

All the analysis, including the assumption of symmetry and the creation of a probability distribution, serves no other purpose than to show that in the vast, vast, vast, majority of cases where the former (input) fact is true, then so is the later (output) fact. Admittedly the derivation is done in an incredibly slick way that makes it difficult to see that’s what’s actually going on, but as Jaynes explained that’s what it all means.

So if by “subjective” you mean nothing more than that recognition of facts is done by humans, then there’s no objection. But if anyone claims either of the following is just someone’s opinion:

(1) The mathematical derivation makes a definite prediction: if the thrower has low skill then we’ll likely see a histogram of a specified shape.
(2) Experimentally whenever the thrower does have low skill then the resulting histogram is observed to have that specified shape.

Then I would say they’ve lost touch with reality, because neither of those is just an opinion.

• December 12, 2013David Rohde

I think the central idea of objective Bayes is that the symmetrical assignment of probabilities always ought to have decision theoretic force. I don’t think Jaynes convincingly makes that argument here or elsewhere. In the paper above, I think he describes the type of symmetry that he prefers to assign uniform probability to, reversing the direction of the argument.

Your point (2) suggests you yourself do not hold subjective probabilities that are i.i.d replications of Jaynes solution. Presumably your willingness to look at experimental results suggests that if the histogram did not resemble Jaynes’ solution you would adjust your predictive distribution away from Jaynes’ and towards the histogram. If you are willing to do that clearly your probabilities are not i.i.d repetitions of Jaynes solution, but more likely an exchangeable sequence constructed using a mixture of Jaynes solutions and other possibilities.

Anyway, I don’t have any huge beef with what you are saying, just suggesting that subjective Bayesians are probably less different and more sophisticated than your current (colourful bad cop) presentation suggests. And thanks for an interesting blog!

• December 12, 2013David Rohde

I really wish I knew Maxent was in Oz earlier than just now, it just clicked. I would have loved to have gone, it looks like a great program!

• December 12, 2013Joseph

David,

The purpose of the blog is to aid the development of my arguments/explanations. So I’m not harping on you, just trying to modify my approach. From what you said last, you are very seriously misunderstanding both me and Jaynes. So here is a different route.

At heart Jaynes wasn’t making any argument about what kind of symmetry he wanted to assign a uniform distribution to. He was actually doing something of a very different nature. He was working out what reproducible consequences, if any, the assumption of a “low skilled” tosser would have.

You could think of it as a kind of sensitivity analysis. The assumption of “low skilled” implies a certain range of possibilities might occur. So the key question is:

“What conclusions are highly insensitive to possible outcomes within that range?”

Jaynes showed the approximate shape of the empirical histogram is one of those robust conclusions, in the sense that the vast majority of possibilities that a “low skilled” thrower could have achieved with their skill level yield similar shaped histograms.

Or if you think in terms of repeated trials, that approximate shape for the histogram should be reproducible from trial to trial.

All that symmetry/uniform-probability stuff which you are insisting is some kind of huge metaphysical assumption needing either objective or subjective Bayesian justification, is actually just a slick mathematical trick for carrying out that sensitivity analysis.

Mathematical tricks don’t need philosophical justification!