Part of the communication difficulty between Bayesians and Frequentists is that they’re modeling different things using similar mathematics. So it’s worth looking closely at a simple example to see what each is hoping to achieve with their methods.
Suppose we take a series of measurements in hopes of estimating the unknown . The “data generation mechanism” of the errors is IID . To be concrete I simulated 10 errors from this distribution and got:
Of course these errors would be unknown to us since we only directly observed .
The goal of Frequentists is to model the “data generation mechanism” which describes the propensity of the measuring device to give off errors on out to infinity. Their choice of IID will be judged based on how well it approximates the frequency of as .
Bayesian’s have a very different goal. Their job is to pin down those one-off unique never-to-be-repeated numbers in (1) as much as possible. Their choice of will be judged on how well it identifies the location of in the space .
Our particular Frequentist happens to be an extremely good modeler of frequencies. Somehow they learn the “data generation mechanism” is IID and use this to model the errors. They report the following 95% Confidence Interval for :
Not too shabby. The Bayesian isn’t as good a modeler as the Frequentist since they aren’t able to intuit the exact properties of the “data generation mechanism” on out to infinity, but they’re not total amateurs either. After some work, they’re able to describe potential values of using the high probability region of with
These probabilities aren’t equal to any frequencies and as a model of the “data generation mechanism” they fail miserably. But they do a reasonable job of describing where lies in space. So the Bayesian computes the 95% Credibility Interval and gets:
The Bayesian answer is clearly an improvement over the Frequentist one.
Unfortunately, things only get worse for the Frequentist from here. There’s no way for the Frequentist to improve their answer by improving their model. They already have the correct “data generation” model which exactly describes how the data was generated. The Bayesian however can continue to improve their description of the true numbers in ever more accurate detail. Eventually if they’re good enough they’ll identify exactly, at which point they’ll know exactly.
Frequentists seem to be modeling the wrong thing. So what motivates them to do this?
Well, they do it because if their highly dubious assumptions about the “data generation mechanism” magically turn out to be true, then they’ll get intervals which wrongly identify the magnitude of a fixed percentage of the time in measurements that will never actually be made.
It’s left as an exercise for the reader to spot the flaws in that.
UPDATE: I’m not sure why my point here is so difficult for people to get. The is an unknown, but fixed parameter. It doesn’t have a frequency distribution of any kind. If you try to model it’s frequency distribution then you’re seriously clueless. If on the other hand, you try to describe (using a function ) where in this fixed parameter resides, then you’re in business. It’s that simple.