Andrew Gelman recently ran a post title “Why waste time philosophizing?” My answer is that different philosophies dramatically affect how, and how well, we get answers from probabilities. This is illustrated with an example from Finance.
Suppose a Frequentist aims to trade the market next week. With a crystal ball they’d know the actual market behavior is:
Not knowing this, they collected up-down data going back a thousand weeks. Historically the market is up 50% of the time regardless of the weekday, so they create the model
A more perceptive Frequentist conditions their historical research on the previous weeks behavior and determines the market will only be up two days next week. So they use an improved model
Both are objective and well calibrated models of the “data generating mechanism”. No one would argue the Frequentists failed to get this right and indeed much quantitative market analysis is a version of their approach. But now a Bayesian comes along brazenly using the following:
Clearly the Bayesian will make a lot more money next week than either or , but the Bayesian’s probabilities aren’t the frequency of anything, and would be dismissed as subjective by Frequentists. Eager to resolve this conflict Frequentists could equate to the fraction of alternate Universes in which the market goes up. For many this is enough to make respectable. Respectability though is in the eye of the beholder, and skeptics are mindful of the shocking lack of data from other universes.
There is a better way to understand the objective content of . Although next week’s market moves seem like a “random variable”, the truth is they’re a singular event, unique to a given time and place; never to be repeated. Frequentists are flummoxed by the probabilities of singular events, but Bayesians deal with fixed and non-repeatable events all the time. A Bayesian happily considers the probability distribution for the speed of light or the probability Obama will win the 2012 election. All we need do is understand in the same way.
works because it satisfies two objective conditions. These are expressed using the joint distribution where is the 5-tuple describing next weeks markets movements. For example .
The Truthfulness Condition says the distribution should only imply true things about :
Not every distribution is truthful in this sense. , , and satisfy this condition at the level, but does not. It wrongly implies the market is down all next week.
The Information Condition requires the distribution to be informative about :
Here is where improves on the others. Using Boltzmann’s insight that is related to the entropy we can determine which distribution is most informative using :
Which explains why the Bayesian model was so much more useful. and weren’t wrong it’s just that the Bayesian, freed from the ideological constraint of making all probabilities equal to some frequency, found a distribution so informative that was 30 times smaller than .
These considerations lead to the Law of Trading Edges:
The trader has an edge to the extent they can find a low entropy distribution covering the trading period whose high probability region contains the true price sequence. As long as these conditions are satisfied, it’s irrelevant to their performance over that period whether the distribution equals the frequency of anything.
One strategy for satisfying the Truthfulness Condition is to find the region where ‘s have fallen in the past and create a probability distribution so that . If the future resembles the past, then will again be in this region. This is how and were found. Indeed it’s pretty much the only strategy Frequentists use since they bizarrely believe anything else is subjective.
An even simpler, but more robust strategy would be to enlarge so much it can’t help but contain regardless of how the past relates to the future. Maximizing the entropy often works this way when the question at hand doesn’t need a highly informative distribution. Applying this simple strategy will be the subject of the next post.