You’re a student of volatility. You wrote the book on volatility. You probably invented volatility and would rather find out you’re adopted than be accused of unlawful calculation. Nevertheless, you’re getting it wrong.
To calculate the volatility of end-of-day stock prices the usual rule is the standard deviation of returns. This works if stocks are traded every day, but not for real data since markets are closed almost a third of the year. The standard deviation overestimates the true volatility because it treats returns from Friday to Monday (72 hours) the same as from Monday to Tuesday (24 hours).
The good news is there’s a volatility estimator with these properties:
- It is as simple to calculate as the standard deviation of returns.
- It has a solid theoretical derivation.
- It seems to correct the well known discrepancy between implied volatilities in option prices and realized volatilities.
- It has an amazing “invariance under missing data property” that allows it accurately estimate the true volatility with 70-90% of the days missing.
- It cures baldness.
So here is the estimator in detail
Where is the price at time (for more particulars see my statistics thesis).
The invariance under missing data property has to be seen to be believed. Here are pictures of it for simulated and real end-of-day stock data.
So where does this “invariance under missing data” property come from?
It’s clear from the equation how it works. The formula estimates the missing data and then proceeds as if it had a full set. This is a common trick for handling missing data but here it falls out of the formula automatically.
Theoretically, the estimator is derived from the Brownian motion transition probabilities and the Maximum Likelihood Principle. Both are well understood, but neither gives any hint where this property comes from. To see its origin requires a deeper understanding of Brownian Motion.
Brownian motion has a “path integral” interpretation which deals with probability distributions on paths . In fact, Brownian motion is a key example, playing a role entirely similar to the Normal Distribution on the real line. Path integrals are familiar to Physicists from Quantum Field Theory but aren’t well known to Statisticians. Looking at it from a statistician’s viewpoint, one sees why it works: the unknown prices are treated as nuisance parameters that are integrated out exactly the way a Bayesian would want. The result is precisely the robustness against the unknown values that a Bayesian would expect.
One consequence is that any estimator similarly derived will have the same property. Such path integral methods are ubiquitous in option pricing because of physicists-turned-quants and may yet be important for Economics proper. Much more about that later topic another day.