Imagine a medical researcher is conducting a drug trial on 1000 people and wishes to compare it to a placebo. The researcher randomly assigns 500 to the placebo group and the rest to the treatment group. The hope is that if there is some unknown variable influencing the efficacy of the drug, then it will evenly split between the two groups, allowing the statistician to say if the drug worked. Unfortunately, the belief that randomization removes the problem of unknown factors or influences is false.
To see the difficulty one only has to ask: how many unknown variables are there? Or to paraphrase, how many things can you measure about the human body? Thinking classically for convenience, one could measure the amount of Uranium-238 present in some portion of the body:
Or measure the total momentum in the same portion:
Since the sum is over an arbitrary subset of atoms in the body then there are at least as many factors as there atoms in a typical person, or about . One could add many insights to this estimate, but any more careful or realistic count will only increase this number. So without getting bogged down in unrelated subtleties, I’ll assume there are independent variables which can be measured in the typical human.
The upshot of a number like is that after randomization an astronomical number of factors will be systematically different between the treatment and control groups. In fact, this is true no matter what procedure is use to divide them, random or not.
One objection is that researchers are only interested in “nice” variables like total momentum of the whole body. A factor like the momentum of a single protein in a person’s brain is too messy to be of concern. But there is a selection bias here. The total momentum of a person is easy to know, while the momentum of a single protein in living tissue is usually impractical to measure. Maybe lots of variables are important but we don’t know that because we only ever see the ones measured. Certainly when we can measure “weird” variables, like the amount of Uranium-238 in a tiny portion of the brain, they sometimes turn out to be highly important medically.
A related but more subtle objection is that only “relevant” variables are important. There may be factors but only five that actually make a difference. In that case randomization will likely split the population into two groups that are balanced on those five.
Amazingly this kind of reduction does happen sometimes. You don’t need, for example, to take measurements to know if someone’s pregnant. It only takes a few. (This is probably not an accident by the way. Reproduction is a repeatable and reliable process. To be so, it has to be highly robust, which is another way of saying “unaffected by the value of most factors”).
But to simply assume this reduction for the drug trial is a huge stretch given the astronomical number of variables and our ignorance about almost all of them. About the only time you can make this assumption is when you already understand quite a bit about the underlying mechanism involved.
Unfortunately though, randomized trials are often used in the life and social sciences precisely when there is little or no knowledge of the underlying mechanism and the danger is greatest.
Moreover, there is growing evidence this isn’t just a theoretical problem. The error rate among peer reviewed statistical studies in the life and social sciences seems so great as to make a layman doubt any effect that isn’t large and obvious.
So why do statistics and trials at all? Well if your goal is to determine the efficacy of a specific drug you might get lucky and have only a few hidden relevant factors, in which case the conclusion will be valid. However, if you’re unlucky there will be a wealth of unknown relevant factors, which will have different values when the trial is repeated, and you’ll get your conclusions overturned.
On the other hand, if your goal is to find and understand underlying mechanisms, then things aren’t so bad. If a result is later contradicted, then that implies there were some undiscovered relevant factors that differed between the two populations. Congratulations: you’ve just found a clue which may illuminate the underlying mechanism.
In this view, randomization is good if it helps reveal unexpected relevant variables. How good a job it does at this in practice I’ll leave for you to decide.