Fact, Fiction, and Our Probabilistic World
How do we separate fact from fiction? We are frequently struck by seemingly unusual coincidences. Imagine seeing an inscription describing a fish in your morning reading, then at lunch you are served fish and the conversation turns to "April fish" (or April fools). That afternoon a work associate shows you several pictures of fish and in the evening you are presented with an embroidery of fish-like sea monsters. The next morning a colleague tells you she dreamed of fish. This might be starting to seem spooky. It actually turns out that we shouldn't find this surprising. The reason why has a long history resulting in the unintuitive insight of building randomness directly into our understanding of nature, through the probability distribution.
Chance as Ignorance
Tolstoy was skeptical of our understanding of chance. He gave an example of a flock of sheep where one had been chosen for slaughter. This one sheep was given extra food separately from the others and Tolstoy imagined that the flock, with no knowledge of what was coming, must find the continually fattening sheep extraordinary—something he thought the sheep would assign to "chance" due to their limited viewpoint. Tolstoy's solution was for the sheep to stop thinking that things happen only for "the attainment of their sheep aims" and realize that there are hidden aims that explain everything perfectly well, and so there is no need to resort to the concept of chance.
Chance as an Unseen Force
Eighty-three years later Carl Jung published a similar idea in his well-known essay "Synchronicity, An Acausal Connecting Principle." He postulated the existence of a hidden force that is responsible for the occurrence of seemingly related events that otherwise appear to have no causal connection. The initial story of the six fish encounters is Jung's, taken from his book. He finds this string of events unusual, too unusual to be ascribable to chance. He thinks something else must be going on—and labels it the acausal connecting principle.
Persi Diaconis, Stanford Professor and former professor of mine, thinks critically about Jung's example: suppose we encounter the concept of fish once a day on average according to what statisticians call a "Poisson process" (another fish reference!). The Poisson process is a standard mathematical model for counts, for example radioactive decay seems to follow a Poisson process. The model presumes a certain fixed rate at which observations appear on average and otherwise they are random. So we can consider a Poisson process for Jung's example with a long run average rate of one observation per 24 hours and calculate the probability of seeing six or more observations of fish in a 24 hour window. Diaconis finds the chance to be about 22%. Seen from this perspective, Jung shouldn't have been surprised.
The Statistical Revolution: Chance in Models of Data Generation
Only about two decades after Tolstoy penned his lines about sheep, Karl Pearson brought about a statistical revolution in scientific thinking with a new idea of how observations arose, the same idea used by Diaconis in his probability calculation above. Pearson suggested that nature presents data from an unknown distribution but with some random scatter. His insight was that this is a different concept from measurement error, which adds additional error when the observations are actually recorded.
Before Pearson, science dealt with things that were "real," such as laws describing the movement of the planets or blood flow in horses, to use examples from David Salsburg's book, The Lady Tasting Tea. What Pearson made possible was a probabilistic conception of the world. Planets didn't follow laws with exact precision, even after accounting for measurement error. The exact course of blood flow differed in different horses, but the horse circulatory system wasn't purely random. In estimating distributions rather than the phenomena themselves, we are able to abstract a more accurate picture of the world.
Chance Described by Probability Distributions
That measurements themselves have a probability distribution was a marked shift from confining randomness to the errors in the measurement. Pearson's conceptualization is useful because it permits us to estimate whether what we see is likely or not, under the assumptions of the distribution. This reasoning is now our principal tool for judging whether or not we think an explanation is likely to be true.
We can, for example, quantify the likelihood of drug effectiveness or carry out particle detection in high energy physics. Is the distribution of the mean response difference between drug treatment and control groups centered at zero? If that seems likely, we can be skeptical of the drug's effectiveness. Are candidate signals so far from the distribution for known particles that they must be from a different distribution, suggesting a new particle? Detecting the Higgs boson requires such a probabilistic understanding of the data to differentiate Higgs signals from other events. In all these cases the key is that we want to know the characteristics of the underlying distribution that generated the phenomena of interest.
Pearson's incorporation of randomness directly into the probability distribution enables us to think critically about likelihoods and quantify our confidence in particular explanations. We can better evaluate when what we see has special meaning and when it does not, permitting us to better reach our "human aims."