Standard Deviation

Suppose we choose an American woman at random. Given just a couple of numbers that we know characterize American women's heights, we can be 95% certain (meaning, we will be wrong only once out of every 20 times we do this) that her height will be between 4 feet 10 inches and 5 feet 10 inches. It is the statistical concept of standard deviation which allows us to say this. To show how, let us take such a data set: the heights of 1,000 randomly chosen American women, and let us plot this data as points on a graph where the x-axis shows height from 0 to 100 inches, and the y-axis shows the number of women who are of that height (we use only whole numbers of inches). If we then connect all these points into a smooth line, we will get a curve that is bell-shaped. Some data sets that have this characteristic shape, including this one, are said to have what is called a normal (also known as Gaussian) distribution, and it is the way in which a great many kinds of data are distributed.

For data that is strictly normally distributed, the highest point on our graph (in this example, the height in inches which occurs most frequently, and which is called the mode) would be at the average (mean) height for American women, which happens to be 64 inches, but with real-world data the mean and the mode can differ slightly. As we move rightward from that peak to greater height along the x-axis, the curve will start sloping down, become steeper, and then gradually become less steep and peter out to zero as it hits the x-axis just after where the tallest woman (or women) in our sample happens to be. The same exact thing happens on the other side as we go to smaller heights. And this is how we get the familiar symmetrical bell-shape. Suppose that the height of Japanese women is also 64 inches on average but it varies less than that of American women because Japan is less ethnically and racially diverse than America. In this case, the bell-shaped curve will be thinner and higher and will fall to zero more quickly on either side. Standard deviation is a measure of how spread out the bell-curve of a normal distribution is. The more the data is spread out, the greater the standard deviation.

The standard deviation is half the distance from one side of the bell curve to the other (so it has the same units as the x-axis), where the curve is about 60% of its maximum height. And it can be shown that about 68% of our data points will fall within plus or minus one standard deviation around the mean value. So in our example of the heights of American women, if the standard deviation is 3 inches, then 68% of American women will have a height between 61 and 67 inches. Similarly, 95% of our data points will fall within two standard deviations around the mean, so in our case, 95% of women will have heights between 58 and 70 inches. And similarly, it can be calculated that 99.7% of our data points will fall within three standard deviations around the mean, and so on for even greater degrees of certainty.

The reason that standard deviation is so important in science is that random errors in measurement usually follow a normal distribution. And every measurement has some random error associated with it. For example, even with something as simple as just weighing a small object with a scale, if we weigh it 100 times, we may get many slightly different values. Suppose the mean of all our observations of its weight comes out to 1352 grams with a standard deviation of 5 grams. Then we can be 95% certain that the object's actual weight is between 1342 and 1362 grams (mean weight plus or minus two standard deviations). You may have heard reports before the discovery of the Higgs Boson at CERN in 2012 that they have a "3 sigma" result showing a new particle.  (The lower case Greek letter sigma is the conventional notation for standard deviation, hence it is often also just called "sigma.") The "3 sigma" meant that we can be 99.7% certain the signal is real and not a random error. Eventually a "5 sigma" result was announced for the Higgs particle on July 4th, 2012 at CERN, and that corresponds to a 1 in 3.5 million chance that what they detected was due to random error.

It is interesting that measurement error (or uncertainty in observations) is such a fundamental part of science now but it is only in the 19th century that scientists started incorporating this idea routinely in their measurements. The ancient Greeks, for example, while very sophisticated in some parts of their mathematical and conceptual apparatus, almost always reported observations with much greater precision than was actually warranted, and this often got amplified into major errors. A quick example of this is Aristarchus's impressive method of measuring the distance between the Earth and the Sun by measuring the angle between the line of sight to the Moon when it is exactly half full, and the line of sight to the Sun. (The line between the Earth and the Moon would then be at 90 degrees to the line between the Moon and the Sun, and along with the line from the Earth to the Sun, this would form a right triangle.) He measured this angle as 87 degrees which told him the distance from the Earth to the Sun is 20 times greater than the distance from the Earth to the Moon.

The problem is that the impeccable geometric reasoning he used is extremely sensitive to small errors in this measurement. The actual angle (as measured today with much greater precision) is 89.853 degrees, which gives a distance between the Earth and the Sun as 390 times greater than the distance between the Earth and the Moon. Had he made many different measurements and also had the concept of standard deviation, Aristarchus would have known that the possible error in his distance calculation was huge, even for a decent reliability of two standard deviations, or 95% certainty in measuring that angle.