I’m going to talk statistics. My wife recently took an advanced statistics course as part of her Masters program, and we ended up talking about what she learned. A lot. In particular, we spent a lot of time talking about how people misunderstand statistics. Common examples:
- Statistics apply only to large numbers. If you don’t have a large enough data set, you can’t get anything meaningful out of the statistic. For instance, if I poll just ten people, I have too small a sample set to come to a decent conclusion.
- Statistics apply globally, not locally. For instance, a woman has an 13% chance of getting breast cancer if she lives to age 85. This doesn’t mean your five-year-old daughter has a 1 in 8 chance of having breast cancer right now. Take a group of people over the span of 85 years, and you end up with that stat.
- Drawing statistics from a small sample set, without looking at a similar larger sample set, leads to incorrect conclusion. If 97.7% of prison inmates eat meat, one might assume a correlation between eating meat and likelihood of becoming a prisoner until you measure the general population and find that 2.3% of the US population are vegetarians. The statistic has no particular meaning in a more limited context, it just reflects the similar statistic of the overall population.
- Correlation does not equal causation. Just because two statistics tend to have some sort of relationship to one another doesn’t mean one causes the other. It does mean, however, that there is some sort of relationship there which might be worth exploring.
- Discarding data that goes against your hypothesis. This is called “counting the hits and ignoring the misses”. For instance, people will often blame the weatherman for mis-calling a forecast, ignoring his 90% three-day-forecast success rate* the rest of the year. People remember the days he flubbed it, and forget the days things went as planned.
- Choosing one point in a data set. Psychics often point to a single or a few tests in an examination that support their claim to be psychics. Although those exceptional results are outliers on the bell curve of probability, since they were not repeatable, they are not exceptional in the overall arc of testing.
Now to my specific complaint:
If I hear one more person say “if scientists can’t even predict the weather, how can they predict global warming?”, I’m going to scream. Ongoing climate change, including the current upward temperature trend, isn’t a hypothesis based on weather forecasts!
The weather forecast in your home town is a teeny, tiny part of an enormous chaotic system. That chaotic system, however, is a self-contained ecosystem, and the overall ecosystem can be measured — and statistics drawn — many different ways to come up with accurate global statistics that have precisely jack to do with weather you should carry an umbrella tomorrow.
For more information on how people misunderstand and abuse statistics, read How to Lie With Statistics.
* Note: Of course, that success rate has its own sub-set of ranges of temperatures and weather conditions where it might be considered a hit or a miss. The forecast for the coming weekend is going to change a lot between Monday and Friday. However, three-day forecasts are currently pretty accurate most of the time, particularly if you look at aviation charts. The moral of the story: three-day forecasts are really useful to plan your weekend, and mostly accurate. Beyond that, anything can happen because it’s a chaotic system.

