17 - Statistics and Probability
The field of statistics, with its complex equations and mystical symbols, seems unfathomable to many laymen. As such, there is a tendency to disregard it entirely as effete sophistry, or to surrender entirely to it as a kind of religion. In reality, it deserves neither.
Statistics is merely a mathematical approach to probability, attempting to assign numerical values to the chances that a hypothesis is true. Its strength is in quantifying things, and its weaknesses are in ignoring things that cannot be quantified, failing to include factors, misidentifying factors, and misidentifying causal relationships.
As such, it can be a useful tool, if applied according to the dictates of logic, but should never be substituted for logic.
Evaluating Surveys and Sampling Studies
To assess the validity and applicability of a statistical study, the author proposes some basic questions:
What Exactly is the Finding?
The main conclusion should be questioned - what hypothesis did the study mean to investigate, how the keywords were defined, how the argument was formulated. In essence, the study represents a conclusion based on premises, and begs the same questions as any argument.
Since statistics is mathematical, its results should be expressed mathematical. A study may be interpreted to "prove" that sleeping more makes your life shorter, implying causation, but the actual results might show that people who sleep more than eight hours have a higher death rate than those who sleep less than eight. It is a correlation without causation, which is a common failing of statistics.
Definitions and key concepts are also significant: a survey might suggest 27% of students are christians, but what does "christian" mean in this context? Is it just a matter of claiming to be so, or does it involve regular church attendance? Are Mormons and Catholics considered to be christians for the purposes of the study?
How Large is the Sample?
Small sample size is also a foible of statistics: generally the fewer number of subjects, the less accurate the results of the study are considered to be. If you ask a sample of one person, their responses are not likely to be representative of the population. If you ask one hundred, you will better be able to assess the full range of opinions.
Many studies, especially in the commercial sector, sacrifice accuracy for expediency and cost. Even in the academic sector, smaller samples are often used in preliminary or exploratory studies to determine whether it is worthwhile to spend the time and money to engage a larger sample, yet the findings of these studies are often reported, even when those who conducted them decided they were not valid or relevant enough to warrant further investigation.
How is the Sample Chosen?
Sample selection greatly affects the reliability of conclusions. If you were to ask the opinions of attendees at a gun show about the ethics of hunting, you would get completely different answers to asking the same question at a political rally for environmentalists. It is a significant bias of many academic studies to use samples of students, who are convenient to the researchers but whose opinions reflect a very specific demographic.
As such we should inquire as to the sampling method of a survey, and be extremely circumspect as to how the sample matches the characteristics of the population about which we seek to draw conclusions.
Some praise is given to random sampling as a method of overcoming bias, but this is a misperception. It does indicate that the biases of the researchers did not influence the sample, but it does not mean the sample was not biased: a random sample might happen to select a biased audience.
What Method is used to Investigate the Sample?
There are a number of ways in which the design of the survey instrument can bias the outcome of a study. Consider this:
- Denial. If you asked students about whether they have cheated on exams or have taken drugs, they will be inclined to deny both behaviors, leading to a false conclusion. Anonymity of a survey might decrease this tendency, but does not eliminate it, as there are many things a person would choose to deny even if no-one knew their identity.
- Leading questions. A question may intentionally or unintentionally be phrased in a way to skew the response. Asking "would you give your children a pill to improve their health?" will solicit more positive responses that "would you give your children a pill?" because the former question implies that you do not value your children's health unless you are willing to give them the pill.
- Observer effect. People may change their answers depend on who is asking them. Particularly among college males, an attractive young female proctor will get different responses than an unattractive older male proctor.
To the designer of a survey experiment, it is important to avoid biases such as these inasmuch as possible. But when the results are interpreted, it is equally important to ensure, rather than assume, that care was taken to do so.
Margin of Error?
Many statistical surveys include a number to indicate their margin of error (and you should be suspicious of one that seems to wish to hide this). It is very important to interpreting the results, but a bit difficult to grasp.
Essentially, the margin of error arises as an effect of sample size. If you wanted to present the average weight of a Korean, and you weighed every single person in Korea, your results would be 100% accurate. If you weighted only one Korean, your results would be completely inaccurate. This goes back to the principle that larger sample sizes produce more accurate results, but also considers the size of the sample relative to the size of the population.
The margin of error is expressed in two ways: it may be reported as a confidence interval (95% confidence in the accuracy), as a deviation (plus or minus 5 pounds) or even both (a 95% confidence level, plus or minus five pounds). Generally speaking, surveys seek the 95% confidence level, but this demonstrates adherence to practice rather than conformity to reality and is often used comparatively - if two studies had slightly different results, but their stated level of confidence puts the outcomes in an overlapping range.
If the margin of error is unknown (or hidden) we do not know how much trust we can place in the reported results. With a small sample size and a large margin of error, it may be entirely inaccurate. With a large sample size and a small margin of error, the opposite is true.
Finally, note that the margin of error represents the mathematical accuracy of calculations, not the strength or validity of the underlying logic, nor does it account for biases or methodological errors. Essentially, the margin of error declares "our math is good."
Sidebar: Statistics and advertisements
Because statistical reports can be conducted in a way that skews the results, they are often leveraged by those who wish to prove a desired outcome (rather than seek the truth) who essentially make mistakes on purpose in order to achieve the outcome they desire.
Particularly when a statistic is presented as a reason you should do something that is beneficial to another person (buy a product from them, give your vote to their candidate), you should be very cautious about their methodology.
Absolute Vs. Relative Quantity
There's a brief mention of absolute versus relative quantities, absolute being a specific number (the amount of revenue for one firm, the number of female professors at a given university) and relative quantities being about ratios and percentages (a firm's share of market or the ratio of female to male professors).
Sometimes relative numbers are more meaningful: a country may have a smaller GNP but a higher per-capita income, a city may have a decreasing incidence of crime because the population has fallen, etc. In other instances, percentages and ratios can be misleading because of proportions: eating wild salmon may "double" your risk of getting a disease if it occurs in 0.0000000002% of people who have consumed it and 0.0000000001% of those who have not.
Journalists and advertisers tend to prefer the approach that gives them the most sensational numbers, as do people who are very interested in winning an argument or proving their claims correct rather than describing the truth.
To be ethical in your argument, use absolute or relative numbers that are most relevant to the conclusion you are investigating, and be sure to disclose both sets of figures. Likewise, to expose a flawed or unethical argument, ask if they are not disclosed.
Probability
Absolute certainty is not only unrealistic, but unreal. We act and lay plans based on assumptions of what is likely or unlikely to happen, with a level of certainty that decreases with the scale and time of our predictions. While it is beyond the scope of the present book to discuss the mathematics of probability, the author does wish to consider some common reasoning mistakes.
Gambler's Fallacy
The gambler's fallacy believes that the probability of an event will be influenced by recent occurrences: a count that has come up heads ten times in a row is more likely to come up tails because things will "balance out" in the long term - or, conversely, that lucky "streaks" occur that make a person who is just one more likely to win on their next bet.
Regression Fallacy
Regression fallacy is likewise based on trends and repetitions.
One example is the belief that the stock market will go down after a sharp increase or jump back up after a sharp decrease - observing past patterns we see this occur and perceive that there are natural fluctuations - not recognizing (or perhaps pointedly ignoring) that this is not always so.
Another example is that we are generally aware that a headache will go away on its own, but if the pain becomes intense and we take a pill, we assume the pill caused the pain to dissipate faster than it would have if we hadn't taken the pill.
There are also superstitions that arise from perception, such as things going from "bad to worse," in that if something unfortunate or unpleasant happens, it begins a cycle of bad luck. In reality, misfortune puts us in a state of mind to be sensitive to additional misfortune while ignoring good fortune.
Amazing Coincidences
For whatever reason, the author tosses in the phenomenon of amazing coincidence, which are remarkable stories about highly unusual events, often coupled with the suggestion that there is some cosmic force at work to make bizarre things happen, and the counsel that we should not discard highly implausible theories.
It is useful to recall "Littlewood's Law," named for a professor who suggested that it is inevitable for rare things to happen, just by virtue of the largeness of the world. If there is a one-in-a-billion chance of something happening to a person today, then it's going to happen to seven people every single day.
There is also the tendency of people to look for evidence after the fact, even if they have to manufacture it. This is particularly true in the modern age, where there is a great deal of data and many people have access to the tools to parse through it.
Consider the prophesies of Nostradamus and the hidden messages people find by analyzing the Bible - working over the data enough enables you to force meaning upon data. To disprove the mystical claims, people have shown the same thing can be done with any work of sufficient length, including Moby Dick or the Yellow Pages.