A favorite maxim of skeptics, right up there with the plural of anecdote not being data, is that correlation does not prove causation. Many times it does not even imply causation. To illustrate this point, there is a graph showing an almost identical correlation between organic food sales and autism, and another chart that demonstrates the stunning consistency of string cheese sales and persons dying while getting out of bed.
While few people would insist that nighttime provolone cravings are fatal, causation is often erroneously inferred in less obvious instances. Consider an example from the classroom. In general, student grades go down the farther back one sits. Yet some classrooms are arranged with the desks forming a circle and the teacher in the middle, so if distance from the teacher caused the grades, everyone would score the same. So sticking a D student front and center will not put him or her on the honor roll. The primary relationship between seating location and grades is that the more studious students want to be near the teacher and visual aids, while their more indifferent counterparts prefer to be out of view, in order to pass notes in days of yore, and to send text messages today.
The string cheese-accidental deathbed is only one example of unrelated items that produce mirror-image data. Another centers on the sale of ice cream and the commission of violent crimes. Overlaying graphs show almost identical peaks and valleys relating to these two incidents. This is not because Rocky Road engenders Road Rage, despite the name similarity. Rather, violent criminals, like other people, get out much more often when it’s warm, which is also when folks buy most of their cool confectioneries.
Most people would understand this, perhaps even intuitively. But less obvious correlation/causation errors are regular features in online news sites and in links shared by members of your social media circle. They take the form of, “Persons who drink tea daily are half as likely to catch cold,” “Young professionals who have goals in writing will accumulate 10 times the net worth of those who don’t,” and, “Excellent grades in high school will lead to better health as adults.”
Putting goals in writing indicates drive, organization, and planning, three traits frequently seen among successful professionals. As to the supposed link between grades and health, persons with strong high school grades have, in general, more resources than those with lower marks. They have the latest gizmos and gadgets, may attend private schools, have access to high-quality tutors, and enjoy the type of health care that leaves them less vulnerable to a lengthy illness that would keep them from school for a long stretch. This same affluence will later allow them access to healthy food, gym memberships, and premium health care plans that result in better fitness.
But pointing out that correlation does not necessarily imply causation is only getting at half of the equation. The other half is determining when correlation and causation are interwoven. Three criteria must be met to determine this.
The first and easiest step is to verify that there is indeed a correlation. We learned earlier there is a correlation between violent crime and ice cream sales. There is no such correlation between violent crime and rap album sales, much as William Bennett and C. Delores Tucker wish that there was.
Second, for X to lead to Y, X must come first. I have seen some persons blame public school shootings on mandatory prayer being removed from these institutions. But there were 119 deadly school incidents prior to the 1962 Supreme Court ruling forbidding the practice. So without even getting into the post hoc nature of such an assertion, X could not lead to Y because Y came first.
Thirdly, other potential causes must be ruled out. Persons who jump off the Empire State Building die. We can conclude that the jump (well the landing, really) causes the persons’ demise because the leap precedes the death, there is a fatality rate of 100 percent, and there is no other factor in the deaths.
This example is obvious, but let’s look at how causality can be inferred from correlation in less clear instances. We will do this by focusing on necessity and sufficiency.
A condition is necessary if the effect cannot occur without it. For an unassisted triple play to occur, at least two runners must be on base with nobody out. The condition is necessary, but insufficient. The overwhelming majority of plays with two on and none out result in something other than an unassisted triple play.
A condition is sufficient if the effect always occurs when the condition is met. For example, the aforementioned Empire State Building jumpers always die. Here, the condition is sufficient for death but not necessary since there are many other ways of dying.
For a condition to be both necessary and sufficient, the effect must always occur when the condition is met, and never happen when it is not met. For example: To be married, you must have a spouse.
The most difficult to detect is when a cause is neither necessary nor sufficient. To be a cause without these distinctions, it must be what James Randi Educational Foundation Programs Consultant Barbara Drescher calls “a non-redundant part of a sufficient condition.” To illustrate this, Derscher pondered a forest fire that resulted from a lit cigarette carelessly tossed aside.
The cigarette, of course, would not be a necessary condition for the blaze. Forest fires can also result from unattended campfires, arson, and lightning ripping a tree asunder (I woke up this morning determined to use that adverb).
Besides not being necessary to start a fire, a discarded lit cigarette is also insufficient. The initial spark would have last long enough to combust, sufficient oxygen would be needed to fuel it, the surrounding brush, leaves, or sticks would need to be dry, the weather would need to be conducive to blazes, and no one with the desire and means to put it out could be nearby.
But if all these criteria are met, the condition is sufficient for a forest fire to rage. But for the cigarette to be a cause, it must still be non-redundant. Meaning that nothing else in the equation can do the job of the cigarette. And, indeed, none of the other factors – combustion, weather, dryness, present oxygen, absent firefighters – does the job of the cigarette. Turning this around, the cigarette cannot do the job of any of the other factors. Oxygen is present with or without the cigarette, the weather and surroundings would be dry without it, the cigarette does not spontaneously combust, nor cause an area to be unpopulated. Hence, the tossed tobacco product is a non-redundant part of a sufficient condition, so in this case the correlation and causation are related.