Of Correlations, Causations and the Divide Therein – Part Deux

In the first part of this post, I mentioned the important maxim in science, “Correlation does not imply causation”, providing a glimpse of its logical framework, and discussing how the scientific method is utilized to establish causality in observed relationships between/amongst variables.

And what happens when scientists, study authors, investigators ignore this prime maxim?

Well, as a casual search would show, the internet is replete with examples of situations of spurious relationships, where either it was mistakenly assumed that evidence of correlation implied a causal relationship (where none existed), or the effect of some other variable(s) were ignored/disregarded, thereby leading to often implausible, ridiculous, and frankly humorous conclusions. Here is a small sampling of such conclusions gleaned from the internet (Caveat lector: some of these may be apocryphal; however, they do illustrate the point).

  1. Sleeping with one’s shoes on is strongly correlated with waking up with a headache. Therefore, sleeping with one’s shoes on causes headache.
  2. As ice-cream sales increase, the rate of drowning deaths increases sharply. Therefore, ice-cream causes drowning.
  3. Examination of the records of the Netherlands for many years revealed a strong positive correlation between (i) the annual number of storks, and (ii) the annual number of human babies born. Therefore, storks bring babies, since the opposite choice, neonates gathering a flock of storks, is unlikely.
  4. Over the same period of time in history, the number of pirates have decreased, and there has been an increase in global warming. Therefore, global warming is caused by a lack of pirates – a central tenet of Pastafarianism, the religion (N.B. those who are not familiar with this great religion, quickly check out the link!).

These examples, found in Wikipedia (see here and here), illustrate the erroneous conclusions reached by ignoring the plausible effects of a third variable, or the possibility of a coincidence:

  • In 1, a state of extreme inebriation may be responsible for the inability to remove one’s shoes, and the hangover the next morning, leading to the headache.
  • In 2, the warm months of summer may be instrumental in goading people to cool off either by eating ice-cream or by taking a swim (correspondingly increasing the chance of drowning) or both.
  • In 3, the relationship may be quite complex; the storks arrived at the onset of winter and established nests in chimneys and farm outbuildings – therefore, gathering more in rural areas than in cities. This may have been because of the presence of structures conducive to nesting, a cleaner, less-polluted rural environment, and/or the availability of more food and water in winter. For reasons completely unrelated to storks or any other bird, rural families tend to have more children than urban families; in addition, many babies are born in the spring in rural areas because of human (ahem!) behavior during the cold short winter days and long winter nights. Or, these two events may have been completely unrelated, a coincidence. (Note: For a humorous take on the Stork-Baby theory, check out this faux article (PDF) from Germany.)
  • In 4, another example (albeit made-up) of coincidence.

Not to belabor the point, these, precisely, are considerations that should give a pause to the investigators studying relationships of variables. However, oftentimes, the zeal of the investigator(s) in trying to find the postulated relationship becomes a hindrance to the objective assessments thereof. The following few examples illustrate how.

I have already talked about the poorly-analyzed study that purportedly showed an association between religiosity and prolonged survival following liver transplant. Other examples include the following studies:

  1. An older (1999) study published in Nature (no less!) falls squarely in this group. From the associations, the study opined that myopia in young people was caused by exposure to ambient night lights that were left on in their room. This assertion was later soundly refuted by two groups from Ohio and Boston, who found (a) no such association between pediatric myopia and ambient night lighting, and more importantly, (b) an association between parental and filial myopia, lending credence to the idea that myopic parents are more likely to leave the lights on in their children’s rooms.
  2. In an interesting report in Guardian Science, Matt Parker, a mathematician with the Maths Department at Queen Mary College of the University of London, looked up publicly available data on the number of mobile phone masts (Cell phone towers) in each county across the UK, and matched it against the live birth data for the same counties, finding an extremely strong and statistically significant correlation between the two numbers. As an intellectual exercise, Matt has released his findings publicly, stating that although it is a correlation-only finding, with no evidence of causality, he is curious to find out if others interpret a causal connection from it, given the mobile-phone health scare hysteria that arises from time to time despite a resounding lack of evidence.
  3. Living near Freeways is associated with autism, concluded the recent CHARGE study, that aims to uncover genetic and environmental links to autism.

The CHARGE study observed that even after adjusting for co-variates, such as maternal smoking, socio-economic and demographic factors, maternal residence during the 3rd trimester, as well as at the time of delivery, was more likely be near a Freeway (but not any other major road) for mothers giving birth to autistic babies. The authors speculate that proximity to the Freeway may be a surrogate for exposure to traffic related air-pollution which is known to have adverse prenatal effects, and call for a systematic examination of the possible association of the air pollutants to autism. However, unless that latter link is established, the assertion of the link between Freeways and autism is at best premature, and bound to increase confusion (and panic) in the interim, without answering several pertinent questions, such as:

  1. Why were Freeways ‘bad’ and not other major roads?
  2. What about autism incidence and prevalence data from other countries where people live close to busy major roadways equivalent to Freeways?
  3. Are there autism clusters around US cities that have/had high levels of urban air-pollution?
  4. What about autism incidence and prevalence data in rural US communities?

Let’s all take a deep breath and sing in chorus now: Correlation does not imply causation. If there is a true causal relationship between variables, rigorous application of the scientific method remains by far the best way to uncover it.

2 Comments

  1.  Nice post on an important issue. However, as someone who works almost entirely on observational data, I’d just add that patterns of correlation between multiple variables can certainly rule out certain putatitve causes, and in some circumstances can provide pretty compelling evidence of causation. The key is not simply to accept a correlation, but to explore its implications. Bill Shipley’s book, Cause and Correlation in Biology, is very good on the philosophical issues associated with cause and correlation (including the historical figures, such as Pearson and Fisher, who’ve shaped our current views on the topic), and also is a primer on structural equation modelling / path analysis, which is a technique to establish cause from correlative data.

  2. patterns of correlation between multiple variables … in some circumstances can provide pretty compelling evidence of causation

    If there is a causal relationship, there will be correlation – otherwise, causality cannot be inferred. But – I think you’d agree – the mere presence of correlation should not lead to the presupposition of causality, which needs to be established empirically. I don’t know about the standards of proof in the physical sciences, but generally in biological sciences, causal relationships have plenty of corroborative evidence.

    The key is not simply to accept a correlation, but to explore its implications.

    I think you’ve nailed it with this statement, Tom.

Leave a Reply

%d bloggers like this: