Monday, October 25, 2010

Causation, Correlation, Religion... where does Medicine stand?


Is statistically significant correlation between two seemingly unrelated things worrisome? Not really. There might be many common causes, or might happen just by random chance – that IS expected to happen 5% of the time. And all this assuming that the statistical analysis done is actually accurate.

But, really, unless you do proper causal statistical analysis, you should stay clear form the temptation of making ‘this causes it’ kind of interpretation. And while doing that, make sure to summon all available knowledge about the variables, and be aware of your assumptions.

That, sadly, is an oft-violated practice in applied statistical papers authors who, frequently, use statistics to demonstrate what they already ‘know at heart’ to be true – basically using statistics ‘as a drunken man uses lamp-posts – for support rather than illumination. I will talk about my dismay on an article that was brought to my notice recently which, among multitudes, is a clear case of statistics being malpracticed (just like me verbing the noun, and doing it again) or misused.

(1)

I recently stopped by a talk organized by a Catholic student group at Harvard, not knowing that it is regarding ‘miracles’ that are ‘unexplained’ by medical knowledge. The talk, being publicized as ‘sure to appeal to anyone interested in the dynamic interplay between science and religion’, did not seem so much on the said interplay to me but more as a descriptive advertisement, if I politely avoid the word propaganda.

There, as an ending comment, was shown this recent paper to argue that religion might indeed have an underlying beneficial effect on patients. That is a serious causal claim to be seemingly inducted from an observational statistics paper which simply claims in the title that they see “religiosity associated with prolonged survival in liver transplant recipients”. 

Nevertheless, willing to be disappointed twice in a day, I looked through the said paper in search of something interesting. And to my utter surprise, even they themselves actually make the causal claim! The reason is obvious though – their belief in religion strongly enables them to extend the statistical association in a poorly analyzed study to a veritable statement on the happenings of nature. They might well have come across the book that says “Insight is not the same as scientific deduction, but even at that it may be more reliable than statistics.”

(1.1)
Poorly analyzed study? Yes. Speaking of the paper ‘Religiosity Associated with Prolonged Survival in Liver Transplant Recipients’ (Liver Transplantation, 16: 1158-1163, 2010; PMID: 20818656) by Bonaguidi et al, I have quite a few statistical concerns; my doctor friend discusses concerns from his side here. Let me mention my strongest ones.

My foremost concern is their unbelievable choice of endpoint. “The only endpoint of the study was patient survival, regardless of the cause of any deaths.” What?

Firstly, sloppy wording, as endpoints are defined as something which takes you away from the study, such as death, you don’t call survival as an endpoint. But we understand – death was their endpoint.

Even then, when you’re following patients after a treatment, you absolutely have to consider at least two, preferably three types of endings – whether the death is from a physiological cause (illness/complication) related to the treatment performed, whether it is from a physiological cause not directly attributable to (but might be indirectly related to) the treatment, or from an external unrelated factor (like murder or car accident). And among these, only the first or the second should be treated as actual clinical endpoints for the survival analysis, and the external causes should be treated as censoring. Taking every type of death as endpoint is certainly a bad choice, in particular when detecting those categories should be rather easy.

(1.2)
The next concern is regarding the issue of censoring.  They say, “None of the enrolled patients were lost to follow-up.” That means, no censoring during the study period. And since their follow-up plan was to follow everybody till at least 36 months, we would expect that till 36 moths from liver transplant, everybody should be under follow-up and therefore ‘at risk’ unless they die and therefore drop out of the study.

But, according to the data presented at the bottom of figure 1, 72+56=128 people are at risk after 12 months out of 89+90=179 patients. Where did the 49 people go? According to them, only 18 patients died during the entire follow-up. Then? By the end of 36 months, only 48+30 = 78 patients remain at risk; what about the rest 101 people?

It seems there must have been loss to follow-up. The median follow-up period is reported to be 21 months, which is not possible with only 18 deaths out of 179 if there hasn’t been any censoring.

And that raises the important question – standard survival analysis needs to assume that censoring is noninformative, i.e. it happens at random, without the patient status affecting the drop-out probability of people. This is an untestable assumption, and therefore has to be thought about very carefully. Has that been thought of in any way? I think not, because they never even mention this issue. And if censoring is biased in any way, which is never impossible, given so many possible underlying factors, there is good chance that the results may change, especially since the confidence intervals are so wide and often so close to 1 when significant – in the range of 1.02, 1.07, etc.

On a similar note they use Cox proportional hazards model; have they thought about any of its assumptions? Did they check if PH even holds or not? Did they check for possible interaction terms? Nothing, it seems.

(1.3)
I do not have anything personal against this or any study which shows some measure of religiousness is associated with patient survival. It might be; that is not statistically impossible, even if religiousness doesn’t actually plays a role. But the point of concern is that the paper pushes hard to go beyond the permissible conclusions of an observational study to espouse the hypothesis that religiosity improves the survival of patients with end-stage liver disease who have undergone orthotopic liver transplantation. (emphasis added)

No, No, No! This is a standard statistical study based on observational data, neither a designed experiment nor a causal statistical inference! Correlation is not causation. That is something they keep in mind passively in the second paragraph while referring to some other studies which show that people with unhealthy habits do not go to religious services and (they avoid using ‘and therefore’ here) die earlier.

But in analyzing their study, they seem all too eager to champion the ‘religiousness impacts survival’ theory. This is prominent in their statement that “we are inclined to believe that the relationship between religiosity and prognosis is not one of mere association; instead, faith as a way of coping is a real resource for seriously ill patients and helps to improve their prognosis.” With such strong beliefs, who needs causal inference?

(1.4)
Such tendency of putting your religious interpretations in the mouth of simple data is visible over and over again in their extremely long discussions section, where they rarely discuss the findings of their statistical analysis but instead go philosophical, discussing the religious feelings and ideas, and how other studies think similar.

To quote an example, “Nevertheless, the psychological evaluation provided evidence that the relationship of the patients with God was primarily intimate and private in nature and was experienced with a religious sentiment derived from their cultural context.” (emphasis added)

Evidence? What evidence? Nowhere in the findings was anything statistical to indicate whether the patients’ ‘relationship with god’ is ‘intimate and private’ or public. It is purely philosophical to take a survey statement like “I sought God’s help in dealing with the situation” and to call it a private relation and so forth. Were the people asked if they consider this relationship to be public or private? And where does ‘cultural context’ come into the study, never being defined or measured in any way?


It is not bad to start your experiment with a gut feeling that your study might establish something – that is why usually we get the idea of a study in the first place – but making interpretations on such pre-fabricated beliefs which go beyond the acceptable in statistical methodology is not desirable in scientific literature. It can then have the immense potential of, and is already seen being used as, a tool for religious lobbies and groups to claim that religion still has an upper hand on science in the end.

(2)

Survival analysis is typically a bit more complicated to handle than the ‘statistics for dummies’-type tests such as a t-test or ANOVA. Another good example to discuss will be ‘Elvis to Eminem: quantifying the price of fame through early mortality of European and North American rock and pop stars’ (J Epidemiol Community Health. 2007 Oct; 61(10):896-901, PMID: 17873227) by Bellis et al.

There, on page 898, they mention that Figure 2 contains some Kaplan-Meier survival curves denoting pop star survival. Their language is sloppy though; they say those survival curves are ‘plotted against’ general population curves, which might seem to mean the x-versus-y kind of plotting. But thankfully, it is not; a look at figure 2 will show that it actually means that both the curves – survival curves and general population curves – were merely plotted on the same axis.

Now the big thing – their survival curves go up-n-down with time!

For those unfamiliar with Kaplan-Meier curves, these survival curves at time t measure the probability that you’ll survive beyond time t, and therefore it can only decrease, by definition. If your chance of surviving beyond age 50 is 40%, your chance of surviving beyond age 70 cannot increase to become 60%.


Without this basic fact being noticed by the authors, and more importantly, the reviewers, this is only a little demo on the relative importance and awareness bestowed on proper statistical methods in the applied fields, in particular medicine. With the objective of establishing some medical principle, and without thorough statistical training, many unknowingly engage in what is aptly described as 'torture the numbers and they'll confess to anything'.

Remember the old saying, “Statistics can be made to prove anything - even the truth”? And truth is all we seek – some, in God; some, in science. Let there be light.