Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Causality and causalinference in survivalanalysis
NORBIS course, Oslo
Friday, December 13, 2019
Odd O. Aalen
Dept. of Biostatistics,
Institute of Basic Medical Sciences
University of Oslo2
New England Journal of Medicine,
Editorial, Jan. 6, 2000, p. 42-49 The eleven most important developments in
medicine in the past millennium Elucidation of human anatomy and physiology
Discovery of cells and their substructures
Elucidation of the chemistry of life
Application of statistics to medicine
Development of anesthesia
Discovery of the relation of microbes to disease
Elucidation of inheritance and genetics
Knowledge of the immune system
Development of body imaging
Discovery of antimicrobial agents
Development of molecular pharmacotherapy
3
From NEJM: The origin of modern epidemiology:
1854, when John Snow demonstrated the transmission of cholera from contaminated water
The majority of people who got ill used the Broad Street Pump in London's Golden Square
He removed the pump handle from the polluted well and the spread of the disease stopped.
The original Broad Street pump
© Wellcome Library, London 4
From NEJM:
Earliest clinical trial in 1747
Scurvy (Serious disease: Magellan lost 80% of his men from scurvy)
James Lind treated 12 scorbutic ship passengers on a British navy ship with cider, an elixir of vitriol, vinegar, sea water, oranges and lemon
Those who got oranges and lemon did not get ill
Supply of lemon juice eliminated scurvy from the navy
5
So, it is all about causality
Statistics is important because it is conceived ascontributing to a causal understanding which is needed in prevention and treatment of disease.
Statistics can indicate causality even in the absence of a mechanistic understanding. Treatment of scurvy far ahead of the knowledge of
vitamin C John Snow: 20 years ahead of Pasteur
Going to modern times next: Causality and statistics – a happy couple?
6
Modern breakthrough based on statistics:
Sleeping position influences risk of crib death
Crib deaths in Norway 1986-2003
Num
ber
7
Sudden infant death syndrome (SIDS)
The risk of SIDS is strongly increased (RR up to
13) when the infant is sleeping on its stomach
compared to sleeping on its back.
This is simple because
An intervention could be conceived and was easy to carry out in practice
The effect was immediate
The effect was very strong
None of these conditions normally hold in epidemiology
Questions
Why is causality important in medicine and other fields?
How can statistics say something about causality?
What is causality?
8
Danger of e-cigarettes
Center for Disease
Control (CDC):
A form of vitamin E has been identified as a “very strong culprit” in lung injuries related to vaping THC, health officials reported on Friday, a major advance in a frightening outbreak that has killed 40 people and sickened 2,051.
Analysis of lung fluid. Statistical association.
9 10
CREDITZEPHYR/SCIENCE SOURCE
Surgery for Blocked
Arteries Is Often
Unwarranted,
Researchers FindDrug therapy alone may save lives as
effectively as bypass or stenting procedures, a
large federal study showed.
New York Times, 16.11.2019
Veierød11
Association is not causation
If A and Y are associated, then this is compatible with four different types of causal relationship:
Direct causation
Reverse causation
Confounding
Collider effect (selection)
12
A Y
A Y
A Y
U
A Y
S
1313
Causality is hard:Mechanisms in biologyare complex.Intervention may havea number of unexpectedeffects followingdifferent pathways.
- Closing a boulevardin Paris …
What is causality?
The simplest formulation: it
is about intervention
What is the effect of an action?
This is the focus of Judea
Pearl’s famous book
Seeing vs. doing
Pearl implies that statistics
per se says nothing about
interventions
14
Do-operator:P(y | do(x), z)
Interventions in medicine
Medical doctors, psychologists and otherhealth personell constantly intervene:
Giving treatment, medications, advice …
Statistics is expected to be a tool for validating such actions, to prove or disprove their validity
This cannot be done without taking thecausal issue seriously
15
Seeing vs doing (Pearl)
Pearl makes a fundamental distinction between
seeing and doing. Causality is about doing,
while most statistical data is about seeing
Seeing and doing may coincide in experiments
because of the ability to control the setting. The
“big” experiment in medicine is the randomized
clinical trial where the effect of doing is directly
visible
16
Registries contain data on seeing only, and not doing
We want to say something about the effect of intervention. BUT: the registry only contains a description of what has happened, there is no information about what could have happened if one acted differently. Therefore, you can’t (directly) say anything about the effect of intervention
This is the case for observational data in general
Still, causal inference can help us if we collect enough data and the right type of data…
17
Randomised clinical trials
The established solution to the confounder problem. We create both a factual and a counterfactual world
One of the great pillars of medical research. An unrivalled source of reliable information. Thousands of clinical trials carried out every year
But limitations: very many exclusions (children e.g.), could be distant from clinical practice, often extremely expensive. Clinical trials become unethical once a secure effect has been established
18
19
Lancet 2010; 376:
1741–50
Proximal colon
Distal colon
In the absence of randomization
We need to correct for confounders
But many of them will be unknown
Sensitivity analysis is an interesting option
20
Graphical models
Graphical models with arrows and boxes are common. However, Judea Pearl has lifted them to a new level
A number of rules for evaluating graphs can be defined
These are applicable in practice as shall be demonstrated
21
Directed acyclic graph – DAG
Graph with arrows, where you never return to the same node
22
A
U
Y
L
Outcome
Confounder
Treatment /
Risk factor
Mediator
Collider: where two or more arrows meet
The following rules decide whether
a path is open or closed
1. A path with colliding arrows is closed (→←). If there are no colliders the path is open.
2. To conditon on a non-collider closes thepath.
3. To condition on a collider (or descendantof a collider) opens the path
23
What do we mean by “to condition on”
We mean e.g. to include a variable in the regression.
To include a confounder is usually ok
To include a collider is dangerous
However, a collider may not be avoided if it represents inherent selection in the data
24
Keep causal paths open and non-causal paths closed Example (Hein Stigum): red arrow is causal, black path
is not causal (backdoor path). Conditioning on age (or obesity) blocks the back-door path
25
Vitamin E
Age
Birth defects
Obesity
Exercise (Hein Stigum)
We want the causal effect of physical activity on CHD (coronary heart disease). What should we adjust for?
26
2727
Birth defects. Adjustment for confounder?Source: Hernán et al, Amer. J. Epidem. 2002, 155, 176-184
When estimating the effect of E on D, shall you adjust for C? No, one should not adjust for a collider. Case-control study on folic acid supplementation and neural tube
defects. Adjusted OR: 0.80 (0.62, 1.21), non-adjusted OR 0.65 (0.46,0.94)
Exercise: A larger DAG
28
When is a DAG causal?
Two views
Robins and Hernán: A DAG is causal when
1. Lack of an arrow can be interpreted as lack of direct causal effect
2. All common causes, even if unmeasured, of any pair of variables on the graph are themselves on the graph
Note: this requires a concept of direct cause
Pearl: A DAG becomes causal if intervening on
a node has the effect of removing all arrows into
the node while the DAG is otherwise unchanged
29
Metformin and diabetes Metformin is a standard
treatment for diabetes type 2
It does improve the diabetes
However, it is not clear that it improves the negative longterm effects of diabetes, like cardiovascular disease
Possibly, high glucose is just a symptom which does not have a causal effect oncomplications
Same issue with LDL cholesterol and heartdisease 30
Metformin
Compli-
cations
Diabetes
type 2
Glucose
Metformin
Compli-
cations
Glucose
Survival paradoxes
There are a number of well known paradoxes in epidemiology.
Obesity paradox denotes cases where obesity is a risk factor for developing a condition, like heart failure, kidney failure or myocardial infarction, but once this condition has occurred obesity appears as a protective factor for survival.
Cholesterol paradox: low cholesterol is associated with a poor prognosis in heart failure and in myocardial infarction.
31
Obesity paradox
5070 references in Google Scholar
Apparent “obesity paradox” seen in diabetes,
coronary artery disease, heart failure, peripheral
arterial disease, hypertension, chronic
obstructive pulmonary disease, lung cancer,
esophageal adenocarcinoma, myocardial
infarction, coronary revascularization, acute
coronary syndromes, and among hemodialysis
patients. Preston and Stokes, Epidemiology,
201432
Examples from the literature
Obesity increases mortality in women, but
reduces mortality in diabetic women. Lajous et
al, Epidemiology, 2014
Obesity reduces mortality among heart failure
patients. Banack and Kaufman, Epidemiology,
2013
Smoking appears to protect against Alzheimer’s
disease. Part of the effect is due to survival
paradox. Chang, Alzheimer Dis Assoc Disord,
201233
Why these paradoxes
There would be a mix of explanations
Some biological, I do not look at those
Some statistical, only those are considered
here
So, how can biases be produced? The typical explanation concentrates on collider effects
34
Questions
Why do we denote findings like this as paradoxes?
Can you imagine a reason for these paradoxes?
35
Remember
When we condition on a collider we may activate a non-causal pathway, thereby creating bias.
We’ll come back to the obesity paradoxes in a moment, but first …
36
One collider is ageThe survival of the fittest
Age selection is unavoidable. People die. If you
concentrate on an elderly population it is
selected. People who are vulnerable to high
cholesterol, obesity, smoking etc. may have
died. The robust ones survive
So, you may measure something in a young
person and it may have a causal effect. Then the
causal effect dissipates with age and is
substituted by a selection effect
37
Frailty selection is well known
Assume a constant causal hazard ratio: r(t)= r
Observed hazard ratio:
38
• The relative risk may declineand even cross over due to artefacts.
• This may not reflect anychange in risk at theindividual level.
• This is a causal issue if youare interested in whathappens to individuals whentime passes.
• See related comments in Hernan (2010): “The hazard of hazard ratios”
Seeing vs. doing
Comparing two treatment groups by hazard ratio. End of trial, should all patients be put on the default treatment?
Assume hazard ratio equals 1. Apparently no difference in treatment effect. So, ok to change?
Surprise! The intervention yields a sharp change in the event rate in the group that changes treatment.
39
0 1 2 3 4 5Time
0.5
1.0
1.5
Hazard ratio
Intervene
Gulsvik et al (2012) demonstrate that high serum cholesterol appears as a "protective" factor with respect to general mortality at old age, something which could be an artefact, especially since statin treatments reduce disease incidence of cardiovascular disease in the elderly (Savarese et al, 2013). A strongly reduced risk with age was also seen for smoking in Gulsvik et al. (2012).
40
102-Year-Old Woman Quits Smoking (But Not for Her Health)
Collider effects and survival
Survival bias due to age
41
Survived
to time t
Randomized
treatment
Survived
to time
t+dt
Con-
founder
Questions
In what sense is age a collider?
Why may the apparent protective effect of high cholesterol in old age be a selection effect?
42
Banack and Kaufman: Epidemiology, 2013
Obesity paradox in heart failure
For patients with heart failure: The odds ratio of mortality was 0.79 (0.70–0.88) for overweight and 0.65 (0.57–0.74) for obese
43
We need computation as well
The collider effects just tell us that there might be something to worry about, but not whether the bias is large or small, or even nonexistent. (There do exist models where colliders have no effect at all)
This requires a mathematical model where we explicitly model individual variation
44
Risk model
Model for the first event (like diabetes etc)
Hazard rate equals rate due to special risk
factor (e.g. obesity) plus model with built-in risk variation Z
By doing some mathematics, we can calculate the rate of mortality after the first event has occurred. It turns out to be smaller for those with the special risk than for those without
45
Lower curve: with special risk factorUpper curve: without
46
Question
What is the collider in the heart failure example?
47
Mediation
Can we understand mechanisms by using statistics?
Path analysis (Wright, 1921) introduced the idea of direct, indirect and total effects and presented a simple calculus for these effects based on linear regression models.
A lot of recent and sophisticated development of these ideas in the causal inference literature
48
OutcomeTreatment
Mediator
How much of the effect passes
directly from treatment to
outcome (direct effect) and
how much passes through the
mediator (indirect effect)?
Mediation: Cholesterol treatment
49
Mediation and confounding. Can we estimate the direct effect of statin on coronary heart disease?Confounders between mediator an outcome may give a false impression of an increased indirect effect
50
Example
Psychotherapy: It is well documented that therapeutic alliance appears to be a mediator. However, there is an obvious possibility of confounding effects
Which type of effects could that be?
Psycho-
therapyOutcome
Therapeutic
alliance
Confounder
51
Quotes from Dunn et alStatist. Med. 2007; 26:4719–4745
“Although there is an enormous methodological literature on the estimation of the effects of mediators …., most of it completely ignores the technical challenges raised by measurement errors in the proposed mediators and by potential hidden confounding of mediator(s) and outcome.”
“The assumptions concerning the lack of hidden confounding and measurement errors are very rarely stated, let alone their validity discussed. One suspects that the majority of investigators are oblivious of these two requirements. One is left with the unsettling thought that the thousands of investigations of mediational mechanisms in the psychological and other literatures are of unknown and questionable value.”
One object of criticism is the famous paper by Baron & Kenny (1986). This has now 88 800 references according to Google Scholar.
You ignore causal inference at your own peril
Example: Hormone replacement therapy (HRT) at menopause used to be viewed as a good idea because it was supposed to improve the health and well being ofthe woman. In fact, HRT was dangerous, increasing risk of breast cancer, heart disease and stroke
But the epidemiological studies were wrongly interpreted
due to a lack of causal understanding, see Hernan et al, (2008). Randomized trials gave the correct answer
Frailty selection is a relevant issue (Stensrud et al, 2017)
Today, the attitude to use of HRT is very cautious
52
Pearl’s new book
Popularizing ideasfrom his first book
Many niceexamples
Discussing alsoartificialintelligence
53
Summary
Causal inference is a large and complex area
Various types of survival bias are always present
Causal inference more and more becomes the
norm of analysis and presentation in an
international epidemiological setting
54
Some references
Lange, T. and Hansen, J. V, (2011). Direct and Indirect Effects in a Survival
Context. Epidemiology, 22(4):575-581. The electronic supplement contains
some information on software
Røysland, K., Jon Michael Gran et al. (2011). Analyzing direct and indirect
effects of treatment using dynamic path analysis applied to data from the
Swiss HIV Cohort Study. Statistics in Medicine. DOI: 10.1002/sim.4324
Pearl J. Causality: Models, Reasoning, and Inference, 2nd edn. Cambridge
University Press: Cambridge, 2009.
Hernán MA, Robins JM. Instruments for causal inference: an epidemiologist’s
dream? Epidemiology 2006; 17:360–372.
Gran JM et al. A sequential Cox approach for estimating the causal effect of
treatment in the presence of time dependent confounding. Statistics in
Medicine 2010; 29:2757–2768
Robins JM, Hernan MA, Brumback B. Marginal structural models and causal
inference in epidemiology. Epidemiology 2000; 11(5):550–560.