Algorithms to Investigate Causal Paths to Explain the Incidence
of Cardiovascular Disease. Simon Thornley, MPH, MBChB, FAFPHM.
[email protected] Professional Teaching Fellow, Research
Fellow, PhD candidate. The University of Auckland, New
Zealand.
Slide 2
Summary Background to study Directed Acyclic Graphs (DAGs) What
are they? What can they be used for? How do computers draw DAGs? A
look at a case study including risk factors for CVD
Slide 3
Slide 4
Slide 5
My PhD Cardiovascular risk prediction Screen healthy adults Put
high risk ones on drugs Distortion of natural history of disease
How to deal with it when analysing CVD risk?
Slide 6
Primary prevention In the 70s, risk factors identified for the
treatment of CVD, from cohort studies. Raised blood pressure
Diabetes status Cigarette smoking LDL cholesterol level Age Targets
for drug treatment.
Slide 7
Assumption Not just risk factors, but on the causal pathway to
disease.
Slide 8
Slide 9
Assumption Not just risk factors, but on the causal pathway to
disease. Are they canaries or the miner??
Slide 10
Drug treatment: a summary
Slide 11
Drug effects in observational studies Being on a drug indicates
, rather than risk, after adjustment for all other factors??!!!
Explanations: Unmeasured confounding Measurement error Drug does
harm For example: Hippisley-Cox J, Coupland C, Vinogradova Y,
Robson J, May M, Brindle P. Derivation and validation of QRISK, a
new cardiovascular disease risk score for the United Kingdom:
prospective open cohort study. BMJ 2007;335(7611):136.
Slide 12
Sydney: Professorial fellow I've worked a lot with blood
pressure epidemiology, and blood pressure-lowering drug use is
always associated with higher risk in all observational studies
That is because people who get treated differ from those who don't
in too many respects to be able to capture post-hoc. That's why
observational studies can never replace randomised trials.
Estimating causal effect [sic] can only be attempted under very
special circumstances in observational studies.
Slide 13
Continued After much flogging of the analyst [If you followed
my advice about the design of the study] you would probably find
some evidence of a protective effect of statins (unless all RCTs of
statins are wrong)
Slide 14
Statistics and causality Statistics Assesses parameters of a
distribution from samples. Infers associations Estimate
probabilities of past and future events... If... experimental
conditions remain the same. Causal analysis Infers probabilities
under conditions that are changing e.g. treatments or
interventions
Slide 15
The problem: variable selection Association with outcome Based
on relationship with outcome variable (p-value) Minimising
information metric (AIC, BIC, Mallows C) fit of data to model;
joint probability of data given model, penalised for model
complexity Causal relationship What about causal relationships
between variables? Confounding: shared common cause of exposure and
disease.
Slide 16
What are DAGs? Graphic: A picture of nodes (variables) and arcs
or edges (causal influence) Directed: directed causal effects shown
Acyclic: No arrows from effects to causes
Slide 17
Why use DAGs? Encodes expert knowledge Make assumptions about
research question explicit; allow debate Link causal to statistical
model for causal inference What could give rise to an observed
association between exposure and disease?
Slide 18
EXPLAINING OBSERVED ASSOCIATIONS What do we use DAGs for?
Slide 19
Confounding E and D share a common cause (confounding) Exposure
Disease Confounder
Slide 20
Collider Induced by conditioning on common effect of Exposure
and Disease (e.g. selection bias, collider). Exposure Disease
Hospitalisation
Slide 21
True causal association? Exposure Disease
Slide 22
Researcher drawn DAG: Serum urate and CVD Urate CVD Nutrition
Obesity HbA1c Diabetes BP t-1 Gout Sex BP meds BP Creatinine
Propensity to take preventive treatment HDL Trigs LDL t-1 Ethnic
group Smoking Statin therapy HDL Trigs LDL t
Slide 23
A computer can do it for us Several algorithms available (from
computer science, artificial intelligence). Starts with Chi-square
tests of independence Conditional tests (similar to Mantel-Haenszel
test)
Slide 24
Aim Use algorithm to draw DAG for variables used to assess CVD
risk Inform structure of regression model for causal enquiry and
prediction
Slide 25
HOW THE ARTIFICIAL INTELLIGENCE ALGORITHM WORKS Technical
details may induce somnolence, so do not attempt to drive or
operate large machinery after listening to this section.
Slide 26
Chi-square tests Null: P(smoke, CVD) = P(smoke)P(CVD) No
relationship Alt: P(smoke, CVD) P(smoke)P(CVD) Yes, a relationship
exists (association) Chi-square distribution gives distribution
assuming independence (null), if on tails of this (P