Algorithms to Investigate Causal Paths to Explain the Incidence of Cardiovascular Disease. Simon Thornley, MPH, MBChB, FAFPHM. [email protected]

Embed Size (px)

Citation preview

  • Slide 1
  • Algorithms to Investigate Causal Paths to Explain the Incidence of Cardiovascular Disease. Simon Thornley, MPH, MBChB, FAFPHM. [email protected] Professional Teaching Fellow, Research Fellow, PhD candidate. The University of Auckland, New Zealand.
  • Slide 2
  • Summary Background to study Directed Acyclic Graphs (DAGs) What are they? What can they be used for? How do computers draw DAGs? A look at a case study including risk factors for CVD
  • Slide 3
  • Slide 4
  • Slide 5
  • My PhD Cardiovascular risk prediction Screen healthy adults Put high risk ones on drugs Distortion of natural history of disease How to deal with it when analysing CVD risk?
  • Slide 6
  • Primary prevention In the 70s, risk factors identified for the treatment of CVD, from cohort studies. Raised blood pressure Diabetes status Cigarette smoking LDL cholesterol level Age Targets for drug treatment.
  • Slide 7
  • Assumption Not just risk factors, but on the causal pathway to disease.
  • Slide 8
  • Slide 9
  • Assumption Not just risk factors, but on the causal pathway to disease. Are they canaries or the miner??
  • Slide 10
  • Drug treatment: a summary
  • Slide 11
  • Drug effects in observational studies Being on a drug indicates , rather than risk, after adjustment for all other factors??!!! Explanations: Unmeasured confounding Measurement error Drug does harm For example: Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M, Brindle P. Derivation and validation of QRISK, a new cardiovascular disease risk score for the United Kingdom: prospective open cohort study. BMJ 2007;335(7611):136.
  • Slide 12
  • Sydney: Professorial fellow I've worked a lot with blood pressure epidemiology, and blood pressure-lowering drug use is always associated with higher risk in all observational studies That is because people who get treated differ from those who don't in too many respects to be able to capture post-hoc. That's why observational studies can never replace randomised trials. Estimating causal effect [sic] can only be attempted under very special circumstances in observational studies.
  • Slide 13
  • Continued After much flogging of the analyst [If you followed my advice about the design of the study] you would probably find some evidence of a protective effect of statins (unless all RCTs of statins are wrong)
  • Slide 14
  • Statistics and causality Statistics Assesses parameters of a distribution from samples. Infers associations Estimate probabilities of past and future events... If... experimental conditions remain the same. Causal analysis Infers probabilities under conditions that are changing e.g. treatments or interventions
  • Slide 15
  • The problem: variable selection Association with outcome Based on relationship with outcome variable (p-value) Minimising information metric (AIC, BIC, Mallows C) fit of data to model; joint probability of data given model, penalised for model complexity Causal relationship What about causal relationships between variables? Confounding: shared common cause of exposure and disease.
  • Slide 16
  • What are DAGs? Graphic: A picture of nodes (variables) and arcs or edges (causal influence) Directed: directed causal effects shown Acyclic: No arrows from effects to causes
  • Slide 17
  • Why use DAGs? Encodes expert knowledge Make assumptions about research question explicit; allow debate Link causal to statistical model for causal inference What could give rise to an observed association between exposure and disease?
  • Slide 18
  • EXPLAINING OBSERVED ASSOCIATIONS What do we use DAGs for?
  • Slide 19
  • Confounding E and D share a common cause (confounding) Exposure Disease Confounder
  • Slide 20
  • Collider Induced by conditioning on common effect of Exposure and Disease (e.g. selection bias, collider). Exposure Disease Hospitalisation
  • Slide 21
  • True causal association? Exposure Disease
  • Slide 22
  • Researcher drawn DAG: Serum urate and CVD Urate CVD Nutrition Obesity HbA1c Diabetes BP t-1 Gout Sex BP meds BP Creatinine Propensity to take preventive treatment HDL Trigs LDL t-1 Ethnic group Smoking Statin therapy HDL Trigs LDL t
  • Slide 23
  • A computer can do it for us Several algorithms available (from computer science, artificial intelligence). Starts with Chi-square tests of independence Conditional tests (similar to Mantel-Haenszel test)
  • Slide 24
  • Aim Use algorithm to draw DAG for variables used to assess CVD risk Inform structure of regression model for causal enquiry and prediction
  • Slide 25
  • HOW THE ARTIFICIAL INTELLIGENCE ALGORITHM WORKS Technical details may induce somnolence, so do not attempt to drive or operate large machinery after listening to this section.
  • Slide 26
  • Chi-square tests Null: P(smoke, CVD) = P(smoke)P(CVD) No relationship Alt: P(smoke, CVD) P(smoke)P(CVD) Yes, a relationship exists (association) Chi-square distribution gives distribution assuming independence (null), if on tails of this (P