MENDELIAN
RANDOMIZATIONMaria Carolina Borges
Research Fellow
MRC Integrative Epidemiology Unit
University of Bristol
UK
Outline
• Motivation
• Assumptions
• One-sample MR
• Two-sample MR
• Recent extensions
• MR-Base
MOTIVATION
Why Mendelian randomization?
4
Causal inference & Epidemiology
Confounding Bias
?
Reverse causation
6
N pair-wise
associations
Expected
(P ≤ 0.01)
Observed
(P ≤ 0.01)
P value for
observed x
expected
96 (non-
genetic) traits4560 45.6 (1%) 2036 (44.6%) < 0.000001
23 SNPs 253 2.5 (1%) 4 (1.6%) 0.33
96 traits x 23
SNPs2208 22.1 (1%) 27 (1.1%) 0.29
Bone Marrow Transplant. 1991;7 Suppl 3:9-12.
Lancet 1986;i:507–08
Mendel’s Second Law:independent assortment
Gregor Mendel (1822–1884): “the behavior of each pair of differentiating characteristics in hybrid union
is independent of the other differences between the two original plants, and, further, the hybrid produces
just so many kinds of egg and pollen cells as there are possible constant combination forms...”
Inheritance of one trait is independent
of the inheritance of other traits
Randomization
Mendelian
randomization
Randomized
controlled trial
Random
segregation of
alleles
Random
allocation
Exposed:
Allele A
Control:
Other allele
Control:
No intervention
Exposed:
Intervention
Outcomes compared
between groups
Outcomes compared
between groups
Confounders
equal between
groups
Confounders
equal between
groups
Adapted from Ebrahim, Davey-Smith, 2008
Mendelian randomization
X Y
U
Z
Z: genetic instrument
X: exposure
Y: outcome
U: confounder
Z: rs12345 X: LDLc Y: CHD
U
Z → X: 0.05 mmol/L of LDLc per T allele
Z → Y: 0.03 log odds CHD per T allele
X → Y: 0.03/0.05=0.6 log odds CHD per 1 mmol/L of LDLc (OR=1.82)
Instrumental variable: randomization to HMGCR variant (rs12345)
Hypothetical example: LDL-c → CHD
ASSUMPTIONS
Instrumental variable (IV) assumptions
U
X YZ
• Z is strongly associated with X
• Z is independent of U
• Z is independent of Y, given X & U
IV1
IV2
IV3
Z: genetic IV
X: exposure
Y: outcome
U: confounder
Instrumental variable (IV) assumptions
U
X YZ
• Z is strongly associated with X
IV1
Z: genetic IV
X: exposure
Y: outcome
U: confounder
• Loss of power
• Bias• Finite samples: confounders will not be perfectly balanced between
genotypic subgroups
• If IV weak, this may explain more of phenotypic differences than IV
• Bias towards confounded estimate (one-sample MR) or towards null
(two-sample MR with no sample overlap)
Weak instruments
Burgess et al., 2011; 2016
Instrumental variable (IV) assumptions
U
X YZ
• Z is independent of U
IV2
Z: genetic IV
X: exposure
Y: outcome
U: confounder
Population stratification
Balding, 2006
U
X YZ
U: genetic ancestry
Instrumental variable (IV) assumptions
U
X YZ
• Z is independent of Y, given X & U
IV3
Z: genetic IV
X: exposure
Y: outcome
U: confounder
Violations of exclusion restriction
VanderWeele et al. 2014
Phenotype Abbreviation
Neurological phenotypes
Alzheimer disease AD
Migraine MIGR
Parkinson disease PD
Photic sneeze reflex PS
Schizophrenia SCZ
Anthropometric and social traits
Beighton hypermobility BHM
Breast size CUP
Body mass index BMI
Bone mineral density (femoral neck) FNBMD
Bone mineral density (lumbar spine) LSBMD
Chin dimples DIMP
Educational attainment EDU
Height HEIGHT
Male-pattern baldness MPB
Nearsightedness NST
Nose size NOSE
Waist–hip ratio WHR
Unibrow UB
Immune-related traits
Any allergies ALL
Asthma ATH
Childhood ear infections CEI
Crohn's disease CD
Hypothyroidism HTHY
Rheumatoid arthritis RA
Tonsillectomy TS
Ulcerative colitis UC
Metabolic phenotypes
Age at menarche AAM
Age at menarche (23andMe) AAM (23)
Age at voice drop AVD
Coronary artery disease CAD
Type 2 diabetes T2D
Fasting glucose FG
Low-density lipoproteins LDL
High-density lipoproteins HDL
Triglycerides TG
Total cholesterol TC
Hematopoietic traits
Hemoglobin HB
Mean cell hemoglobin concentration MCHC
Mean red blood cell volume MCV
Packed red blood cell volume PCV
Red blood cell count RBC
Platelet count PLT
Mean platelet volume MPV
Pickrell et al., 2016; Visscher, Yang, 2016
The ubiquity of pleiotropy
What is the underlying causal model?
Z X Y
Z X
Y
Z Y X
✓ Vertical pleiotropy
(mediation)
X Horizontal pleiotropy
ONE-SAMPLE MR
One-sample Mendelian randomization (MR)
Definition
Genotypes (Z), exposure (X), and outcome (Y)
available from individuals in the same sample
One-sample MR
Z X Y
U
βZX
βZY
One-sample MR
Key aspects
• Identify genetic instruments for the exposure
• Explore violations of IV assumptions
• Generate MR estimates
• Run sensitivity analyses
Identifying genetic instruments
SNPs with well understood
functions….
or via GWAS
26
Explore violations of IV assumptions
Instrument strength
Endogeneity
Overidentification
Burgess et al., 2011; Davies et al., 2013
IV1
IV2
IV3
One-sample MR estimates
Common one-sample
MR estimator
Outcome Exposure Instruments (37 SNPs)
Overidentification
test
First stage regression
Instrument
strength
Endogeneity
test
One-sample MR using polygenic score
Polygenic score- Increases variance explained (compared to single SNPs)
- Avoid many weak instrument bias (compared to many separate SNPs)
TWO-SAMPLE MR
SNP-exposure association SNP-outcome association
Effect of exposure on outcome
Underlying population
Two-sample Mendelian randomization (MR)
Definition
SNP-exposure and SNP-outcome association
estimates from two independent samples from
the same underlying population
Two-sample MR
Z X Y
U
βZX
βZY
One-sample MR Two-sample MR
Hartwig et al., 2017
Why did two-sample MR become
so popular?
Genome-wide association studies (GWAS)
Key aspects
1. Identify genetic instruments for the exposure
2. Extract summary data
3. Harmonise the two datasets
4. Explore violations of IV assumptions
5. Generate MR estimates
6. Run sensitivity analyses
Harmonise datasets
• Ensure that all instruments in dataset 1 are associated with exposure in the
same direction
• Ensure datasets 1 and 2 are identically coded
• Check and correct palindromic SNPs
• Check quality of harmonisation
Hartwig et al., 2017
Explore violations of IV assumptions
• F-statistics
Burgess et al., 2016
N: sample sizeK: number of IvsR2: variance of X by IVs
SD: standard deviation
α: SNP-exposure association in SD unitsMAF: minor allele frequency
For each IV*:
*If multiple & independent SNPs are available, R2
can be added up to calculate F statistics
Instrument strength IV1
Explore violations of IV assumptions
Explore presence of horizontal pleiotropyIV3
Heterogeneity & Asymmetry
Heterogeneity
rs1
rs2
rs3
rs4
rs5
rs6
Causal estimate (βIV = βzy /βzx)
0 1-1
Del Greco et al., 2015; Burgess et al., 2017
Substantial heterogeneity indicates that either modelling or IV assumptions are violated
m: number of IVs
Cochran’s Q statistic
Asymmetry
Causal estimate (βxy)
Pre
cis
ion (
1/S
E)
Causal estimate (βxy)
Pre
cis
ion (
1/S
E)
Funnel plot symmetric:
Balanced pleiotropy (IVW OK)Funnel plot asymmetric:
Directional pleiotropy (IVW biased)
MR estimates: single instrument
• For a single instrument → Wald ratio
መ𝛽𝐼𝑉 =መ𝛽𝑍𝑌መ𝛽𝑍𝑋
• Where both 𝛽’s on the right hand side are regression
coefficients
Assumption: no invalid instruments
• Pooled Wald ratios (fixed- vs random effects)
MR estimates: multiple instruments
rs1
rs2
rs3
rs4
rs5
rs6
Causal estimate (βIV = βzy /βzx)
0 1-1
Del Greco et al., 2015; Burgess et al., 2017Assumption: no invalid instruments
• Inverse variance weighted (IVW) method
MR estimates: multiple instruments
βzy
βzx
βIVW regress βzy ~ βzx [weigths=1/seβzy^2] ***
*** With intercept constraint to be zero
Assumption: no invalid instruments
Sensitivity analyses
• Many new methods relax the assumption of no invalid
instruments. E.g:
• MR-Egger
• Median-based estimator
• Mean-based estimator
• And many others ...
• Consistency of results across methods is key (≠ methods,
≠ assumptions)
MR-Egger
IVW OK
MR-Egger OK
IVW biased
MR-Egger OK
βzyβzy
βzxβzx
βIVWβIVW
βEgger
βEgger
regress βzy βzx [aw=1/seβzy^2]
αEggerαEgger=0
αEgger → non-zero estimate is evidence for directional pleiotropy
βEgger → causal effect estimate adjusted for directional pleiotropy
Bowden et al., 2015, 2016; Burgess, Thompson, 2017
MR-Egger
• MR-Egger allows 100% of invalid IVs, but requires InSIDE
• INSIDE (Instrument Strength Independent of Direct Effect) assumption: SNP-
exposure effects should NOT correlate with the horizontal pleiotropic effects
• Low power particularly when the SNP-exposure effect sizes are
relatively homogeneous
• SNP-exposure estimates have to be oriented to be positive (and the
SNP-outcome effects flipped accordingly)
• More susceptible to regression dilution bias
• Individual outliers can have a large influence on causal estimates
Bowden et al., 2016; Burgess, Thompson, 2017
Median based estimators
Bowden et al., 2016
βzy
βzx
βzy
βzx
- InSIDE not required
- If true, the median ratio estimate is a reliable estimate for the causal effect
- More efficient to use weighted analysis (assumption: set of instruments accounting for
50% or more of the total weight is valid)
Hypothetical example – finite sample Hypothetical example – infinite sample
Mode based estimators (MBE)
βzy
βzx
βzy
βzx
Hypothetical example: Truth=MBE
Hypothetical example: Truth≠MBE
- InSIDE not required
- Zero Modal Pleiotropy Assumption (ZEMPA)
- Current implementation requires specification of smoothing parameterHartwig et al., 2017
RECENT EXTENSIONS
Zheng et al., 2017
Multivariable MR
Factorial MR
Two-step MR (mediation)
Bidirectional MR
MR-BASE
What is MR-Base?
Web server Web-based API GWAS database
• Web interface
• www.mrbase.org/beta
• R package
• TwoSampleMR
MR-Base access
Information on two-sample MR R package: https://github.com/MRCIEU/TwoSampleMR
Key messages
• MR uses genetic variants as proxies of modifiable exposures and can
overcome some key limitations of observational studies
• MR can reliably test for causal relations provided that IV assumptions are met
• Horizontal pleiotropy is one of the main threats to the validity of MR studies
• Two-sample MR can be performed with free, open access, summary data
from GWAS
• Consistency of results across methods is key to reliable causal inference
Acknowledgements
• Debbie A Lawlor
• Jack Bowden
• Sarah Lewis
• Chris Zheng
References
Motivation
• Davey Smith G, Ebrahim S. Epidemiology--is it time to call it a day? Int J Epidemiol. 2001;30(1):1-
11.
• Smith GD, Lawlor DA, Harbord R, Timpson N, Day I, Ebrahim S. Clustered environments and
randomized genes: a fundamental distinction between conventional and genetic epidemiology. PLoS
Med. 2007;4(12):e352.
• Gray R, Wheatley K. How to avoid bias when comparing bone marrow transplantation with
chemotherapy. Bone Marrow Transplant. 1991;7 Suppl 3:9-12.
• Katan MB. Apolipoprotein E isoforms, serum cholesterol, and cancer. Lancet. 1986;1(8479):507-8.
• Smith GD, Ebrahim S. 'Mendelian randomization': can genetic epidemiology contribute to
understanding environmental determinants of disease? Int J Epidemiol. 2003;32(1):1-22.
• Ebrahim S., Davey Smith G. Mendelian randomization: can genetic epidemiology help redress the
failures of observational epidemiology? Hum. Genet. 2008;123:15–33.
ReferencesWeak instrument bias• Burgess S, Thompson SG; CRP CHD Genetics Collaboration.
Avoiding bias from weak instruments in Mendelian randomization studies. Int J Epidemiol. 2011;40(3):755-64.
• Burgess S, Davies NM, Thompson SG. Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol. 2016;40(7):597-608.
Population stratification• Balding DJ. A tutorial on statistical methods for population association studies. Nat Rev
Genet. 2006;7(10):781-91.
Exclusion restriction assumption• VanderWeele TJ et al. Methodological challenges in mendelian randomization.
Epidemiology. 2014;25(3):427-35.
Pleiotropy• Visscher PM, Yang J. A plethora of pleiotropy across complex traits. Nat Genet. 2016;48(7):707-8.
• Pickrell JK, Berisa T, Liu JZ, Ségurel L, Tung JY, Hinds DA. Detection and interpretation of shared genetic influences on 42 human traits. Nat Genet. 2016 Jul;48(7):709-17.
References
One-sample MR
• Burgess S, Thompson SG; CRP CHD Genetics Collaboration.
Avoiding bias from weak instruments in Mendelian randomization studies. Int J
Epidemiol. 2011;40(3):755-64.
• Davies NM, Smith GD, Windmeijer F, Martin RM. Issues in the reporting and conduct
of instrumental variable studies: a systematic review.
Epidemiology. 2013 May;24(3):363-9.
• Burgess S, Small DS, Thompson SG. A review of instrumental variable estimators
for Mendelian randomization. Stat Methods Med Res. 2017 Oct;26(5):2333-2355.
ReferencesTwo-sample MR
• Hartwig FP, Davies NM, Hemani G, Davey Smith G. Two-sample Mendelian randomization: avoiding the downsides of a powerful, widely applicable but potentially fallible technique. Int J Epidemiol. 2016 Dec 1;45(6):1717-1726.
• Burgess S, Davies NM, Thompson SG. Bias due to participant overlap in two-sample Mendelian randomization. Genet Epidemiol. 2016 Nov;40(7):597-608.
• Greco M FD, Minelli C, Sheehan NA, Thompson JR. Detecting pleiotropy in Mendelian randomisation studies with summary data and a continuous outcome. Stat Med. 2015 Sep 20;34(21):2926-40.
• Burgess S, Bowden J, Fall T, Ingelsson E, Thompson SG. Sensitivity Analyses for Robust Causal Inference from Mendelian Randomization Analyses with Multiple Genetic Variants. Epidemiology. 2017 Jan;28(1):30-42.
• Bowden J, Davey Smith G, Burgess S. Mendelian randomization with invalid instruments: effect estimation and bias detection through Egger regression. Int J Epidemiol. 2015;44(2):512-25.
• Burgess S, Thompson SG. Interpreting findings from Mendelian randomization using the MR-Egger method. Eur J Epidemiol. 2017;32(5):377-389.
• Bowden et al. Assessing the suitability of summary data for two-sample Mendelian randomization analyses using MR-Egger regression: the role of the I2 statistic. Int J Epidemiol. 2016;45(6):1961-1974.
• Bowden J et al. Consistent Estimation in Mendelian Randomization with Some Invalid Instruments Using a Weighted Median Estimator. Genet Epidemiol. 2016;40(4):304-14.
• Hartwig FP, Davey Smith G, Bowden J. Robust inference in summary data Mendelian randomization via the zero modal pleiotropy assumption. Int J Epidemiol. 2017;46(6):1985-1998.
References
MR extensions• Zheng J, Baird D, Borges MC, Bowden J, Hemani G, Haycock P, Evans DM,
Smith GD. Recent Developments in Mendelian Randomization Studies. Curr
Epidemiol Rep. 2017;4(4):330-345.
MR-base• Hemani G et al. The MR-Base platform supports systematic causal inference
across the human phenome. eLife 2018.
doi: https://doi.org/10.7554/eLife.34408