Distinguishing genetic correlation from causation among 52 ... · Distinguishing genetic...

Preview:

Citation preview

Distinguishing genetic correlation from causationamong 52 diseases and complex traits

Luke J O'ConnorHarvard T.H. Chan School of Public Health

Pre-print on biorxiv

What is a genetic correlation?

Psychiatric Genomics Consortium 2013 Nat Genet; Bulik-Sullivan et al. 2015b Nat Genet

Correlation across SNPs Correlation across Individuals

What is Mendelian Randomization?

Primary motivation: modifiable exposures to reduce disease risk

If LDL causes CAD: (Voight et al. 2012 Lancet)

– SNPs associated high LDL are associated with higher CAD risk– Individuals with high LDL alleles have higher CAD risk

If LDL does not cause CAD, and no genetic correlation due topleiotropy:– SNPs associated high LDL are not associated with CAD– Individuals with high LDL alleles have equal CAD risk

Mendelian randomization is geneticcorrelation restricted to top SNPs

Davey Smith and Hemani 2014 Hum Mol Genet; Bulik-Sullivan et al. 2015b Nat Genet

Regress SNP effects Regress genetic values

Pleiotropy is common

Pickrell et al 2016 Nat Genet

Pleiotropy: same variantaffects multiple traits

– Effects may or maynot be correlated

– May or may not bedue to causalrelationship betweentraits

Genetic Correlations are common

Bulik-Sullivan et al 2015b Nat Genet

Bidirectional MR distinguishes correlationfrom causation?

If A has a causal effect on B, then:

Pickrell et al 2016 Nat Genet

– Most variants ascertainedfor B do not affect A

– All variants ascertainedfor A do affect B

Outline

1. Latent Causal Variable model to distinguishcorrelation from causation

2. Comparison with MR in simulations

3. Application to MI and other traits

Latent causal variable model

Value of trait kEffect of on

Latent causal variable

Latent causal variable model

Genotype

Value of trait kEffect of on

Latent causal variable

Effect of on Effect of on not mediated by

When

Special case: full genetic causality

Genotype

Value of trait kEffect of on

Latent causal variable

Effect of on Effect of on not mediated by

Genetic causality proportion measuresdegree of partial causality

Genetic causality proportion (gcp): number x such that

gcp=1: trait 1 fully genetically causal for trait 2

gcp=-1: trait 2 fully genetically causal for trait 1

gcp=0: no partial causality

Key intuition: if trait 1 causal, then SNPsaffecting trait 1 have proportional effects ontrait 2, but not vice versaKey equation: relates mixed fourth momentswith q under the LCV model

Inference using mixed fourth moments

Key equation relates mixed fourth momentswith q

Excess kurtosis (zerowhen Gaussian)

Estimate fromsummary statistics

Genetic correlation(estimate using LDSC)Want

Posterior estimation of gcp

Mixed 4th moments, block jackknife →approximate likelihood

Uniform prior → posterior mean, standard error

Hypothesis testing: does gcp = 0?

Outline

1. Latent Causal Variable model to distinguishcorrelation from causation

2. Comparison with MR in simulations

3. Application to MI and other traits

Simulations: comparison with MR methods

● Comparison with:– Two-sample MR (Burgess et al. 2013 Genet Epidemiol)

– MR-Egger (Bowden et al. 2015 Int J Epidemiol)

– Bidirectional MR (Pickrell et al. 2016 Nat Genet)

● M=50k no LD● N=100k disjoint cohorts

● h2g = 0.3, h2

GWAS ~ 0.15

Uncorrelated pleiotropic effects: all methodswell calibrated

● Pleiotropic SNPs explaining 20%of heritability for both traits

● Zero genetic correlation

Nonzero genetic correlation: MRconfounded

● SNPs with correlated pleiotropiceffects explaining 20% ofheritability for both traits

● Genetic correlation: 0.2

Unequal polygenicity between traits: Bi-MR(and MR) confounded

● SNPs affecting trait 1 only: highper-SNP heritability

● SNPs affecting trait 2 only: lowper-SNP heritability (4x difference)

● Genetic correlation: 0.2● Similar results for unequal power

Full genetic causality: all methods (exceptMR-Egger) well powered

● All SNPs affecting trait 1 alsoaffect trait 2

● Genetic correlation: 0.2● High power in more challenging

simulations as well

Unbiased posterior estimates in simulationswith LD

● Real LD patterns● gcp values drawn from

prior distribution● Unequal polygenicity and

power● Unbiasedness:

Outline

1. Latent Causal Variable model to distinguishcorrelation from causation

2. Comparison with MR in simulations

3. Application to MI and other traits

Application to 52 traits

● Summary statistic data:– 37 UK Biobank traits including MI (N=460k)– 16 other traits (average N=43k)

● Nominally significant genetic correlation: 430 trait pairs● Significant partial causality: 63 trait pairs (1% FDR)

– Many have low gcp estimates: probably not fullgenetic causality

Trait 1 Gen corr LCV p-val gcp est MR Ref

BMI 0.34 (0.09) 5x10-9 0.94 (0.11) Holmes 2014

Triglycerides 0.30 (0.06) 2x10-31 0.90 (0.08) Do 2013

LDL 0.17 (.08) 4x10-31 0.73 (.13) Voight 2012

Hypothyroidism 0.26 (0.05) 1x10-11 0.72 (0.16) Zhao 2017(null)

High cholesterol 0.52 (0.12) 2x10-4 0.71 (0.19) Voight 2012

Fasting glucose 0.19 (0.07) 4x10-4 0.62 (0.23) Ahmad 2015(T2D)

Traits affecting myocardial infarction:consistent with known biology

Effect of LDL on BMD consistent with RCTs

Trait 1 Trait 2 Gen corr LCV p-val gcp est MR ref

LDL BMD -0.12 (.05) 7x10-34 0.80 (.12)

● Familial defective apolipoprotein B-100: leads to highLDL and low BMD (Yerges-Armstrong et al. 2013 J Endocrinol Metab)

● Effect of statins on BMD in 7 trial meta-analysis (Wang etal. 2016 Medicine)

– Not interpreted specifically as evidence for an effect of LDL

– Modest effect size concordant with modest genetic correlation

Summary

● MR methods can be confounded by geneticcorrelations

● Partial genetic causality measured by genetic causalityproportion (gcp)

● LCV produces unbiased estimates of gcp and well-calibrated p-values

● LCV recapitulates known biology and identifies novelputative causal relationships

Acknowledgements

Alkes Price

Soumya Raychaudhuri

Ben Neale

Chirag Patel

Members of the Price lab

UK Biobank

Pre-print on biorxiv

Related work at ASHG2017: Morrison et al. poster 3004W

Recommended