33
PREDICTION MODELS USING GENOMIC PROFILING H. Zhang E. Warner D. Zhao

P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

Embed Size (px)

Citation preview

Page 1: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PREDICTION MODELS USING GENOMIC PROFILINGH. Zhang

E. Warner

D. Zhao

Page 2: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

GENOMIC PROFILINGTesting genes at multiple loci simultaneously

Complex diseases-complex causal pathway

High number of weak predictors Single genes–of limited predictive value

Page 3: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

QUESTIONS OF INTERESTIs testing low-risk genes at multiple loci useful clinically -- discriminative accuracy?

Can we predict individual genetic risk from GWAS?

Value for assessing susceptibility to common diseases and targeting interventions?

Page 4: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PREDICTIVE TESTING FOR COMPLEX DISEASES USING MULTIPLE GENES: FACT OR FICTION?

Cecile J. W. Janssens, Yurii S. Aulchenko, Stefano Elefante, Gerard J J. M Borsboom, Ewout W. Steyerberg, Cornelia M van Dujuin

Genet Med 2006: 8(7):395-400

Page 5: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

METHODSSimulation test: 100,000 subjects up to 400 genes Each gene: two alleles Hardy-Weinberg Equilibrium Disease risk associated with genetic profiles:

Bayes’ theorem Multiplicative model No LD between genes All genes predictive of the disease are known

Page 6: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

METHODS (CONTINUED)

Part I: genes had the same risk allele frequencies and the same effect on the disease (same ORs)

Part II: genes with varying ORs and allele frequencies

Page 7: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

DISCRIMINATIVE ACCURACY

AUC (Area Under the ROC Curve)

Probability that the test correctly identifies the diseased subject

Page 8: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

EXAMPLES OF AUC

Page 9: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

FIG 1. DISCRIMINATIVE ACCURACY OF

GENETIC PROFILING (CONSTANT ORS)

Page 10: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao
Page 11: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

FIG 2. DISCRIMINATIVE ACCURACY OF GENETIC PROFILING (VARYING ORS)

Page 12: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

FIG. 3. RELATIONSHIP BETWEEN HERITABILITY AND DISCRIMINATIVE ACCURACY

Page 13: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

CONCLUSIONS Discriminative accuracy depends on

Number of genes Frequency of risk alleles Risk associated with the genotypes Heritability (few strong predictors or large

number of common susceptibility genes)

Level of discriminative accuracy required for clinical application depends on the goal of testing, burden of disease, cost, treatment availability etc.

Page 14: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PREDICTION OF INDIVIDUAL GENETIC RISK TO DISEASE FROM GENOME-WIDE ASSOCIATION STUDIES

Naomi R. Wray, Michael E. Goddard and Peter M Visscher

Genome Res. 2007 17: 1520-1528

Page 15: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PURPOSE • Research Question

– Can we identify high risk genetic profiles consisting of multiple risk alleles with small effects at any given locus?

• Aims– Investigate the relationship between the RR of

genetic loci and the number of loci that contribute to disease risk

– Investigate the number of loci underlying complex disease of a given disease prevalence and heritability

– Simulate a case control study to investigate the prediction of genetic risk of disease from multiple loci in a genome wise association study (GWAS)

– Use SNPs selected from the simulation to see how accurately they predict the risk in a random sample

Page 16: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

METHODS• Repeated simulations• Parameters

– Disease prevalence: 0.05 or 0.10– Heritability: 0.1 or 0.2– Allele frequency distribution: uniform

(common disease-common variant) or U-shaped (neutral allele hypothesis)

– GWAS• 500,000 SNPs• Number of disease risk loci: 10, 20, 50, 100, 300,

1000• 1000 or 10,000 cases and controls

Page 17: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

RESULTS: # OF LOCI AND AVERAGE RELATIVE RISK

Page 18: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

RESULTS: RISK ALLELE MODELS

Page 19: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

RESULTS: SELECTED SNPS

Page 20: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

RESULTS: ACCURACY

Page 21: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

ASSUMPTIONS AND LIMITATIONS• True causal SNPs were always included in

GWAS• All genetic variance was attributable to

variants of frequency 0.01 to 0.99• No population stratification • All genotypes are in Hardy-Weinberg

equilibrium• No LD between SNPs• Did not consider gene-gene or gene-

environment interactions

Page 22: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

CONCLUSIONSPrediction of genetic risk is

possible, even if there are hundreds of risk variants, each of small effect

Genomic profiling may not be appropriate for rare diseases

Implementation of these procedures doesn’t require knowledge of causal mechanism

Page 23: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

AN EPIDEMIOLOGIC ASSESSMENT OF GENOMIC PROFILING FOR MEASURING SUSCEPTIBILITY TO COMMON DISEASES AND TARGETING INTERVENTIONS

Muin J. Khoury, Quanhe Yang, Marta Gwinn, Julian Little, W. Dana Flanders

Genet Med 2004:6(1):38-47

Page 24: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PURPOSE Normal epidemiological methods

already provide information about important exposure-disease associations that can be used to reduce disease burden

What value does genomic profiling/genetic testing to predict susceptibility add to usual epidemiological methods?

Page 25: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

TWO ASPECTS OF “VALUE” Clinical value – individual level

Clinical validity – can genetic testing help predict future disease positive and negatives?

Clinical utility – can genetic testing help lower disease risks for people with a “positive” genetic test

See the task force on genetic testing and the Secretary's Advisory Committee on Genetic Testing – references in paper

Public health value – population level Public health utility – how does reduction of disease

burden in population based on genetic profiling compare to population-wide interventions

Page 26: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

GENERAL METHODS Model: Risk = Baseline + Gene1 + Gene2 +

Gene3 + Modifiable exposure Posited hypothetical but “likely” data by

varying the following parameters: Lifetime risk of a disease Number of loci in genetic test Frequency of genotypes Strength of association between these loci and

the disease Strength of association between exposure and

the disease Calculated value for these hypothetical

data by calculating impact of targeted intervention on the exposure

Page 27: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

CLINICAL VALIDITY AND UTILITY

Technical validity is assumed

Page 28: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

CLINICAL VALIDITY AND UTILITY

Technical validity is assumed

Page 29: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PUBLIC HEALTH UTILITY

Calculate the ratio of PAFt – reduction of disease burden due to targeted intervention – to PAF

Page 30: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

PUBLIC HEALTH UTILITY

Calculate the ratio of PAFt – reduction of disease burden due to targeted intervention – to PAF

Page 31: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

IMPLICATIONS There are other parameters that could be

varied: Higher synergy will lead to higher predictive

values and population impact Epistasis will lead to higher predictive values

Tension between targeted and population interventions Screening may be a good compromise:

population-wide intervention of education and awareness + targeted intervention

Genetic testing has different added values under different conditions and epidemiological methods can be used to determine the extent of its added value

Page 32: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

SUMMARYGenetic tests should involve multiple lociDiscriminative accuracy improves with

higher heritabilityNumber of loci needed to accurately

identify associationFunction of heritability, prevalence and RR

Complicated relationships between accuracy, allele effect sizes and allele frequency

Page 33: P REDICTION M ODELS U SING G ENOMIC P ROFILING H. Zhang E. Warner D. Zhao

SUMMARYThe accuracy of the predictive

models in the presence of gene-gene/environment interactions may be overstated

Genetic testing must be applied to all subjects and can be resource-intensive