Genome-Wide Association (GWA)
StudiesNational Human Genome Research
Institute
National Institutes of
Health
U.S. Department of Health and
Human Services
U.S. Department of Health and Human Services
National Institutes of HealthNational Human Genome Research
Institute
Teri A. Manolio, M.D., Ph.D.Senior Advisor to the Director, NHGRI,
for Population GenomicsDirector, Office of Population Genomics
There’s a revolution going on…
Williard, AM. Spirit of ‘76
There’s a revolution going on…
Williard, AM. Spirit of ‘76
• Technologic advances now allow us to measure hundreds of thousands of variable points across human genome
– Relatively low cost
– Relatively little DNA
• Can be applied to unrelated individuals studied over years or decades
• Can identify multitude of subtle genetic effects increasing risk of “complex” disease
What is a GWA Study?
• Method for interrogating all 10 million variable points across human genome
• Variation inherited in groups, or blocks, so not all 10 million points have to be tested
• Blocks are shorter (so need to test more points) the less closely people are related
• Technology now allows studies in unrelated persons, assuming ~10,000 base pair lengths in common (300,000 - 500,000 markers)
Christensen and Murray, N Engl J Med 2007; 356:1094-1097.
Mapping the Relationships Among SNPs
Christensen and Murray, N Engl J Med 2007; 356:1094-1097.
One SNP May Serve as Proxy for Many
Progress in Genotyping Technology
1 10 102 103 104 105 106
Nb of SNPs
Cost
per
gen
oty
pe
(Cen
ts,
US
D)10
1
102
ABITaqMan
ABISNPlex
IlluminaGolden
Gate
IlluminaInfinium/
Sentrix Affymetrix
100K/500K
Perlegen
Affymetrix
MegAllele
2001 2005
Affymetrix
10K
Courtesy S. Chanock, NCI
0
300
600
900
1200
1500
1800
Jul-05 Oct-05 Jan-06 Apr-06 Jul-06 Oct-06
Affymetrix 500K
Illumina 317K
Illumina 550K
Illumina 650Y
Continued Progress in Genotyping Technology
Courtesy S. Gabriel, Broad/MIT
July 2005 Oct 2006
Cost
per
pers
on
(U
SD
)
Courtesy, K. Doheny, Johns Hopkins
Intensity Data for Three Combinations of Two Alleles
GWA Genotyping Data, Chromosome 22, Parkinson’s
Study
Study ID
Case/Control Status
rs5747620 rs2236639
Allele 1
Allele 2
Allele 1
Allele 2
14 Case T T G G
20 Case T C G G
41 Case T C G G
412 Control T C G G
592 Control C C G G
665 Control T C A G
http://ccr.coriell.org/ninds/
Association of rs2236639 Alleles with Development of Parkinson Disease
(Made Up!)Development of Disease
Variant Allele (A)
Develop Disease
Do Not Develop
DiseaseTotal
Present 10 70 80
Absent 40 880 920
Total 50 950 1,000
Relative Risk =
Risk in Exposed=
10/80=
12.5%
=2.9Risk in
Unexposed40/92
04.3%
Measures of Association: The Odds Ratio
• Odds are related to probability: odds = p/(1-p)– If probability of horse winning race is 50%,
odds are 1/1– If probability of horse winning race is 25%,
odds are 1/3 for win or 3 to 1 against win• If probability of exposed person getting disease
is 25%, odds = p/(1-p) = 25/75 = 1/3• When don’t have denominators for risk
estimates, can calculate odds ratio = cross-product ratio (“ad/bc”); computationally easier
• If disease is rare, odds ratio approximates relative risk but always overestimates effect
Association of rsxxxx3207 Alleles with Occurrence of Myocardial Infarction
Presence of Disease
Variant Allele (G)
Present Absent Total
Present 813 3,061 ?
Absent 794 3,667 ?
Total 1,507 6,728 ??
OR =Odds in Exposed
=
813 / 3,061
=
813 x 3,667
=1.23Odds in
Unexposed794 / 3,667
794 x 3,061Helgadottir et al, Sciencexpress 3 May 2007.
Association of rsxxxx3207 Alleles with Occurrence of Myocardial Infarction
Presence of Disease
Variant Allele (G)
Present Absent Total
Present 813 3,061 ?
Absent 794 3,667 ?
Total 1,507 6,728 ??
OR =Odds in Exposed
=
813 / 3,061
=
813 x 3,667
=1.23Odds in
Unexposed794 / 3,667
794 x 3,061
Helgadottir et al, Sciencexpress 3 May 2007.
This is a tsunami of data…
Hokusai, K. The Great Wave
This is a tsunami of data…
Hokusai, K. The Great Wave
• New approaches needed for accessing, manipulating, visualizing
• Requires entirely new perspective
• Recognize potential for differences to be observed by chance alone
A Few Epidemiologic Definitions
P-Value
Probability of finding result as extreme or more extreme by chance alone (0.0001 or 1 x 10-4)
Type I error (α)
Probability of finding a difference when in fact none exists (also called “spurious association”)
Type II error (β)
Probability of failing to find a difference when in fact one does exist
PowerProbability of finding a difference when one in fact does exist, = 1 - β
Effect Size
Magnitude of risk associated with variant
Sample Size
P-value Effect size
Allele frequency Variability of measure
Klein et al, Science 2005; 308:385-389.
P Values of GWA Scan for Age-Related Macular Degeneration
http://www.broad.mit.edu/diabetes/scandinavs/type2.html
Genome-Wide Scan for Type 2 Diabetes in a Scandinavian Cohort
P-values for 8q24 SNPs Most Strongly Associated with Prostate
Cancer
Haiman et al, Nat Genet 2007; 39:638-44.
P-values for Chromosome 11 SNPs Most Strongly Associated with
Diabetes
Scott et al, Sciencexpress 26 April 2007.
Of 600 Gene-Disease Associations, Only 6 Significant in > 75% of
Identified Studies
Disease/Trait GenePolymorphism
Frequency
DVT F5 Arg506Gln 0.015
Graves’ Disease
CTLA4 Thr17Ala 0.62
Type 1 DM INS 5’ VNTR 0.67
HIV/AIDS CCR5 32 bp Ins/Del 0.05-0.07
Alzheimer’s APOE Epsilon 2/3/4 0.16-0.24
Creutzfeldt-Jakob Disease
PRNP Met129Val 0.37
Hirschhorn J et al, Genet Med 2002; 4:45-61.
Aspects of GWA Studies that Make Data Sharing Crucial
• Expensive, generate many “false positives”
• Replication held as sine qua non of valid association
• Large sample sizes and multiple studies needed to replicate findings
• Massive data sets, analysis requires huge and specialized effort
• Better analytic methods needed• Once genome is measured can be
related to just about anything
Larson, G. The Complete Far Side. 2003.
The revolution is here…
Williard, AM. Spirit of ‘76
The revolution is here…
Williard, AM. Spirit of ‘76
• Extensive characterization of individual person’s genome now feasible
• Can be applied to unrelated individuals• Many existing studies have carefully
characterized thousands of persons • New approaches to manipulating and
interpreting data needed• Responsible and widespread data sharing
key to fully exploring GWA datasets• Collaboration for replication and functional
determination is crucial
Measures of Public Health Impact: Population Attributable Risk
• Measures the proportion of disease that would be eliminated if particular causal factor were eliminated
• Directly related to prevalence of risk factor and risk it conveys
• Almost always over-estimates proportion attributable to risk factor
PAR =
(Prevalence of exposure) x (Relative risk - 1)
1 + (Prevalence of exposure) x (Relative risk - 1)
GWA Genotyping Data, Chromosome 22, Parkinson’s
Study
Study ID
Case/Control Status
rs5747620 rs2236639
Allele 1
Allele 2
Allele 1
Allele 2
14 Case T T G G
20 Case T C G G
41 Case T C G G
412 Control T C G G
592 Control C C G G
665 Control T C A G
http://ccr.coriell.org/ninds/
GWA Genotyping Data, Chromosome 22, One Person
412 1 T C G G G G A A A A C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T …
Chromosome 22, One Person, Continued
…G A A A A C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T TC C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T …
Chromosome 22, One Person, Continued…
…A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T TC C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T T T C C A A A A T C A G C C T C T T T C T C T T A G C C C A A A A T C A G C C T C T T T C T C T T A G C C A G A A T C C A A A A T C A G C C T C …