61
0 Analysis of Genome Wide AssociaBon Studies (GWAS) Lecture 20 David K. Gifford MassachuseQs InsOtute of Technology Computa(onal Analysis of QTLs

Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

0

Analysis of Genome Wide AssociaBon Studies (GWAS)

Lecture 20

David K.  Gifford

MassachuseQs InsOtute of Technology

Computa(onal  Analysis  of  QTLs  

Page 2: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

1

Today’s NarraBve Arc

1. We can discover human variants that are associated with aphenotype by studying the genotypes of case and controlpopulaOons– Approach 1 – Use allelic counts from SNP arrays (SNPs  called from

microarray data)– Approach 2 – Use read counts from sequencing (mulOple reads per

variant per individual)

2. We can prioriOze variants based upon their esOmatedimportance

3. Follow up confirmaOon is important because correlaOon isnot equivalent to causality

Computa(onal  Analysis  of  QTLs  

Page 3: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

  

2

Today’s ComputaBonal Approaches

1. ConOngency tables for allelic associaOon tests and genotypicassociaOon tests.

2. Methods of tesOng -­‐ Chi-­‐Square tests, Fisher’s  exact test3. Likelihood based tests of case/control posterior genotypes

Computa(onal  Analysis  of  QTLs  

Page 4: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

   

3

Out of  scope  for  today

1. Non-­‐random  genotyping failure2. Methods to correct for populaOon straOficaOon3. Structural variants (SVs) and copy number variaOons (CNVs)  

Computa(onal  Analysis  of  QTLs  

Page 5: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

4

Mendelian traits are caused by a single gene

Courtesy of David Altshuler. Used with permission.

Slide courtesy of David Altshuler, HMS/Broad

Computa(onal  Analysis  of  QTLs  

Page 6: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

ComputaOonal  Analysis  of  QTLs  AnimaOon: Itsik Pe’er, Columbia

Page 7: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

6

Healthy controlDisease casesComputa(onal  Analysis  of  QTLs  

Page 8: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

Healthy  controls  Disease  cases  7

Healthy  controls  Disease  cases  Computa(onal  Analysis  of  QTLs  

Page 9: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

8

Age-­‐related macular degeneraOon

Cohort – 2172 unrelated European descent individualsat least 60 years old

2004: LiQle known about cause of AMD

934controls

1238 cases

Photographs are in the public domain.

Computa(onal  Analysis  of  QTLs  

Page 10: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

9

SNP rs10611701238 individuals with AMD and 934 controls2172 individuals / 4333 alleles

2 2 (ad − bc) ( a + b + c + d)

χ = (a b)( c + d)( b + d)( a + c)+

X2 = 279 Df = (2 rows-­‐1)x(2 columns-­‐1) = 1

P-­‐value = 1.2 x 10-­‐62

Computa(onal  Analysis  of  QTLs  

Page 11: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

10

ConOngency Tables – χ2 test

n(a + b)(a + c)E1 = X 2 = ∑(a + b + c + d ) i=1

Df = (2 rows-­‐1)x(2 columns-­‐1) = 1

(Oi − Ei )2

Ei

Computa(onal  Analysis  of  QTLs  

Page 12: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

11

ConOngency Tables – Fisher’s  Exact Test

c + d

a + b + c + d a + c

c a + b a p =

$ ! $ ! &&%

##" &&%

##"

$ ! &&%

##"

Sum all probabiliOes for observed and all more extreme values with same marginal totals to compute probability of null hypothesis

Computa(onal  Analysis  of  QTLs  

Page 13: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

12

Does the affected or control group exhibitPopulaOon StraOficaOon?

• PopulaOon straOficaOon is when subpopulaOonsexhibit allelic variaOon because of ancestry

• Can cause false posiOves in an associaOon study ifthere are SNP  differences in the case and controlpopulaOon structures

• Control for this arOfact by tesOng control SNPs  forgeneral elevaOon in χ2 distribuOon between casesand controls

Computa(onal  Analysis  of  QTLs  

Page 14: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

13

Age-­‐related macular degeneraOo 2004: LiQle known about cause of AMD

2006: Three genes (5 common variants)Together explain >50% of risk

Courtesy of Macmillan Publishers Limited. Used with permission.

Source: Maller, Julian, Sarah George, et al. "Common Variation in Three Genes, Including a Noncoding

Variant in CFH, Strongly Influences Risk of Age-related Macular Degeneration." Nature Genetics 38,

no. 9 (2006): 1055-9.

Photographs are in the public domain.RelaOve risk ploQed as a funcOon of the geneOc load of the five variants thatinfluence risk of AMD. Two variants are in the CFH gene on chromosome 1:Y402H and rs1410996. Another common variant (A69S) is in hypotheOcal geneLOC387715 on chromosome 10. Two relaOvely rare variants are observed in theC2 and BF genes on chromosome 6. We find no evidence for interacOon betweenany of these variants, suggesOng an independent mode of acOon.

Edwards et al, Klein  et al, Haines et al Science (2005); JakobsdoRr et al, AJHG (2005); Gold et al Nature Gene-cs  (2006), Maller,  George, Purcell,  Fagerness,  Altshuler, Daly, Seddon,  Nature  GeneOcs (2006)

Computa(onal  Analysis  of  QTLs  

Page 15: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

14

Courtesy of Macmillan Publishers Limited. Used with permission.Source: Burton, Paul R., David G. Clayton, et al. "Genome-wide Association Study of 14,000 Cases of

Seven Common Diseases and 3,000 Shared Controls." Nature 447, no. 7145 (2007): 661-78.

Nature  Vol 447|7 June 2007| doi:10.1038/nature05911

Computa(onal  Analysis  of  QTLs  

Page 16: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

15

Linkage Disequilibrium (LD) between two loci L1 and L2 in gametes

At locus L1 pA probability L1 is A qa probability L1 is a

At locus L2 pB probability L2 is B qb probability L2 is b

D = Measure of linkage disequilibrium= 0 when L1 and L2 are in equilibrium

r2 = D2 / (pAqapBqb)

D=PABPab -­‐ PAbPaB

r is [0,1]  and is the correlaOon coefficient between allelic states in L1 and L2

Computa(onal  Analysis  of  QTLs  

Page 17: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

16

r2 from human chromosome 22 Computa(onal  Analysis  of  QTLs  

Page 18: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

17

LD organizes the genome into haplotype blocks

Human genome 5q31 region (associated with Inflammatory Bowel Disease)

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 19: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

18

The length of haplotype blocks vs Ome

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 20: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known
Page 21: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

    

20

Variant Phasing  

1. Phasing assigns alleles to their parental chromosome2. Set of ordered alleles along a chromosome is a haplotype3. Known  haplotypes can assist with phasing4. Phasing is criOcal for understanding the funcOonal status of

genes with more than one important SNPs  (are the non-­‐reference alleles on different chromosome? If so, the genemay not be funcOonal)

5. New  long read sequencing technologies phase variants inobserved reads

Computa(onal  Analysis  of  QTLs  

Page 22: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

21

Today’s NarraBve Arc

1. We can discover human variants that are associated with aphenotype by studying the genotypes of case and controlpopulaOons– Approach 1 – Use allelic counts from SNP  arrays (SNPs  called from

microarray data)– Approach 2 – Use read counts from sequencing (mulBple  reads per

variant  per  individual)

2. We can prioriOze variants based upon their esOmatedimportance

3. Follow up confirmaOon is important because correlaOon isnot equivalent to causality

Computa(onal  Analysis  of  QTLs  

Page 23: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

22  

22

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 24: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

23  

23

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 25: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

24  

24 Computa(onal  Analysis  of  QTLs  

Page 26: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

25  

25

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 27: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

26  

26

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 28: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

27

Genome Analysis Tool Kit (GATK) Computa(onal  Analysis  of  QTLs  

Courtesy of the Broad Institute. Used with permission. The most recent best practices can

be found at this website: https://www.broadinstitute.org/gatk/guide/best-practices.

Page 29: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

28

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 30: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

29

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 31: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

30 Computa(onal  Analysis  of  QTLs  

Courtesy of the Broad Institute. Used with permission. The most recent best practices can

be found at this website: https://www.broadinstitute.org/gatk/guide/best-practices.

Page 32: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

31

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 33: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

32

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 34: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

33

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 35: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

ComputaOona Analysi o QTLs 34

CompuOng genotypes

1= ∑ P(G)G∈{AA,AC ,...,TT }

Given the reads we observe we wish to compute P(Gp) at a SNP  for a populaOon p (p could be cases or controls)

Page 36: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

35

Joint esOmaOon of genotype frequencies

Computa(onal  Analysis  of  QTLs  

Page 37: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

36

Compute Bayesian posterior genotype frequencies (G) for each individual from their reads (D)

G=H1,H2

Computa(onal  Analysis  of  QTLs  

Page 38: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

37

Haploid likelihood considers the probability of errors

[Dj is a single read]

Computa(onal  Analysis  of  QTLs  

Page 39: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

38

Joint esOmaOon of genotype frequencies

Computa(onal  Analysis  of  QTLs  

Page 40: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

39

EM can be used to improve the esOmate of P(Gp)

n )( ) t)(t+1) 1 P(Di | Gp )P(GpP(G =

n ' 'i=1 | Gp)P G p p ∑

∑P(Di ( ) ( ) ' t' Gp

Computa(onal  Analysis  of  QTLs  

Page 41: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

40

TesOng for associaOons

Assume a reference allele (A) and a single non-­‐reference allele (a)

ψ = P(A) (1−ψ) = P(a) ε0 = P(AA) ε1 = P(Aa) ε2 = P(aa)

Computa(onal  Analysis  of  QTLs  

Page 42: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

41

TesOng for Hardy Weinberg Equilibrium (HWE)

When a populaOon is in HWE we can computegenotypic frequencies from allelic frequencies

We can test for HWE as follows –

P(D |ε0 ,ε1,ε2 )T3 = 2log P(D | (1−ψ)2,2ψ(1−ψ),ψ 2 )

Computa(onal  Analysis  of  QTLs  

Page 43: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

42

TesOng for associaOons

P(D[1] | ψ[1] )P(D[2] | ψ[2] )T1 = 2log P(D | ψ )

[1] and [2] are cases and controls. Do not use T2 when populaOon isin HWE as it will be underpowered (too many DOF)

P(D[1] |ε0[1] [1] [1] )P(D[2] |ε0

[2] [2] [2] ),ε1 ,ε2 ,ε1 ,ε2T2 = 2log P(D |ε0 ,ε1,ε2 )

Computa(onal  Analysis  of  QTLs  

Page 44: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

43 Computa(onal  Analysis  of  QTLs  

Page 45: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

44

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 46: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

45

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 47: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

46 Computa(onal  Analysis  of  QTLs  

Page 48: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

47

© source unknown. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Computa(onal  Analysis  of  QTLs  

Page 49: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

48 Computa(onal  Analysis  of  QTLs  

Page 50: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

49 Computa(onal  Analysis  of  QTLs  

Page 51: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

50

Today’s NarraBve Arc

1. We can discover human variants that are associated with aphenotype by studying the genotypes of case and controlpopulaOons– Approach 1 – Use allelic counts from SNP  arrays (SNPs  called from

microarray data)– Approach 2 – Use read counts from sequencing (mulOple reads per

variant per individual)

2. We can prioriBze  variants  based upon their  esBmatedimportance

3. Follow up confirmaOon is important because correlaOon isnot equivalent to causality

Computa(onal  Analysis  of  QTLs  

Page 52: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

51 Computa(onal  Analysis  of  QTLs  

© Macmillan Publishers Limited. All rights reserved. This content is excluded from our Creative

Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.Source: Weedon, Michael N., Inês Cebola, et al. "Recessive Mutations in a Distal PTF1A

Enhancer Cause Isolated Pancreatic Agenesis." Nature Genetics (2013).

Page 53: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

52

IdenOficaOon of possible funcOonal variants

Courtesy of Macmillan Publishers Limited. Used with permission.

Source: Weedon, Michael N., Inês Cebola, et al. "Recessive Mutations in a Distal PTF1A Enhancer

Cause Isolated Pancreatic Agenesis." Nature Genetics (2013).

Computa(onal  Analysis  of  QTLs  

Page 54: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

53

AssociaOon of variants with pedigrees

Courtesy of Macmillan Publishers Limited. Used with permission.

Source: Weedon, Michael N., Inês Cebola, et al. "Recessive Mutations in a Distal PTF1A Enhancer

Cause Isolated Pancreatic Agenesis." Nature Genetics (2013).

Computa(onal  Analysis  of  QTLs  

Page 55: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

54

Today’s NarraBve Arc

1. We can discover human variants that are associated with aphenotype by studying the genotypes of case and controlpopulaOons– Approach 1 – Use allelic counts from SNP  arrays (SNPs  called from

microarray data)– Approach 2 – Use read counts from sequencing (mulOple reads per

variant per individual)

2. We can prioriOze variants based upon their esOmatedimportance

3. Follow up confirmaBon is important because correlaBon isnot equivalent  to causality  

Computa(onal  Analysis  of  QTLs  

Page 56: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

55

ConfirmaOon of variant funcOon

Courtesy of Macmillan Publishers Limited. Used with permission.Source: Weedon, Michael N., Inês Cebola, et al. "Recessive Mutations in a Distal PTF1A Enhancer

Cause Isolated Pancreatic Agenesis." Nature Genetics (2013).

Computa(onal  Analysis  of  QTLs  

Page 57: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

56

© American Association for the Advancement of Science. All rights reserved. This content is excludedfrom our Creative Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Source: Roberts, Nicholas J., Joshua T. Vogelstein, et al. "The Predictive Capacity of Personal Genome

Sequencing." Science Translational Medicine 4, no. 133 (2012): 133ra58.

Computa(onal  Analysis  of  QTLs  

Page 58: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

57

© American Association for the Advancement of Science. All rights reserved. This content is excluded

from our Creative Commons license. For more information, see http://ocw.mit.edu/help/faq-fair-use/.

Source: Roberts, Nicholas J., Joshua T. Vogelstein, et al. "The Predictive Capacity of Personal Genome

Sequencing." Science Translational Medicine 4, no. 133 (2012): 133ra58.

Computa(onal  Analysis  of  QTLs  

Page 59: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

58

FIN

Computa(onal  Analysis  of  QTLs  

Page 60: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

59 Computa(onal  Analysis  of  QTLs  

Page 61: Analysis of Genome Wide AssociaBon Studies (GWAS) …...1. Phasing assigns alleles to their parental chromosome 2. Set of ordered alleles along a chromosome is a haplotype 3. Known

MIT OpenCourseWarehttp://ocw.mit.edu

7.91J / 20.490J / 20.390J / 7.36J / 6.802J / 6.874J / HST.506J Foundations of Computational and Systems Biology Spring 2014

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.