Transcript
Page 1: Haplotype Discovery and Modeling

Haplotype Discovery and Modeling

Page 2: Haplotype Discovery and Modeling

Identification of genes

Identify the

PhenotypeMap Clone

Page 3: Haplotype Discovery and Modeling

QTL Mapping

Marker 1

Marker 2Marker 3

Marker k

QTL

A QTL (quantitative trait locus) is a gene that affects a quantitative trait,

The QTL detected by the markers linked with it is a chromosomal segment,

The DNA structure of a QTL is unknown.

.

.

.

Page 4: Haplotype Discovery and Modeling

I

II

III

1

1

1

2 4 5

3 5 7

2

2 4 6 8

aaBb

AABb AaBbaabb AaBb

Aabb

AaBb aaBb AaBbAABb AAbb Aabb

aabb

Aabb aaBb

QTL Mapping Based on Linkage

Page 5: Haplotype Discovery and Modeling

Mapping and sequencing

10000 Kb

100 Kb

Markers

DNA clones

Page 6: Haplotype Discovery and Modeling

SNPs (‘snips’)

• A SNP is a site in the DNA where different chromosomes differ in the base they have.

Page 7: Haplotype Discovery and Modeling

SNPs

Paternal allele: CCCGCCTTCTTGGCTTTACA

Maternal allele: CCCGCCTTCTCGGCTTTACA

Paternal allele : CCCGCCTTCTTGGCTTTACA

Maternal allele : CCCGCCTTCTTGGCTTTACA

Page 8: Haplotype Discovery and Modeling

HapMap

Single Nucleotide Polymorphisms (SNPs)

Insensitive to drug

Sensitive to drug

Detecting specific DNA sequence variantsthat determine complex traits

The International HapMap Consortium (Nature, 2003, 2005)

Page 9: Haplotype Discovery and Modeling

Allele, Haplotype, and Diplotype

Basic concepts

Page 10: Haplotype Discovery and Modeling

Haplotyping a Phenotype

Basic concepts

Quantitative Trait Nucleotide (QTN)

Page 11: Haplotype Discovery and Modeling

Risk Haplotype and Composite Diplotype

Risk haplotype: [AB] = R

Non-risk haplotype: [Ab], [aB], [ab] = r

Composite Diplotype: RR, Rr, rr

A

B B

A A

B B

AA

B B

AA

B B

A

,

Illustrations

Basic concepts

Consider A QTN composed of two SNPs:

RR (2) Rr (1) rr (0)

Page 12: Haplotype Discovery and Modeling

Study designA random sample of unrelated individuals from a natural population

SNPGroup 1 2 Diplotype Obs. Drug Response Trait

1 AA BB [AB][AB] n11/11 y1 = (y11, …, y1n11/11)T

2 AA Bb [AB][Ab] n11/10 y2 = (y21, …, y2n11/10)T

3 AA bb [Ab][Ab] n11/00 y3 = (y31, …, y3n11/00)T

4 Aa BB [AB][aB] n10/11 y4 = (y41, …, y4n10/11)T

5 Aa Bb [AB][ab] n10/10 y5 = (y51, …, y5n10/10)T

[Ab][aB]

6 Aa bb [Ab][ab] n10/00y6 = (y61, …, y6n10/00)T

7 aa BB [aB][aB] n00/11 y7 = (y71, …, y7n00/11)T

8 aa Bb [aB][ab] n00/10y8 = (y81, …, y8n00/10)T

9 aa bb [ab][ab] n00/00 y9 = (y91, …, y9n00/00)T

Page 13: Haplotype Discovery and Modeling

Unifying Likelihoodbased on marker (S) and phenotype (y) data

There are two types of parameters:

- Haplotype frequencies (population genetic parameters p) [AB]: p11 = pq+D [Ab]: p10 = p(1-q)-D p – Allele (A) frequency at SNP 1 [aB]: p01 = (1-p)q-Dq – Allele (B) frequency at SNP 2 [ab]: p00 = (1-p)(1-q)+D D – Linkage disequilibrium

- Haplotype effects and variation (quantitative genetic para. q) RR: µ2 = µ + a a = additive effect Rr: µ1 = µ + d d = dominance effect rr: µ0 = µ - a

Liu, Johnson, Casella and Wu, 2004, Genetics

Page 14: Haplotype Discovery and Modeling

Modeling Haplotype Frequencies

SNPGroup 1 2 Diplotype Frequency Obs.

1 AA BB [AB][AB] p211 n11/11

2 AA Bb [AB][Ab] 2p11p10 n11/10

3 AA bb [Ab][Ab] p210 n11/00

4 Aa BB [AB][aB] 2p11p01 n10/11

5 Aa Bb [AB][ab] 2p11p00 n10/10 [Ab][aB] 2p10p01

6 Aa bb [Ab][ab] 2p10p00 n10/00

7 aa BB [aB][aB] p201 n00/11

8 aa Bb [aB][ab] 2p01p00 n00/10

9 aa bb [ab][ab] p200 n00/00

Page 15: Haplotype Discovery and Modeling

EM algorithm

E step

M step

Page 16: Haplotype Discovery and Modeling

Modeling Haplotype Effects SNP Risk Haplotype

1 2 [AB] [Ab] [aB] [ab]1 AA BB [AB][AB] RR rr rr rr2 AA Bb [AB][Ab] Rr Rr rr rr3 AA bb [Ab][Ab] rr RR rr rr4 Aa BB [AB][aB] Rr rr Rr rr5 Aa Bb [AB][ab] Rr rr rr Rr

[Ab][aB] rr Rr Rr rr6 Aa bb [Ab][ab] rr Rr rr Rr7 Aa BB [aB][aB] rr rr RR rr8 Aa Bb [aB][ab] rr rr Rr Rr9 Aa bb [ab][ab] rr rr rr RR

Likelihood L1 L2 L3 L4

Genotypic values of composite diplotypes: RRu2, Rru1, rru0

Page 17: Haplotype Discovery and Modeling

Mixture Modelassuming that [AB] is the risk haplotype

Page 18: Haplotype Discovery and Modeling

EM Algorithm

• E step

• M step

)()1()(

)(

01

1

ii

ii yfyf

yf

11/11

12

11/11

ˆn

yn

ii

10/10

10/10

1

1 11ˆ n

ii

n

i

n

iiii

n

yy

10/10

10/10

1

1 10

)1(

)1(ˆ

n

ii

n

i

n

iiii

n

yy

20

1

21

1 1

20

21

1

22

2 )ˆ)(1()ˆ()ˆ()ˆ()ˆ(1

ˆ10/1011/11

ii

n

iii

n

i

n

iii

n

ii yyyyy

n

Page 19: Haplotype Discovery and Modeling

Hypothesis Testing

H0: µ2 = µ1 = µ0 = 0 RR = Rr = rr

H1: At least one of equalities in the H0 does

not hold

LR = –2ln[L0( |y) – L1( |y,S, )]

The threshold is determined empirically by permutation tests

q~ p̂q̂

Page 20: Haplotype Discovery and Modeling

Genome-wide Scan

LR

SNPs on the Genome

Threshold

Page 21: Haplotype Discovery and Modeling

Structural Variation in the Human Genome

Haplotype Blocks: Nearby SNPs are often distributed in block-like patterns

Hotspots and Coldspots: SNPs from different blocks have larger recombination rates than those from within blocks

Tag SNPs: Haplotype diversity within each block can be well explained by a small portion of SNPs.

Recombination Hot Spots

Block 1 Block 2 Block 3 Block 4 …

Page 22: Haplotype Discovery and Modeling

A Genetic StudyA candidate genefor human obesity

SNP A: A, G

SNP B: C, G

Four haplotypes[AC][AG][GC][GG]

• A total of 155 patients selected from a population • Typed for the two SNPs• Measured for body mass index (BMI)• Question: Which haplotype triggers an effect on BMI?

Page 23: Haplotype Discovery and Modeling

Testing Risk Haplotype LR[AC] 2.32 r[AG] 1.52 r[GC] 3.11 r[GG] 10.35 (p<0.01) R

RR: µ2 = µ + a = 30.83 – 1.77 = 29.06 a = additive effectRr: µ1 = µ + d = 30.83 – 3.05 = 27.78 d = dominance effectrr: µ0 = µ - a = 30.83 + 1.77 = 32.60 • A patient who combines haplotype [GG] with any other haplotypes is normal weight,• A patient who combines any two haplotypes from [AC], [AG] and [GC] is obese,• A patient who has double haplotypes [GG] is overweight

Page 24: Haplotype Discovery and Modeling

Model Extensions

• Block-Block Interactions (Lin et al. 2007, Bioinformatics)

• Haplotype-Environment Interactions (Wang et al. 2008, Molecular Pain)

• Haplotype Imprinting Effects (Cheng et al., to be submitted)

• Multivariate high-dimensional drug response (PK-PD link, efficacy and toxicity…) – A systems approach

Page 25: Haplotype Discovery and Modeling

1000-Genome Projects This sequencing effort will

produce most detailed map

of human genetic variation to

support disease studies

Results will help to design the

personalized medication which can

optimize drug therapy

 

  


Recommended