Upload
blake-colden
View
216
Download
1
Tags:
Embed Size (px)
Citation preview
Association studies
DiseaseResponder
ControlNon-responder
Allele 0 Allele 1
Marker A is associated with
Phenotype
Marker A:
Allele 0 =
Allele 1 =
Association studies• Evaluate whether
nucleotide polymorphisms associate with phenotype
T A GA A
C G GA A
C G TA A
T A TC G
T G TA G
T G GA G
Hypothesis – Haplotype Blocks?
The genome consists largely of blocks of
common SNPs with relatively little recombination
within the blocks Patil et al., Science, 2001; Jeffreys et al., Nature Genetics, 2001; Daly et al., Nature Genetics, 2001
Sense genes
Antisense genes
200 kb
1 2 3 4
DNA
SNPs
Haplotypeblocks
Haplotype Block StructureLD-Blocks, and 4-Gamete Test Blocks
One definition of block
•Based on the Four Gamete test.
•Intuition: when between two SNPs there are all four gametes, there is a recombination point somewhere inbetween the two sites
Four Gamete Block Test• Hudson and Kaplan 1985
A segment of SNPs is a block if between every pair of SNPs at most 3 out of the 4 gametes (00, 01,10,11) are observed.
0 0 10 1 11 1 01 1 1
0 0 10 1 11 1 01 0 1
BLOCK VIOLATES THE BLOCK DEFINITION
Finding Recombination Hotspots:Many Possible Partitions into Blocks
A C T A G A T A G C C TG T T C G A C A A C A TA C T C T A T G A T C GG T T A T A C G A C A TA C T C T A T A G T A TA C T A G C T G G C A T
All four gametes are present:
A C T A G A T A G C C TG T T C G A C A A C A TA C T C T A T G A T C GG T T A T A C G A C A TA C T C T A T A G T A TA C T A G C T G G C A T
Find the left-most right endpoint of any constraint and mark the site
before it a recombination site.
Eliminate any constraints crossing that site.
Repeat until all constraints are gone.
The final result is a minimum-size set of sites crossing all constraints.
Tagging SNPs
ACGATCGATCATGAT
GGTGATTGCATCGAT
ACGATCGGGCTTCCG
ACGATCGGCATCCCG
GGTGATTATCATGAT
A------A---TG--
G------G---CG--
A------G---TC--
A------G---CC--
G------A---TG--
An example of real data set
and its haplotype block
structure. Colors refer to the
founding population, one
color for each founding
haplotype
Only 4 SNPs are needed to tag
all the different haplotypes
Informativeness A measure for the “information” a SNP contains about about another SNP. Useful for designing SNPs Arraysand Tagging SNPs selection.
0 1 00 1
0 1 10 0
s
h2
h1
1 0 00 0
0 1 00 1
0 1 10 0
1 0 11 1s1 s2 s3 s4 s5
I({s3,s4},{s1,s2,s5}) = 3
S={s3,s4} is a
Minimal Informative Subset
Informativeness
Minimum Set Cover= Minimum Informative Subset
s1
s2
s5
s3
s4
e1
e2
e3
e4
e5
e6
SNPs Edges
1 0 00 0
0 1 00 1
0 1 10 0
1 0 11 1
s1
s2
s3
s4
s5
Graph theory insight
Informativeness