View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Haplotype mapping withSingle Feature Polymorphismsin Arabidopsis
Justin BorevitzEcology & EvolutionUniversity of Chicagohttp://naturalvariation.org
Talk Outline
• Natural Variation in Light Response
• Single Feature Polymorphisms (SFPs)– Potential deletions
• Haplotype analysis
• Patterns in gene Families
• Aquilegia
• Natural Variation in Light Response
• Single Feature Polymorphisms (SFPs)– Potential deletions
• Haplotype analysis
• Patterns in gene Families
• Aquilegia
Light Affects the Entire Plant Life Cycle
Light response variation can be seen under constant conditions in the labNatural Variation under selection? Test in field
Seasons in the Growth Chamber
• Changing Day length• Cycle Light Intensity• Cycle Light Colors• Cycle Temperature
Day Length
0:00
2:00
4:00
6:00
8:00
10:00
12:00
14:00
16:00
18:00
20:00
22:00
sep
oct
nov
dec
jan
feb
mar
apr
may jun jul
aug
month
hour
s
Sweden
Spain
standard
standard
Light Intensity
0
200
400
600
800
1000
1200
1400se
p
oct
nov
dec
jan
feb
mar
apr
may jun jul
aug
month
W/m
2
Sweden
Spain
standard
Temperature
-10
-5
0
5
10
15
20
25
30
35
sep
oct
nov
dec
jan
feb
mar
apr
may jun jul
aug
monthde
gree
s C
Spain High
Spain Low
Sweden High
Sweden Low
standard
Which arrays should be used?
• Spotted arrays Arizona 29,000 - 70mers• ATH1, Affymetrix expression GeneChip
202,806 unique 25bp oligo nucleotides features• AtTILE1, universal whole genome array
every ~35bp, > 3Million PM features• Re-sequencing array 120M*8bp
– 20 Accessions, Perlegen,
– Max Planck (Weigel), USC (Nordborg)
GeneChip
RNA DNA
Universal Whole Genome Array
Transcriptome AtlasExpression levelsTissues specificity
Transcriptome AtlasExpression levelsTissues specificity
Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription
Gene DiscoveryGene model correctionNon-coding/ micro-RNAAntisense transcription
Alternative SplicingAlternative Splicing Comparative GenomeHybridization (CGH)
Insertion/Deletions
Comparative GenomeHybridization (CGH)
Insertion/Deletions
MethylationMethylation
ChromatinImmunoprecipitation
ChIP chip
ChromatinImmunoprecipitation
ChIP chip
Polymorphism SFPsDiscovery/Genotyping
Polymorphism SFPsDiscovery/Genotyping
~35 bp tile, non-repetitive regions, “good” binding oligos, evenly spaced
ChipViewer: Mapping of transcriptional units of ORFeome
From 2000v At1g09750 (MIPS) to the latest AGI At1g09750
2000 v Annotation (MIPS)
The latest AGI Annotation
SNP SFP MMMMM MSFP
SFP
MMMMM M
Chromosome (bp)
con
serv
atio
n
SNP
ORFa
start AAAAA
Tra
nsc
ripto
me
Atla
s
ORFb
deletion
Improved Genome Annotation
Potential Deletions
False Discovery and Sensitivity
PM only
SAM threshold
5% FDR
GeneChip SFPs nonSFPs Cereon marker accuracy 3806 89118 100% Sequence 817 121 696 Sensitivity
Polymorphic 340 117 223 34% Non-polymorphic 477 4 473
False Discovery rate: 3% Test for independence of all factors: Chisq = 177.34, df = 1, p-value = 1.845e-40 SAM threshold 18% FDR
GeneChip SFPs nonSFPs Cereon marker accuracy 10627 82297 100% Sequence 817 223 594 Sensitivity
Polymorphic 340 195 145 57% Non-polymorphic 477 28 449
False Discovery rate: 13% Test for independence of all factors: Chisq = 265.13, df = 1, p-value = 1.309e-59
3/4 Cvi markers were also confirmed in PHYB
90% 80% 70%
41% 53% 85%
90% 80% 70%
67% 85% 100%
Cereonmay be asequencingError
TIGRmatch isa match
Chip genotyping of a Recombinant Inbred Line
29kb interval
Discovery 6 replicates X $500 12,000 SFPs = $0.25Typing 1 replicate X $500 12,000 SFPs = $0.041
Map bibb100 bibb mutant plants100 wt mutant plants
bibb mapping
ChipMapAS1
Bulk segregantMapping usingChip hybridization
bibb maps toChromosome2 near ASYMETRIC LEAVES1
BIBB = ASYMETRIC LEAVES1
Sequenced AS1 coding region from bib-1 …found g -> a change that would introduce a stop codon in the MYB domain
bibb as1-101
MYB
bib-1W49*
as-101Q107*
as1bibb
AS1 (ASYMMETRIC LEAVES1) =MYB closely related toPHANTASTICA located at 64cM
Array Mapping
Hazen et al Plant Physiology (submitted)
chr1 chr2 chr3 chr4 chr5
eXtreme Array Mapping
Histogram of Kas/Col RILs Red light
hypocotyl length (mm)
cou
nts
6 8 10 12 14
02
46
81
01
2
15 tallest RILs pooled vs15 shortest RILs pooled
LOD
eXtreme Array Mapping
Allele frequencies determined by SFP genotyping. Thresholds set by simulations
0
4
8
12
16
0 20 40 60 80 100cM
LO
D
Composite Interval Mapping
RED2 QTL
Chromosome 2
RED2 QTL 12cM
Red light QTL RED2 from 100 Kas/ Col RILs (Wolyn et al Genetics 2004)
eXtreme Array Mapping BurC F2
XAMLz x Col
F2
QTLLz x Ler
F2
(Werner et al Genetics in press)
XRED2 QTL
mark1 mark2
Select recombinants by PCR >200 from >1250 plants
HighLow~2Mb ~8cM
>400 SFPsCol
Kas
Col Col
Col het
Col
~2
Kas
het Col
het het
het
~43
Kas
Kas Col
Kas het
Kas
~268
~43 ~539 ~43
~268 ~43 ~2
Kas
eXtreme Array Fine Mapping
Potential Deletions
>500 potential deletions45 confirmed by Ler sequence
23 (of 114) transposons
Disease Resistance(R) gene clusters
Single R gene deletions
Genes involved in Secondary metabolism
Unknown genes
Potential Deletions Suggest Candidate Genes
FLOWERING1 QTL
Chr1 (bp)
Flowering Time QTL caused by a natural deletion in FLM
MAF1
FLM natural deletion
(Werner et al PNAS 2005)
Fast Neutron deletions
FKF1 80kb deletion CHR1 cry2 10kb deletion CHR1
Het
Array Haplotyping
• What about Diversity/selection across the genome?
• A genome wide estimate of population genetics parameters, θw, π, Tajima’D, ρ
• LD decay, Haplotype block size• Deep population structure?• Col, Lz, Bur, Ler, Bay, Shah, Cvi, Kas,
C24, Est, Kin, Mt, Nd, Sorbo, Van, Ws2Fl-1, Ita-0, Mr-0, St-0, Sah-0
Array Haplotyping
Inbred lines
Low effectiverecombinationdue to partialselfing
Extensive LDblocks
Col Ler Cvi Kas Bay Shah Lz Nd
Chr
omos
ome1
~50
0kb
(-4,-3.5] (-3,-2.5] (-2,-1.5] (-1,-0.5] (0,0.5] (1,1.5] (2,2.5] (3,3.5]
T statistic
fre
qu
en
cy
0
e+
00
4
e+
04
8
e+
04
Distribution of T-stats
null (permutation)actual
Not Col ColNA NA duplications
32,427Calls
208,729
12,250 SFPs
Sequence confirmation of SFPs
SFP SNP Total FPR FDR Sensitivity
bay 44 61 1375 0.8% 25.0% 54.1%
bur 47 57 1320 1.1% 29.8% 57.9%
cvi 69 92 1325 1.2% 21.7% 58.7%
ler 41 51 1466 0.6% 22.0% 62.7%
lz 37 40 1441 0.5% 18.9% 75.0%
mr 67 87 1191 1.1% 17.9% 63.2%
mt 46 48 1413 0.9% 26.1% 70.8%
sorbo 37 53 1317 0.9% 29.7% 49.1%
ws 29 47 1369 0.3% 13.8% 53.2%
SFPs for reverse genetics
http://naturalvariation.org/sfp
14 Accessions 30,950 SFPs`
Chromosome Wide Diversity
Diversity 50kb windows
Tajima’s D like 50kb windows
RPS4 unknown
R genes vs bHLH Theta W
RPS4
Rgenes vs bHLH Tajimas’ D
RPS4
R genes vs bHLH
(-1,-0.8] (-0.6,-0.4] (-0.2,0] (0.2,0.4] (0.6,0.8]
Selection
Tajima's D like statistic
freq
uen
cy
01
02
03
04
05
06
07
0
RgenesbHLH
Review
• Single Feature Polymorphisms (SFPs) can be used to
• Identify recombination breakpoints• eXtreme Array Mapping• Potential deletions (candidate genes)
• Haplotyping• Diversity/Selection
• Association Mapping
Aquilegia (Columbines)
Recent adaptive radiation, 350Mb genome
> 20k dbEST 11/14/2003
Animal lineage: good coverage
Plant lineage: crop plant coverage
NSF Genome Complexity
• 45,000 ESTs 5’ and 3’ ends• 350 arrays, RNA and genotyping
– High density SFP Genetic Map
• Physical Map (BAC tiling path)– Physical assignment of ESTs
• QTL for pollinator preference – ~400 RILs, map abiotic stress
– QTL fine mapping/ LD mapping
• Develop transformation techniques
Scott Hodges (UCSB)
Elena Kramer (Harvard)
Magnus Nordborg (USC)
Justin Borevitz (U Chicago)
Jeff Tompkins (Clemson)
NaturalVariation.orgNaturalVariation.orgSalk
Jon WernerJoanne ChoryJoseph Ecker
Max Planck
Detlef Weigel
UC San Diego
Charles Berry
Scripps
Sam HazenElizabeth Winzeler
Salk
Jon WernerJoanne ChoryJoseph Ecker
Max Planck
Detlef Weigel
UC San Diego
Charles Berry
Scripps
Sam HazenElizabeth Winzeler
University of Chicago
Xu ZhangEvadne Smith
UC Davis
Julin Maloof
University of Guelph, Canada
Dave Wolyn
Sainsbury Laboratory
Jonathan Jones
University of Chicago
Xu ZhangEvadne Smith
UC Davis
Julin Maloof
University of Guelph, Canada
Dave Wolyn
Sainsbury Laboratory
Jonathan Jones