Upload
cimmyt-int
View
926
Download
2
Embed Size (px)
DESCRIPTION
Presentacion de 11th Asian Maize Conference which took place in Beijing, China from November 7 – 11, 2011.
Citation preview
Association Mapping, Breeder Ready markers and Genomic Selection
Raman Babu, Jill Cairns, Gary Atlin, PH Zaidi, Pichet Grudloyma, George
Mahuku, Sudha K Nair, Natalia Palacios, Pixley Kevin, Jose Crossa, BM
Prasanna and all the Breeders of CIMMYT
Outline Association Mapping for Drought Tolerance – CIMMYT‟s
experience
● Are there large effect genes for GY_stress?
● Should we bother about “rare alleles” that have large
effects?
Association Mapping for Disease Resistance
Association Mapping (Candidate-gene based) for
Carotenoids
„Breeder-ready‟ markers for disease resistance and ProA
Integrating Genomic Selection in the breeding Pipeline
LD and Population structure in DTMA-AM panel based on 55K SNPs
Average distance between two
markers is 55kb and Average EM-
R2 is 0.26
LD in DTMA panel is low and
hence suitable for association
mapping
Population structure is ‘mild’ and
association results were corrected
for structure (through PCA) and
kinship by MLM
LaPosta Seq
CIM-CALI
DTP
DTMA-AM panel and 55K SNPs can identify large effect genes – 1. Grain Color
Psy1
R² = 37%
SNP with largest significant association with
grain color located within one of the exons of
Phytoene Synthase1 (psy1) on chr.6
92 – Yellow lines (1)
186 – white lines (0)
DTMA-AM panel and 55K SNPs can identify large effect genes – 2. QPM
Opaque2
at 7.01
R² = 16%
Ask2
R² = 8%
Besides opaque-2 and ask-2, several minor QTL regions
influencing kernel modification and tryptophan content
identified that overlap with previously reported regions…
10 – QPM lines (1)
268 – Normal lines (0)
Mapping Drought Tolerance
Strategy GWAS AM-panel ~ 300 inbreds – TCd with CML312
Known DT sources La Posta Sequia C7; DTP C9, MBR etc.
Phenotyping 10 locations – Stress & Optimal
Heritabilities Kiboko-10-Late (0.64), M-10 –Tlalti (0.67), Thailand-
10 (0.49), M-Tlalti-09 (0.54), Zim-10 (0.22) Across
locations: 0.35
Phenotype used in
GWAS
Combined BLUPs of TC_GY under stress, corrected for
anthesis date
Genotyping Genome-wide, high density markers – 55K SNPs and GBS
markers (500K SNPs)
Statistical Model General Linear Model (PCA correction) and Mixed Linear
Model (PCA + Kinship – Q+K)
12 -15 significant genomic regions identified for DT
7.3% 6.2% 5.7%
7.0%
5.7%
5.5%
5.1%
5.8%
4.9%
5.1%
Only 147 SNPs (~15 genomic regions) had R2 values more than 5%
Significant Genomic Regions associated with
TC_GY_Stress
SNP Chr Position P value MAF R2 (%) Effect
(kg/ha) Candidate Gene
SYN39332 10 142655119 9.62E-06 0.49 7.6 29.3 Starch Synthase
PZE-107042377 7 72216348 1.49E-05 0.32 7.3 35.6
Myb family transcription factor-related protein
PZE-108046876 8 77237318 2.33E-05 0.35 6.9 -34.4
PZE-110029252 10 50842298 7.77E-05 0.40 6.8 -26.3
PZE-107032355 7 45011599 6.62E-05 0.38 6.2 38.6
SYN37988 2 146399448 3.84E-05 0.26 6.1 49.5 TSA: Zea mays contig27975, mRNA sequence
PZE-101090321 1 80757998 1.67E-04 0.46 4.9 -31.4
PZE-109041733 9 62608362 1.35E-04 0.42 4.9 25.5
PZE-104047052 4 78536398 1.21E-04 0.32 4.7 30.1
Average GY of the stress trials – 1.3 t/ha
Heritability across locations – 0.32
Rare Alleles with Large Effects
Marker Chr Position P Minor Allele MAF Average for
DD Average for
Dd Average for
dd Effect (kg/ha) DD Dd dd
PZE-104042524 4 67259441 3.70E-03 A 0.14 1499.5 1414.1 1382.8 116.6 7 59 188
PZE-101066401 1 49827350 1.54E-02 A 0.04 1487.8 1391.5 96.3 10 0 257
SYN36769 4 4914023 7.83E-03 A 0.06 1479.7 1355.9 1391.7 88.0 14 2 249
SYN26515 1 63053588 2.42E-03 A 0.06 1472.1 1253.0 1391.0 81.0 15 1 251
SYN1035 5 5786027 3.24E-02 G 0.07 1463.1 1395.2 1390.3 72.8 16 2 240
PZE-110053356 10 100124247 4.70E-03 A 0.11 1331.7 1381.2 1401.8 -70.1 17 24 224
PZE-104113536 4 194565443 7.31E-04 A 0.13 1334.7 1282.7 1405.3 -70.6 33 3 231
PZE-102096857 2 107898705 3.07E-03 G 0.08 1329.7 1400.4 -70.6 20 0 238
PZE-109074314 9 116545321 2.21E-04 G 0.08 1328.8 1400.3 -71.5 20 0 246
PZE-105127701 5 183968110 2.24E-04 A 0.07 1323.4 1400.7 -77.4 19 0 249
PZE-102121069 2 162773047 8.92E-04 G 0.06 1320.1 1400.2 -80.1 17 0 250
PZE-106064720 6 116886483 1.09E-02 A 0.07 1318.2 1307.6 1401.3 -83.1 17 1 248
SYN14434 2 15813081 1.40E-03 A 0.08 1314.6 1433.4 1398.6 -84.0 19 1 221
PZE-106056703 6 107499158 1.98E-04 G 0.06 1310.2 1307.6 1401.2 -91.0 15 1 251
SYN8914 3 194356323 4.25E-03 G 0.08 1307.9 1381.6 1400.1 -92.2 9 25 226
Rare Alleles with
Positive Large
Effects
PZE-104042524
4 67259441
A GY (kg/ha)
90[SPMATC4/P500(SELY)]#-B-4-2-B-B 1483.8
DTPYC9-F46-3-9-1-1-B-B 1461.7 La Posta Seq C7-F125-2-1-1-2-B-B-B 1436.8
La Posta Seq C7-F103-2-2-2-1-B-B-B 1626.9
La Posta Seq C7-F180-3-1-1-1-B-B-B 1593.5 La Posta Seq C7-F96-1-1-1-B-B 1482.1
DTPYC9-F72-1-2-1-1-B-B 1411.4
PZE-101066401
1
49827350
A GY (kg/ha)
POB.502 c3 F2 10-3-2-1-BBBBBB-B 1429.0
POB.502c3 F2 9-14-1-2-B-B-B-B 1482.4
CLQ-RCYQ28=(CLQ6502*CLQ6601)-B-34-2-2-B*6-B 1476.1
DTPWC9-F24-4-3-1-B-B-B 1554.0
DTPWC9-F115-1-4-1-1-B-B-B 1483.4
DTPWC9-F103-2-1-1-1-B-B-B 1469.6
DTPYC9-F46-3-4-1-1-B-B-B 1535.9
DTPYC9-F46-3-9-1-1-B-B 1461.7
DTPYC9-F46-1-2-1-2-B-B 1606.1
DTPYC9-F13-2-1-1-2-B-B 1379.5
SYN36769
4
4914023
A GY (kg/ha)
[SYN-USAB2/SYN-ELIB2]-12-1-1-2-BBB 1497.3
[CML440/[[[K64R/G16SR]-39-1/[K64R/G16SR]-20-2]-5-1-2-B*4/CML390]-B-39-2-B-4-#-1-B//ZM303c1-243-3-B-1-1-B]-9-1
[[KILIMA ST94A]-30/MSV-03-1-10-B-1-B-B-1xP84c1 F27-4-1-6-B-5-B] F8-3-2-2-1 x G16SeqC1F47-2-1-2-1-BBBB-B-xP84c1 F26-2-2-6-B-3-B]-3-1-B/CML395]-1-1 1419.5
[Pob.SEW-HG"B"c0F39-1-1-1-1xMBR C5 Bc F22-2-1-4-B-B-B-B-2-2-B-B-B/CML442]-1-1 1333.2
[Cuba/Guad C3 F34-2-1-1-B-B-B x CML264Q]-1-1 1376.4
CML-322 1428.5
DTPWC9-F115-1-4-1-1-B-B-B 1483.4
DTPWC9-F31-1-3-1-1-B-B-B 1492.0
DTPWC9-F67-1-2-1-2-B-B-B 1506.5
DTPWC9-F104-5-4-1-1-B-B-B 1454.3
DTPYC9-F46-3-4-1-1-B-B-B 1535.9
DTPYC9-F46-3-9-1-1-B-B 1461.7
DTPYC9-F46-1-2-1-1-B-B 1552.7
DTPYC9-F46-1-2-1-2-B-B 1606.1
DTPWC9-F67-2-2-1-B-B-B 1568.7
SYN26515
1
63053588
A GY (kg/ha)
CML444-B 1501.9
S87P69Q(SIYF) 109-1-1-4-B 1518.4
CLQ-RCYQ40 = (CML165 x CLQ-6203)-B-9-1-1-B*8 1509.3
CML497=[CL-00331*v]-3-B-3-2-1-B*6 1443.1
DTPWC9-F115-1-4-1-1-B-B-B 1483.4
DTPWC9-F109-2-6-1-1-B-B-B 1467.8
DTPWC9-F67-1-2-1-2-B-B-B 1506.5
DTPWC9-F104-5-4-1-1-B-B-B 1454.3
DTPWC9-F128-1-1-1-1-B-B-B 1390.9
DTPYC9-F143-5-4-1-2-B-B-B 1442.1
DTPYC9-F143-1-6-1-B-B 1414.6
DTPWC9-F67-2-2-1-B-B-B 1568.7
Rare Alleles with
Negative Large
Effects
SYN8914
3
194356323
G
[CML198/ZSR923S4BULK-2-2-X-X-X-X-1-BB]-3-3-1-1-2-B*7 1196.562
S99TLWQ-B-8-1-B*5 1245.322
4001 1292.372
CLA41 1389.549
(A.I.Z.T.V.C. 20-3-1-1-2-B-B x A.I.Z.T.V.C.PR93A-17-1-3-1-1-B-B)-B-14TL-1-3-B-B 1252.957
[G16SeqC1F47-2-1-2-1-BBBB-B-xP84c1 F27-4-1-6-B-5-B] F23-1-3-1-1 x [KILIMA ST94A]-30/MSV-03-2-10-B-1-B-B-xP84c1 F27-4-1-6-B-5-B]-2-1-B/CML395]-1-1 1270.448
POB.501c3 F2 13-8-2-1-BBBB 1383.065
CL-RCY031=(CL-02410*CML-287)-B-9-1-1-2-B*7 1433.411
PZE-106056703 6 107499158 G [CML444/CML395//DTPWC8F31-4-2-1-6]-2-1-1-1-B*4 1331.949 [(CML395/CML444)-B-4-1-3-1-B/CML395//DTPWC8F31-1-1-2-2]-5-1-2-2-BB 1346.993 CML 384xMBR/MDR C3 Bc F58-2-1-3-B-B-B-B-3-1-B-B-BB-B 1344.688 MBR C6 Bc F280-2-B-#-1-1-B-B-B-B-B-B 1256.056 [G16SeqC1F47-2-1-2-1-BBBB-B-xP84c1 F27-4-1-6-B-5-B] F23-2-1-2-3 x P43C9-1-1-1-1-1-BBBB-1-xP84c1 F26-2-2-6-B-3-B]-2-1-B/CML395]-1-1 1258.137 [M37W/ZM607#bF37sr-2-3sr-6-2-X]-8-2-X-1-BB-B-xP84c1 F27-4-3-3-B-1-B] F29-1-2-1-6 x [KILIMA ST94A]-30/MSV-03-2-10-B-1-B-B-xP84c1 F27-4-1-6-B-5-B]3-1-2-B/CML442]-1-1 1190.413 [Pob.SEW-HG"B"c0F39-1-1-1-1xMBR C5 Bc F22-2-1-4-B-B-B-B-2-2-B-B-B/CML442]-1-1 1333.209 [MBR Et/MBR Bc C1 F4-1-1-3-B-B-B-Bx1760B B1 Bco x Comp.-B-1-1-1-1-B-B-B/CML395]-1-1 1354.8 [CML 329/MBR C3 Am F103-1-1-2-B-B x CML486]-1-1 1346.293 [(87036/87923)-X-800-3-1-X-1-B-B-1-1-1-B-B-xP84c1 F26-2-2-4-B-2-B] F47-3-1-1-3 x M37W/ZM607#bF37sr-2-3sr-6-2-X]-8-2-X-1-BB-B-xP84c1 F27-4-3-3-B-1-B]-3-2-B x P33c3 F64-1-1-4-BB]-1-1 1295.392 P390amC3/285x287 F73-3-2-3xMIRTC5Am F96-1-1-1-3-1)-1-1-B 1399.776 CL-G1837=G18SeqC2-F141-2-2-1-1-1-2-##-2-B*4 1275.469 CML421=P31DMR#1-55-2-3-2-1-B*18-B 1252.385 DTPWC9-F73-2-1-1-1-B-B-B 1329.332
SYN14434 2 15813081 A P501SRc0-F2-47-3-2-1-B-B 1268.038 [CML444/CML395//DTPWC8F31-1-1-2-2-BB]-4-2-2-2-2-BB-B 1267.39 [CML444/CML395//DTPWC8F31-1-1-2-2-BB]-4-2-2-2-1-BB-B 1408.142 02SADVL2B-#-17-1-1-B 1419.196 [CML440/[[[K64R/G16SR]-39-1/[K64R/G16SR]-20-2]-5-1-2-B*4/CML390]-B-39-2-B-4-#-1-B//ZM303c1-243-3-B-1-1-B]-9-1 [CML144/[CML144/CML395]F2-5sx]-1-3-1-3-B*4 1397.445 [CML198/ZSR923S4BULK-2-2-X-X-X-X-1-BB]-3-3-1-1-2-B*7 1196.562 [CML144/[CML144/CML395]F2-8sx]-1-1-1-B*5 1171.759 [CML144/[CML144/CML395]F2-8sx]-1-2-3-2-B*5 1203.073 CLA222 1337.217 [M37W/ZM607#bF37sr-2-3sr-6-2-X]-8-2-X-1-BB-B-xP84c1 F27-4-3-3-B-1-B] F29-1-2-1-6 x [KILIMA ST94A]-30/MSV-03-2-10-B-1-B-B-xP84c1 F27-4-1-6-B-5-B]3-1-2-B/CML442]-1-1 1190.413 [Cuba/Guad C3 F34-2-1-1-B-B-B x CML264Q]-1-1 1376.38 CA00344 / PAC777F2-6-1-1-BB-B-B-BB 1321.875 P44 C10MH8-30-4-B-4-1-B-B-B-B- 1329.436 P147-#136-5-1-B-1-BBB 1356.154 CLQ-6211=P62QC6HC13-1-3-BBB-6-B-7-6-BBBB-7-9-B-B-B-B 1311.726 CML269=P25STEC1F13-6-1-1-#-BBB-f-##-B*6-B 1407.819 CL-02143 P21C6S1MH247-5-B-1-1-2-BBB-1-##-B*10 1471.196 CML421=P31DMR#1-55-2-3-2-1-B*18-B 1252.385 DTPWC9-F66-2-1-1-2-B-B-B 1291.755
Rare Alleles – Candidate genes
Candidate genes
identified by Rare Alleles
Putative function
upstream of a DNA biding/membrane
bound receptor Many membrane bound receptors like Rpk1, shown to confer DT in AT.
DEAD box Helicase domain
Less documented helicase domain proteins in AT proved for DT in CK
dependent pathways
related to ethyline insensitive2
cross-talk between ethylene signalling and drought response pathways well-
documented
Extensin like cell wall protein
glyco poteins rich in hydroxy proline was first studied in Tracheophytes
which can with stand severe stress
Annexin IV domain Role of Annexins in DT well-documented in AT
Peroxidase protein known for involvement in DT in rice, AT etc.
Major Facilitator Superfamily (MFS)
Transporters plays key roles in different stress conditions
Aminotransferase
over expression of Aspartate aminotransferase along with other
genes has been patented for DT
CREB domain containing TF Known component in stress related pathways
Ubiquitin subgroup known component in drought tolerance pathways
Traits for which AM analysis accomplished in
DTMA-AM panel
GY_Stress_BLUPs
MSV
GLS
NDVI
Senescence
SPAD
Canopy senescence
ASI
Root traits (Shovelomics!)
Anaerobic Emergence
% reduction in shoot weight under waterlogged conditions
% reduction in root weight under water logged conditions
Following up the AM results
● BC-NILs for validation of important genomic
regions
● Identify MARS progenies with contrasting
genotypes and check the drought phenotypes
● Genotype the DH lines from DT x Normal
crosses and check the phenotypes
● Introgress validated genomic regions into tester
lines through MAS
Artesian – Recent Drought Tolerant Hybrid from Syngenta
Base Hybrid Artesian Hybrid
Artesian – how was it developed?
Strategy Association mapping (candidate gene-based)
BC-MAS of 4-8 QTLs
DT source germplasm CML333, CML322, Cateto SP VII (Brazil), Confite Morocho AYA
38 (Peru), or Tuxpeno VEN 692 (Venezula)
AM-panel 575 inbreds – 47 different testers (mostly S-2 and S-3 TCHs)
Phenotyping 4 locations (Colorado, California and Chile) – Optimal & stress -
Yield reduction under stress was 40-60% from optimal
Genotyping 85 polymorphisms (corresponding to 57 candidate genes) and
149 random polymorphisms across 600 lines – in total only
~250 markers
Effect sizes of identified
genomic regions
60 to 650 kg/ha
Minimum P value of any
significant region
0.0001
Significant Conclusions – DT mapping
LD in DTMA-AM panel is low and hence conducive for association mapping
55K genotype data is capable of identifying large effect genes
„Reasonably large effect‟ genomic regions (10-15) do exist for GY_Stress and co-locate with genes, previously implicated for DT in At, rice and maize
9 genomic regions that had robust p-values together explained 35% of phenotypic variance for GY_Stress_Combined
Lines with multiple donor segments identified for validation and introgression
Two Key genes in carotenoid biosynthetic pathway identified
Lycopene epsilon cyclase (Harjes
et al. 2008; Science)
Hydroxylase (CrtRB-1/HydB-1)
(Yan et al. 2010; Nature Genetics)
Association Mapping
based on candidate
gene sequences
Breeder-ready markers developed and routinely being used in the
H+ breeding program of CIMMYT for CrtRB1 and LcyE
AM leads to identification of
Key genes and polymorphisms
Polymorphisms validated in
diverse tropical genetic
backgrounds and breeder-ready
high throughput markers
developed
Routine use of markers and
selection of favorable
genotypes in H+ breeding
program
+ +
MAS for
LycE
MAS for
HydB Deep orange
ears
High
ProvitA
maize! =
Allele Mining for CrtRB1 (HydB1) across various
Association Mapping Panels
Panel Genotypes with Fav.
allele/Total
White(W)/Yellow(Y)
CIMMYT_Syngenta
CAM Panel
24*/501 – 16 new sources All Yellow (Y)
IMAS 16/430 (6 from ARC, SA and
3 from KARI)
14-W and 2-Y
Subtropical Collections 71/1131 – many new sources 24-W and 47-Y
ADP lines of
SYNGENTA
19/122 – “1” and 23/122 – “H”
PS: * out of 24, 8 were previously fixed for fav. allele of CrtRB1 in the H+ breeding
program through MAS
MSV – Harare 2010 data (Heritability = 0.79)
Association Mapping for Disease Resistance
GLS-combined analysis (Heritability = 0.6)
Marker Chr Position
Corr/Trend
P value
Corr/Trend
Chi-square
FDR (False
discovery
rate)
R2
(%)
Minor
Allele
Minor
Allele
Freq.
Major
Allele
Trait
average
for Minor
allele
Trait
average
for Major
allele
PZE-101093951 1 86065123 4.50E-08 29.92 0.002 11.5 A 0.34 G 1.83 3.08
PZE-101098418 1 92204598 6.47E-07 24.77 0.011 9.5 G 0.36 A 2.15 2.95
SYN36281 1 187128850 1.93E-06 22.67 0.019 8.7 G 0.11 A 2.21 2.72
PZE-101094082 1 86384320 2.45E-06 22.21 0.020 8.5 G 0.39 A 1.99 3.10
PZE-104024779 4 28770811 4.04E-06 21.24 0.022 8.2 A 0.15 G 2.26 2.73
PZE-101098295 1 91837910 5.31E-06 20.72 0.022 8.0 A 0.33 G 2.12 2.92
PZE-108038832 8 59948253 5.57E-06 20.63 0.021 7.9 A 0.47 G 2.63 2.70
PZE-103070254 3 111066077 6.36E-06 20.38 0.022 7.8 G 0.24 A 3.07 2.52
PZE-101094056 1 86365447 6.37E-06 20.37 0.021 7.8 G 0.50 A 2.16 3.16
PZE-108039819 8 62905375 7.00E-06 20.19 0.022 7.8 G 0.46 A 2.62 2.69
PZE-101090488 1 80905706 7.02E-06 20.19 0.020 7.8 A 0.29 G 1.83 3.00
PZE-104016598 4 16339600 7.13E-06 20.16 0.019 7.8 A 0.33 C 2.21 2.87
PZE-102080891 2 64845534 7.21E-06 20.14 0.019 7.7 A 0.28 C 2.19 2.84
PZE-101098960 1 93244458 7.76E-06 20.00 0.019 7.7 A 0.40 G 3.11 2.36
MSV – Harare 2010 data (Heritability = 0.79)
Significant chromosomal regions (P < 1.0E-05) associated with MSV
resistance (Har-2010 data) based on DTMA-AM panel and 55K genotype
data (MLM)
R R
S
S S
R R
S
S S
S
S S R
R
R R R
S S S
Chr.1
Msv1
Chr.3 Chr.4 Chr.8
PZ
E0175698629
PZ
E0
11
32
22
09
36
PZ
A0
20
90
_1
PZ
A0
35
27
_1
PZ
A0
26
14
_2
PZ
A0
05
29
_4
PZ
A0
36
51
_1
PH
M1
41
04
_2
3
Validation of AM regions and Breeder-ready markers for MSV
PZE0186365075
csu1138_4
PZA00944_1
PZE0195148805
PZE01101110579
PZE01111422982
PZE0175698629
PZE-101093951
Candidate SNPs for MSV
Genomic Selection
Genomic Selection
Trait RR-BLUP B-LASSO RP
Grain Color (Binary) 0.8 0.82 0.87
QPM (Binary) 0.96 0.96 0.95
ProA - Quant 0.39 0.42 0.6
GLS - Quant 0.52 0.53 0.55
MSV - Quant 0.62 0.61 0.60
GY - Quant 0.34 0.35 0.36
Using 55K SNP data across 300 individuals in the AM
panels
Integrating GS in breeding pipeline (DH + off-season nusery + GS)
Season Activity
Summer • Grow 50-100 F2s/BC1s
• Select 50 plants/cross and cross to haploid inducer
Winter
• Chromosome doubling of putative haploids to get DHs
• Seed chip (one kernel/DH) 2500 – 5000 DH kernels
• Discard disease susceptible DHs through specific marker
screening
• Select DHs through GY-GEBVs and seed Increase (top 5-
10%)
Summer • Test cross GEBV-selected DH lines to one/two testers
• Yield trials of DH-TCHs
THANKS
% phenotypic variance explained by structure
alone…in DTMA-AM panel
Trait/Location % phenotypic variance
explained by 10PCs GY_Stress_Combined_BLUP 15.8
MSV (Harrae2010+09-1) 38.2
GLS_Combined 25.1
GLS_Har_10 8.8
GLS_Kakamega 11.5
GLS_Columbia_Scatalina 30.2
GLS_San Pedro_Mexico 29.6
GLS_Acatec_Mex 23
GLS_Paraguacito_Columbia 6.7