Upload
doanphuc
View
227
Download
0
Embed Size (px)
Citation preview
Genome wide association mapping of Sclerotinia sclerotiorum resistance in soybean
with a genotyping by sequencing approach
Maxime Bastien, Humira Sonah and François Belzile*
Département de Phytologie and Institut de Biologie Intégrative et des Systèmes (IBIS),
Université Laval, Quebec City, Quebec, Canada G1V 0A6
Received 4 October 2013.
*Corresponding author: [email protected]
Page 1 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Abstract
Sclerotinia stem rot (SSR) is one of the most important pests in cool soybean growing
regions of the Northeastern United States and Canada. However, the intensity of
infestations varies considerably from year to year according to weather conditions, thus
making it difficult for breeders to select under uniform disease pressure. Selection for
resistance to SSR would be greatly facilitated by the use of molecular markers. In this
work, a collection of 130 lines was inoculated using the cotton pad method and was
genetically characterized using a genotyping-by-sequencing protocol optimized for
soybean. Genome-wide association mapping and linkage disequilibrium (LD) analyses
were performed with 7,864 single nucleotide polymorphisms (SNPs). LD varied
considerably over physical distance, reaching a r2 value of 0.2 after 8.5 Mb in the
pericentromeric region and 0.5 Mb in the telomeric region. The mixed linear model
performed very well in accounting for population structure and relatedness, as only 5.5%
of the observed p-values were < 0.05. The strongest association was found on
chromosome Gm15 (p-value=1.38 x 10-6; q-value=0.011). Two additional SNP markers
in the vicinity had a q-value < 0.1. This marker was validated in the progeny of a
biparental cross, where F4:6 lines carrying the susceptibility allele developed lesions 17.6
mm longer than lines carrying the resistance allele. Interestingly, other genes
contributing to resistance to pathogens have been reported in this region of Gm15.
Three other association peaks having a q-value < 0.1 were detected on chromosomes
Gm01, Gm19 and Gm20.
Page 2 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Abbreviations: AM, association mapping; FDR, false discovery rate; GBS, genotyping-
by-sequencing; GLM, general linear model; GWAM, genome-wide association mapping;
LD, linkage disequilibrium; LRR, leucine-rich repeat; LSD, least significant difference;
MAF, minor allele frequency; MAS, marker-assisted selection; MLM, mixed linear model;
PC, principal component; PCA, principal component analysis; QTL, quantitative trait loci;
RAD, restriction site associated DNA; RIL, recombinant inbred lines; RRL, reduced
representation libraries; SLAF, specific-locus amplified fragment; SNP, single nucleotide
polymorphism; SSR, sclerotinia stem rot.
Page 3 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Sclerotinia stem rot (SSR) of soybean (Glycine max (L.) Merr.), caused by Sclerotinia
sclerotiorum (Lib.) de Bary, is one of the most important diseases in the northern United
States and Canada. In the United States, it was among the top ten diseases that
affected soybean yield in three of four years between 2006 and 2009, even ranking
second in 2009 (Koenning and Wrather, 2010). The development of SSR is highly
sensitive to fluctuations in humidity and temperature (Boland and Hall, 1987; Phillips,
1994; Workneh and Yang, 2000; Mila and Yang, 2008). Therefore, the disease can
cause considerable damage in soybean one year and be almost absent the following. In
Eastern Canada, and particularly in the province of Quebec, it is the most important
disease in soybean and resistance/susceptibility is assessed in official varietal
registration trials.
Chemical control of SSR is difficult to achieve because several preventive and systemic
treatments are required (Mueller et al., 2004). Biological control agents including
Coniothyrium minitans, Streptomyces lydicus and Trichoderma harzianum are all
effective to reduce the number of sclerotia, with C. minitans having the best
effectiveness (Zeng et al., 2012). However, these products need to be applied yearly to
obtain the best efficacy and will be unnecessary in the years where climatic conditions
are not favorable for disease development. Cultural practices such as crop rotation
(Kurle et al., 2001; Rousseau et al., 2007), reduced tillage (Sutton and Peng, 1993;
Gracia-Garza et al., 2002; Mueller et al., 2002), and wide row spacing (Kurle et al.,
2001; Mila and Yang, 2008) have been reported to reduce the impact of SSR in soybean
fields. The rationale behind these practices is to decrease inoculum and/or maintain
unfavorable conditions for fungal development.
Page 4 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Although no complete resistance to S. sclerotiorum has been described in soybean,
important differences in susceptibility to the pathogen have been reported (Kim et al.,
1999; Hartman et al., 2000; Hoffman et al., 2002; Chen and Wang, 2005; Bastien et al.,
2012). Moreover, in the field, true physiological resistance may be confounded with
escape or avoidance mechanisms (Kim and Diers, 2000; Rousseau et al., 2004). Overall,
the development of cultivars exhibiting enhanced resistance is one of the most effective
means to manage SSR (Grau, 1988; Kurle et al., 2001) and is an important objective of
breeding programs targeted at northern soybean growing areas.
To date, quantitative trait loci (QTLs) for white mold resistance in soybean have been
reported by Kim and Diers (2000), Arahana et al. (2001), Guo et al. (2008), Han et al.
(2008), Vuong et al. (2008), Huynh et al. (2010), Li et al. (2010) and Sebastian et al.
(2010). The mapping populations were generated from crosses between a resistant and
a susceptible parent. In all but one study, it has proven challenging to detect the same
QTLs in different trials. However, the most reproducible work identified three QTLs on
chromosomes 06 and 20, in three out of four different field trials (Huynh et al., 2010).
Selective phenotyping confirmed the impact of these putative QTLs in four additional
trials. The cotton pad method used in this study measures the length of lesions on the
main stem following inoculation with mycelium. It provides a reproducible measure of a
component of physiological resistance to SSR, namely to the infection via floral buds, in
both field and controlled conditions (Bastien et al., 2012).
Such QTL mapping in biparental crosses is limited in terms of the diversity sampled, two
parents per population, and the resolution provided by the low number of recombination
Page 5 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
events incurred during population development. An alternative mapping approach,
genome-wide association mapping (GWAM), is gaining popularity to identify genes of
interest in plants. Compared to linkage mapping, it enables the study of many genotypes
at once and, if a sufficient number of markers is used, generates more precise QTL
positions. GWAM has been shown to have the potential to dissect the genetic basis of
complex traits in Arabidopsis (Atwell et al., 2010) and rice (Huang et al., 2012). However,
population structure and kinship present within the association mapping (AM) population
must be taken into consideration to avoid detecting spurious associations. In addition,
because a statistical test is performed between each marker and the trait, traditional p-
value cutoffs of 0.01 or 0.05 have to be made stricter to avoid an abundance of false
positive results.
In soybean, GWAM studies using less than 200 microsatellite markers have reported
associations to various quantitative traits (Hou et al., 2011; Li et al., 2011; Korir et al.,
2013; Niu et al., 2013; Zuo et al., 2013). Such low marker density inevitably leads to a
low power of detection of QTLs and an imprecise localization of the QTL, thus reducing
their usefulness for marker-assisted selection (MAS). Three other GWAM studies have
been conducted using the GoldenGate assay, a high-throughput analysis method
capable of genotyping 1,536 SNPs on sets of 96 samples (Hyten et al., 2008; 2010b). In
the first study, two populations of advanced soybean breeding lines were screened for
iron deficiency chlorosis tolerance (Mamidi et al., 2011). Using the best model to control
for population structure, 15.5% and 18.7% of marker-trait association p-values were
inferior to 5%. This obvious inflation of observed p-values relative to the expected p-
values could have lead to the reporting of many false positives. The two other GWAM
Page 6 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
studies were conducted to identify QTLs associated with chrorophyll, chlorophyll
fluorescence parameters, yield and yield components in soybean landraces (Hao et al.,
2012a; b). Here, no correction for multiple testing was used, potentially leading to the
reporting of many false positive associations.
All of these GWAM studies have relied on relatively small numbers of markers
(hundreds to a thousand). Recent developments in high throughput next-generation
sequencing technologies now offer the opportunity to considerably increase the number
of SNPs in such studies. Although re-sequencing still remains too expensive to routinely
carry out on a large set of genotypes, several methods have been developed that
involve sequencing only a small fraction of the entire genome. Four main complexity
reduction methods have been described to date: Restriction site Associated DNA (RAD)
sequencing, Reduced Representation Libraries (RRL), Specific-Locus Amplified
Fragment (SLAF) sequencing, and Genotyping By Sequencing (GBS). Initially
developed in animals and fungi, (Miller et al., 2007; Baird et al., 2008), RAD sequencing
has been applied in barley (Chutimanitsakun et al., 2011) and rapeseed (Bus et al.,
2012) but not yet in soybean. Using the RRL approach in soybean, 1,682, 7,947, 25,047
and 14,550 SNPs were identified in four independent studies (Deschamps et al., 2010;
Wu et al., 2010; Hyten et al., 2010a; Varala et al., 2011). Although highly promising as a
SNP discovery tool, the RRL approach required between 6 and 50 µg of DNA per
sample, a quantity that is not very practical for genotyping on a large number of lines.
The SLAF sequencing approach proposed in soybean (Sun et al., 2013) requires two
Page 7 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
digestions, PCR amplification and purification steps, in addition to size selection on gels.
With such complexity this approach is unlikely to gain much popularity in soybean.
The GBS method described by Elshire et al. (2011) in maize and barley involves a
greatly simplified library production procedure more amenable to use on large numbers
of individuals/lines. There is no size selection step of the digested DNAs, enabling it to
be carried out using small amounts of DNA (100 ng). When adapted to soybean with the
ApeKI enzyme, a total of 10,120 high quality SNPs was discovered among eight diverse
soybean lines (Sonah et al., 2013). Their distribution mirrored closely the distribution of
gene-rich regions in the soybean genome, thus making GBS an attractive approach to
rapidly and efficiently genotype a large number of soybean lines with thousands of SNP
markers.
This paper reports the use of the GBS approach for the identification of QTLs
contributing to SSR resistance in soybean via a GWAM approach. The results suggest
that AM is a useful strategy for dissecting complex traits in soybean, thus providing a
valuable tool to assist in plant breeding. The high number of SNPs also enabled an
evaluation of LD decay over physical distance.
Page 8 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Material and Methods
Soybean Lines
A panel of 130 soybean lines representative of the diversity present in a private breeding
program (Semences Prograin Inc.) in Eastern Canada covering maturity groups (MG)
000-II was used with the sole exception of Williams 82 (MG III) (see list in Supplemental
Table 1). Among these, three varieties were known to show good levels of resistance to
SSR (Karlo RR, Maple Donovan and S19-90), while three were moderately (OAC
Bayfield and Williams 82) or highly susceptible (Nattosan) (Bastien et al., 2012). Maple
Donovan and Nattosan are commercial cultivars from the Eastern Cereal and Oilseed
Research Centre (Agriculture and Agri-Food Canada, Ottawa, Canada), while S19-90 is
a commercial cultivar from Syngenta Seeds. Williams 82 was obtained from the
American Germplasm Resources Information Network. Karlo RR is a cultivar from
Semences Prograin (St-Césaire, QC, Canada). Seeds of the other lines were obtained
from Semences Prograin.
Validation Populations
Two populations of 192 F4:5 lines segregating for the SNP marker most highly
associated with SSR resistance were used for validation purposes. Population 1 was
generated from the cross PR918827 x PR935401. Both genotypes are advanced
breeding lines from Semences Prograin. PR918827 carries the resistance allele at this
locus while PR935401 carries the susceptibility allele. Population 2 was generated from
the cross PR918827 x Toma. PR918827 carries the resistance allele at this locus while
Toma carries the susceptibility allele.
Page 9 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Phenotypic Evaluation
The association mapping (AM) panel was sown in the greenhouse at Université Laval in
a randomized complete block design comprising four blocks separated in time. Planting
dates were 25 Sept 2009, 6 Nov 2009, 18 Dec 2009 and 29 Jan 2010. The validation
experiments were sown in the greenhouse on 21 Sept 2012 (population 1) and on 2 Nov
2012 (population 2). A randomized complete block design comprising three blocks was
used. For all experiments, experimental units consisted of a total of six plants grown in
pairs in three 6-L pots. The potting mix was made of 50% black earth, 30% perlite and
20% Promix (Premier Tech Horticulture, Rivière-du-Loup, QC, Canada). Seeds were
inoculated with RhizoStick® inoculant (Becker Underwood, Ames, IA) at sowing. Plants
were grown under natural light supplemented with 600 W high-pressure sodium lamps
(P.L. Light Systems, Beamsville, ON, Canada) to provide a 16-h photoperiod. During the
growing period prior to inoculations the day/night temperature was 26/22°C.
Inoculum was prepared from strain NB-5 (provided by Dr. S. Rioux of CEROM, Quebec
City, QC, Canada). Inoculations were performed over several days because of
differences in flowering date. A pot was inoculated when both plants had reached the R1
growth stage. The cotton pad method described in Bastien et al. (2012) was used.
Briefly, S. sclerotiorum grown in potato dextrose broth was homogenized for 30 s in a
Waring blender (New Hartford, CT). Pieces (2.7 x 5.5 cm) of cotton pad [U.S. Cotton
(Canada) Co., Montreal, QC, Canada] were then soaked in the suspension. The
inoculum was applied on the petiole of the lowest node bearing flowers. After inoculation,
Page 10 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
plants were transferred to a different greenhouse where day/night temperatures were
22°C/18°C. Humidity was controlled based on water pressure deficit (maintained at 2.5
g/m3 with a fogging system). Lesion length was measured 7 d after inoculation.
DNA Extraction, Library Preparation and Sequencing
DNA was extracted from 50 mg fresh young leaves using the DNeasy 96 Plant kit
(Qiagen, cat. no. 69181) following the manufacturer’s protocol. DNA was quantified
using a Thermo Scientific Nanodrop 8000 spectrophotometer (Wilmington, DE). DNA
concentrations were normalized to 10 ng/µl and subsequently used for library
preparation. Three ApeKI libraries (48-plex) were prepared according to the GBS
protocol described by Elshire et al. (2011). Fourteen genotypes unrelated to this work
were included in one of the three GBS libraries for a total of 144 DNA samples. Single-
end sequencing was performed on three lanes of an Illumina HiSeq2000 (at the McGill
University-Génome Québec Innovation Center in Montreal, QC, Canada).
Processing of Illumina Raw Sequence Read Data and SNP Calling
The pipeline described by Sonah et al. (2013) was used for the processing of Illumina
108-bp reads. A maximum of one third of missing data per marker was tolerated. Finally,
missing information in the filtered set of high-quality SNPs was imputed using
fastPHASE (Scheet and Stevens, 2006).
Page 11 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Genotypic Data Analysis
SNP marker data were imported into TASSEL version 3.0 standalone (Bradbury et al.,
2007). Indels, markers having a minor allele frequency (MAF) inferior to 5% and markers
located on unanchored scaffolds were eliminated from the dataset. The remaining SNP
markers were used for analysis of population structure, linkage disequilibrium and
marker-trait associations.
Linkage Disequilibrium Analysis
Decay of LD between marker loci was assessed using the squared allele frequency
correlation (r2) between pairs of loci located on the same chromosome (Hill and
Robertson, 1968). To take into account the variability of recombination over the genome,
chromosomes were divided into telomeric (low LD) and pericentromeric (high LD)
regions. Borders between these regions were determined based on the mean r2 value
for markers located in a 1-Mb sliding window (0.1 Mb increments). Windows with fewer
than 10 marker pairs (i.e. <5 markers) were almost exclusively found in the repeat-rich
pericentromeric regions and were assigned a value of r2 = 1. Starting from each end of a
chromosome, a series of 10 consecutive windows with r2 = 1 was considered to mark the
beginning of the pericentromeric region. In a few cases, the first high LD zone
encountered was clearly separate from the large body of pericentric heterochromatin
(extended zone of low LD) reported in Schmutz et al. (2010). In such cases the second
“high LD” region was chosen to define the border. The precise boundary between
telomeric and pericentromeric regions was taken to be the midpoint of the first 1-Mb
Page 12 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
window. LD plots of all chromosomes are found in Supplementary Fig. 1. A summary of
regions and SNP coverage is shown in Table 1.
LD decay with physical distance was plotted both for the entire dataset and separately
for the telomeric and pericentromeric regions described above. A regression of r2
against distance was performed for all marker pairs using the R script LDit
(http://www.rilab.org/code/files/LDit.html, page verified 25 Sept 2013).
Association Analysis
All marker-trait association tests were run in TASSEL version 3.0 standalone (Bradbury
et al., 2007). A principal component analysis (PCA) was conducted to assess population
structure and a kinship (K) matrix was calculated to estimate familial relatedness
between lines. On the basis of the Scree plot, the first 16 PCs were used to capture
population structure in the association analyses. Four different models were tested: a) a
naïve analysis using only the general linear model (GLM); b) a GLM analysis with
principal components (P) as a cofactor; c) a mixed linear model (MLM) analysis using
the kinship matrix (K) as a cofactor; and d) a MLM analysis in which both population
structure and relatedness (P+K) were used as cofactors (Zhang et al., 2010). Quantile-
quantile plots were produced to assess the extent to which the analysis produced more
significant results than expected by chance.
The critical values for assessing the significance of marker-trait associations were
calculated using QVALUE (Storey and Tibshirani, 2003). The q-value is a measure of
Page 13 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
significance in terms of the false discovery rate similar to the p-value that relates to the
false positive rate. This approach limits the number of false positive results, while
offering a more liberal criterion than the Bonferroni correction factor. Marker-trait
associations having a q-value inferior to 0.1 were declared significant.
QTL Validation Experiments
A codominant cleaved amplified polymorphic sequence marker was developed and used
to genotype the candidate SNP on Gm15 in the crosses segregating for this marker. The
locus was amplified with specific primers (5’-TACCAAAATAACTTGTCTTGCAGCTTGG-
3’ and 5’- GCGGAGGAGCAAGCAGCTTATATGG-3’) and the resulting amplicon was
digested with ApeKI, with digestion of the amplicon signaling the resistance allele. For
each population, three F4:5 plants per row were genotyped and a line was considered to
have reached fixation at the marker when all three plants were homozygous for the
same allele. Based on this information, 24 F4:5 plants homozygous for each of the
resistance or the susceptibility allele were selected and F4:6 progeny of these 48 plants
were tested for their reaction to SSR as described above. Resistant genotypes Karlo RR
and S19-90 as well as susceptible genotypes Nattosan and OAC Bayfield were used as
checks in these experiments.
Statistical Analysis of Phenotypic Data
An analysis of variance was performed on the phenotypic data using PROC GLM of
SAS (Version 9.3, SAS Institute, Cary, NC) for a randomized complete block design.
Analysis of residuals and PROC UNIVARIATE confirmed the assumptions that
Page 14 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
experimental errors were normally distributed around a zero mean, and had a common
variance. Fisher’s protected least significant difference (LSD) at alpha = 0.05 was used
to test the differences among genotypes. Analyses of variance for the QTL validation
experiments were performed using PROC MIXED (SAS Release Version 9.3, SAS
Institute, Cary, NC). Alleles were analyzed as fixed effects, while genotypes and
replications were analyzed as random effects. Genotypes were nested within alleles.
Page 15 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Results
Evaluation of SSR Resistance in a Panel of 130 Soybean Lines
The distribution of lesion lengths among lines in the AM panel is shown in Fig. 1 and the
data for each line are provided in Supplemental Table 1. Lesion lengths covered a very
broad range (29 mm to 192 mm, mean of 114 mm), and the distribution was bell-shaped
with few highly resistant or highly susceptible and a large majority of lines exhibiting
intermediate reactions. Resistant checks Karlo RR, S19-90 and Maple Donovan all
developed shorter lesions than the average (29, 44 and 81 mm, respectively). One of
the moderately susceptible checks, Williams 82, ranked near the average (118 mm),
while another, OAC Bayfield, developed longer lesions (160 mm). The highly susceptible
cultivar Nattosan developed the longest lesions among all checks (177 mm).
Marker Distribution
The pooled GBS libraries were sequenced in three lanes of one flow cell, generating
285.9 million reads for the 144 lines comprised in three 48-plex libraries. The 130 lines
belonging to the association panel comprised 2.1 million reads per line on average, for a
total of 266.7 million reads. The number of reads per line varied between 0.5 and 5.8
million. A total of 12,193 SNPs was initially called for this set of lines. A large portion
(35.3%) of these was found to have a MAF < 5% and these SNPs were discarded,
leaving 7,893 SNPs with a MAF ≥ 5% (Fig. 2). Of these, 29 SNPs mapped to scaffolds
that are currently unassigned to a chromosome and the 7,864 that mapped onto one of
the 20 soybean chromosomes were used for the ensuing analyses. In total, these
Page 16 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
markers cover 945.5 Mb of the genome. The largest number of SNPs was found on
chromosome 18 (626 SNPs), followed by chromosome 15 (516 SNPs), and the lowest
number of SNPs was observed on chromosomes 11 (173 SNPs) and 12 (248 SNPs).
The distribution of SNPs on each chromosome is presented in Table 1. For the purpose
of refining the analysis of LD, the chromosomes were divided into telomeric and
pericentromeric regions (as described in the Materials and Methods) to properly reflect
the marked differences in marker coverage, recombination and LD in these regions.
Summed over all chromosomes, the telomeric (“low LD”) regions span a total of 400.9
Mb and 5,579 SNP were detected in this zone, for a coverage of one SNP every 71.9 kb.
In contrast, the pericentromeric (“high LD”) regions spanned 544.7 Mb and 2,285 SNP
were detected in this zone, for a coverage of one SNP every 238.4 kb.
Linkage Disequilibrium Analysis
LD analysis was performed using 7,864 SNPs. Decay of LD over physical distance is
presented in Fig. 3. For the entire data set, the regression curve fitted to the LD plot falls
below r2 = 0.2 at ~0.85 Mb (Fig. 3a). Within the telomeric regions of the chromosomes,
the regression curve crosses this threshold value at ~0.5 Mb (Fig. 3b). In stark contrast,
in the pericentromeric region, the LD curve falls below r2 = 0.2 at ~8.5 Mb (Fig. 3c). As
the mean distance between markers was inflated due to a relatively small number of
marker pairs that were exceptionally distant, the median distance between SNPs was
deemed more representative. Within the telomeric region, the median distance between
markers is 26.7 kb and it is 65.3 kb in the pericentromeric region.
Page 17 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
To examine how tightly the alleles at two loci are associated in these two distinct regions
of the genome, we then estimated the median r2 between marker pairs separated by
<100 kb within either the pericentromeric or telomeric regions. As can be seen in Fig. 4a,
within the pericentromeric regions, markers remain tightly associated (r2 > 0.8) at
distances as high as 21.1 kb, and even at large distances (~100 kb), marker pairs retain
a high level of association (r2 > 0.67). In contrast, within the telomeric regions (Fig. 4b),
median r2 values fall below 0.8 at only 3.6 kb and below 0.5 at 24.5 kb.
Population Structure Analysis
A principal component analysis (PCA) was performed on the 130 lines of the AM panel.
Principal component 1 (PC1) explained 5.9% of the variation in the data, while PC2 and
PC3 explained 4.7% and 4.5% of the variation, respectively. When observing these first
three axes of the PCA (Supplementary Fig. 2) we saw no clear grouping among lines,
indicating a low level of population structure. The first 16 PCs used in the association
analyses (as determined based on the Scree plot) captured 42.8% of the variability.
Association Mapping
The number of significant associations between SNPs and lesion length varied between
the statistical methods tested. The naïve GLM model detected the largest number of
significant associations at q-value < 0.1 (428). This method does not account for any
possible confounding effects that could lead to false positives, which led to an inflation of
the cumulative distribution of p-values relative to the observed p-values (Fig. 5). The
GLM model taking into account population structure (PCA) detected 13 significant
Page 18 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
associations but still led to an inflation of the cumulative distribution of p-values relative
to observed p-values. The MLM model taking into account familial relatedness (K) did
not detect any significant association. The fact that this model provided fewer significant
results than expected by chance suggests that it may be overly conservative. The mixed
model correcting for both population structure and familial relatedness (PCA+K) yielded
the best fit to the theoretical distribution. In total, 5.5% of marker-trait associations had a
p-value below 5% and this model detected a total of 10 significant marker-trait
associations (q-value < 0.1) defining four genomic regions (Fig. 6 and Table 2).
The most significant region comprises three SNPs covering 312 kb on Gm15. The
marker with the strongest association with lesion length (q-value = 0.011) is located at
position 13,651,235 and explains 14.5% of the variation for SSR resistance. It is the sole
marker that was detected in both the PCA and PCA+K analyses. Genotypes carrying the
minor allele (A) at this locus developed lesions 15.1 mm shorter than genotypes carrying
the major allele (G). The second significant region comprises five consecutive SNPs on
Gm01 between positions 29,185,984 and 31,164,344. They share an identical q-value of
0.040 and explain 7.3% of the variation. Genotypes carrying the minor allele at these
loci developed lesions 5.4 mm shorter than genotypes carrying the major allele. The
next significant region is marked by a single SNP located on Gm20 at position
39,698,515 (q-value = 0.094). It explained 6.3% of the variation for resistance to SSR
and genotypes carrying the minor allele (A) at this locus developed lesions 8.0 mm
shorter than genotypes carrying the major allele (G). The last significant region is also
marked by a single SNP located on Gm19 at position 50,557,054 (q-value = 0.094). It
explained 7.2% of the variation for resistance to SSR. Genotypes carrying the minor
Page 19 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
allele (A) at the locus on Gm19 developed lesions 9.9 mm longer than genotypes
carrying the major allele (C). Combined together, these four most significant markers
explained 35.3% of the phenotypic variation.
Validation Experiments
To validate the candidate markers associated with SSR resistance on Gm15, two
populations of F4:5 lines derived from parents contrasted for the peak marker on Gm15
were identified. The phenotypic contrast between parents of population 1 (PR918827 x
PR935401) was large (96.4 mm). In contrast, the parents of population 2 (PR918827 x
Toma) were both considered partially resistant and the phenotypic contrast between
them was small (15.8 mm). For each population, an equal number of F4:6 lines
homozygous either for the resistance or the susceptibility allele (24 each) were
evaluated for SSR resistance under greenhouse conditions (Fig. 7). With the sole
exception of OAC Bayfield that showed much smaller lesions than expected in the
Population 1 trial, all other checks developed lesions of the expected length in both trials.
In Population 1, among the group of lines homozygous for the resistance allele, lesion
length averaged 70.6 mm, whereas it averaged 82.9 mm among those homozygous for
the susceptibility allele. The contrast between the two groups of lines was not significant
(p = 0.170). In population 2, among the group of lines homozygous for the resistance
allele, lesion length averaged 38.6 mm, whereas it was 56.2 mm among those
homozygous for the susceptibility allele. The contrast between the two groups of lines
was significant (p = 0.027).
Page 20 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Genomic Landscape Near the Peak SNP on Gm15
To have a broad view of the genomic landscape in the vicinity of the peak SNP on
Gm15, we examined the interval defined by SNP markers having a q-value < 0.2. Six
SNPs have a q-value under this threshold, defining an interval that spans 590 kb
between position 13,339,206 and 13,929,317. Annotation using the blast2go software
(Conesa et al., 2005) revealed that this region harbors 28 predicted genes, including
twelve that have a predicted function related to disease resistance (Fig. 8).
Page 21 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Discussion
Number of Markers
In this work we performed GWAM with 7,864 SNPs. This is considerably more than
previous work in soybean that employed between 55 and 186 microsatellite markers
(Hou et al., 2011; Li et al., 2011; Korir et al., 2013; Niu et al., 2013; Zuo et al. 2013). It is
also a significant increase compared to three GWAM studies where genotyping was
conducted with the GoldenGate Assay. One used 858 and 868 SNPs in two different AM
populations (Mamidi et al., 2011), while the other two studies were conducted with 1,142
SNPs (Hao et al., 2012a; b).
Linkage Disequilibrium
Assessing the decay of LD in an association mapping panel provides an estimate of the
number of markers required to detect QTLs. The method most widely used to describe
LD decay consists in doing a regression of r2 against physical or genetic distance and
finding the intersection of the regression with a set threshold. In this work, we found that
applying this approach to physical distance did not provide an accurate portrayal of LD,
as it varies considerably across the genome (Comadran et al., 2009; Lee et al., 2013).
We devised a criterion, based on the mean level of LD in sliding windows, to define
borders between the telomeric (low LD) and the pericentromeric (high LD) regions of the
genome. This is similar to the annotation found in the soybean genome browser in
SoyBase in which pericentromeric regions are defined as ones having “near-zero rates
of recombination”, except that the zones defined here relate directly to the situation
encountered in this collection of lines. When we compared the pericentromeric regions
Page 22 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
defined in this work with those in SoyBase, on average, there was a ~5% difference in
the portion of the chromosome that was labeled pericentromeric.
As expected, LD decay varied considerably between the two regions, falling below
r2=0.2 after only 500 kb in the telomeric regions and extending up to 8.5 Mb in the
pericentromeric regions. The genome-wide figure (0.85 Mb) thus provides a relatively
poor description of what are two very distinct situations. Nonetheless, as previous
studies have used such global figures of LD decay, we will use this number to perform
comparisons on a similar basis. In the first work conducted on the question, 74
sequence tagged sites were used to study LD decay in three chromosomal regions
(Hyten et al., 2007). Among elite cultivars, LD reached r2=0.1 after 574 kb in one region
but never reached this threshold in the two other regions spanning 513 and 336 kb. In
another study conducted by Mamidi et al. (2011), it was reported that r2 fell below 0.1 at
7.0 Mb in one collection of lines using 858 SNPs and extended to 5.9 Mb in another
collection of lines using 868 SNPs. These two AM panels were respectively composed
of 141 or 143 advanced breeding lines adapted to the north central states of the United
States. Here, the AM panel comprised 130 cultivars and advanced breeding lines
representing the extent of diversity present within a single private breeding program.
Over all loci, r2 dropped below 0.1 at ~2.8 Mb. This less extensive LD is likely a
reflection of the broader scope of the latter panel as it comprised genetically-modified,
conventional and food-type soybeans belonging to maturity groups 000 to II. Lastly,
using 1,142 SNPs, Hao et al. (2012b) found that r2 reached 0.1 at only 500 kb in an AM
panel composed of 191 landraces from different geographic origins and with phenotypic
Page 23 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
variations. It is not surprising to find such low levels of LD in a very diverse collection of
lines.
Although the number of SNP markers used in this study is much higher than in
previously published work, is this coverage sufficient to successfully detect QTLs?
Based on the study of LD in three small euchromatic chromosomal regions (336 to 574
kb), Hyten et al. (2007) estimated that the number of tag SNPs required to obtain
sufficient SNP coverage at r2=0.8 in elite material ranged between 9,600 and 29,400
markers. In this work, close to 8,000 SNP markers covering the entire genome were
available to examine this question on a larger scale. If we assume that the most
challenging location for detection of a QTL is midway between two flanking markers, we
need to estimate the degree of correlation between these flanking markers and a
hypothetical QTL midway between them. To do this, we estimated the median r2 value
between markers located at half the median distance between marker pairs in our
dataset. For the telomeric regions, the median distance between marker pairs was 26.7
kb and thus, for loci situated 13.3 kb apart, the logarithmic regression shown in Fig. 4
provides an estimate of r2 = 0.60. In other words, given the number and distribution of
SNPs described in this work, we estimate that the degree of correlation between any
QTL and a flanking SNP marker is expected to be greater than 0.6 in telomeric
(euchromatic) regions. Similarly, for the pericentromeric regions (in which the median
distance between markers was found to be 65.3 kb), the median r2 value between loci at
half this distance is estimated to be 0.76. If one uses r2 = 0.8 as a threshold above which
an association analysis would have high power to detect QTLs (Hyten et al., 2007), we
find that the coverage achieved in the telomeric regions is probably sufficient to detect
Page 24 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
many QTLs either of moderate to large effect or located in more favorable positions
relative to flanking markers, but that it is still inadequate to provide high QTL detection
power throughout the soybean genome of this size and composition.
The pericentromeric region spans 544.7 Mb and is covered by 2,285 SNP markers. The
median distance between a SNP and a QTL at which r2 > 0.8 is < 21.1 kb (Fig. 4a). Thus,
to ensure an even coverage at such high LD levels, a SNP every 42.2 kb, or 12,900
SNPs, would be needed in the pericentromeric region. Similarly, the telomeric region
spans 400.9 Mb and is covered by 5,579 SNP markers. The median distance between a
SNP and a QTL at which r2 > 0.8 (Fig. 4b) is < 3.6 kb. Therefore, on average, SNPs
must be present every 7.2 kb, for a total of 55,700 SNPs over the region. For the entire
genome, a total of 68,600 well-distributed SNPs would thus be needed. However, it is
important to underline that this number is based on the worst-case scenario of a
hypothetical QTL midway between two SNPs. This study has demonstrated that GWAM
can detect QTLs with much fewer SNPs.
Validation of the QTL on Gm15
Although a number of studies have reported QTLs detected via association analyses in
soybean, validation of such candidate QTLs is rare. Here, we assessed the phenotypic
contrast between lines segregating for the peak SNP marker associated with SSR
resistance on Gm15. In the two segregating populations, the magnitude of the difference
in lesion length between the two genotypic classes was similar (12.3 mm in population 1
and 17.6 mm in population 2) and consistent with the allelic effect estimated in the
Page 25 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
association analysis (15.1 mm, Table 2). In population 1, this contrast was not
statistically significant (p < 0.170). A plausible explanation for this is the presence of
other segregating QTLs in this cross (R x S; 96.4 mm difference in lesion length) that
could have blurred the allelic contrast at a single locus. In contrast, both parents of
population 2 were partially resistant to SSR (15.8 mm difference in lesion length) and
the observed contrast between genotypic classes (17.6 mm) was significant (p < 0.027)
and explains completely the contrast between the parents of this cross. To our
knowledge this is the first report of validation of QTLs detected for SSR resistance in
soybean.
Magnitude of SSR Resistance QTLs and Relevance to Breeding
The quantitative nature of SSR resistance in soybean is clearly supported in the present
study, as the four genomic regions identified each explained between 6.3 and 14.5% of
the variation. With the exception of one QTL on chromosome 6, which explained
between 18.9 to 23.6% of phenotypic variation (Huynh et al., 2010), all other reported
QTLs accounted for 16% or less of SSR resistance (Kim et Diers, 2000; Arahana et al.,
2001; Guo et al., 2008; Han et al., 2008; Vuong et al., 2008; Huynh et al., 2010;
Sebastian et al., 2010). Combined together, the four candidate QTLs accounted for
38.4% of the variation for SSR variation, leaving a large proportion of variation
unexplained. In simulations, QTL detection power was demonstrated to increase with
both heritability and population size (Bradbury et al., 2011). As our association panel
(130 lines) is at the lower end of tested population sizes (100-300 lines), we would
expect this to constitute a limitation. Finally, the R2 value is influenced by the LD
Page 26 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
between the marker and the QTL and by the marker allele frequency. R2 is greatest
when LD is near 1 and the allele frequency is 0.5. Both conditions are likely to be fulfilled
in a bi-parental population but neither is likely to be the case in an AM population.
Although the number of markers that were used here is greater than most AM studies
published in soybean to this day, several thousand more SNP markers would be
required to ensure complete genome coverage. The impact of marker coverage was
examined in a recent work where the genomes of 226 accessions of the model legume
Medicago truncatula were sequenced, generating over 6 millions SNPs, that were used
to perform GWAM for several traits (Stanton-Geddes et al., 2013). In parallel, GWAM for
these traits was conducted with an in silico 250K SNP array. The comparison of AM
results revealed that candidates identified using the in silico arrays were often distant
from the top sequence-based candidates and highly biased towards common variants.
This implies that some QTLs for SSR resistance carried by rare variants may have gone
undetected because of insufficient marker coverage, and that the exact location of
candidate QTL will still need to be refined. However, from a breeder’s standpoint, the
mapping precision of QTLs achieved in study is more than sufficient to allow their
exploitation in marker-assisted breeding.
Pyramiding QTLs identified in this study could potentially result in increased resistance,
as the most resistant lines among the association panel were not fixed for the resistance
alleles at all QTLs. For instance, Karlo RR, the most highly resistant line in the panel,
possesses the resistance allele for QTLs on Gm01 and Gm19 but the susceptibility
allele at QTLs on Gm15 and Gm20. Thus, through the crossing of complementary lines
and the use of marker-assisted selection, the identification of breeding lines combining
Page 27 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
increased levels of resistance to SSR should be possible. However, the simultaneous
handling of many small-effect QTLs for several characters within a breeding program via
phenotypic selection could prove challenging. The development of genomic selection
schemes could provide an alternative way to achieve this goal (Poland et al., 2012).
Comparison with Previously Reported QTLs for SSR Resistance
Some of the candidate QTLs found in this work are located on chromosomes where
other SSR resistance QTLs have been found. The three QTLs previously reported on
Gm19 (Arahana et al., 2001; Han et al., 2008; Sebastian et al., 2010), however, map to
regions distinct from that reported here. QTLs on chromosomes 1 and 20 were also
reported (Arahana et al., 2001; Li et al., 2010), but again mapping to intervals distinct
from this study. Three QTLs for SSR resistance were described on chromosome 15. The
first encompasses a 44.7 Mb interval that includes our significant peak (Guo et al., 2008).
The second is located between microsatellite satt231 (position 50.5 Mb) and a restriction
fragment length polymorphism 18 cM upstream (Han et al., 2008). Both intervals are too
large to state if they detected the same QTL as ours. The third QTL was found near a
RAPD marker (OP_m12b), which is located between Satt720 (position 4.1 Mb) and
BARC-014271-01299 (position 14.0 Mb) (Arahana et al., 2001). This interval includes
the significant peak detected here. The resistance allele came from the susceptible
parent Williams 82 and the susceptible allele from the resistant parent S19-90. In our
work the resistance allele is carried by S19-90, which suggests that the two QTLs are
distinct. Overall this is the first report of association to SSR resistance in soybean in
these regions of chromosomes 1, 19 and 20.
Page 28 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Genomic Landscape Near the Association Peak on Gm15
Markers having a q-value < 0.2 on Gm15 define a 590 kb zone between positions
13,339,206 and 13,929,317 where twelve of the 28 predicted genes (42.9%) have a
predicted function related to disease resistance (Fig. 8). Within this zone there is a 260
kb gap between the peak SNP at position 13,651,235 and another SNP at position
13,911,191 (Fig. 9). This gap could have been caused by the presence of eight
consecutive predicted serine/threonine protein kinase genes spanning 83.5 kb. In
soybean, gene-rich regions that harbor clustered multigene families such as nucleotide-
binding and receptor-like protein classes have been shown to be the most enriched for
structural variation (McHale et al., 2012). Furthermore, evidence of SNP associations in
or adjacent to serine/threonine protein kinases genes with quantitative resistance
against fungal pathogens has been presented in maize (Kump et al., 2011; Poland et al.,
2011; Wang et al., 2012b). In soybean, genes encoding serine/threonine protein kinases
were shown to be involved in the reaction to infection with soybean rust caused by
Phakopsora pachyrhizi (Tremblay et al., 2011) and were suggested to be associated
with the higher level of partial resistance to Phytophthora sojae (Wang et al., 2012a).
Two other predicted proteins contain leucine-rich repeat (LRR) domains, a class of
genes involved in partial resistance to fungal pathogens in maize (Kump et al., 2011;
Poland et al., 2011). The recognition ability of polygalacturonase-inhibiting proteins,
extracellular plant proteins capable of inhibiting fungal endopolygalacturonases, resides
in their LRR structure, where solvent-exposed residues in the β-strand/β-turn motifs of
Page 29 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
the LRRs are determinants of specificity (De Lorenzo et al., 2001). In the model legume
Medicago truncatula, high-recombination regions are significantly overrepresented in
LRR genes (Paape et al., 2012). Here the genes coding for the two putative LRR
proteins flank the eight consecutive predicted serine/threonine protein kinase genes. In
soybean LLR-containing genes tend to co-localize with disease resistance QTL (Hayes
et al., 2004; Valdes-Lopez et al., 2011; Kang et al., 2012), notably in a QTL linked to
partial resistance to Phytophthora sojae (Jeong et al., 2001; Wang et al., 2010; Wang et
al., 2012a). Finally, two predicted genes contain ankyrin repeats domains, a feature of a
large family whose members are involved in a number of physiological and
developmental functions that include responses to biotic and abiotic stresses (Cao et al.,
1997; Yan et al., 2002; Yang et al., 2012). Ankyrin containing proteins have been
reported to be associated to quantitative resistance to various pathogens in maize
(Kump et al., 2011) and rice (Mou et al., 2013). The overexpression of the rice gene
OsBIANK1 in Arabidopsis plants was shown to increase disease resistance to Botrytis
cinerea, a pathogen closely related to S. sclerotiorum (Li et al., 2013). Taken together,
these evidences strongly support the presence of a disease resistance gene cluster
near the association peak on Gm15.
Page 30 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Conclusions
The dense marker coverage obtained through GBS enabled us to more finely describe
LD over the entire genome and to conduct GWAM at an unprecedented resolution in
soybean. We have found that LD persists over much longer distances in the
pericentromeric region than the telomeric region and that this information must be
considered when assessing marker coverage adequacy. We identified four regions
associated with SSR resistance and one of these was validated in a bi-parental
population. These QTLs can potentially be pyramided in new soybean cultivars to
achieve improved SSR resistance through marker-assisted or genomic selection.
Acknowledgments
We thank Semences Prograin Inc. and the Natural Sciences and Engineering Research
Council of Canada for providing the financial support for this research. We would like to
extend our gratitude to Denis Marois and Martin Lacroix for their technical assistance.
Page 31 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
References
Arahana, V.S., G.L. Graef, J.E. Specht, J.R. Steadman and K.M. Eskridge. 2001.
Identification of QTLs for resistance to Sclerotinia sclerotiorum in soybean. Crop Sci.
41:180-188. doi:10.2135/cropsci2001.411180x
Atwell, S., Y.S. Huang, B.J. Vilhjalmsson, G. Willems, M. Horton, et al. 2010. Genome-
wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature
465(7298):627-631. doi:10.1038/nature08800
Baird, N.A., P.D. Etter, T.S. Atwood, M.C. Currey, A.L. Shiver, et al. 2008. Rapid SNP
discovery and genetic mapping using sequenced RAD markers. PLoS One 3(10):e3376.
doi:10.1371/journal.pone.0003376
Bastien, M., T.T. Huynh, G. Giroux, E. Iquira, S. Rioux, and F. Belzile. 2012. A
reproducible assay for measuring partial resistance to Sclerotinia sclerotiorum in
soybean. Can. J. Plant Sci. 92:279-288. doi:10.4141/CJPS2011-101
Boland, G.J., and R. Hall. 1987. Evaluating soybean cultivars for resistance to
Sclerotinia sclerotiorum under field conditions. Plant Dis. 71:934-936.
Page 32 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Bradbury, P.J., Z. Zhang, D.E. Kroon, T.M. Casstevens, Y. Ramdoss, and E.S. Buckler.
2007. TASSEL: software for association mapping of complex traits in diverse samples.
Bioinformatics 23(19):2633-2635. doi:10.1093/bioinformatics/btm308
Bradbury, P., T. Parker, M. Hamblin, and J.-L. Jannink. 2011. Assessment of power and
false discovery rate in genome-wide association studies using the BarleyCAP germplam.
Crop Sci. 51:52-59. doi:10.2135/cropsci2010.02.0064
Bus, A., J. Hecht, B. Huettel, R. Reinhardt, and B. Stich. 2012. High-throughput
polymorphism detection and genotyping in Brassica napus using next-generation RAD
sequencing. BMC Genomics 13:281. doi:10.1186/1471-2164-13-281
Cao, H., J. Glazebrook, J.D. Clarke, S. Volko, and X. Dong. 1997. The Arabidopsis
NPR1 gene that controls systemic acquired resistance encodes a novel protein
containing ankyrin repeats. Cell 88(1):57-63. doi:10.1016/S0092-8674(00)81858-9
Chen, Y., and D. Wang. 2005. Two convenient methods to evaluate soybean for
resistance to Sclerotinia sclerotiorum. Plant Dis. 89:1268-1272. doi:10.1094/PD-89-1268
Chutimanitsakun, Y., R.W. Nipper, A. Cuesta-Marcos, L. Cistué, A. Corey, et al. 2011.
Construction and application for QTL analysis of a Restriction site Associated DNA
(RAD) linkage map in barley. BMC Genomics 12(4). doi:10.1186/1471-2164-12-4
Page 33 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Comadran, J., W. Thomas, F. van Eeuwijk, S. Ceccarelli, S. Grando, et al. 2009.
Patterns of genetic diversity and linkage disequilibrium in a highly structured Hordeum
vulgare association-mapping population for the Mediterranean basin. Theor. Appl. Genet.
119(1):175-187. doi :10.1007/s00122-009-1027-0
Conesa, A., S. Götz, J.M. Garcia-Gomez, J. Terol, M. Talon, and M. Robles. 2005.
Blast2GO: a universal tool for annotation, visualization and analysis in functional
genomics research. Bioinformatics 21:3674-3676. doi:10.1093/bioinformatics/bti610
De Lorenzo, G., R. D’Ovidio, and F. Cervone. 2001. The role of polygalacturonase-
inhibiting proteins (PGIPs) in defense against pathogenic fungi. Annu. Rev. Phytopathol.
39:313-335. doi:10.1146/annurev.phyto.39.1.313
Deschamps, S., M. la Rota, J.P. Ratashak, P. Biddle, D. Thureen, et al. 2010. Rapid
genome-wide single nucleotide polymorphism discovery in soybean and rice via deep
resequencing of reduced representation libraries with the Illumina genome analyzer.
Plant Gen. 3(1):53-68. doi:10.3835/plantgenome2009.09.0026
Elshire, R.J., J.C. Glaubitz, Q. Sun, J.A. Poland, K. Kawamoto, et al. 2011. A robust,
simple Genotyping-by-Sequencing (GBS) approach for high diversity species. PLoS
ONE 6(5):e19379. doi:10.1371/journal.pone.0019379
Page 34 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Gracia-Garza, J.A., S. Neumann, T.J. Vyn, and G.J. Boland. 2002. Influence of crop
rotation and tillage on production of apothecia by Sclerotinia sclerotiorum. Can. J. Plant
Pathol. 24:137-143. doi:10.1080/07060660309506988
Grau, C.R. 1988. Sclerotinia stem rot of soybean. In: T.D. Wyllie and D.H. Scott editors,
Soybean diseases of the North Central region. APS, St. Paul, MN. p. 56-149.
Guo, X., D. Wang, S.G. Gordon, E. Helliwell, T. Smith, et al. 2008. Genetic mapping of
QTLs underlying partial resistance to Sclerotinia sclerotiorum in soybean PI 391589A
and PI 391589B. Crop Sci. 48:1129-1139. doi:10.2135/cropsci2007.04.0198
Han, F., M. Katt, W. Schuh, and D.M. Webb. 2008. QTL controlling Sclerotinia stem rot
resistance in soybean. U.S. Patent 7,250,552. Date issued: 18 September.
Hao, D., M. Chao, Z. Yin, and D. Yu. 2012a. Genome-wide association analysis
detecting significant single nucleotide polymorphisms for chlorophyll and chlorophyll
fluorescence parameters in soybean (Glycine max) landraces. Euphytica 186:919-931.
doi:10.1007/s10681-012-0697-x
Hao, D., H. Cheng, Z. Yin, S. Cui, D. Zhang, et al. 2012b. Identification of single
nucleotide polymorphisms and haplotypes associated with yield and yield components in
soybean (Glycine max) landraces across multiple environments. Theor. Appl. Genet.
124:447–458. doi:10.1007/s00122-011-1719-0
Page 35 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Hartman, G.L., M.E. Gardner, T. Hymowitz, and G.C. Naidoo. 2000. Evaluation of
perennial Glycine species for resistance to soybean fungal pathogens that cause
Sclerotinia stem rot and sudden death syndrome. Crop Sci. 40:545-549.
doi:10.2135/cropsci2000.402545x
Hayes, A.J., S.C. Jeong, M.A. Gore, Y.G. Yu, G.R. Buss, et al. 2004. Recombination
within a nucleotide-binding-site/leucine-rich-repeat gene cluster produces new variants
conditioning resistance to soybean mosaic virus in soybeans. Genetics. 166(1):493-503.
doi:10.1534/genetics.166.1.493
Hill, W.G., and A. Robertson. 1968. Linkage disequilibrium in finite populations. Theor.
Appl. Genet. 38:226-231.
Hoffman, D.D., B.W. Diers, G.L. Hartman, C.D. Nickell, R.L. Nelson, et al. 2002.
Selected soybean plant introductions with partial resistance to Sclerotinia sclerotiorum.
Plant Dis. 86:971-980. doi:10.1094/PDIS.2002.86.9.971
Hou, J., C. Wang, X. Hong, J. Zhao, C. Xue, et al. 2011. Association analysis of
vegetable soybean quality traits with SSR markers. Plant Breed. 130(4):444-449.
doi:10.1111/j.1439-0523.2011.01852.x
Huang, X., Y. Zhao, X. Wei, C. Li, A. Wang, et al. 2012. Genome-wide association study
of flowering time and grain yield traits in a worldwide collection of rice germplasm.
Nature Genet. 44(1):32-39. doi:10.1038/ng.1018
Page 36 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Huynh T.T., M. Bastien, E. Iquira, P. Turcotte, and F. Belzile. 2010. Identification of
QTLs associated with partial resistance to white mold in soybean using field-based
inoculation. Crop Sci. 50(3):969-979. doi:10.2135/cropsci2009.06.0311
Hyten, D.L., I.Y. Choi, Q.J. Song, R.C. Shoemaker, R.L. Nelson, et al. 2007. Highly
variable patterns of linkage disequilibrium in multiple soybean populations. Genetics
175 :1937-1944. doi:10.1534/genetics.106.069740
Hyten, D.L., Q. Song, I.-Y. Choi, M.-S. Yoon, J.E. Specht, et al. 2008. High-throughput
genotyping with the GoldenGate assay in the complex genome of soybean. Theor. Appl.
Genet. 166(7): 945-952. doi:10.1007/s00122-008-0726-2
Hyten, D.L., S.B. Cannon, Q. Song, N. Weeks, E.W. Fickus, et al. 2010a. High-
throughput SNP discovery through deep resequencing of a reduced representation
library to anchor and orient scaffolds in the soybean whole genome sequence. BMC
Genomics 11:38. doi:10.1186/1471-2164-11-38
Hyten, D.L., I.Y. Choi, Q. Song, J.E. Specht, T.E. Carter, et al. 2010b. A high density
integrated genetic linkage map of soybean and the development of a 1536 Universal
Soy Linkage Panel for quantitative trait locus mapping. Crop Sci. 50:960-968.
doi:10.2135/cropsci2009.06.0360
Page 37 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Jeong, S.C., A.J. Hayes, R.M. Biyashev, and M.A. Saghai Maroof. 2001. Diversity and
evolution of a non-TIR-NBS sequence family that clusters to a chromosomal “hotspot”
for disease resistance genes in soybean. Theor. Appl. Genet. 103:406-414.
doi:10.1007/s001220100567
Kang, Y.J., K.H. Kim, S. Shim, M.Y. Yoon, S. Sun, et al. 2012. Genome-wide mapping
of NBS-LRR genes and their association with disease resistance in soybean. BMC Plant
Biol. 12:139. doi:10.1186/1471-2229-12-139
Kim, H.S., C.H. Sneller, and B.W. Diers. 1999. Evaluation of soybean cultivars for
resistance to Sclerotinia stem rot in field environments. Crop Sci. 39:64-68.
doi:10.2135/cropsci1999.0011183X003900010010x
Kim, H.S., and B.W. Diers. 2000. Inheritance of partial resistance to Sclerotinia stem rot
in soybean. Crop Sci. 40:55-61. doi: 10.2135/cropsci2000.40155x
Koenning, S.R., and J.A. Wrather. 2010. Suppression of soybean yield potential in the
continental United States by plant diseases from 2006 to 2009. Plant Health Progress.
doi:10.1094/PHP-2010-1122-01-RS
Korir, P.C., J. Zhang, K. Wu, T. Zhao, and J. Gai. 2013. Association mapping combined
with linkage analysis for aluminum tolerance among soybean cultivars released in
Yellow and Changjiang river valleys in China. Theor. Appl. Genet. 126:1659–1675.
doi :10.1007/s00122-013-2082-0
Page 38 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Kump, K.L., P.J. Bradbury, R.J. Wisser, E.S. Buckler, A.R. Belcher, et al. 2011.
Genome-wide association study of quantitative resistance to southern leaf blight in the
maize nested association mapping population. Nature Genet. 43(2):163-169.
doi:10.1038/ng.747
Kurle, J.E., C.R. Grau, E.S. Oplinger, and A. Mengistu. 2001. Tillage, crop sequence,
and cultivar effects on Sclerotinia stem rot incidence and yield in soybean. Agron. J.
93:973-982. doi:10.2134/agronj2001.935973x
Lee, W.K., N. Kim, J. Kim, J.-K. Moon, N. Jeong et al. 2013. Dynamic genetic features of
chromosomes revealed by comparison of soybean genetic and sequence-based
physical maps. Theor. Appl. Genet. 126:1103-1119. doi:10.1007/s00122-012-2039-8
Li, D., M. Sun, Y. Han, W. Teng, and W. Li. 2010. Identification of QTL underlying
soluble pigment content in soybean stems related to resistance to soybean white mold
(Sclerotinia sclerotiorum). Euphytica 172(1):49-57. doi:10.1007/s10681-009-0036-z
Li, D., F. Wang, B. Liu, Y. Zhang, L. Huang, et al. 2013. Ectopic expression of rice
OsBIANK1, encoding an ankyrin repeat-containing protein, in Arabidopsis confers
enhanced disease resistance to Botrytis cinerea and Pseudomonas syringae. J.
Phytopathol. 161:27-34. doi:10.1111/jph.12023
Page 39 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Li, Y.-H., M.J.M. Smulders, R.-Z. Chang, and L.-J. Qiu. 2011. Genetic diversity and
association mapping in a collection of selected Chinese soybean accessions based on
SSR marker analysis. Conserv. Genet. 12:1145-1157. doi:10.1007/s10592-011-0216-y
Mamidi, S., S. Chikara, R.J. Goos, D.L. Hyten, D. Annam, et al. 2011. Genome-wide
association analysis identifies candidate genes associated with iron deficiency chlorosis
in soybean. Plant Gen. 4(3):154-164. doi:10.3835/plantgenome2011.04.0011
McHale, L.K., W.J. Haun, W.W. Xu, P.B. Bhaskar, J.E. Anderson, et al. 2012. Structural
variants in the soybean genome localize to clusters of biotic stress-response genes.
Plant Physiol. 159(4):1295-1308. doi:10.1104/pp.112.194605.
Mila, A.L., and X.B. Yang. 2008. Effects of fluctuating soil temperature and water
potential on sclerotia germination and apothecial production of Sclerotinia sclerotiorum.
Plant Dis. 92:78-82. doi:10.1094/PDIS-92-1-0078
Miller, M.R., J.P. Dunham, A. Amores, W.A. Cresko, and E.A. Johnson. 2007. Rapid and
cost-effective polymorphism identification and genotyping using restriction site
associated DNA (RAD) markers. Genome Res. 17(2):240-248. doi:10.1101/gr.5681207
Mou, S., Z. Liu, D. Guan, A. Qiu, Y. Lai, and S. He. 2013. Functional analysis and
expressional characterization of rice ankyrin repeat-containing protein, OsPIANK1, in
basal defense against Magnaporthe oryzae attack PLoS One 8(3):e59699.
doi:10.1371/journal.pone.0059699
Page 40 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Mueller, D.S., A.E. Dorrance, R.C. Derksen, E. Ozkan, J.E. Kurle, et al. 2002. Efficacy of
fungicides on Sclerotinia sclerotiorum and their potential for control of Sclerotinia stem
rot on soybean. Plant Dis. 86:26-31. doi:10.1094/PDIS.2002.86.1.26
Mueller, D.S., C.A. Bradley, C.R. Grau, J.M. Gaska, J.E. Kurle, and W.L. Pedersen.
2004. Application of thiophanate-methyl at different host growth stages for management
of Sclerotinia stem rot in soybean. Crop Prot. 23:983-988.
doi:10.1016/j.cropro.2004.02.013
Niu, Y., Y. Xu, X.F. Liu, S.X. Yang, S.P. Wei, et al. 2013. Association mapping for seed
size and shape traits in soybean cultivars. Mol. Breeding 31:785–794.
doi:10.1007/s11032-012-9833-5
Paape, T., P. Zhou, A. Branca, R. Briskine, N. Young, and P. Tiffin. 2012. Fine-scale
population recombination rates, hotspots, and correlates of recombination in the
Medicago truncatula genome. Genome Biol. Evol. 4(5):726–737.
doi:10.1093/gbe/evs046
Phillips, A.J.L. 1994. Influence of fluctuating temperatures and interrupted periods of
plant surface wetness on infection of bean leaves by ascospores of Sclerotinia
sclerotiorum. Ann. Appl. Biol. 124:413-427.
Page 41 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Poland, J.A., P.J. Bradbury, E.S. Buckler, and R.J. Nelson. 2011. Genome-wide nested
association mapping of quantitative resistance to northern leaf blight in maize. P. Natl.
Acad. Sci. USA 108(17):6893-6898. doi:10.1073/pnas.1010894108
Poland, J., J. Endelman, J. Dawson, J. Rutkoski, S. Wu, et al. 2012. Genomic selection
in wheat breeding using genotyping-by-sequencing. Plant Gen. 5(3):103-113.
doi:10.3835/plantgenome2012.06.0006
Rousseau, G., T.T. Huynh, D. Dostaler, and S. Rioux. 2004. Greenhouse and field
assessments of resistance in soybean inoculated with sclerotia, mycelium, and
ascospores of Sclerotinia sclerotiorum. Can. J. Plant Sci. 84:615-623. doi:10.4141/P03-
003
Rousseau, G.X., D. Dostaler, and S. Rioux. 2007. Effect of crop rotation and soil
amendments on Sclerotinia stem rot on soybean in two soils. Can. J. Plant Sci. 87:605-
614. doi:10.4141/P05-137
Scheet, P., and M. Stephens. 2006. A fast and flexible statistical model for large-scale
population genotype data: Applications to inferring missing genotypes and haplotypic
phase. Am. J. Hum. Genet. 78:629-644. doi:0002-9297/2006/7804-0010
Schmutz, J., S.B. Cannon, J. Schlueter, J. Ma, T. Mitros, et al. 2010. Genome sequence
of the palaeopolyploid soybean. Nature 463(7278):178-183. doi:10.1038/nature08670
Page 42 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Sebastian, S.A., H. Lu, F. Han, D. Kyle, B.R. Hedges, et al. 2010. Genetic loci
associated with Sclerotinia tolerance in soybean. U.S Patent 7,790,949 B2. Date issued:
7 September.
Sonah, H., M. Bastien, E. Iquira, A. Tardivel, G. Légaré, et al. 2013. An improved
genotyping by sequencing (GBS) approach offering increased versatility and efficiency
of SNP discovery and genotyping. PLoS ONE 8(1):e54603.
doi:10.1371/journal.pone.0054603
Stanton-Geddes, J., T. Paape, B. Epstein, R. Briskine, J. Yoder, et al. 2013. Candidate
genes and genetic architecture of symbiotic and agronomic traits revealed by whole-
genome, sequence-based association genetics in Medicago truncatula. PLoS ONE
8(5):e65688. doi:10.1371/journal.pone.0065688
Storey, J.D., and R. Tibshirani. 2003. Statistical significance for genomewide studies. P.
Natl. Acad. Sci. USA 100(16):9440-9445. doi:10.1073/pnas.1530509100
Sun, X., D. Liu, X. Zhang, W. Li, H. Liu, et al. 2013. SLAF-seq: An efficient method of
large-scale de Novo SNP discovery and genotyping using high-throughput sequencing.
PLoS ONE 8(3):e58700. doi:10.1371/journal.pone.0058700
Sutton, J.C., and G. Peng. 1993. Manipulation and vectoring of biocontrol organisms to
manage foliage and fruit disease in cropping systems. Annu. Rev. Phytopathol. 31:473-
493.
Page 43 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Tremblay, A., P. Hosseini, N.W. Alkharouf, S. Li, and B.F. Matthews. 2011. Gene
expression in leaves of susceptible Glycine max during infection with Phakopsora
pachyrhizi using next generation sequencing. Sequencing 2011.
doi:10.1155/2011/827250
Valdes-Lopez, O., S. Thibivilliers, J. Qiu, W.W. Xu, T.H.N. Nguyen, et al. 2011.
Identification of quantitative trait loci controlling gene expression during the innate
immunity response of soybean. Plant Physiol. 157(4):1975-1986.
doi:10.1104/pp.111.183327
Varala, K., K. Swaminathan, Y. Li, and M.E. Hudson. 2011. Rapid genotyping of
soybean cultivars using high throughput sequencing. PLoS ONE 6(9):e24811.
doi:10.1371/journal.pone.0024811
Vuong, T.D., B.W. Diers, and G.L. Hartman. 2008. Identification of QTL for resistance to
Sclerotinia stem rot in soybean plant introduction 194639. Crop Sci. 48(6):2209-2214.
doi:10.2135/cropsci2008.01.0019
Wang, H., L. Waller, S. Tripathy, S.K. St. Martin, L. Zhou, et al. 2010. Analysis of genes
underlying soybean quantitative trait loci conferring partial resistance to Phytophthora
sojae. Plant Gen. 3(1):23-40. doi:10.3835/plantgenome2009.12.0029
Page 44 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Wang, H., A. Wijeratne, S. Wijeratne, S. Lee, C.G. Taylor, et al. 2012a. Dissection of two
soybean QTL conferring partial resistance to Phytophthora sojae through sequence and
gene expression analysis. BMC Genomics 13:428. doi:10.1186/1471-2164-13-428
Wang, M., J. Yan, J. Zhao, X. Zhang, and Y. Xiao. 2012b. Genome-wide association
study (GWAS) of resistance to head smut in maize. Plant Sci. 196:125-131.
doi:10.1016/j.plantsci.2012.08.004
Workneh, F., and X.B. Yang. 2000. Prevalence of Sclerotinia stem rot of soybeans in the
north-central United States in relation to tillage, climate, and latitudinal positions.
Phytopathology 90:1375-1382. doi:10.1094/PHYTO.2000.90.12.1375
Wu, X., C. Ren, T. Joshi, T. Vuong, D. Xu, and H. Nguyen. 2010. SNP discovery by
high-throughput sequencing in soybean. BMC Genomics 11(1):469. doi:10.1186/1471-
2164-11-469
Yan, J., J. Wang, and H. Zhang. 2002. An ankyrin repeat-containing protein plays a role
in both disease resistance and antioxidation metabolism. Plant J. 29(2):193–202.
doi:10.1046/j.0960-7412.2001.01205.x
Yang, Y., Y. Zhang, P. Ding, K. Johnson, X. Li, and Y. Zhang. 2012. The ankyrin-repeat
transmembrane protein BDA1 functions downstream of the receptor-like protein SNC2
to regulate plant immunity. Plant Physiol. 159(4):1857–1865.
doi:10.1104/pp.112.197152
Page 45 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Zeng, W., W. Kirk, and J. Hao. 2012. Field management of Sclerotinia stem rot of
soybean using biological control agents. Biol. Control 60:141–147.
doi:10.1016/j.biocontrol.2011.09.012
Zhang, Z., E. Ersoz, C.-H. Lai, R.J. Todhunter, H.K. Tiwari, et al. 2010. Mixed linear
model approach adapted for genome-wide association studies. Nat. Genet. 42(4):355-
360. doi:10.1038/ng.546
Zuo, Q., J. Hou, B. Zhou, Z. Wen, S. Zhang, et al. 2013. Identification of QTLs for
growth period traits in soybean using association analysis and linkage mapping. Plant
Breed. 132(3):317-323. doi:10.1111/pbr.12060
Page 46 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Figures
Fig. 1. Distribution of disease susceptibility among 130 soybean lines.
Fig. 2. Distribution of SNPs according to their minor allele frequency (MAF). SNPs with
MAF < 5% were excluded from the analysis.
Fig. 3. Intrachromosomal linkage disequilibrium decay over physical distance. (a) Whole
chromosome; (b) Telomeric regions; (c) Pericentromeric regions.
Fig. 4. Plot of median r2 over physical distance. a) Pericentromeric regions; b) Telomeric
regions.
Fig. 5. Cumulative distribution of p-values from genome-wide association tests using
four statistical models. Cumulative p-values are equivalent to observed p-value under an
expected model taking into account all background effects.
Fig. 6. Genome-wide association scan for Sclerotinia stem rot (SSR) resistance. The –
log10(P) values from the genome-wide scan are plotted against the SNP positions on the
physical map of each chromosome. The significance threshold (q = 0.1) is indicated by
the horizontal line.
Fig. 7. Lesion length among recombinant inbred lines classes contrasted for the SNP on
Gm15 in two validation populations.
Fig. 8. Predicted genes in the vicinity of the peak SNP at position 13,651,235 on Gm15.
Genes involved in disease resistance are colored in yellow, other genes are colored in
purple.
Fig. 9. Heat map of r2 values between SNP markers located in an interval containing
SNP markers associated with Sclerotinia stem rot resistance (at q < 0.2) on Gm15.
Page 47 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Supplementary Fig. 1. Average r2 values over physical distance for the 20 soybean
chromosomes. The arrows show the approximate extent of the pericentromeric region.
Supplementary Fig. 2. Projection of the 130 genotypes on the plane of the three first
eigenvectors of the principal component analysis.
Page 48 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Tables
Table 1. Marker distribution among chromosomal regions.
Telomeric Pericentromeric
SNP †Borders region region
Total Position
(Mb) Left Right Width Nb Width Nb
Chr Nb First Last (Mb) (Mb) (Mb) SNP (Mb) SNP
1 287 0.29 55.90 4.6 46.0 14.21 170 41.4 117
2 479 0.11 51.55 12.1 41.2 22.34 349 29.1 130
3 328 0.21 47.74 8.6 35.8 20.33 254 27.2 74
4 421 0.01 49.08 12.4 41.2 20.27 312 28.8 109
5 357 0.02 41.92 9.4 26.9 24.4 351 17.5 6
6 420 0.05 50.49 8.4 43.2 15.64 184 34.8 236
7 324 0.11 44.54 11.4 34.8 21.03 227 23.4 97
8 358 0.66 46.90 15.3 42.1 19.44 215 26.8 143
9 418 0.30 46.83 3.3 32.9 16.93 242 29.6 176
10 419 0.13 50.95 6.6 36.6 20.82 250 30.0 169
11 173 0.25 39.10 10.9 35.9 13.85 124 25.0 49
12 248 0.02 40.10 7.8 33.9 13.98 203 26.1 45
13 490 0.01 44.34 5.1 23.9 25.53 335 18.8 155
14 392 0.06 49.68 11.1 45.9 14.82 260 34.8 132
15 516 0.14 50.89 14.2 39.3 25.65 348 25.1 168
16 452 0.05 37.13 8.4 22.8 22.68 364 14.4 88
17 355 0.35 41.84 15.2 31.6 25.09 301 16.4 54
18 626 0.01 62.19 13.5 50.9 24.78 477 37.4 149
19 435 0.03 50.56 5.9 34.9 21.53 292 29.0 143
20 366 0.11 46.75 3.6 32.7 17.54 321 29.1 45
Total 7,864 400.9 5,579 544.7 2,285
† Borders between the pericentromeric and telomeric regions
Page 49 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Table 2. Most significant single nucleotide polymorphisms (SNPs) associated with
resistance.
Chr† Position‡ p-value q-value MAF MA effect§ ¶R2 LLR# LLS†† a‡‡
15 13,651,235 1.38 x 10-6 0.011 0.32 Resistant 0.145 103.7 118.8 15.1
1 29,185,984 3.09 x 10-5 0.040 0.39 Resistant 0.073 110.7 116.1 5.4
20 39,698,515 1.19 x 10-4 0.094 0.32 Resistant 0.063 108.6 116.5 8.0
19 50,557,054 1.21 x 10-4 0.094 0.05 Susceptible 0.072 113.4 123.3 9.9
† Chromosome number
‡ Position of peak marker on the physical map
§ Indicates whether the minor allele provides increased resistance or susceptibility
¶ Indicates the proportion of total phenotypic variation accounted for by the marker
# Mean lesion length in mm of genotypes carrying the resistance allele
†† Mean lesion length in mm of genotypes carrying the susceptibility allele
‡‡ Average change in lesion length following allele substitution
Page 50 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
Supplementary Table 1. Lesion length (mm) among the 130 soybean lines.
Genotype LL Genotype LL Genotype LL Genotype LL
Karlo RR 28.6 15 88.5 40 116.6 2 138.0
PRO 275 37.6 Tundra 90.8 70 116.6 37 139.8
S19-90 43.5 90 92.2 Azur 117.9 S04273.09 140.1
Toma 49.2 46 92.4 Williams 82 118.0 PR9031LL 140.3
93 49.5 124 93.2 122 118.8 99 140.4
21 50.8 72 93.5 Accent 119.5 Oria 141.5
Majesta 56.9 S04273.19 93.8 12 119.6 Saska 143.8
Prius RR 57.7 31 94.0 Acora 119.8 9 149.8
S04297.18 60.8 24 97.2 Victoria 120.0 Nova 150.1
Maple Arrow 61.0 110 97.3 36 120.7 109 150.8
PR918827 64.9 23 101.2 PR938626 121.0 118 151.3
64 65.8 Jutra 101.3 35 122.9 PR935413 152.0
S04280.44 65.8 27 101.5 PRO 25-53 123.0 6 152.7
45 72.0 97 101.8 102 123.3 Bixi LL 152.7
116 72.7 44 102.0 123 123.6 28 153.7
30 74.5 Venus 103.8 4 125.1 Amasa 153.8
52 75.8 80 104.6 PR939402 126.5 25 155.0
34 79.0 13 105.3 38 126.9 107 155.0
100 80.1 119 106.0 20 128.2 29 156.1
22 80.5 Korus 106.7 92 130.2 112 156.3
Maple Donovan 81.3 PR9368B25 107.5 111 130.9 1 157.8
121 82.0 5 109.0 2601R 131.2 OAC Bayfield 159.9
A x N-1-55 82.3 Delta 109.3 49 133.7 56 161.3
65 82.8 101 109.8 Lotus 133.8 PR935401 161.3
69 83.1 Kolia 110.4 95 134.2 117 163.2
17 83.3 108 112.3 62 134.6 26 169.0
61 84.5 115 112.7 67 135.0 Supra 169.5
66 84.6 77 113.8 Naya 135.5 7 173.6
PRD 419 84.9 PR9423B31 113.8 94 136.8 10 173.7
125 86.3 41 114.5 113 136.8 8 176.1
Bakara 86.7 51 114.7 19 137.7 Nattosan 176.6
39 87.4 63 115.4 98 137.8 96 192.4
33 87.9 Aquita 116.0
Page 51 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
0
5
10
15
20
25
21-36 37-52
53-68 69-84
85-100
101-116
117-132
133-148
149-164
165-180
181-196
Num
ber o
f lin
es
Lesion length (mm)
Maple Donovan (R)
OAC Bayfield (S)
S19-90 (R)
Williams 82 (S)
Nattosan (S)
Karlo RR (R)
Page 52 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
0
5
10
15
20
25
30
35
40
<0.05 0.05-0.1 0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5
% o
f SN
Ps
Minor allele frequency
Page 53 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
a c b
Distance (Mb)
r2
Distance (Mb) Distance (Mb) r2
r2
Page 54 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
0"
0,1"
0,2"
0,3"
0,4"
0,5"
0,6"
0,7"
0,8"
0,9"
1"
0" 20" 40" 60" 80" 100"
r2
Distance (kb)
0"
0,1"
0,2"
0,3"
0,4"
0,5"
0,6"
0,7"
0,8"
0,9"
1"
0 20 40 60 80 100
r2
Distance (kb)
a b Page 55 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
0,0
0,2
0,4
0,6
0,8
1,0
0 0,2 0,4 0,6 0,8 1
Cum
ulat
ive
P
Observed P
Expected Distribution Naive PCA K PCA + K
Page 56 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
-Log
(p-v
alue
)
2
1
0
3
4
5
6
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Chromosome
Page 57 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
35#
45#
55#
65#
75#
85#
1# 2#
Lesion
#length#(m
m)#
Suscep;ble#Allele#
Resistant#Allele#
Population 1 Population 2
Page 58 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
13,651,235 13,339,206 13,339,218
13,911,191 13,911,995
13,929,317
Glyma15g17230.1 Glyma15g17240.1
Ankyrin repeats
Glyma15g17310.1 TIR domain – Leucine-rich
repeats
Glyma15g17360.1 Glyma15g17370.1 Glyma15g17390.1 Glyma15g17410.1 Glyma15g17420.1 Glyma15g17430.1 Glyma15g17450.1 Glyma15g17460.1
Serine/threonine protein kinase
Glyma15g17540.1 TIR domain – Leucine-rich
repeats
Page 59 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
13,339,206 13,339,218 13,516,378 13,525,925 13,529,976 13,651,175 13,651,235 13,911,191 13,911,995 13,929,317
Upper r2
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
> 0.
01
< 0.
01
< 0.
001
< 0.
0001
Lower p-values
Page 60 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50# 60#
R2#
Physical#posi.on#(Mb)#
Gm18#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm19#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm17#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm05#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm01#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm11#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm10#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm09#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm07#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm06#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm03#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm02#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm13#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm14#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm15#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm12#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm08#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40# 50#
R2#
Physical#posi.on#(Mb)#
Gm04#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30#
R2#
Physical#posi.on#(Mb)#
Gm16#
0,0#
0,2#
0,4#
0,6#
0,8#
1,0#
0# 10# 20# 30# 40#
R2#
Physical#posi.on#(Mb)#
Gm20#
Page 61 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030
-‐0,016
-‐0,011
-‐0,006
-‐0,001
0,004
0,009
0,014
0,019
-‐0,017
-‐0,012
-‐0,007
-‐0,002
0,003
0,008
0,013
0,018
0,023
0,028
-‐0,014 -‐0,009 -‐0,004 0,001 0,006 0,011 0,016
PC 3
PC 2
PC 1
PC2 vs PC1 PC3 vs PC1
Page 62 of 62
The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030