32
Mark E. Sorrells and Flavio Mark E. Sorrells and Flavio Breseghello Breseghello Department of Plant Breeding Department of Plant Breeding & Genetics & Genetics Cornell University Cornell University Association Mapping as a Breeding Association Mapping as a Breeding Strategy Strategy

Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Embed Size (px)

Citation preview

Page 1: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Mark E. Sorrells and Flavio BreseghelloMark E. Sorrells and Flavio Breseghello

Department of Plant Breeding & GeneticsDepartment of Plant Breeding & Genetics

Cornell UniversityCornell University

Association Mapping as a Breeding StrategyAssociation Mapping as a Breeding Strategy

Page 2: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Presentation Presentation OverviewOverview

A Genetic Model for Association Mapping in Plant Breeding

Populations

Comparison of Different Plant Breeding Materials for

Association Mapping

Association Mapping of Kernel Size and Milling Quality in Soft

Winter Wheat Cultivars

Page 3: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

A Genetic Model for AM in Plant Breeding Populations:A Genetic Model for AM in Plant Breeding Populations:Association as Conditional ProbabilitiesAssociation as Conditional Probabilities

Gene Marker

Recombination (c)

Breeding Pool

Gene={a}

Marker={m,M}

New Parent (A,M)

Pr(A,M)=φ

Pr(a,M)=θ

Pr(a,m)=1-φ-θ

Pr(A,m)=0

(Hedrick 2005)

Recom

binat

ion (c

)

Selec

tion o

n A o

r M

(w)

Pr(A|M,c,t,φ,θ,w)“Probability of a plant with marker allele M to have gene allele A, t generations after the introduction of A”

t generations

Population genetics theory

Page 4: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Recombination x initial frequency Recombination x initial frequency of M in the breeding poolof M in the breeding pool

A novel marker allele at 10

cM distance can be more

predictive of the QTL allele

than an allele 1 cM away if

it was present in the original

pop at a freq of 0.05

Freq. new parent: φ=0.05

Relative fitness: w=1

Freq. M in original population = θ

Freq. Recombination c

θ=0

0 5 10 15 20

0.0

0.2

0.4

0.6

0.8

1.0

Generations

Pr(

A|M

)

0 0.05 0.25

c=0.01

c=0.05

c=0.10

~8 ~18

θ=0

θ=0.05

θ=0.25

Pr(

A|M

)

t Generations

Page 5: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Recombination x selection for M

Freq. new parent: φ=0.05

Relative fitness: w = 4 (red), 2 (green), 1.25 (blue)

Freq. M in original pop: 0

Freq. Recombination: c = 0.01, 0.05, 0.10

Pr(A|M)

Pr(A)

• The generation at which the marker is depleted [Pr(A|M)=Pr(A)], depends on the selection intensity applied;

• The final frequency of A depends on selection and tightness of linkage between marker and gene.

Generations

Page 6: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

SummarySummary

• In plant breeding populations, the locus most associated

with the trait is not necessarily the closest locus;

• Loosely linked markers can still be useful for MAS if

high intensity of selection is applied.

Page 7: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

MAS for Complex Traits: Issues MAS for Complex Traits: Issues

• Accurate detection and estimation of QTL effects

• Pre-existing marker alleles in a breeding population can be

linked to non-target QTL alleles

• Multiple QTL alleles can have different relative values

• Gene x gene and gene by environment interactions

Page 8: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Association Analysis as a Breeding StrategyAssociation Analysis as a Breeding Strategy

•Most association studies have focused on estimating linkage disequilibrium and fine mapping.

•Breeding programs are dynamic, complex genetic entities that require frequent evaluation of marker / phenotype relationships.

Breseghello, F., and M.E. Sorrells. 2006. Association mapping of kernel size and milling quality in wheat (Triticum aestivum L.) cultivars. Genetics 172:1165-1177.

Breseghello, F., and M.E. Sorrells. 2006. Association analysis as a strategy for improvement of quantitative traits in plants. Crop Sci. In press.

Page 9: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Association Mapping versus QTL MappingAssociation Mapping versus QTL Mapping

• Association Mapping can be conducted directly on the breeding material,

therefore:

• Direct inference from data analysis to breeding is possible

• Phenotypic variation is observed for most traits of interest

• Marker polymorphism is higher than in biparental populations

• Routine variety trial evaluations provide phenotypic data

• Association Mapping provides other useful information about:

• Organization of genetic variation in relevant breeding populations

• Novel alleles can be identified and their relative value can be

assessed as often as necessary

Page 10: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

• Type I error (false positives) can be higher because of:

• Unaccounted population structure

• Simultaneous selection of combinations of alleles at different loci

• High sampling variance of rare alleles

• Type II error can be higher (low power) because of:

• Lower LD than in biparental mapping populations

• Unbalanced design due to differences in allele frequencies

• A larger multiple-testing problem because of lower LD

Association Mapping versus QTL MappingAssociation Mapping versus QTL Mapping

Page 11: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Germplasm

New Populations

New Synthetics, Lines, VarietiesNew Synthetics, Lines, Varieties

Elite Synthetics, Lines, VarietiesElite Synthetics, Lines, Varieties

Hybridization

Selection(Intermating)

Evaluation Trials

Genotypic & Phenotypic data

Parental Selection

Marker Assisted Selection

Novel & ValidatedNovel & ValidatedQTL/MarkerQTL/MarkerAssociationsAssociations

Integration of Association Analysis in a Breeding Program

Elite germplasmfeeds back intohybridization

nursery

Page 12: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Types of PopulationsTypes of Populations

• Germplasm Bank Collection

• A collection of genetic resources including landraces, exotic material and

wild relatives.

• Synthetic Populations

• Outcrossing populations (either male-sterile or manually crossed)

synthesized from inbred lines. May be used for recurrent selection.

• Elite Lines

• Inbred lines (and checks) manipulated with the objective of releasing new

varieties in the short term.

Page 13: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Aspects of AM Germplasm bank Synthetic Populations

Elite Germplasm

Sample Core-collection Segregating progenies

Elite lines and checks

Sample turnover Static Ephemeral Gradually substituted

Source of phenotypic data

Screenings Progeny tests Yield trials

Type of traits High heritability traits;Domestication traits

Depends on the evaluation scheme

Low heritability traits: yield, resistance to abiotic stresses

Characteristics Related to Association Mapping:Characteristics Related to Association Mapping:Practical aspectsPractical aspects

Page 14: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Characteristics Related to Association Mapping: Characteristics Related to Association Mapping: Genetic Expectations Genetic Expectations

Aspects of AM Germplasm bank Synthetic Populations Elite Germplasm

Linkage Disequilibrium

Low Intermediate and fast-decaying

High

Population structure

Medium Low High

Allele diversity among samples

High Intermediate Low

Allele diversity within samples

Variable 1 or 2 alleles(diploid species)

1 allele (inbred lines)

Page 15: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Characteristics Related to Association Mapping: Characteristics Related to Association Mapping: Potential ApplicationsPotential Applications

Aspects Germplasm bank Synthetic Populations Elite Germplasm

Power Low Intermediate and decreasing

High; could allow genome scan

Resolution High; could allow fine mapping

Intermediate and increasing

Low

Use of significant markers

Transfer of new alleles by marker-assisted backcross

Incorporation in selection index

Forward Breeding -MAS in progenies (requires validation)

Page 16: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Previous QTL informationPrevious QTL information

• Doubled-Haploid Population AC Reed x

Grandin

• QTL for kernel size (width) near Xwmc18-2D

• Recombinant Inbred Population Synthetic

W7984 x Opata (ITMI population)

• QTL for kernel size (length) on 5A and 5B

Length

5B

Width

2D

Page 17: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Association AnalysisAssociation AnalysisMaterials

• 95/149 soft winter wheat cultivars from the Northeastern US: Mostly recent releases, representing 35 seed companies / institutions

• 93 SSR loci: 33 on 2D, 20 on 5A, 9 on 5B, 31 on 16 other chromosomes

• Rare alleles (freq<5%):considered as missing for LD and population structure analysis; considered as allele for AM analysis

Methods

• Population Structure: 36 “unlinked” SSR markers- Structure without admixture, SPAGeDi (Hardy & Vekemans) program for Kinship ; Visualization: Factorial (Multiple) Correspondence Analysis (Benzecri, 1973 L' Analyse des correspondances. Dunod)

• Linkage Disequilibrium: Tassel (maizegenetics.net) used to compute r2 , with p-values from 1000 permutations

• Association Analysis: R stats package lme used to analyze Linear mixed-effects model with marker as fixed effects (selected from previously identified QTL regions) and subpopulations or Kinship as random effects (no obvious differentiating characteristics); Two-marker models: tested by likelihood ratio test

• Jianming Yu, Gael Pressoir, et al. (2006) A Unified Mixed-Model Method for Association Mapping Accounting for Multiple Levels of Relatedness Nature Genetics 38:203-208

Page 18: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Estimating Relatedness Estimating Relatedness The K MatrixThe K Matrix

Relatedness (K)

i

j

Θij≅ Fij

F11

FnnFnj ……

………….

.

.

.

.

.

.

.

.

In cattle studies the analogous matrix is estimated from pedigrees, and it controls for the polygene effect

Jianming Yu, Gael Pressoir, et al. (2006) A Unified Mixed-Model Method for Association Mapping Accounting for Multiple Levels of Relatedness Nature Genetics 38:203-208

Fij = (Qij-Qm)/(1-Qm) (Ritland, Loiselle)

If Fij is negative, then it is set to zero.

Page 19: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Subpopulation No. of Varieties Fst1 19

0.3372 32

0.1113 13

0.2954 31

0.064Total 95

0.188

Population Structure:Population Structure:Sample SubdivisionsSample Subdivisions

S1S2

S3S4

Moderate Population Subdivision

Page 20: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Population Structure:Population Structure:Factorial Correspondence AnalysisFactorial Correspondence Analysis

S2

S4

S1

S3

Orthogonal views of 4 soft winter wheat subpopulations

Page 21: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Linkage Disequilibrium:Linkage Disequilibrium:Germplasm Germplasm Sample SelectionSample Selection

• 149 lines genotyped with 18 unlinked SSR markers

• Most similar lines were excluded

• "Normalizing" the sample drastically reduced LD among unlinked markers

p<.0001

p<.001

p<.01

149 lines

95 line

s

R2 probability for unlinked SSR markers

Page 22: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Definition of a baseline-LD specific for our sampleDefinition of a baseline-LD specific for our sample

Defined as the 95th percentile of the distribution of r2 among unlinked loci

r2 estimates above this value are probably due to genetic linkage

Baseline LD for this sample: r2 = 0.0654

Normal curve

Correlation Coefficient r

De

nsi

ty

0.0 0.1 0.2 0.3 0.4

02

46

8

Normal Distr. 95th percentile

LD

b

aselin

e

0 100 200 300 400 500 600

0.00

0.02

0.04

0.06

0.08

0.10

0.12

r2 LD baseline

Page 23: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Linkage Disequilibrium: Chromosome 2DLinkage Disequilibrium: Chromosome 2D

Consistent LD was below 1 cM, localized LD 1-5 cM

0 20 40 60 80 100

0.0

0.1

0.2

0.3

0.4

0.5

0.6

cM

r2

Baseline LD

Page 24: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Linkage Disequilibrium: Chromosome 5ALinkage Disequilibrium: Chromosome 5A

Significant LD extended for 5 cM in pericentromeric region

~5

cM0 10 20 30 40 50

0.0

0.2

0.4

0.6

0.8

1.0

cM

r2

Baseline LD

Page 25: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Locus Weight Area Length WidthcM Name NY OH NY OH NY OH NY OH

7 Xcfd56 0.069 0.160 0.012 0.119 0.076 0.031 0.000* 0.252

11 Xwmc111 0.005 0.020 0.005 0.108 0.003’ 0.107 0.000* 0.000**

23 Xgwm261 0.145 0.016 0.019 0.009 0.027 0.009 0.058 0.001*

28 Xwmc112 0.012 0.057 0.047 0.120 0.480 0.367 0.001* 0.024

64 Xgwm30 0.081 0.862 0.053 0.848 0.312 0.820 0.000** 0.212

91 Xgwm539 0.042 0.038 0.030 0.039 0.001* 0.005 0.290 0.334

Loci Associated with Kernel Size (p-values)Loci Associated with Kernel Size (p-values)Chromosome 2DChromosome 2D

Kernel Size

Milling Quality

None of the loci on 2D were significant after multiple testing correction

**

Lik

elih

oo

d

Rati

o

Test

Agreed with QTL in Reed x Grandin

Page 26: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Locus Weight Area Length WidthcM Name NY OH NY OH NY OH NY OH

55 Xcfa2250 0.021 0.007 0.044 0.014 0.014 0.002* 0.637 0.649

55 Xwmc150b 0.002* 0.003 0.003 0.005 0.009 0.002* 0.093 0.429

56 Xbarc117 0.009 0.002* 0.021 0.005 0.118 0.022 0.044 0.039

60 Xbarc141 0.631 0.037 0.232 0.024 0.038 0.002* 0.852 0.863

Loci Associated with Kernel Size (p-values)Loci Associated with Kernel Size (p-values)Chromosome 5AChromosome 5A

cM LocusMilling Score

Flour Yield ESI FriabilityBreak-Flour

Yield

55 Xcfa2250 0.010 0.029 0.047 0.002* 0.081

Kernel Size

Milling Quality

n.s.

**

Lik

elih

oo

d

Rati

o

Test

Agreed with QTL in

M6 x Opata

Page 27: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

B.L.U.E. of allele effectsB.L.U.E. of allele effectsKernel LengthKernel Length

N. of Cultivars: 9 5 18 37 9 9 41 45 43 49

Page 28: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

B.L.U.E. of allele effectsB.L.U.E. of allele effectsKernel WidthKernel Width

N. of Cultivars: 41 14 8 15 18 24 5 10 19

Page 29: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

B.L.U.E of allele effectsB.L.U.E of allele effectsKernel WeightKernel Weight

N. of Cultivars: 41 45 43 49

Page 30: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

ConclusionsConclusions• Linkage Disequilibrium

• Variation in LD across the genome can be characterized in relevant germplasm

• Markers closely linked to QTL of interest can be identified and allelic effects quantified

• Association Mapping as a Breeding Strategy

• For recurrent selection, markers could be used to carry information from a “good year” to a “bad

year”

• In pedigree breeding, markers could carry information about traits of interest from replicated field

trials to single row or single plant selection

• Allelic values of previously identified alleles can be updated annually based on advanced trial

data combined with genotypic data

• New alleles can be identified and characterized to determine their relative value

• A selection index can be used to incorporate both phenotypic and molecular data

Page 31: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

AcknowledgementsAcknowledgements• USDA Soft Wheat Quality Lab, Wooster, OH

• Embrapa

Technical Support:

• David Benscher

• James Tanaka

• Gretchen Salm

Page 32: Mark E. Sorrells and Flavio Breseghello Department of Plant Breeding & Genetics Cornell University Association Mapping as a Breeding Strategy

Kangaroo Island

Wayne Powell