20
Module 10: Multi-Site (GxE) Analysis – Maize Example Learning Objectives Bases on analysis performed for single site trials (Module 9), be able to select the desired locations data to do a combined analysis of the data Gain an understanding of the different summary outputs to make interpretations of results correctly to ensure the best advancement decisions are made Module Map 1

Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

  • Upload
    others

  • View
    7

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Module 10: Multi-Site (GxE) Analysis – Maize Example

Learning Objectives Bases on analysis performed for single site trials (Module 9), be able to select the desired

locations data to do a combined analysis of the data Gain an understanding of the different summary outputs to make interpretations of results

correctly to ensure the best advancement decisions are made

Module Map

Introduction Select Data from Database Run Analysis

o Description of Nodeso Exclude Traitso Analysis Reports and Graphs

Descriptive Statisticso Trait Summary Statisticso Best Variance-Covariance Model for each traito Genotype by Environment Interactiono Stability Superiority Measureo Static Stability Measure Coefficientso Wricke’s Ecovalence Stability Coefficientso Finlay and Wilkinson Modified Joint Regression Analysiso AMMI Analysis Modelo GGE Biplot Modelo Variance-Covariance Model and Correlation Matrixo Correlation Heat Mapo Scatter Plot

Introduction

This module describes a genotype by environment (GxE) analysis for a four location maize field trial. The outputs (adjusted means, BLUEs, and summary statistics) calculated for the individual

1

Page 2: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

locations from single site analysis in Module 9 serve as inputs for Multi-Site analysis. You can click and select to run analysis one trait at time for individual trait report (where the whole analysis is run sequentially through all nodes and when finished generate an Output tab of the summary statistics for the trait within and between environments), or select all to run multiple traits simultaneously to run the whole pipeline and the results are combined in a single report.

You will complete the steps below to initiate and run Multi-Site analysis.

i. Select data from database

o Define Environments and Groupsii. Run analysis pipeline

Select Data from Database

i. Open Multi-Site Analysis from the Statistical Analysis menu of the Workbench. Select Browse (Fig. 1).

Figure 1

ii. Select the desired trial, e.g., CIMMYT 2012 Trial, to use the means uploaded to the BMS after the single site analysis (Fig. 2).

Figure 2

2

Page 3: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

o Define environments and groups (Fig. 3)

i. For factor defining environment select ‘LOCATION_NAME’ii. For factor defining Genotype select ‘DESIGNATION’iii. For factor defining Environment Group select ‘None’

Figure 3

Note that traits with adjusted means available from all trial locations are selected by default. However, traits that are not observed or traits that could not be fitted with a mixed model in more than one environment in the single site analysis are not selected.

iv. Review the factors and variates (traits) in the dataset. Leave the default selections and click ‘Next’ (Fig. 4) to open the details of the selected dataset (Fig. 5).

Figure 4.

Four traits (PH, GY, FW, & EH) are automatically selected for the GxE analysis, because the single site analysis was able to generate adjusted means (BLUEs) for these traits at all four test locations.

3

Page 4: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

v. Review the four environments and four traits to be included in the multi-site analysis

Run Analysis

Multi-Site analysis is database integrated.

Figure 5

i. Click ‘Launch Breeding View’ (Fig. 5) to launch the application. The analysis conditions and data are automatically loaded. All locations and traits (that were analyzed in single site analysis in Module 9) are selected by default.

ii. Right click on any trait or location to exclude from the analysis

iii. When a project has been created or opened, a visual representation of the analytical pipeline is displayed in the Analysis Pipeline tab. The analysis pipeline includes a set of connected nodes, which can be used to run and configure pipelines 

Description of Nodes

i. Quality Control Phenotypes: Contains summary statistics within and between environments for the trait(s)

ii. Finlay-Wilkinson: Allows user to perform a Finlay-Wilkinson joint regression (Finlay and Wilkinson, 1963)

iii. AMMI Analysis: Used to fit an additive main effect model and multiplicative interaction (AMMI) model to generate summaries and a biplot (Gauch, 1988)

iv. GGE Biplot: Fits a GGE model and generates a biplot (Yan et al., 2000). v. Variance-Covariance Modeling: Fits different variance-covariance models to the

GxE data and selects the best structure for the genetic correlations between environments

vi. Stability Coefficients: Estimates different stability coefficient parameters to assess genotype performance across locations

vii. Generate report: Generates an HTML report of the results

4

Page 5: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Exclude Traits

You can run analysis for all traits or one trait at a time.

i. Exclude all traits except grain yield (GY) from the analysisii. Right click the ‘Quality Control Phenotype node’ and choose ‘Run Pipeline (Fig. 6)iii. Run the analysis using the default settings

Figure 6

iv. A popup notification indicates when the pipeline is complete (Fig. 7)

Figure 7

Analysis Report and Graphs – Interpretation of Outputs

The analysis output can be viewed from Breeding View interface under the results and graphs tabs. Analysis results can also be reviewed as individual files are automatically saved in a time stamped folder on the C: drive of your computer – path as shown below.

C:/Breeding Management System/Workspace/Maize Tutorial/breeding_view/output

5

Page 6: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Descriptive Statistics

Breeding View provides descriptive statistics that describe the variance and covariance of the entire dataset.

Trait Summary Statistics

The trait summary statistics describe each trait based on the means calculated for each environment in the single site analysis (Fig. 8).

Figure 8

The box plot of means provides a visual representation of the summary statistics (Fig. 9).

Figure 9. Boxplot of Grain Yield Means at Tlaltizapan

6

Page 7: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

The Tlaltizapan location has the highest grain yield and the highest variance. The Sabana del Medio location has the lowest grain yield and the lowest variance (Fig. 9)

Best Variance-Covariance Model for Each Trait

The GxE analysis pipeline formally models the variance-covariance structure in the means data, and selects the best model for each trait. The main purpose is to establish a model for later testing of fixed effects, like determining marker effects in a quantitative trait loci by environment (QTLxE) analysis using BLUPs calculated in the single site analysis. 

Figure 11. Best Variance-Covariance Model

In this example, grain yield means are best described by an unstructured model, where each variance and covariance is estimated uniquely from the data (Fig. 11).

Genotype By Environment (GxE) Interactions

Stability, or lack of phenotypic plasticity, is calculated for each genotype considering all traits using the following analyses:

Cultivar-Superiority Measure Static Stability Measures Coefficients Wricke’s Ecovalence Stability Coefficients

GxE interactions are also examined for each individual trait using the following analyses:

Finlay and Wilkinson Modified Joint Regression AMMI Model GGE Model Best Variance-Covariance Model Correlation Matrix Scatter Plot Matrix

Stability Superiority Measure

Stability Superiority Measure (Lin & Binns,1988) is the sum of the squares of the difference between genotypicmean in each environment and the mean of the best genotype, divided by twice the number of environments. Genotypes with the smallest values of the superiority tend to be more stable, and closer to the best genotype in each environment (Fig. 12).

7

Page 8: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Figure 12

Static Stability Measures Coefficients

The Static Stability Coefficient is defined as the variance around the germplasm’s phenotypic mean across all environments. This provides a measure of the consistency of the genotype, without accounting for performance (Fig. 13).

8

Page 9: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Figure 13 Wricke’s Ecovalence Stability Coefficients

Wricke’s Ecovalence Stability Coefficient (Wricke, 1962) is the contribution of each genotype to the genotype-by-environment sum of squares, in an un-weighted analysis of the genotype-by-environment means. A low value indicates that the genotype responds in a consistent manner to changes in environment; i.e. stable from a dynamic point of view. Like static stability, the Wricke’s Ecovalence does not account for genotype performance (Fig. 14).

9

Page 10: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Figure 14 Finlay and Wilkinson Modified Joint Regression Analysis

The Finlay and Wilkinson Modified Joint Regression Analysis ranks germplasm based on phenotypic stability for each individual trait (Fig. 15).

10

Page 11: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Figure 15 AMMI Model

In the Additive Main Effects and Multiplicative Interaction (AMMI) model, a two-way ANOVA additive model is performed (additive main effects), followed by a principal component analysis on the residuals (multiplicative interaction). As a result, the interaction is characterized by Interaction Principal Components (IPCA), where genotypes and environments can be simultaneously plotted in biplots (Fig. 16).

11

Page 12: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Figure 16

GGE Model

In the Genotype Main Effects and Genotype × Environment Interaction Effects (GGE) model (Yan et al. 2000 & 2003) a 1-way ANOVA, including environment as a main effect, is run followed by a principal component analysis on the residuals. Like AMMI, principal component scores can be used to construct biplots. Unlike the AMMI Model, in GGE the genotypic main effects are also represented in the plot. The GGE model is superior to AMMI analysis at differentiating mega-environments (Yan et al. 2007) (Fig. 17)

12

Page 13: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Figure 17

The Agua Fria and Tlaltizapan  locations cluster, indicates that these two locations have similar environmental effects on phenotype and small GxE interactions.

13

Page 14: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Variance-Covariance Model and Correlation Matrix

Details on the variance-covariance model, including the pairwise correlation matrix from the covariance model is presented in a table in the Report tab. In the correlation matrix values close to 1 indicate higher correlation between environments. A value of 1 indicates a perfect correlation, such as when an environment is compared to itself (Fig. 18).

Figure 18. Correlation Matrix for Grain Yield (GY)

v. Agua Fria is most positively correlated to the Tlaltizapan environment (0.8659), suggesting that  the Agua Fria and Tlaltizapan  locations have similar environmental effects on phenotype.

14

Page 15: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Correlation Heat Map 

The correlation heat matrix visualizes correlations with color; warm colors (red) indicating high positive correlation between environments, and cool colors (blue) indicating high negative correlation between environments (Fig. 19).

Figure 19. Correlation Heat Map of Grain Yield

vi. Agua Fria is most positively correlated (red)  to  the Tlaltizapan environment, suggesting that the Agua Fria and Tlaltizapan locations have similar environmental effects on phenotype and small GxE interactions (Fig. 19).

15

Page 16: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Scatter Plot Matrix

The scatter plot matrix illustrates the association of genotypic performance between each pair of environments (Fig. 20).

Figure 20. Scatter Plot Matrix for Grain Yield

vii. A positive correlation is observed between genotypic performance at Agua Fria and Tlaltizapan, indicating similar environmental effects on phenotype for these environments and small GxE interactions. However, little correlation is observed between genotypic performance at Agua Fria and Jutipapa, indicating large GxE interactions between these two environments (Fig. 20)

References

Gauch, H. G. (1988). Model selection and validation for yield trials with interaction. Biometrics, 44, 705–715.

Gauch, H.G. (1992). Statistical Analysis of Regional Yield Trials – AMMI analysis of factorial designs. Elsevier, Amsterdam.

Finlay, K.W. & Wilkinson, G.N. (1963). The analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754.

Murray, D. Payne, R, & Zhang, Z. (2014) Breeding View,   a Visual Tool for Running Analytical Pipelines: User Guide. VSN International Ltd. (. pdf ) (Sample data .zip).

16

Page 17: Integrated Breeding Platform - Plant breeding …€¦ · Web viewThe analysis of adaptation in a plant-breeding programme. Australian Journal of Agricultural Research, 14, 742-754

Lin, C.S. & Binns. M.R. (1988). A superiority performance measure of cultivar performance for cultivar x location data. Canadian Journal of Plant Science, 68, 193-198.

Lin, C. S., Binns, M. R., & Lefkovitch, L. P. (1986). Stability analysis: Where do we stand? Crop Science, 26, 894–900.

Oakey, H., Verbyla, A. P., Pitchford, W., Cullis, B., & Kuchel, H. (2006). Joint modeling of additive and non-additive genetic line effects in single field trials. Theoretical and Applied Genetics, 113, 809–819.

Wricke, G. 1962. On a method of understanding the biological diversity in field research, Zeitschrift für Pflanzenzüchtung 47: 92–146.47: 92-146

Yan, W., Hunt, L. A., Sheng, Q., & Szlavnics, Z. (2000). Cultivar Evaluation and Mega-Environment Investigation Based on the GGE Biplot. Crop Science, 40, 597–605.

Yan, W. & Kang, M.S. (2003). GGE Biplot Analysis: a Graphical Tool for Breeders, Geneticists and Agronomists. CRC Press, Boca Raton.

Yan, W., Kang, M.S. Ma, B., Woods, S., Cornelius, P.L. (2007) GGE Biplot vs. AMMI Analysis of Genotype-by-Environment Data. Crop Science. 47, 643–653.

17