1
Results Population correlation = black dashed line. Estimator legend: Exponentiation decreased correlations (blue). ML = green WLS = purple Binning continuous data attenuates correlations (red); ULS = blue DWLS = red GLS = olive gap widens with bigger exponents (dashed green). Within-sample median rs and skews used for independence. Polychoric correlations resist bias from binning and nonnormality Loading SEs based on polychoric correlations Loadings based on polychoric correlations Loadings based on product-moment correlations Item residuals based on polychoric correlations χ² fit statistics based on polychoric correlations Summary WLS loading estimates deviate most from target (dashed black, square root of median correlation) Parameter estimates roughly equal for ML, ULS, & DWLS. WLS estimates are least accurate. For GLS, only error variances are inaccurate. No reasons to recommend weighted or generalized least squares. Fit statistics are biased by nonnormality for ML and GLS. ML and GLS have worse χ² and RMSEA even in normal data. Nonnormality widens standard errors of loadings for DLWS & WLS Method Simulated 10,000 multivariate normal datasets for 5-item CFAs with n = 500. Correlations = .581 ± sampling error. Continuous values plotted as vertically jittered triangles. Ordinal counts and bin thresholds in red. Exponentiated continuous values to 1–6th powers to increase skew & kurtosis across samples: Sample #10,000. Generated like Sample #1, but continuous values were raised to 6th power. Continuous/ordinal skewness = 4.3/6.5, kurtosis = 35/58. Compared results for continuous and binned (ordinal) data Compared loadings, error variances, fit statistics, etc. Estimators for Structural Equation Modeling of Nonnormal Likert Scale Data Nick Stauner Case Western Reserve University References Jöreskog, K. G., & Sörbom, D. (1996). LISREL 8: User’s reference guide. Chicago: Scientific Software. Mîndrilă, D. (2010). Maximum likelihood (ML) and diagonally weighted least squares (DWLS) estimation procedures: A comparison of estimation bias with ordinal and multivariate non- normal data. International Journal of Digital Society, 1(1), 60–66. Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Psychometrika, 75, 1– 45. Muthén, L. K., & Muthén, B. O. (1998). Mplus user’s guide. Los Angeles: Muthén & Muthén. Olsson, U. H., Foss, T., Troye, S. V., & Howell, R. D. (2000). The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality. Structural equation modeling, 7(4), 557– 595. Quiroga, A. M. (1992). Studies of the polychoric correlation and other correlation measures for ordinal variables. Unpublished doctoral dissertation. Uppsala: Acta Universitatis Upsaliensis. R Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R- project.org/. Version 2.15.2. Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. URL: http://www.jstatsoft.o rg/v48/i02/ Satorra, A., & Bentler, P. Contact: NickStauner@ gmail.com Limitations and Future Directions Haven’t explored theoretical differences yet. Maximum likelihood might have more meaningfully defined standard errors. Didn’t use scaled χ² or mean & variance adjustments to improve robustness of fit statistics or standard errors against nonnormality yet. Future sensitivity analyses: Sample size (half-done), correlation strength, # of bins, model complexity SRMRs, WRMRs, and residual SEs for all estimators Introduction Maximum likelihood (ML) is a popular default estimator for SEM. Alternatives in R’s lavaan: Weighted least squares (WLS) Diagonally weighted least squares (DWLS) Unweighted least squares (ULS) Generalized least squares (GLS) Which estimator is optimal for SEM of Likert scale data? Polychoric correlations assume a normally distributed latent variable and estimate thresholds from ordinal data. How do estimators work with these?

Estimators for structural equation models of Likert scale data

Embed Size (px)

Citation preview

Page 1: Estimators for structural equation models of Likert scale data

ResultsPopulation correlation = black dashed line. Estimator legend:Exponentiation decreased correlations (blue). ML = green WLS = purpleBinning continuous data attenuates correlations (red); ULS = blue DWLS = red GLS = olive

gap widens with bigger exponents (dashed green).Within-sample median rs and skews used for independence.Polychoric correlations resist bias from binning and nonnormality Loading SEs based on polychoric correlations

Loadings based on polychoric correlations Loadings based on product-moment correlations

Item residuals based on polychoric correlations χ² fit statistics based on polychoric correlations

SummaryWLS loading estimates deviate most from target (dashed black, square root of median correlation)

Parameter estimates roughly equal for ML, ULS, & DWLS.

WLS estimates are least accurate. For GLS, only error variances are inaccurate.

No reasons to recommend weighted or generalized least squares.

Fit statistics are biased by nonnormality for ML and GLS.

ML and GLS have worse χ² and RMSEA even in normal data.

Nonnormality widens standard errors of loadings for DLWS & WLS

Liberal & robust (best?): ULS and DWLS.

MethodSimulated 10,000 multivariate normal datasets for 5-item CFAs with n = 500. Correlations = .581 ± sampling error.

Continuous values plotted as vertically jittered triangles. Ordinal counts and bin thresholds in red.

Exponentiated continuous values to 1–6th powers to increase skew & kurtosis across samples:

Sample #10,000. Generated like Sample #1, but continuous values were raised to 6th power. Continuous/ordinal skewness = 4.3/6.5, kurtosis = 35/58.

Compared results for continuous and binned (ordinal) dataCompared loadings, error variances, fit statistics, etc.

Estimators for Structural Equation Modeling of Nonnormal Likert Scale Data

Nick StaunerCase Western Reserve University

ReferencesJöreskog, K. G., & Sörbom, D.

(1996). LISREL 8: User’s reference guide. Chicago: Scientific Software.

Mîndrilă, D. (2010). Maximum likelihood (ML) and diagonally weighted least squares (DWLS) estimation procedures: A comparison of estimation bias with ordinal and multivariate non-normal data. International Journal of Digital Society, 1(1), 60–66.

Muthén, B., du Toit, S. H. C., & Spisic, D. (1997). Robust inference using weighted least squares and quadratic estimating equations in latent variable modeling with categorical and continuous outcomes. Psychometrika, 75, 1–45.

Muthén, L. K., & Muthén, B. O. (1998). Mplus user’s guide. Los Angeles: Muthén & Muthén.

Olsson, U. H., Foss, T., Troye, S. V., & Howell, R. D. (2000). The performance of ML, GLS, and WLS estimation in structural equation modeling under conditions of misspecification and nonnormality. Structural equation modeling, 7(4), 557–595.

Quiroga, A. M. (1992). Studies of the polychoric correlation and other correlation measures for ordinal variables. Unpublished doctoral dissertation. Uppsala: Acta Universitatis Upsaliensis.

R Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0. http://www.R-project.org/. Version 2.15.2.

Rosseel, Y. (2012). lavaan: An R package for structural equation modeling. Journal of Statistical Software, 48(2), 1–36. URL: http://www.jstatsoft.org/v48/i02/

Satorra, A., & Bentler, P. M. (1994). Corrections to test statistics and standard errors in covariance structure analysis. In A. von Eye & C. C. Clogg (Eds.), Latent variables analysis: Applications for development research. Thousand Oaks, CA: Sage Publications.

Contact: [email protected]

Limitations and Future DirectionsHaven’t explored theoretical differences yet. Maximum likelihood might have more meaningfully defined standard errors.Didn’t use scaled χ² or mean & variance adjustments to improve robustness of fit statistics or standard errors against nonnormality

yet. Future sensitivity analyses:• Sample size (half-done), correlation strength, # of bins, model complexity• SRMRs, WRMRs, and residual SEs for all estimators• Interactions? Could test for them with generalized linear models…

IntroductionMaximum likelihood (ML) is a popular default estimator for SEM. Alternatives in R’s lavaan:• Weighted least squares (WLS)• Diagonally weighted least squares (DWLS)• Unweighted least squares (ULS)• Generalized least squares (GLS)Which estimator is optimal for SEM of Likert scale data?

Polychoric correlations assume a normally distributed latent variable and estimate thresholds from ordinal data. How do estimators work with these?