Structural Equation Modeling Using Mplus Chongming Yang, Ph. D. 3-22-2012

Preview:

Citation preview

Structural Equation Structural Equation Modeling Modeling

Using MplusUsing Mplus

Chongming Yang, Ph. D.Chongming Yang, Ph. D.

3-22-20123-22-2012

“In the past twenty years we have witnessed a paradigm shift in the analysis of correlational data. Confirmatory factor analysis and structural equation modeling have replaced exploratory factor analysis and multiple regression as the standard methods.”

Kenny, D.A. Kashy, D.A., & Bolger, N. (1998). Data analysis in psychology. In D.T. Gilbert, S.T. Fiske, & G. Lindzey (Eds.) The Handbook of Social Psychology, Vol. 1 (pp233-265). New York: McGraw-Hill.

New Paradigm in Data Analysis

Structural? Structural?

StructuralismStructuralism ComponentsComponents Relations Relations

ObjectivesObjectives

Introduction to SEMIntroduction to SEM ModelModel

Source of the modelSource of the model ParametersParameters Estimation Estimation Model evaluationModel evaluation Applications Applications

Estimate simple models with Mplus Estimate simple models with Mplus

Continuous Dependent Continuous Dependent VariablesVariables

Session ISession I

Four Moments/InformationFour Moments/Informationof Variableof Variable

MeanMean VarianceVariance SkewednessSkewedness Kurtosis Kurtosis

Variance & CovarianceVariance & Covariance2( )

1

n

ii

x xV

n

( )( )

1

n

i ii

x x y yCov

n

Covariance Matrix (S)

x1 x2 x3 x1 x2 x3

x1 Vx1 V11

x2 Covx2 Cov21 21 VV22

x3 Covx3 Cov31 31 CovCov32 32 VV33

Statistical Model Statistical Model

Probabilistic statement about Probabilistic statement about Relations of variablesRelations of variables

Imperfect but useful representation Imperfect but useful representation of realityof reality

Structural Equation Structural Equation ModelingModeling

A system of regression equations for A system of regression equations for latent variables to estimate and test latent variables to estimate and test direct and indirect effects without the direct and indirect effects without the influence of measurement errors.influence of measurement errors.

To estimate and test theories about To estimate and test theories about interrelations among observed and interrelations among observed and latent variables.latent variables.

Latent Variable / Construct / FactorLatent Variable / Construct / Factor

A hypothetical variable A hypothetical variable cannot be measured directly cannot be measured directly inferred from observable manifestations inferred from observable manifestations

Multiple manifestations (indicators) Multiple manifestations (indicators) Normally distributed interval Normally distributed interval

dimensiondimension No objective measurement unitNo objective measurement unit

How is Depression How is Depression Distributed in?Distributed in?

College students College students

Patients for Depression Therapy Patients for Depression Therapy

Normal Distributions Normal Distributions

Levels of AnalysesLevels of Analyses

ObservedObserved

LatentLatent

Test TheoriesTest Theories

Classical True Score Theory:Classical True Score Theory:

Observed Score = True score + Observed Score = True score + ErrorError

Item Response TheoryItem Response Theory Generalizability Generalizability (Raykov & Marcoulides, 2006)(Raykov & Marcoulides, 2006)

Graphic Symbols of SEMGraphic Symbols of SEM

Rectangle – observed variableRectangle – observed variable Oval -- latent variable or errorOval -- latent variable or error Single-headed arrow -- causal Single-headed arrow -- causal

relationrelation Double-headed arrow -- correlation Double-headed arrow -- correlation

Graphic Measurement Graphic Measurement Model Model

of Latent of Latent

X1

X2

X3

1

2

3

1

2

3

EquationsEquations

Specific equationsSpecific equationsXX11 = = 11 + + 11

XX22 = = 22 + + 22

XX33 = = 33 + + 3 3

Matrix SymbolsMatrix SymbolsX = X = + +

Relations of VariancesRelations of Variances

VVX1X1 = = 1122 + + 11

VVX2X2 = = 2222 + + 22

VVX3X3 = = 3322 + + 33

= measurement error / uniqueness = measurement error / uniqueness

Sample Covariance Matrix (S)

x1 x2 x3 x1 x2 x3

x1 Vx1 V11

x2 Covx2 Cov21 21 VV22

x3 Covx3 Cov31 31 CovCov32 32 VV33

Relation of CovariancesRelation of Covariances

Variance of Variance of = common covariance = common covariance of X1 X2 and X3of X1 X2 and X3

Variance of

1

2 3

0

0

0

Unknown ParametersUnknown Parameters

VVX1X1 = = 1122 + + 11

VVX2X2 = = 2222 + + 22

VVX3X3 = = 3322 + + 33

Unstandardized Unstandardized ParameterizationParameterization

(scaling)(scaling) 1 1 = 1 = 1 (set variance of X1 =1; X1 called reference (set variance of X1 =1; X1 called reference

Indicator)Indicator)

Variance of Variance of = common variance of X1 = common variance of X1 X2 and X3X2 and X3

Squared Squared = explained variance of X (R = explained variance of X (R22)) Variance of Variance of = unexplained variance in = unexplained variance in

XX Mean of Mean of = 0 = 0

Standardized Standardized ParameterizationsParameterizations

(scaling)(scaling) Variance of Variance of = 1 = common = 1 = common

variance of X1 X2 and X3variance of X1 X2 and X3 Squared Squared = explained variance of X = explained variance of X

(R(R22)) Variance of Variance of = 1 - = 1 - 22 Mean of Mean of = 0 = 0 Mean of Mean of = 0 = 0

Two Kinds of ParametersTwo Kinds of Parameters

Fixed at 1 or 0Fixed at 1 or 0 Freely estimatedFreely estimated

GeneralIntelligence

Verbald3

Reasoningd2

Analyticd1

EmotionalIntelligence

Recognize/Assessd5

SelfControld4

Personality

Opennessd7

Agreeable-nessd6

JobSatisfaction

BeingAppreciated e1

SocialRelations e2

MaritalSatisfaction

PerceivedBenefit e3

PerceivedCost e4

z1

z2

Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols

X = X = xx + + (exogenous) (exogenous)

Y = Y = yy + + (endogenous)(endogenous)

= = + + + + (structural model)(structural model)

Note: Measurement model reflects the true score Note: Measurement model reflects the true score theory theory

Structural Equation ModelStructural Equation Modelin Matrix Symbolsin Matrix Symbols

X = X = xx + + xx + + (measurement) (measurement)

Y = Y = yy + + yy + + (measurement)(measurement)

= = αα + + + + + + (structural)(structural)

Note: SEM with mean structure.Note: SEM with mean structure.

Model Implied Covariance Model Implied Covariance MatrixMatrix

(Σ)(Σ)

Note: This covariance matrix contains unknown parameters in the equations.

(I-B) = non-singular

Sample Covariance Matrix (S)Sample Covariance Matrix (S)

x1 x2 x3 x4 …x1 x2 x3 x4 …x1 x1 vv11

x2 x2 covcov21 21 vv22

x3 x3 covcov31 31 covcov32 32 vv33

x4 x4 covcov41 41 covcov42 42 covcov43 43 vv4 4 ……

…… Mean1 Mean2 Mean3 Mean4 Mean1 Mean2 Mean3 Mean4 ……

Total info = P(P+1)/2 + Means (if included)Total info = P(P+1)/2 + Means (if included)

Estimations/Fit FunctionsEstimations/Fit Functions

Hypothesis: Hypothesis: = S or = S or - S = 0 - S = 0 Maximum LikelihoodMaximum Likelihood

F = log||F = log|||| + trace(S|| + trace(S-1-1) - log||S|| - (p+q)) - log||S|| - (p+q)

Convergence -- Reaching Convergence -- Reaching LimitLimit

Minimize F while adjust unknown Parameters through Minimize F while adjust unknown Parameters through iterative processiterative process

Convergence value: F difference between last two Convergence value: F difference between last two iterationsiterations

Default convergence = .0001 Default convergence = .0001 Increase to help convergence (Increase to help convergence (0.001 or 0.010.001 or 0.01))

e.g. e.g. Analysis: convergence = .01;Analysis: convergence = .01;

No ConvergenceNo Convergence

No unique parameter estimatesNo unique parameter estimates Lack of degrees of freedom Lack of degrees of freedom under under

identification identification Variance of reference indicator too Variance of reference indicator too

small small Fixed parameters are left to be freely Fixed parameters are left to be freely

estimatedestimated Misspecified model Misspecified model

Absolute Fit IndexAbsolute Fit Index

22 = F(N-1) = F(N-1) (N = sample size)(N = sample size)

df = p(p+1)/2 – q df = p(p+1)/2 – q

P = number of variances, covariances, & meansP = number of variances, covariances, & means

q = number of unknown parameters to be estimatedq = number of unknown parameters to be estimated

probprob = ? = ? (Nonsignificant (Nonsignificant 22 indicates good fit, indicates good fit, Why?)Why?)

Relative Fit: Relative Fit: Relative to Baseline (Null) Relative to Baseline (Null)

ModelModel Fix all unknown parameters at 0 Fix all unknown parameters at 0 Variables not related Variables not related ((=======0)=0)

Model implied covariance Model implied covariance = 0 = 0 Fit to sample covariance matrix SFit to sample covariance matrix S Obtain Obtain 22, df, , df, prob prob < .0000 < .0000

Relative Fit IndicesRelative Fit Indices

CFI = 1- (CFI = 1- (22-df)/(-df)/(22bb-df-dfbb) )

b = baseline modelb = baseline model Comparative Fit Index, desirable => .95; 95% better than b modelComparative Fit Index, desirable => .95; 95% better than b model

TLI = (TLI = (22bb/df/dfb b - - 22/df) / (/df) / (22

bb/df/dfbb-1) -1) (Tucker-Lewis Index, desirable => .90)(Tucker-Lewis Index, desirable => .90)

RMSEA = RMSEA = √(√(22-df)/(n*df) -df)/(n*df) (Root Mean Square of Error Approximation, desirable <=.06(Root Mean Square of Error Approximation, desirable <=.06 penalize a large model with more unknown parameters)penalize a large model with more unknown parameters)

Absolute Fit -- SRMRAbsolute Fit -- SRMR

Standardized Root Mean Square Standardized Root Mean Square ResidualResidual

SRMR = Difference between observed SRMR = Difference between observed and implied covariances in standardized and implied covariances in standardized metricmetric

Desirable when < .90, but no consensusDesirable when < .90, but no consensus Does not penalize for number of model Does not penalize for number of model

parameters, unlike RMSEAparameters, unlike RMSEA

Special Case ASpecial Case A

VerbalAggression

t4a3 e3

t4a93 e2

t4a94 e1

PhysicalAggression

t4a37 e6

t4a57 e5

t4a90 e4

Sex

d1

1

d2

1

Special Cases A Special Cases A

Assumption: x = Assumption: x =

y y = = xx + + + +

= = + + xx + +

Special Case BSpecial Case B

VerbalAggression

x3e3

x2e2

x1e1

PhysicalAggression

x6e6

x5e5

x4e4

PeerStatus

d

Special Cases B Special Cases B

Assumption: y = Assumption: y =

x = x = xx + + xx + +

yy = = + + + +

Other Special Cases of SEMOther Special Cases of SEM

Confirmatory Factor Analysis Confirmatory Factor Analysis (measurement model only)(measurement model only) Multiple & Multivariate RegressionMultiple & Multivariate Regression ANOVA / MANOVA ANOVA / MANOVA (multigroup CFA)(multigroup CFA)

ANCOVAANCOVA Path Analysis Model Path Analysis Model (no latent variables)(no latent variables)

Simultaneous Econometric Equations…Simultaneous Econometric Equations… Growth Curve ModelingGrowth Curve Modeling ……

EFA vs. CFAEFA vs. CFA

Factor 1

x1

e1

1

1

x2

e21

x3

e31

Factor 2

x4

e4

x5

e5

x6

e6

1

1 1 1

Exploratory Factor AnalysisConfirmatory Factor Analysis

Factor 1

x1

e1

x2

e2

x3

e3

Factor 2

x4

e4

x5

e5

x6

e6

1

1 1 1

1

1 1 1

Multiple RegressionMultiple Regression

x1

x2

x3

Y

e1

ANCOVAANCOVA

Pretest1

Group

Posttest1

e11

Pretest2 Posttest2

e21

Multivariate Normality Multivariate Normality AssumptionAssumption

Observed data summed up perfectly Observed data summed up perfectly by covariance matrix S (+ means M), by covariance matrix S (+ means M), S thus is an estimator of the S thus is an estimator of the population covariance population covariance

Consequences of ViolationConsequences of Violation

Inflated Inflated 2 2 & deflated CFI and TLI& deflated CFI and TLI reject plausible models reject plausible models

Inflated standard errors Inflated standard errors attenuate factor loadings and attenuate factor loadings and structural parametersstructural parameters

(Cause: Sample covariances were underestimated) (Cause: Sample covariances were underestimated)

Accommodating Accommodating StrategiesStrategies

Correcting Fit Correcting Fit Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & Standard Errors & Standard Errors

(estimator = mlm; in Mplus)(estimator = mlm; in Mplus) Correcting standard errorsCorrecting standard errors

BootstrappingBootstrapping Transforming Nonnormal variablesTransforming Nonnormal variables

Transforming into new normal indicators Transforming into new normal indicators (undesirable)(undesirable)

SEM with Categorical VariablesSEM with Categorical Variables

Satorra-Bentler Scaled Satorra-Bentler Scaled 2 2 & & SE SE

S-B S-B 22 = = d d-1-1(ML-based (ML-based 22)) (d= Scaling factor (d= Scaling factor that incorporates kurtosis)that incorporates kurtosis)

Effect: performs well with continuous data Effect: performs well with continuous data in terms of in terms of 22, CFI, TLI, RMSEA, parameter , CFI, TLI, RMSEA, parameter estimates and standard errors.estimates and standard errors.

also works with certain-categorical also works with certain-categorical variables (See next slide)variables (See next slide)

Analysis:Analysis: estimator = MLM; estimator = MLM;

Workable Categorical DataWorkable Categorical Data

1.000 2.000 3.000 4.000 5.000

0.000

1.000

2.000

3.000

4.000

5.000

6.000

7.000

Nonworkable Categorical Nonworkable Categorical DataData

1.000 2.000 3.000

0.000

1.000

2.000

3.000

4.000

5.000

6.000

BootstrappingBootstrapping

Original btstrp1 btstrp2 …Original btstrp1 btstrp2 … x y x y x y x y x y x y 1 5 5 3 1 31 5 5 3 1 3 2 4 1 1 5 42 4 1 1 5 4 3 3 3 2 4 13 3 3 2 4 1 4 2 4 5 2 24 2 4 5 2 2 5 1 2 4 3 55 1 2 4 3 5 . . . . . .. . . . . .

Limitation of BootstrappingLimitation of Bootstrapping

Assumption: Sample = PopulationAssumption: Sample = Population Useful Diagnostic ToolUseful Diagnostic Tool Does not Compensate for Does not Compensate for

small or unrepresentative samples small or unrepresentative samples severely non-normal or severely non-normal or absence of independent samples for the cross-absence of independent samples for the cross-

validationvalidation Analysis:Analysis: Bootstrap = 500 Bootstrap = 500

(standard/residual);(standard/residual); Output:Output: stand cinterval; stand cinterval;

Examining Group DifferencesExamining Group Differencesin latent variables (MANOVA)in latent variables (MANOVA)

XXg1g1 = = g1g1 + + g1g1g1g1 + + g1g1

XXg2g2 = = g2g2 + + g2g2g2g2 + + g2g2

XXg1g1-- XXg2 g2 = (= (g1g1 - - g2g2) + () + (g1g1g1g1--g2g2g2g2 ) + ( ) + (g1g1-- g2g2) )

Imposing equality constraints on Imposing equality constraints on and use items with invariant loadings and use items with invariant loadings

XXg1g1-- XXg2 g2 = = + + ((g1g1- - g2g2) + () + (g1g1- - g2g2))Given Given = 0, by assigning = 0, by assigning g1g1 = 0 = 0

XXg1g1-- XXg2 g2 = = + + ((g2g2))

Measurement InvarianceMeasurement Invariance(Hierarchical restrictions)(Hierarchical restrictions)

Configural invariance – same itemsConfigural invariance – same items Metric Factorial InvarianceMetric Factorial Invariance

Weak – additional invariant loadings (Weak – additional invariant loadings ()) Strong – additional invariant intercept Strong – additional invariant intercept

(()) Strict – additional invariant error Strict – additional invariant error

variance (variance ())(Steven & Reise, 1997)(Steven & Reise, 1997)

Partial InvariancePartial Invariance

Majority of factor loadings invariantMajority of factor loadings invariant Variant factor loadings are allowed to Variant factor loadings are allowed to

be freely estimated across groupsbe freely estimated across groups

Two Applications Invariance Two Applications Invariance TestTest

Develop unbiased test Develop unbiased test Examine group difference in latent Examine group difference in latent

variables variables

Advantages of Multigroup Advantages of Multigroup Analysis Analysis

Test all parameters across groupsTest all parameters across groups Allow invariant variances across Allow invariant variances across

groupsgroups Large sample sizes Large sample sizes How large is large enough? How large is large enough? (Muthén & (Muthén &

Muthén, 2002)Muthén, 2002)

MIMIC ModelMIMIC Model

x1

x3

y1

y2

y3

y4

e1

e2

e3

e4

Fx2

MIMIC Model for Examining MIMIC Model for Examining Group DifferenceGroup Difference

MIMIC = multiple indicator multiple MIMIC = multiple indicator multiple causescauses

Indicators = functions of latent variableIndicators = functions of latent variable Controlling for latent variable, covariate Controlling for latent variable, covariate

should have no effects indicatorsshould have no effects indicators Significant Covariate Effects = biases in Significant Covariate Effects = biases in

the levelsthe levels

Assumptions of MIMIC ModelAssumptions of MIMIC Model

Invariant factor loadings across Invariant factor loadings across subgroupssubgroups

Invariant variances (latent & Invariant variances (latent & observed)observed)

Small sample size Small sample size

MplusMplus

www.statmodel.com

Multiple Programs Multiple Programs IntegratedIntegrated

SEM of both continuous and categorical SEM of both continuous and categorical variablesvariables

Multilevel modeling Multilevel modeling Mixture modeling (identify hidden groups)Mixture modeling (identify hidden groups) Complex survey data modeling Complex survey data modeling

(stratification, clustering, weights)(stratification, clustering, weights) Modern missing data treatmentModern missing data treatment Monte Carlo Simulations Monte Carlo Simulations

Types of Mplus FilesTypes of Mplus Files

Data (*.dat, *.txt)Data (*.dat, *.txt) Input (specify a model, <=80 Input (specify a model, <=80

columns/line)columns/line) Output (automatically produced) Output (automatically produced) Plot Plot

Data File Format Data File Format

Free Free Delimited by tab, space, or comma Delimited by tab, space, or comma No missing values No missing values Default in Mplus Default in Mplus Computationally slow with large data setComputationally slow with large data set

FixedFixed

Format = 3F3, 5F3.2, F5.1;Format = 3F3, 5F3.2, F5.1;

Mplus Input Mplus Input

DATADATA: : File = ? File = ?

VARIABLEVARIABLE: : Names=?; Usevar=?; Names=?; Usevar=?; Categ=?;Categ=?;

ANALYSISANALYSIS: : Type = ?Type = ?

MODELMODEL: : (BY, ON, WITH)(BY, ON, WITH) OUTPUTOUTPUT: : Stand;Stand;

Model Specification in MplusModel Specification in Mplus

BY BY Measured by Measured by (F by x1 x2 x3 x4)(F by x1 x2 x3 x4)

ON ON Regressed on Regressed on (y on x)(y on x)

WITH WITH Correlated with Correlated with (x with y)(x with y)

XWITH XWITH Interact with Interact with (inter | F1 xwith F2)(inter | F1 xwith F2)

PON PON Pair ON Pair ON (y1 y2 on x1 x2 = y1 on x1; y2 on (y1 y2 on x1 x2 = y1 on x1; y2 on

x2)x2) PWITH PWITH pair with pair with (x1 x2 with y1 y2 = x1 with (x1 x2 with y1 y2 = x1 with

y1; y1 with y2)y1; y1 with y2)

Default Specification

Error or residual (disturbance) Covariance of exogenous variables in

CFA Certain covariances of residuals (z2)

z2z1

PracticePractice Prepare two data files for MplusPrepare two data files for Mplus

Mediation.sav Mediation.sav Aggress.sav Aggress.sav

Model SpecificationModel Specification Single Group CFASingle Group CFA Examine Mediation Effects in a Full Examine Mediation Effects in a Full

SEMSEM Run a MIMIC model of aggressions Run a MIMIC model of aggressions Multigroup CFA to examine Multigroup CFA to examine

measurement invariance measurement invariance

SPSS DataSPSS Data

Missing Values?Missing Values? Leave as blank to use fixed formatLeave as blank to use fixed format Recode into special number to use free formatRecode into special number to use free format

Save as & choose file typeSave as & choose file type Fixed ASCIIFixed ASCII Free *.dat (with or without variable names?)Free *.dat (with or without variable names?)

Copy & paste variable names into Mplus Copy & paste variable names into Mplus input fileinput file

Stata2mplusStata2mplus

Converting a stata data file to *.datConverting a stata data file to *.dat

Find out:Find out:http://www.ats.ucla.edu/stat/stata/faq/stata2mplus.htm

Graphic ModelGraphic Model

F1

y1 y2 y3

F3

y7 y8 y9

F5

y13 y14 y15

F2

y6y5y4 F4

y12y11y10

d3

d4d5

Model SpecificationModel Specification

Model: Model: f1 by y1-y3;f1 by y1-y3;

f2 by y4-y6;f2 by y4-y6;

f3 by y7-y9;f3 by y7-y9;

f4 by y10-y12;f4 by y10-y12;

f5 by y13-y15;f5 by y13-y15;

f3 on f1 f2;f3 on f1 f2;

f4 on f2;f4 on f2;

f5 on f2 f3 f4 ;f5 on f2 f3 f4 ;MeaErrors are au

Modification IndicesModification Indices

Lower bound estimate of the expected Lower bound estimate of the expected chi square decrease chi square decrease

Freely estimating a parameter fixed at Freely estimating a parameter fixed at 00

MPlusMPlus Output: stand Mod(10); Output: stand Mod(10); Start with least important parameters Start with least important parameters

(covariance of errors)(covariance of errors) Caution: justification?Caution: justification?

Indirect (Mediation) EffectIndirect (Mediation) Effect

A*BA*B

Mplus specification:Mplus specification:Model Indirect: DV IND Mediator IV;Model Indirect: DV IND Mediator IV;

Model ComparisonModel Comparison Model: Model:

Probabilistic statement about the relations of Probabilistic statement about the relations of variablesvariables

Imperfect but usefulImperfect but useful

Models Differ:Models Differ: Different Variables and Different Relations Different Variables and Different Relations

((, , , , , , )) Same Variables but Different Relations Same Variables but Different Relations

((, , , , , , ))

Nested ModelNested Model A Nested Model (b) comes from general A Nested Model (b) comes from general

Model (a) byModel (a) by

Removing a parameter (e.g. a path)Removing a parameter (e.g. a path)

Fixing a parameter at a value (e.g. 0)Fixing a parameter at a value (e.g. 0)

Constraining parameter to be equal to anotherConstraining parameter to be equal to another

Both models have the same variablesBoth models have the same variables

Equality Constraints in Equality Constraints in Mplus Mplus

Parameter Labels:Parameter Labels: Numbers Numbers Letters Letters Combination of numbers of lettersCombination of numbers of letters

Constraint (B=A)Constraint (B=A) F3 on F1 (A);F3 on F1 (A); F3 on F2 (A);F3 on F2 (A);

Test If A=BTest If A=B

F1

y1 y2 y3

F3

y7 y8 y9

F5

y13 y14 y15

F2

y6y5y4 F4

y12y11y10

B

A

d3

d4d5

Model Comparison via Model Comparison via 22 DifferenceDifference

22 = df = (Nested model) = df = (Nested model) 22 = df = (Default model) = df = (Default model) ___________________________________ ___________________________________ 22

difdif = df = dfdifdif = p = ? = p = ? (a single tail)(a single tail)

Find p value at the following website:Find p value at the following website:http://www.tutor-homework.com/statistics_tables/statistics_tables.html

Conclusion: Conclusion: If p > .05, there is no difference between the default model and If p > .05, there is no difference between the default model and

nested model. Or the Hypothesis that the parameters of the two nested model. Or the Hypothesis that the parameters of the two models are equal is not supported. models are equal is not supported.

Other Comparison CriteriaOther Comparison Criteria

AIC = 2211 - - 22

22 - 2(df - 2(df11 – df – df22))

= Δ2211 – 2(Δdf) (as 22

difdif

testtest)) BIC

Smaller is better Difference > 2

PracticePractice

Test if effect A=BTest if effect A=B

Run CFA with Real DataRun CFA with Real Data

VerbalAggression

a3 e1

a93 e2

a94 e3

PhysicalAggression

a37 e4

a57 e5

a90 e6

Multigroup AnalysisMultigroup Analysis

VARIABLE:VARIABLE: USEVAR = X1 X2 X3 X4; USEVAR = X1 X2 X3 X4; Grouping IS Grouping IS sex sex (0=F 1=M); (0=F 1=M); ANALYSIS: ANALYSIS: TYPE = MISSING H1;TYPE = MISSING H1;MODEL:MODEL: F1 BY X1 - X4;F1 BY X1 - X4;

MODEL M: MODEL M: F1 BY X2 - X4; F1 BY X2 - X4;

Note: sex is grouping variable and is not used in the model.

Test Measurement Invariance Test Measurement Invariance Default Model Default Model

Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 () a93 () a94 ();a94 (); F2 By F2 By a57 () a57 () a90 ();a90 ();Output: stand;Output: stand;

Note: Reference indicators in the second group are omitted.

Test Measurement Invariance Test Measurement Invariance Constrained Model Constrained Model

Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 (1) a93 (1) a94 (2);a94 (2); F2 By F2 By a57 (3) a57 (3) a90 (4);a90 (4);Output: stand;Output: stand;

Note: Reference indicators in the second group are omitted.

Estimate with Real DataEstimate with Real Data

VerbalAggression

a3 e1

a93 e2

a94 e3

PhysicalAggression

a37 e4

a57 e5

a90 e6

Sex

Race1

Race2

d1

d2

SEM with Categorical SEM with Categorical IndicatorsIndicators

Session IISession II

Problems of Ordinal ScalesProblems of Ordinal Scales

Not truly interval measure of a latent Not truly interval measure of a latent dimension, having measurement dimension, having measurement errors errors

Limited range, biased against Limited range, biased against extreme scoresextreme scores

Items are equally weighted (implicitly Items are equally weighted (implicitly by 1) when summed up or averaged, by 1) when summed up or averaged, losing item sensitivity losing item sensitivity

Criticisms on Using Ordinal Criticisms on Using Ordinal Scales Scales as Measures of Latent as Measures of Latent

ConstructsConstructs Steven (1951):Steven (1951): …means should be avoided …means should be avoided

because its meaning could be easily interpreted because its meaning could be easily interpreted beyond ranks.beyond ranks.

Merbitz(1989):Merbitz(1989): Ordinal scales and foundations Ordinal scales and foundations of misinferenceof misinference

Muthen (1983):Muthen (1983): Pearson product moment Pearson product moment correlations of ordinal scales will produce correlations of ordinal scales will produce distorted results in structural equation modeling. distorted results in structural equation modeling.

Write (1998):Write (1998): “… “…misuses nonlinear raw scores misuses nonlinear raw scores or Likert scales as though they were linear or Likert scales as though they were linear measures will produce systematically distorted measures will produce systematically distorted results. …It’s not only unfair, it is immoral.” results. …It’s not only unfair, it is immoral.”

Assumption of Categorical Assumption of Categorical Indicators Indicators

A categorical indicator is a coarse A categorical indicator is a coarse categorization of a normally categorization of a normally distributed underlying dimension distributed underlying dimension

Latent (Polychoric) Latent (Polychoric) CorrelationCorrelation

Categorization of Latent DimensionCategorization of Latent Dimension& Threshold & Threshold

No Yes

Never Sometimes Often

1 2 3 4 5

Y

m-1 m

ThresholdThreshold

The values of a latent dimension at The values of a latent dimension at which respondents have 50% which respondents have 50% probability of responding to two probability of responding to two adjacent categoriesadjacent categories

Number of thresholds = response Number of thresholds = response categories – 1. e.g. a binary variable categories – 1. e.g. a binary variable has one threshold.has one threshold.

Mplus specification [x$1] [y$2]; Mplus specification [x$1] [y$2];

Normal Cumulative Normal Cumulative DistributionsDistributions

Measurement Models of Measurement Models of Categorical Indicators (Categorical Indicators (2P 2P

IRT)IRT)

Probit: Probit: P P ((=1|=1|) = ) = [(-[(- + + ))-1/2-1/2 ] ] (Estimation = Weight Least Square with df adjusted (Estimation = Weight Least Square with df adjusted

for for

Means and Variances)Means and Variances)

Logistic: Logistic: P P ((=1|=1|) = 1 / (1+ ) = 1 / (1+ ee-(--(- + + ))))

(Maximum Likelihood Estimation)(Maximum Likelihood Estimation)

Converting CFA to IRT Converting CFA to IRT ParametersParameters

Probit ConversionProbit Conversion a = a = -1/2 -1/2

b = b = // Logit ConversionLogit Conversion

a = a = /D/D (D=1.7)(D=1.7)

b = b = //

Sample Information Sample Information

Latent Correlation Matrix Latent Correlation Matrix

equivalent to covariance matrix of equivalent to covariance matrix of continuous indicatorscontinuous indicators

Threshold matrix Threshold matrix ΔΔ equivalent to means of continuous equivalent to means of continuous

indicatorsindicators

One Parameter One Parameter Item Response Theory ModelItem Response Theory Model

Analysis: Estimator = ML;Analysis: Estimator = ML; Model: Model:

F by X1@1.7 F by X1@1.7

X2@1.7 X2@1.7

… …

Xn@1.7; Xn@1.7;

Stages of EstimationStages of Estimation

Sample information: Sample information: Correlations/threshold/intercepts Correlations/threshold/intercepts (Maximum Likelihood)(Maximum Likelihood)

Correlation structure (Weight Least Correlation structure (Weight Least Square)Square)

gg F = F = (s (s(g)(g)--(g)(g))’W)’W(g)-1(g)-1(s(s(g)(g)--(g)(g))) g=1g=1

WW-1-1 matrix matrix

Elements: Elements:

S1 intercepts or/and thresholdsS1 intercepts or/and thresholds

S2 slopesS2 slopes

S3 residual variances and S3 residual variances and correlationscorrelations

WW-1 -1 : divided by sample size: divided by sample size

EstimationEstimation

WLSMVWLSMV: :

WWeight eight LLeast east SSquare estimation with quare estimation with degrees of freedom adjusted for degrees of freedom adjusted for MMeans and eans and VVariances of latent and ariances of latent and observed variables observed variables

Baseline ModelBaseline Model

Freely estimated thresholds of all the Freely estimated thresholds of all the categorical indicatorscategorical indicators

dfdf = = pp 22– 3– 3p p ((p p = 3 of polychoric = 3 of polychoric correlations)correlations)

Multigroup AnalysisMultigroup Analysis

VARIABLE:VARIABLE: USEVAR = X1 X2 X3 X4; USEVAR = X1 X2 X3 X4; Grouping IS sex (0=F 1=M); Grouping IS sex (0=F 1=M); ANALYSIS: ANALYSIS: TYPE = MISSING H1;TYPE = MISSING H1;MODEL:MODEL: F1 BY X1 - X4;F1 BY X1 - X4;

MODEL M: MODEL M: F1 BY X2 - X4; F1 BY X2 - X4;

Data Preparation TipData Preparation Tip

Categorical indicators are required to Categorical indicators are required to have consistent response categories have consistent response categories across groupsacross groups

Run Crosstab to identify zero cellsRun Crosstab to identify zero cells

Recode variables to collapse certain Recode variables to collapse certain categories to eliminate zero cellscategories to eliminate zero cells

Inconsistent CategoriesInconsistent Categories

1 2 3 4 5

Male 60 80 43 4 0

Female

57 86 32 16 2

1 2 3 4

Male 60 80 43 4

Female

57 86 32 18

Test Measurement Invariance Test Measurement Invariance Default Model Default Model

Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 () a93 () a94 ();a94 (); F2 By F2 By a57 () a57 () a90 ();a90 ();Output: stand;Output: stand;Savedata:Savedata: difftest agg.dat; difftest agg.dat;

Specify Specify DependentDependent Variables Variables

as Categoricalas Categorical Variable:Variable:

Categ = x1-x3;Categ = x1-x3; Categ = all;Categ = all;

Model Comparison with Model Comparison with Categorical Dependent Categorical Dependent

Variables Variables 1.1. Run H0 model with the following at the Run H0 model with the following at the

end of input file: end of input file: Savedata:Savedata: difftest test.dat;difftest test.dat; 2. Run a nested model H1 with an equality 2. Run a nested model H1 with an equality

constraint (s) on a parameter (s) with the constraint (s) on a parameter (s) with the following in the input file:following in the input file:

Analysis: Analysis: difftest test.dat;difftest test.dat; 3. Examine Chi-square difference test in the 3. Examine Chi-square difference test in the

output of H1 Modeloutput of H1 Model

Test Measurement Invariance Test Measurement Invariance Nested Model Nested Model

Analysis: type = missing h1;Analysis: type = missing h1; difftest agg.dat;difftest agg.dat;Model:Model: F1 By a3 F1 By a3 a93(1) a93(1) a94 (2);a94 (2); F2 By a37 F2 By a37 a57 (3) a57 (3) a90 (4); a90 (4); Model M:Model M: F1 By F1 By a93 (1) a93 (1) a94 (2);a94 (2); F2 By F2 By a57 (3) a57 (3) a90 (4);a90 (4);Output: stand;Output: stand;

Reporting Results

Guidelines: Conceptual Model Software + Version Data (continuous or categorical?) Treatment of Missing Values Estimation method Model fit indices (2

(df), p, CFI, TLI, RMSEA)

Measurement properties (factor loadings + reliability) Structural parameter estimates (estimate,

significance, 95% confidence intervals) ( = .23*, CI = .18~.28)

Reliability of Categorical Indicators

(variance approach)

= (i)2/ [(i)2 + 2], where

(i)2 = square (sum of standardized factor loadings)

2 = sum of residual variances i = items or indicator

2i = 1 - 2

McDonald, R. P. (1999). Test theory: A unified treatment (p.89) Mahwah, New Jersey: Lawrence Erlbaum Associates.

Calculator of Reliability Calculator of Reliability (Categorical Indicators)(Categorical Indicators)

SPSS reliability dataSPSS reliability data SPSS reliability syntax SPSS reliability syntax

Interactions in SEMInteractions in SEM

Observed or Latent Observed or Latent Categorical or ContinuousCategorical or Continuous Nine possible combinationsNine possible combinations Treatment Treatment see users’ Guide see users’ Guide

Trouble Shooting StrategyTrouble Shooting Strategy

Start with one part of a big modelStart with one part of a big model Ensure every part worksEnsure every part works Estimate all parts simultaneously Estimate all parts simultaneously

Important ResourcesImportant Resources

Mplus Website:Mplus Website: www.statmodel.com

Papers:Papers: http://www.statmodel.com/papers.shtml

Mplus discussions:Mplus discussions:

http://www.statmodel.com/cgi-bin/discus/discus.cgi

Recommended