Structural equation modeling with R - Meetupfiles.meetup.com/1503964/Structural equation modeling... · 2013. 2. 12. · SEM defined (Ullman, 2006) • SEM is a collection of statistical

Structural equation modeling with R R Users DC,

Monday, February 11, 2013, 6:00 PM

Raj Kanungo Decision Sciences Department George Washington University

Outline • What structural equation modeling (SEM) is

– Example – Evolution – Need for SEM

• Commercial software for SEM • R packages used for SEM • Case study

– Introduction and motivation – The modeling process and our model – Data – R commands – Output (including structure diagrams) – Interpretation

• Other models that can be analyzed using SEM 2

Measuring brand equity Lieberman (2010)

Skincare benefits

Product bouquet

Personal indulgence

Quality and value

Skincare brand equity

Does not dry out skin Cleans well

Leaves skin soft & smooth Does not leave skin itchy

Is for everyday use

Skincare purchase intent Skincare/brand rating Recommend skincare

Fun to use Long lasting fragrance

Makes you smell great Color appears natural

Is relaxing Special Experience Has calming effect

Keeps skin looking young

Makes a great gift Proud to display in bathroom

Costs more, but worth it Made from natural ingredients

3

SEM defined (Ullman, 2006)

• SEM is a collection of statistical techniques that allow a set of relations between one or more independent variables (IVs), either continuous or discrete, and one or more dependent variables (DVs), either continuous or discrete, to be examined.

• SEM is also referred to as causal modeling, causal analysis, simultaneous equation modeling, analysis of covariance structures.

4

Family Tree of SEM Henseler, J. (2010)

X1

X2

X3

X4

Y1

X3

X1

X2

X4

Y2

Y1

Measure1 = λ11F1+ λ12F2+ λ13F3+ λ14F4 + e1 Measure2 = λ21F1+ λ22F2+ λ23F3+ λ24F4 + e2 Measure3 = λ31F1+ λ32F2+ λ33F3+ λ34F4 + e3 Measure4 = λ41F1+ λ42F2+ λ43F3+ λ44F4 + e4

5

Why SEM (Haenlein and Kaplan, 2004)

• Overcome limitations of “first generation” techniques – the postulation of a simple model structure (at

least in the case of regression-based approaches); – the assumption that all variables can be

considered as observable; and – the conjecture that all variables are measured

without error, which may limit their applicability in some research situations.

6

What does a SEM looks like?

ξ1

ξ2

η1

X1

X2

X3

X4

Y1

Y2

δ1

δ2

δ3

δ4

ε1

ε2

ζ1

Structural model

Measurement model

7

Commercial software for SEM

• AMOS (SPSS) • CALIS, TCALIS (SAS) • EQS • LISREL • Mplus • STATA

8

R packages used related to SEM

• sem – John Fox, original SEM packages in R, well documented – http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/index.html

• lavaan – Yves Rossel, “simpler”, well documented – http://lavaan.ugent.be/

• OpenMx – Team developed at the Human Dynamics Lab at UVA – Functionally rich, comprehensive – http://openmx.psyc.virginia.edu/wiki/main-page

• semPLS and plspm 9

http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/index.htmlhttp://lavaan.ugent.be/http://openmx.psyc.virginia.edu/wiki/main-page

Case study (Hair et al.,2009)

• Consider an organization that employs thousands of workers in different operations around the world and has initiated a research project to study the employee turnover problem. Based on published literature and some preliminary interviews with employees, an employee turnover study was designed to focus on five constructs. The five constructs are:

– Job Satisfaction (JS) – reactions resulting from an appraisal of one’s job situation. – Organizational Commitment (OC) – the extent to which an employee identifies and

feels part of HBAT. – Staying Intentions (SI) – the extent to which an employee intends to continue working

for HBAT and is not participating in activities that make quitting more likely. – Environmental Perceptions (EP) – beliefs an employee has about their day-to-day,

physical working conditions. – Employee Attitudes toward Coworkers (AC) – attitudes an employee has toward the

coworkers he/she interacts with on a regular basis.

10

Constructs and their definitions Organizational Commitment OC1 = My work at HBAT gives me a sense of accomplishment. OC2 = I am willing to put in a great deal of effort beyond that normally expected to help HBAT be successful. OC3 = I have a sense of loyalty to HBAT. OC4 = I am proud to tell others that I work for HBAT. Attitudes Towards Co-Workers AC1 = How happy are you with the work of your coworkers? AC2 = How do you feel about your coworkers? AC3 = How often do you do things with your coworkers on your days off? AC4 = Generally, how similar are your coworkers to you? Environmental Perceptions EP1 = I am very comfortable with my physical work environment at HBAT. EP2 = The place I work in is designed to help me do my job better. EP3 = There are few obstacles to make me less productive in my workplace. EP4 = What term best describes your work environment at HBAT?

11

Steps in the modeling process

Modeling strategies 1. Competing Models 2. Confirmatory

Modeling 3. Exploratory Modeling

The purist’s suggestion 1. Model specification 2. Model identification 3. Model estimation 4. Model evaluation 5. Model re-specification

12

Our model

AC

EP

OC

AC1

AC3

EP1

EP2

OC1

OC2

δ1

δ2

δ3

δ4

ε1

ε2

AC2

AC4

δ1

δ2

EP3

EP4

δ3

δ4

OC3 ε2

OC4 ε2

13

Data preparation (skip) library(foreign) m

Data excerpt and hypothesis

Hypotheses – HO: Σ = Σ(θ) the hypothesized model (our model) – HA: Σ = S the saturated model (i.e. the 'model' that estimates a separate parameter for every unique element of the covariance matrix)

AC1 AC2 AC3 AC4 EP1 EP2 EP3 EP4 OC1 OC2 OC3 OC4 2 9 8 6 5 5 5 1 1 3 1 5 6 7 4 4 0 0 0 4 5 0 3 0 8 0 10 9 8 5 1 7 1 5 2 6

15

Specifying and estimating SEM # Ref: Multivariate Data Analysis by Hair et al. & www.mvstats.com library(foreign) mydata

lavaan output lavaan (0.5-10) converged normally after 45 iterations Number of observations 400 Estimator ML Minimum Function Chi-square 92.559 Degrees of freedom 51 P-value 0.000 Chi-square test baseline model: Minimum Function Chi-square 2475.929 Degrees of freedom 66 P-value 0.000 Full model versus baseline model: Comparative Fit Index (CFI) 0.983 Tucker-Lewis Index (TLI) 0.978

Let d = χ2 - df where df = df of the model

𝐶𝐶𝐶 =𝑑𝑁𝑁𝑁𝑁 − 𝑑𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃

𝑑𝑁𝑁𝑁𝑁

𝑇𝑇𝐶 =𝜒2𝑑𝑓𝑁𝑁𝑁𝑁 − 𝜒2𝑑𝑓𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃

𝜒2𝑑𝑓𝑁𝑁𝑁𝑁

17

lavaan output Loglikelihood and Information Criteria: Loglikelihood user model (H0) -8179.946 Loglikelihood unrestricted model (H1) -8133.667 Number of free parameters 27 Akaike (AIC) 16413.893 Bayesian (BIC) 16521.662 Sample-size adjusted Bayesian (BIC) 16435.990 Root Mean Square Error of Approximation: RMSEA 0.045 90 Percent Confidence Interval 0.030 0.060 P-value RMSEA

lavaan output Parameter estimates: Information Expected Standard Errors Standard Estimate Std.err Z-value P(>|z|) Std.lv Std.all Latent variables: EP =~ EP1 1.000 1.251 0.684 EP2 1.042 0.075 13.807 0.000 1.303 0.801 EP3 0.837 0.062 13.588 0.000 1.046 0.785 EP4 0.926 0.066 14.088 0.000 1.158 0.825 AC =~ AC4 1.000 1.313 0.816 AC3 0.903 0.048 18.653 0.000 1.186 0.836 AC2 1.078 0.059 18.216 0.000 1.414 0.820 AC1 0.872 0.048 18.262 0.000 1.144 0.822 OC =~ OC1 1.000 1.493 0.592 OC2 1.275 0.104 12.263 0.000 1.904 0.872 OC3 0.785 0.075 10.528 0.000 1.171 0.668 OC4 1.155 0.095 12.115 0.000 1.725 0.841

19

lavaan output Regressions: OC ~ EP 0.538 0.080 6.746 0.000 0.451 0.451 AC 0.214 0.062 3.457 0.001 0.188 0.188 Covariances: EP ~~ AC 0.422 0.099 4.241 0.000 0.257 0.257 Variances: EP1 1.780 0.145 1.780 0.532 EP2 0.946 0.093 0.946 0.358 EP3 0.682 0.064 0.682 0.384 EP4 0.631 0.067 0.631 0.320 AC4 0.864 0.081 0.864 0.334 AC3 0.604 0.060 0.604 0.300 AC2 0.971 0.092 0.971 0.327 AC1 0.628 0.060 0.628 0.324 OC1 4.131 0.316 4.131 0.649 OC2 1.142 0.156 1.142 0.240 OC3 1.699 0.136 1.699 0.553 OC4 1.232 0.141 1.232 0.293 EP 1.565 0.213 1.000 1.000 AC 1.723 0.180 1.000 1.000 OC 1.601 0.267 0.718 0.718

20

Drawing the SEM path diagram library(semPlot) semPaths(my.fit,title=FALSE, curvePivot = TRUE)

EP1 EP2 EP3 EP4 AC4 AC3 AC2 AC1

OC1 OC2 OC3 OC4

EP AC

OC

21

library(OpenMx) observed

library(sem) mydata.cov AC2, lam2, NA AC -> AC3, lam3, NA AC -> AC4, lam4, NA EP -> EP1, NA, 1 EP -> EP2, lam6, NA EP -> EP3, lam7, NA EP -> EP4, lam8, NA OC -> OC1, NA, 1 OC -> OC2, lam10, NA OC -> OC3, lam11, NA OC -> OC4, lam12, NA AC -> OC, gam1, NA EP -> OC, gam2, NA AC AC, phi1, NA EP EP, phi2, NA AC EP, phi, NA OC OC, psi, NA AC1 AC1, d1, NA AC2 AC2, d2, NA AC3 AC3, d3, NA AC4 AC4, d4, NA EP1 EP1, d5, NA EP2 EP2, d6, NA EP3 EP3, d7, NA EP4 EP4, d8, NA OC1 OC1, e1, NA OC2 OC2, e2, NA OC3 OC3, e3, NA OC4 OC4, e4, NA mydata.sem

SEM can be used for

• CFA • MTMM • Regression • 2SLS • Latent growth models • And many other applications

24

Reference • Henseler, J. (2010) . Covariance-based Structural Equation Modeling: Foundations

and Applications, http://www.sensometric.org/Resources/Documents/2010/Meeting/Presentations/013-000-Henseler_2010.pdf

• Ullman, J. B. (2006). Structural Equation Modeling: Reviewing the Basics and Moving Forward, Journal of Personality Assessment, 87(1), 35-50.

• Haenlein, M. and Kaplan, A. M. (2004). A Beginner’s Guide to Partial Least Squares Analysis, Understanding Statistics, 3(4), 283-297.

• Pearl, J. (2000). The causal interpretation of structural equations (or SEM survival kit), http://bayes.cs.ucla.edu/BOOK-2K/jw.html.

• Hair, J. F.; Black, W. C.; Babin, B. J. and Anderson, R. E. (2010). Multivariate Data Analysis, 7/E, Prentice Hal

• Lieberman, M. (2010). Measure brand equity with structural equations modeling, http://www.mvsolution.com/wp-content/uploads/Brand-Equity-Structural-Equations-Model-by-Michael-Lieberman.pdf

25

http://www.sensometric.org/Resources/Documents/2010/Meeting/Presentations/013-000-Henseler_2010.pdfhttp://www.sensometric.org/Resources/Documents/2010/Meeting/Presentations/013-000-Henseler_2010.pdfhttp://bayes.cs.ucla.edu/BOOK-2K/jw.htmlhttp://www.mvsolution.com/wp-content/uploads/Brand-Equity-Structural-Equations-Model-by-Michael-Lieberman.pdfhttp://www.mvsolution.com/wp-content/uploads/Brand-Equity-Structural-Equations-Model-by-Michael-Lieberman.pdf

Structural equation modeling with ROutlineMeasuring brand equity�Lieberman (2010)SEM defined�(Ullman, 2006) Family Tree of SEM�Henseler, J. (2010)Why SEM�(Haenlein and Kaplan, 2004)What does a SEM looks like?Commercial software for SEMR packages used related to SEMCase study�(Hair et al.,2009)Constructs and their definitionsSteps in the modeling processOur modelData preparation (skip)Data excerpt and hypothesisSpecifying and estimating SEMlavaan output lavaan outputlavaan outputlavaan outputDrawing the SEM path diagramSlide Number 22Slide Number 23SEM can be used forReference