25
Structural equation modeling with R R Users DC, Monday, February 11, 2013, 6:00 PM Raj Kanungo Decision Sciences Department George Washington University

Structural equation modeling with R - Meetupfiles.meetup.com/1503964/Structural equation modeling... · 2013. 2. 12. · SEM defined (Ullman, 2006) • SEM is a collection of statistical

  • Upload
    others

  • View
    5

  • Download
    1

Embed Size (px)

Citation preview

  • Structural equation modeling with R R Users DC,

    Monday, February 11, 2013, 6:00 PM

    Raj Kanungo Decision Sciences Department George Washington University

  • Outline • What structural equation modeling (SEM) is

    – Example – Evolution – Need for SEM

    • Commercial software for SEM • R packages used for SEM • Case study

    – Introduction and motivation – The modeling process and our model – Data – R commands – Output (including structure diagrams) – Interpretation

    • Other models that can be analyzed using SEM 2

  • Measuring brand equity Lieberman (2010)

    Skincare benefits

    Product bouquet

    Personal indulgence

    Quality and value

    Skincare brand equity

    Does not dry out skin Cleans well

    Leaves skin soft & smooth Does not leave skin itchy

    Is for everyday use

    Skincare purchase intent Skincare/brand rating Recommend skincare

    Fun to use Long lasting fragrance

    Makes you smell great Color appears natural

    Is relaxing Special Experience Has calming effect

    Keeps skin looking young

    Makes a great gift Proud to display in bathroom

    Costs more, but worth it Made from natural ingredients

    3

  • SEM defined (Ullman, 2006)

    • SEM is a collection of statistical techniques that allow a set of relations between one or more independent variables (IVs), either continuous or discrete, and one or more dependent variables (DVs), either continuous or discrete, to be examined.

    • SEM is also referred to as causal modeling, causal analysis, simultaneous equation modeling, analysis of covariance structures.

    4

  • Family Tree of SEM Henseler, J. (2010)

    X1

    X2

    X3

    X4

    Y1

    X3

    X1

    X2

    X4

    Y2

    Y1

    Measure1 = λ11F1+ λ12F2+ λ13F3+ λ14F4 + e1 Measure2 = λ21F1+ λ22F2+ λ23F3+ λ24F4 + e2 Measure3 = λ31F1+ λ32F2+ λ33F3+ λ34F4 + e3 Measure4 = λ41F1+ λ42F2+ λ43F3+ λ44F4 + e4

    5

  • Why SEM (Haenlein and Kaplan, 2004)

    • Overcome limitations of “first generation” techniques – the postulation of a simple model structure (at

    least in the case of regression-based approaches); – the assumption that all variables can be

    considered as observable; and – the conjecture that all variables are measured

    without error, which may limit their applicability in some research situations.

    6

  • What does a SEM looks like?

    ξ1

    ξ2

    η1

    X1

    X2

    X3

    X4

    Y1

    Y2

    δ1

    δ2

    δ3

    δ4

    ε1

    ε2

    ζ1

    Structural model

    Measurement model

    7

  • Commercial software for SEM

    • AMOS (SPSS) • CALIS, TCALIS (SAS) • EQS • LISREL • Mplus • STATA

    8

  • R packages used related to SEM

    • sem – John Fox, original SEM packages in R, well documented – http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/index.html

    • lavaan – Yves Rossel, “simpler”, well documented – http://lavaan.ugent.be/

    • OpenMx – Team developed at the Human Dynamics Lab at UVA – Functionally rich, comprehensive – http://openmx.psyc.virginia.edu/wiki/main-page

    • semPLS and plspm 9

    http://socserv.socsci.mcmaster.ca/jfox/Misc/sem/index.htmlhttp://lavaan.ugent.be/http://openmx.psyc.virginia.edu/wiki/main-page

  • Case study (Hair et al.,2009)

    • Consider an organization that employs thousands of workers in different operations around the world and has initiated a research project to study the employee turnover problem. Based on published literature and some preliminary interviews with employees, an employee turnover study was designed to focus on five constructs. The five constructs are:

    – Job Satisfaction (JS) – reactions resulting from an appraisal of one’s job situation. – Organizational Commitment (OC) – the extent to which an employee identifies and

    feels part of HBAT. – Staying Intentions (SI) – the extent to which an employee intends to continue working

    for HBAT and is not participating in activities that make quitting more likely. – Environmental Perceptions (EP) – beliefs an employee has about their day-to-day,

    physical working conditions. – Employee Attitudes toward Coworkers (AC) – attitudes an employee has toward the

    coworkers he/she interacts with on a regular basis.

    10

  • Constructs and their definitions Organizational Commitment OC1 = My work at HBAT gives me a sense of accomplishment. OC2 = I am willing to put in a great deal of effort beyond that normally expected to help HBAT be successful. OC3 = I have a sense of loyalty to HBAT. OC4 = I am proud to tell others that I work for HBAT. Attitudes Towards Co-Workers AC1 = How happy are you with the work of your coworkers? AC2 = How do you feel about your coworkers? AC3 = How often do you do things with your coworkers on your days off? AC4 = Generally, how similar are your coworkers to you? Environmental Perceptions EP1 = I am very comfortable with my physical work environment at HBAT. EP2 = The place I work in is designed to help me do my job better. EP3 = There are few obstacles to make me less productive in my workplace. EP4 = What term best describes your work environment at HBAT?

    11

  • Steps in the modeling process

    Modeling strategies 1. Competing Models 2. Confirmatory

    Modeling 3. Exploratory Modeling

    The purist’s suggestion 1. Model specification 2. Model identification 3. Model estimation 4. Model evaluation 5. Model re-specification

    12

  • Our model

    AC

    EP

    OC

    AC1

    AC3

    EP1

    EP2

    OC1

    OC2

    δ1

    δ2

    δ3

    δ4

    ε1

    ε2

    AC2

    AC4

    δ1

    δ2

    EP3

    EP4

    δ3

    δ4

    OC3 ε2

    OC4 ε2

    13

  • Data preparation (skip) library(foreign) m

  • Data excerpt and hypothesis

    Hypotheses – HO: Σ = Σ(θ) the hypothesized model (our model) – HA: Σ = S the saturated model (i.e. the 'model' that estimates a separate parameter for every unique element of the covariance matrix)

    AC1 AC2 AC3 AC4 EP1 EP2 EP3 EP4 OC1 OC2 OC3 OC4 2 9 8 6 5 5 5 1 1 3 1 5 6 7 4 4 0 0 0 4 5 0 3 0 8 0 10 9 8 5 1 7 1 5 2 6

    15

  • Specifying and estimating SEM # Ref: Multivariate Data Analysis by Hair et al. & www.mvstats.com library(foreign) mydata

  • lavaan output lavaan (0.5-10) converged normally after 45 iterations Number of observations 400 Estimator ML Minimum Function Chi-square 92.559 Degrees of freedom 51 P-value 0.000 Chi-square test baseline model: Minimum Function Chi-square 2475.929 Degrees of freedom 66 P-value 0.000 Full model versus baseline model: Comparative Fit Index (CFI) 0.983 Tucker-Lewis Index (TLI) 0.978

    Let d = χ2 - df where df = df of the model

    𝐶𝐶𝐶 =𝑑𝑁𝑁𝑁𝑁 − 𝑑𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃

    𝑑𝑁𝑁𝑁𝑁

    𝑇𝑇𝐶 =𝜒2𝑑𝑓𝑁𝑁𝑁𝑁 − 𝜒2𝑑𝑓𝑃𝑃𝑃𝑃𝑃𝑃𝑃𝑃

    𝜒2𝑑𝑓𝑁𝑁𝑁𝑁

    17

  • lavaan output Loglikelihood and Information Criteria: Loglikelihood user model (H0) -8179.946 Loglikelihood unrestricted model (H1) -8133.667 Number of free parameters 27 Akaike (AIC) 16413.893 Bayesian (BIC) 16521.662 Sample-size adjusted Bayesian (BIC) 16435.990 Root Mean Square Error of Approximation: RMSEA 0.045 90 Percent Confidence Interval 0.030 0.060 P-value RMSEA

  • lavaan output Parameter estimates: Information Expected Standard Errors Standard Estimate Std.err Z-value P(>|z|) Std.lv Std.all Latent variables: EP =~ EP1 1.000 1.251 0.684 EP2 1.042 0.075 13.807 0.000 1.303 0.801 EP3 0.837 0.062 13.588 0.000 1.046 0.785 EP4 0.926 0.066 14.088 0.000 1.158 0.825 AC =~ AC4 1.000 1.313 0.816 AC3 0.903 0.048 18.653 0.000 1.186 0.836 AC2 1.078 0.059 18.216 0.000 1.414 0.820 AC1 0.872 0.048 18.262 0.000 1.144 0.822 OC =~ OC1 1.000 1.493 0.592 OC2 1.275 0.104 12.263 0.000 1.904 0.872 OC3 0.785 0.075 10.528 0.000 1.171 0.668 OC4 1.155 0.095 12.115 0.000 1.725 0.841

    19

  • lavaan output Regressions: OC ~ EP 0.538 0.080 6.746 0.000 0.451 0.451 AC 0.214 0.062 3.457 0.001 0.188 0.188 Covariances: EP ~~ AC 0.422 0.099 4.241 0.000 0.257 0.257 Variances: EP1 1.780 0.145 1.780 0.532 EP2 0.946 0.093 0.946 0.358 EP3 0.682 0.064 0.682 0.384 EP4 0.631 0.067 0.631 0.320 AC4 0.864 0.081 0.864 0.334 AC3 0.604 0.060 0.604 0.300 AC2 0.971 0.092 0.971 0.327 AC1 0.628 0.060 0.628 0.324 OC1 4.131 0.316 4.131 0.649 OC2 1.142 0.156 1.142 0.240 OC3 1.699 0.136 1.699 0.553 OC4 1.232 0.141 1.232 0.293 EP 1.565 0.213 1.000 1.000 AC 1.723 0.180 1.000 1.000 OC 1.601 0.267 0.718 0.718

    20

  • Drawing the SEM path diagram library(semPlot) semPaths(my.fit,title=FALSE, curvePivot = TRUE)

    EP1 EP2 EP3 EP4 AC4 AC3 AC2 AC1

    OC1 OC2 OC3 OC4

    EP AC

    OC

    21

  • library(OpenMx) observed

  • library(sem) mydata.cov AC2, lam2, NA AC -> AC3, lam3, NA AC -> AC4, lam4, NA EP -> EP1, NA, 1 EP -> EP2, lam6, NA EP -> EP3, lam7, NA EP -> EP4, lam8, NA OC -> OC1, NA, 1 OC -> OC2, lam10, NA OC -> OC3, lam11, NA OC -> OC4, lam12, NA AC -> OC, gam1, NA EP -> OC, gam2, NA AC AC, phi1, NA EP EP, phi2, NA AC EP, phi, NA OC OC, psi, NA AC1 AC1, d1, NA AC2 AC2, d2, NA AC3 AC3, d3, NA AC4 AC4, d4, NA EP1 EP1, d5, NA EP2 EP2, d6, NA EP3 EP3, d7, NA EP4 EP4, d8, NA OC1 OC1, e1, NA OC2 OC2, e2, NA OC3 OC3, e3, NA OC4 OC4, e4, NA mydata.sem

  • SEM can be used for

    • CFA • MTMM • Regression • 2SLS • Latent growth models • And many other applications

    24

  • Reference • Henseler, J. (2010) . Covariance-based Structural Equation Modeling: Foundations

    and Applications, http://www.sensometric.org/Resources/Documents/2010/Meeting/Presentations/013-000-Henseler_2010.pdf

    • Ullman, J. B. (2006). Structural Equation Modeling: Reviewing the Basics and Moving Forward, Journal of Personality Assessment, 87(1), 35-50.

    • Haenlein, M. and Kaplan, A. M. (2004). A Beginner’s Guide to Partial Least Squares Analysis, Understanding Statistics, 3(4), 283-297.

    • Pearl, J. (2000). The causal interpretation of structural equations (or SEM survival kit), http://bayes.cs.ucla.edu/BOOK-2K/jw.html.

    • Hair, J. F.; Black, W. C.; Babin, B. J. and Anderson, R. E. (2010). Multivariate Data Analysis, 7/E, Prentice Hal

    • Lieberman, M. (2010). Measure brand equity with structural equations modeling, http://www.mvsolution.com/wp-content/uploads/Brand-Equity-Structural-Equations-Model-by-Michael-Lieberman.pdf

    25

    http://www.sensometric.org/Resources/Documents/2010/Meeting/Presentations/013-000-Henseler_2010.pdfhttp://www.sensometric.org/Resources/Documents/2010/Meeting/Presentations/013-000-Henseler_2010.pdfhttp://bayes.cs.ucla.edu/BOOK-2K/jw.htmlhttp://www.mvsolution.com/wp-content/uploads/Brand-Equity-Structural-Equations-Model-by-Michael-Lieberman.pdfhttp://www.mvsolution.com/wp-content/uploads/Brand-Equity-Structural-Equations-Model-by-Michael-Lieberman.pdf

    Structural equation modeling with ROutlineMeasuring brand equity�Lieberman (2010)SEM defined�(Ullman, 2006) Family Tree of SEM�Henseler, J. (2010)Why SEM�(Haenlein and Kaplan, 2004)What does a SEM looks like?Commercial software for SEMR packages used related to SEMCase study�(Hair et al.,2009)Constructs and their definitionsSteps in the modeling processOur modelData preparation (skip)Data excerpt and hypothesisSpecifying and estimating SEMlavaan output lavaan outputlavaan outputlavaan outputDrawing the SEM path diagramSlide Number 22Slide Number 23SEM can be used forReference