STA201 - An Introduction to Structural Equation Modeling
AN INTRODUCTION TO STRUCTURAL EQUATION
MODELING
Giorgio RussolilloCNAM, Paris
STA201 - An Introduction to Structural Equation Modeling
• Structural Equation Models (SEM) are complex models allowing us to study real world complexity by taking into account a whole number of causal relationships among latent concepts (i.e. the Latent Variables, LVs), each measured by several observed indicators usually defined as Manifest Variables (MVs).
• Factor analysis, path analysis and regression are special cases of SEM.• SEM is a largely confirmatory, rather than exploratory, technique. It is used more to
determine whether a model is valid than to find a suitable model. But some exploratory elements are allowed
Key concepts:Latent variables (unobservable by a direct way): abstract psychological variables like «intelligence», «attitude toward the brand», «satisfaction», «social status», «ability», «trust».
Manifest variables are used to measure latent concepts and they contain sizable measurement errors to be taken into account: multiple measures are allowed to be associated with a single construct.
Measurement is recognized as difficult and error-prone: the measurement error is explicitly modeled seeking to derive unbiased estimates for the relationships between latent constructs.
G. Russolillo – slide 2
Structural Equation Modeling (SEM)
STA201 - An Introduction to Structural Equation Modeling
Structural Equation Modeling (SEM)
G. Russolillo – slide 3
Several fields played a role in developing Structural Equation Models :
• From Psychology, comes the belief that the measurement of a valid construct cannot rely on a single measure.
• From Economics comes the conviction that strong theoretical specification is necessary for the estimation of parameters.
• From Sociology comes the notion of ordering theoretical variables and decomposing types of effects.
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 4
SEM: historical corner
STA201 - An Introduction to Structural Equation Modeling
Sewall Wright and Path Analysis
Path Analysis aims to study cause-effect relations among several variables by looking to the correlation matrix among them.
Sewall Wright (21 December 1889 –3 Mars 1988)
American Geneticist, son of the economist Philip Wright
Path Analysis has been developed in the 20s by S. Wright to investigate genetic problems and to help his father in economic studies.
The main newness is the introduction of a new tool to investigate cause-effect relations: the path diagram
G. Russolillo – slide 5
STA201 - An Introduction to Structural Equation Modeling
Factor Analysis and the idea of Latent Variable
Charles Edward Spearman (10 September 1863 – 17 September 1945)
English psychologist
C. Spearman proposed Factor Analysis (FA) at the begin of the ‘900s to measure intelligence in a “objective” way.
The most important input from Factor Analysis is the introduction of the concept of “factor”, in other words of the
concept of Latent Variable
The main idea is that intelligence is a MULTIDIMENSIONAL issue, thus the correlation observed among several variables should be explained by a unique underlying “factor”.
G. Russolillo – slide 6
STA201 - An Introduction to Structural Equation Modeling
Thurstone and Multiple Factor Analysis
Spearman approach has been modified in the following 40 years in order to considers more than one factor as “cause” of observed correlation among several set of manifest variables
Louis Thurstone (29 May 1887–30 September 1955)Psychologist, Psychometrician
He is the father of the Multiple Factor Analysis. In the 50’s he meets Herman Wold. They decide to co-organize “the Upspsala Symposium on Psycological Factor Analysis”.
Herman Wold (25 december 1908 – 16 february 1992) Econometrician and Statistician
Since their meeting, H. Wold started working on Latent Variables models.
G. Russolillo – slide 7
STA201 - An Introduction to Structural Equation Modeling
Causal models rediscovered
Herbert Simon (June, 15 1916 – February 9, 2001) Economist – Nobel Prize for economic in 1978
In 1954 presents a paper proving that “under certain assumptions correlation is an index of causality”
Hubert M. Blalock (23 Augut 1926 – 8 Febrary 1991) SociologistIn 1964 published the book “Causal Inference in Nonexperimental Research”, in which he defines methods able to make causal inference starting from the observed covariance matrix. He faces the problem of assessing relations among variables by means of the inferential method.
They developed the SIMON-BLALOCK techinque
G. Russolillo – slide 8
STA201 - An Introduction to Structural Equation Modeling
Path analysis and Causal models
He was one of the leading sociologists in the world. He introduces the Path Analysis of Wright's in Sociology.
In the mid-60's comes to the conclusion that there is no difference between the Path Analysis of Wright and the Simon-Blalock model.
With the economist Arthur Goldberger he comes to the conclusion that there is no difference between what was known in sociology as Path Analysis and simultaneous equations models commonly used in econometrics.
Along with Goldberger he organizes a conference in 1970 in Madison (USA) where he invited Karl Jöreskog.
G. Russolillo – slide 9
Otis D. Duncan (December, 2 1921– November, 16 2004)Sociologist
STA201 - An Introduction to Structural Equation Modeling
Covariance Structure Analysis and K. Jöreskog
Karl Jöreskog Statistician, Professor at Uppsala University, Sweden
In the late 50s, he started working with Herman Wold. He discussed a thesis on Factor Analysis.
In the second half of the 60s, he started collaborating with O.D. Duncan and A. Goldberger. This collaboration represents a meeting between Factor Analysis (and the concept of latent variable) and Path Analysis (i.e. the idea behind causal models).
In 1970, at a conference organized by Duncan and Goldberger, Jöreskog presented the Covariance Structure Analysis (CSA) for estimating a linear structural equation system, later known as LISREL
G. Russolillo – slide 10
STA201 - An Introduction to Structural Equation Modeling
Soft Modeling and H. Wold
In 1975, H. Wold extended the basic principles of an iterative algorithm aimed to the estimation of the PCs (NIPALS) to a more general procedure (PLS) for the estimation of relations among several blocks of variables linked by a network of relations specified by a path diagram.
He proposed the PLS Approach to estimate Structural Equation Models (SEM) parameters, as a Soft Modeling alternative to Jöreskog's Covariance Structure Analysis
G. Russolillo – slide 11
Herman Wold (December 25, 1908 – February 16, 1992)Econometrician and Statistician
STA201 - An Introduction to Structural Equation Modeling
Path Analysis with manifest variables
G. Russolillo – slide 12
STA201 - An Introduction to Structural Equation Modeling
Path analysis: drawing conventions
or
or
or
Manifest Variables (VM)
Unidirectional Path (cause-effect)
Bidirectional Path (correlation)
Feedback relation or reciprocal causation
or or ε εε
x x
Errors
G. Russolillo – slide 13
STA201 - An Introduction to Structural Equation Modeling
“Drawing” a regression model
The multiple regression model (on centred variables) :
y = γ1x1 + γ2x2 + ζ
can be “drawn” by using a Path Diagram:
x1
x2
y
ζ
Example: The Value for a brand in terms of Quality and Cost
The error is an Unobserved / Latent variable
Value, is an Observed / Manifest Variable
Quality and Cost are Observed / Manifest Variables
γ2
γ1
G. Russolillo – slide 14
STA201 - An Introduction to Structural Equation Modeling
Path analysis: Wright’s Tracing rules
Tracing rules are a way to estimate the covariance between two variables by summing the appropriate connecting paths. They are:1. Trace all paths between two variables (or a variable back to itself),
multiplying all the coefficients along a given path2. You can start by going backwards along a single-headed arrow, but
once you start going forward along these arrows you can no longer go backwards
3. No loops. You cannot go through the same variable more that once for a given path
4. At maximum, there can be one double-headed arrow included in a path
5. After tracing all the paths for a given relationship, sum all the paths
G. Russolillo – slide 15
STA201 - An Introduction to Structural Equation Modeling
Tracing rule #2: an example
G. Russolillo – slide 16
x1
x2y
ζ1
b
ac
z1
You can add the paths acb, bca, 1z1, but the path aca is not allowed
You want to estimate cov(y,y) from this model, so you have to trace all the paths and add them:
ζ2 z2
STA201 - An Introduction to Structural Equation Modeling
Tracing rule #4: an example
G. Russolillo – slide 17
Permissible paths are a, ec, db, but the path adfc is not allowed
x1
x2y
ζ
b
ad
x3
f
ez
c
You want to estimate cov(y,x1) from this model:
STA201 - An Introduction to Structural Equation Modeling
Obtaining a numerical solution for model parameters
G. Russolillo – slide 18
The multiple regression model (standardized variables) :y = ax1 + bx2 +cx3+ ζ
can be “drawn” by using a Path Diagram:DATA: correlation matrix
de
f
• a, b and c are standardized partial correlation coefficients (path coefficients) to estimate
• z is the residual variance to estimate• d, e and f are covariances (correlations) between the exogenous variables.
1
1
1
x1
x2y
ζ
b
ad
x3
f
ez
c1
STA201 - An Introduction to Structural Equation Modeling
Obtaining a numerical solution for path coefficients (1)
G. Russolillo – slide 19
x1
x2y
ζ
b
ad
x3
f
ez
c
x1
x2y
ζ
b
ad
x3
f
ez
c
x1
x2y
ζ
b
ad
x3
f
ez
c
r1Y = a + ec + db → 0.70 = a + 0.24c + 0.2b
DATA: correlation matrix Writing the set of paths for r(x1,y)
STA201 - An Introduction to Structural Equation Modeling
Obtaining a numerical solution for path coefficients (2)
G. Russolillo – slide 20
x1
x2y
ζ
b
ad
x3
f
ez
c
x1
x2y
ζ
b
ad
x3
f
ez
c
x1
x2y
ζ
b
ad
x3
f
ez
c
r2Y = b + da + fc → 0.80 = b + 0.2a + 0.3c
DATA: correlation matrix Writing the set of paths for r(x2,y)
STA201 - An Introduction to Structural Equation Modeling
Obtaining a numerical solution for path coefficients (3)
G. Russolillo – slide 21
x1
x2y
ζ
b
ad
x3
f
ez
c
x1
x2y
ζ
b
ad
x3
f
ez
c
x1
x2y
ζ
b
ad
x3
f
ez
c
r3Y = c + fb + ea → 0.30 = c + 0.3b + 0.24a
DATA: correlation matrix Writing the set of paths for r(x3,y)
0.70 = a + 0.24c + 0.2b0.80 = b + 0.2a + 0.3c0.30 = c + 0.3b + 0.24a
⎧⎨⎪
⎩⎪⇒
a = 0.57b = 0.70c = −0.05
⎧⎨⎪
⎩⎪
STA201 - An Introduction to Structural Equation Modeling
Obtaining a numerical solution for the variance of the error
G. Russolillo – slide 22
ryy = 1= a2 + b2 + c2 + 2 cea( ) + 2 cfb( ) + 2 bda( ) +1z1
z = ryy − 1a1+1b1+1c1+ 2 cea( ) + 2 cfb( ) + 2 bda( )⎡⎣ ⎤⎦ = 1− a2 + b2 + c2 + 2 cea( ) + 2 cfb( ) + 2 bda( )⎡⎣ ⎤⎦
= 1− 0.572 + 0.702 + −0.05( )2 + 2 −0.05( ) 0.24( ) 0.57( ) + 2 −0.05( ) 0.30( ) 0.70( ) + 2 0.70( ) 0.20( ) 0.57( )⎡⎣ ⎤⎦ =
= 1− 0.946 = 0.054
1
1
x1
x2y
ζ
b
ad
x3
f
ez
c
1
1
STA201 - An Introduction to Structural Equation Modeling
Mediation Analysis
G. Russolillo – slide 23
STA201 - An Introduction to Structural Equation Modeling
Single mediator model
MEDIATOR
(INTERVENING) M
INDEPENDENT VARIABLE
DEPENDENT VARIABLE
G. Russolillo – slide 24
STA201 - An Introduction to Structural Equation Modeling
Mediator model: Total Effect
MEDIATOR
M
INDEPENDENT VARIABLE
X Y
DEPENDENT VARIABLE
a
1. The independent variable causes the dependent variable:
Y = i1 + aX + e1
G. Russolillo – slide 25
STA201 - An Introduction to Structural Equation Modeling
Mediator model: Direct effect of X on M 2. The independent variable causes the potential mediator:
M = i2 + bX + e2
G. Russolillo – slide 26
MEDIATOR
M
INDEPENDENT VARIABLE
X Y
DEPENDENT VARIABLE
b
STA201 - An Introduction to Structural Equation Modeling
Mediator model: Direct Effects of M and X on Y 3. The mediator causes the dependent variable controlling for
the independent variable:
Y = i3+ a’X + cM + e3
G. Russolillo – slide 27
MEDIATOR
M
INDEPENDENT VARIABLE
X Y
DEPENDENT VARIABLE
c
a’
STA201 - An Introduction to Structural Equation Modeling
Single mediator model
MEDIATOR
(INTERVENING) M
INDEPENDENT VARIABLE
DEPENDENT VARIABLE
b c
a’ DIRECT EFFECT
INDIRECT EFFECT
INDEPENDENT VARIABLE
X Y
DEPENDENT VARIABLE
a TOTAL EFFECT
• Mediated (Indirect) effect : bc
• Direct effect : a’
• Total effect : a = bc+a’
G. Russolillo – slide 28
STA201 - An Introduction to Structural Equation Modeling
Single mediator model: an example
G. Russolillo – slide 29
Cognitive ability (IQ)
Years in school
Salary
ζsalary
c
a
b z1
ζIQ
z2
DATA: covariance matrix
Structural Model
RESULTS: estimates
STA201 - An Introduction to Structural Equation Modeling
Path Models with Manifest Variables
x1
ζ1
y2 y1
ζ2
γ11 β21
y1 = γ11 x1 + ζ1
y2 = β21 y1 + ζ2
y = Γx +Βy +ζG. Russolillo – slide 30
The multiple regression model can be generalized to paths where endogenous variables are on their turn causative of others endogenous variables.
Γ =γ 110
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # exo
Β = 0 0β21 0
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # endo
ζ =ζ1ζ2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo
y =y1y2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo
x = x1[ ]# exo
STA201 - An Introduction to Structural Equation Modeling
Path Models with Manifest Variables
y1 = γ11 x1 + γ12 x2 + ζ1
y2 = β21 y1 + ζ2
y = Γx +Βy +ζ
x1
ζ1
y2 y1
ζ2 γ11
β21
x2 γ12
y = I −B( )−1 Γx +ζ( )G. Russolillo – slide 31
Γ =γ 11 γ 120 0
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # exo
x =x1x2
⎡
⎣⎢
⎤
⎦⎥
# exo
y =y1y2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo
Β = 0 0β21 0
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # endo
ζ =ζ1ζ2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo
STA201 - An Introduction to Structural Equation Modeling
Analysing covariance structures of Path models
G. Russolillo – slide 32
�
Σ =Σxx
Σyx Σyy
⎡
⎣ ⎢
⎤
⎦ ⎥
Population Covariance matrix
C = Σ Ω( ) = Σ Γ,B,Ψ( )Path Coefficients Structural Error Covariance
x1
ζ1
y2 y1
ζ2 γ11
β21
x2 γ12
Assuming that:i) the MVs are centeredii) The covariance between structural error and exogenous MVs is equal to zero
We can write the covariance matrix among the MVs in terms of model parameters (implied covariance matrix):
STA201 - An Introduction to Structural Equation Modeling
The sub-matrix ΣYY
G. Russolillo – slide 33
Covariance matrix of the endogenous MVs:
y = I−B( )−1 Γx +ζ( )
Σ yy = E yy '( ) = E I−B( )−1 Γx +ζ( )⎡⎣ ⎤⎦ I−B( )−1 Γx +ζ( )⎡⎣ ⎤⎦′⎛
⎝⎜⎞⎠⎟=
= E I−B( )−1 Γx +ζ( )⎡⎣ ⎤⎦ x ' ′Γ + ′ζ( ) I−B( )−1′⎡⎣⎢
⎤⎦⎥
⎛⎝⎜
⎞⎠⎟ =
= I−B( )−1E Γxx ' ′Γ +ζx ' ′Γ + Γx ′ζ +ζ ′ζ( ) I−B( )−1′ == I−B( )−1 ΓE xx '( ) ′Γ + E ζx '( ) ′Γ + ΓE x ′ζ( ) + E ζ ′ζ( )⎡⎣ ⎤⎦ I−B( )−1′
�
E ζ ′ ζ ( ) = Ψlet Σ yy Ω( ) = I−B( )−1 ΓΣ xx ′Γ +Ψ( ) I−B( )−1′
STA201 - An Introduction to Structural Equation Modeling
The sub-matrix ΣXY
G. Russolillo – slide 34
Σ xy Ω( ) = Σ xx ′Γ I −B( )−1'
Cross-covariance matrix between endogenous and exogenous MVs expressed as a function of model parameters
y = I−B( )−1 Γx +ζ( )xy ' = x 'Γ '+ζ '( ) I−B( )−1'
xy ' = xx 'Γ '+ xζ '( ) I−B( )−1'
E xy '( ) = Σ xxΓ ' I−B( )−1'
G. Russolillo – slide 34
STA201 - An Introduction to Structural Equation Modeling
Path model implied covariance matrix
G. Russolillo – slide 35
Population Covariance matrix
“Implied” covariance matrix
Empirical covariance matrix
Σ =Σ xx
Σ yx Σ yy
⎡
⎣⎢⎢
⎤
⎦⎥⎥
�
S =sxxsyx syy
⎡
⎣ ⎢
⎤
⎦ ⎥
€
C ≈ S ≈ Σ
C = Σ Ω( ) =Σ xx
I−B( )−1ΓΣ xx I−B( )−1 ΓΣ xx ′Γ +Ψ( ) I−B( )−1′⎡
⎣
⎢⎢
⎤
⎦
⎥⎥
STA201 - An Introduction to Structural Equation Modeling
An example
G. Russolillo – slide 36
x1
ζ1
y2 y1
ζ2
γ11 β21
y1 = γ11 x1 + ζ1
y2 = β21 y1 + ζ2
Γ =γ 110
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # exo
Β = 0 0β21 0
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # endo
σ x1,x1
σ y1,x1σ y1,y1
σ y2 ,x1σ y2 ,y1
σ y2 ,y2
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
=
σ x1,x1
γ 11σ x1,x1γ 112σ x1,x1
+ψ 11
β21γ 11σ x1,x1β21 γ 11
2σ x1,x1+ψ 11( ) β21
2 γ 112σ x1,x1
+ψ 11( ) +ψ 22
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
Ψ =ψ 11 00 ψ 22
⎡
⎣⎢⎢
⎤
⎦⎥⎥
# endo # endo
Σ xx = σ x1,x1⎡⎣ ⎤⎦
# exo # exo
Σ =Σ xx
I −B( )−1ΓΣ xx I −B( )−1 ΓΣ xx ′Γ +Ψ( ) I −B( )′−1
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 37
Model identification
STA201 - An Introduction to Structural Equation Modeling
Some consideration on model identification
A model is identifiable if its parameters are uniquely determined.
DF = # equations (knowns) - # parameters to be estimated (unknowns)
Degrees of freedom (DF)
Necessary model identification condition:
degree of freedom (DF) higher or different from zero, i.e. DF≥ 0
This is a necessary (but not sufficient) condition
G. Russolillo – slide 38
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 39
Perfect Identification:
A statistical model is perfectly identified if the known information available implies that there is one best value for each parameter in the model whose value is not known.
à ZERO degrees of freedom
A perfectly identified model yields a trivially perfect fit, making the test of fit uninteresting.
Overidentification:A model is overidentified if there are more knowns than unknowns. Overidentified models may not fit well and this is their interesting feature.
à degrees of freedom > 0
Some consideration on model identification
STA201 - An Introduction to Structural Equation Modeling
Path Model identification
Necessary (but not sufficient) condition
T-rule: A Path Model (and more generally a SEM) is identified if the covariance matrix may be uniquely decomposed in function of the model parameters à if its DF ≥ 0, that is if the number of covariances is larger than the number of parameters to be estimated
G. Russolillo – slide 40
DF = 12P +Q( ) P +Q +1( ) − t⎡
⎣⎢⎤⎦⎥
# of parameters to be estimated
# of MVs in the model:P = # of exogenous MVs
Q = # of endogenous MVs
12P +Q( ) P +Q +1( ) Number of unique elements in the Σ matrix
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 41
Model identification in Path Analysis 1. t-rule (necessary condition): the number of redundant elements in the
covariance matrix of the observed variables must be greater of equal to the number of the unknown parameters
2. B=0 rule (sufficient condition) no endogenous variable affects any other endogenous variable
3. Recursive rule (sufficient condition), that is B is triangular and Ψ is diagonal
4. Order Condition (necessary condition): The number of variables excluded from each equation must be at least M – 1 (M is the number of the endogenous variables), all elements in Ψ are free (no constraints).
5. Rank condition:1. Let’s consider the matrix obtained by joining the matrixes (I-B) and -Γ. 2. Delete the non-zero columns of the ith row of this matrix. A necessary and sufficient condition for the identification of the ith equation is that the rank of the resulting matrix equals (M – 1) (no constraints in Ψ)
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 42
Confirmatory Factor Analysis
STA201 - An Introduction to Structural Equation Modeling
Factor Analysis: drawing conventions
or
or
or
Latent Variables (LV)
Manifest Variables (VM)
Bidirectional Path (correlation)
or or δ δδ
ξ
x x
Errors
G. Russolillo – slide 43
ξ
STA201 - An Introduction to Structural Equation Modeling
Confirmatory Single Factor Model
x1 = λ11 ξ1 + δ1
x2 = λ21 ξ1 + δ2
x3 = λ31 ξ1 + δ3
x =x1x2x3
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
Λ x =λ11λ21λ31
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
δ =δ1δ2δ3
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
ξ = ξ1[ ]x = Λ xξ +δ
x3 x2 x1
ξ1
δ2 δ1 δ3
λ11 λ21 λ31
G. Russolillo – slide 44
Factor Loading
Unique Factor
Common Factor
x = manifest variable ξ = latent variableδ = measurement errors
STA201 - An Introduction to Structural Equation Modeling
Multiple Factor Model x1 = λ11 ξ1 + δ1 ,x2 = λ21 ξ1 + λ22 ξ2 + δ2 x3 = λ31 ξ1 + δ3 x4 = λ42 ξ2 + δ4
x5 = λ52 ξ2 + δ5 x6 = λ62 ξ2 + δ6
€
x =
x1x6
"
#
$ $ $
%
&
' ' '
Λ x =
λ11 0
λ21 λ22λ31 0
0 λ420 λ520 λ62
"
#
$$$$$$$$$
%
&
'''''''''
€
δ =
δ1δ6
#
$
% % %
&
'
( ( (
€
ξ =ξ1ξ2
#
$ % &
' (
€
x = Λxξ + δ
x4 x3 x5 x6 x2 x1
ξ1 ξ2
δ1 δ2 δ3 δ4 δ5 δ6
λ11 λ21 λ31
λ22
λ42 λ52
λ62
G. Russolillo – slide 45
STA201 - An Introduction to Structural Equation Modeling
Analysing covariance structures in CF models
G. Russolillo – slide 46
C = Σ Ω( ) = Σ Λ,Φ,Θ( )Loadings LV Covariance Measurement Error Covariance
Assuming that:
i) the MVs, the LV and the errors are centered
ii) Measurement errors and LVs do not covariate
We can write the covariance matrix among the MVs in terms of model parameters (implied covariance matrix):
x4 x3 x5 x6 x2 x1
ξ1 ξ2
δ1 δ2 δ3 δ4 δ5 δ6
λ11 λ21 λ31
λ42 λ52
λ62
ϕ12
θ 12
STA201 - An Introduction to Structural Equation Modeling
Analysing covariance structures in CF models
Σ Ω( ) = E(x ′x ) = E Λξ +δ( ) Λξ +δ( )′⎡⎣⎢
⎤⎦⎥=
= E Λξ +δ( ) ′ξ ′Λ + ′δ( )⎡⎣ ⎤⎦ =
= ΛE(ξ ′ξ ) ′Λ + Λ E(ξ ′δ )+ E(δ ′ξ ) ′Λ + E(δ ′δ )
The covariance matrix of the MVs can be rewritten in terms of model parameters:
Σ Ω( ) = ΛΦΛ ' +Θ
�
x = Λxξ + δ
E ξξ '( ) = Φ
E δδ '( ) =Θlet
Confirmatory factor model :
G. Russolillo – slide 47
STA201 - An Introduction to Structural Equation Modeling
Latent or Emergent Constructs?
Latent Construct
Emergent Construct
Reflective (or Effects) Indicators Formative (or Causal) Indicators
e.g. Consumer’s attitudes, feelings
• Constructs give rise to observed variables (unique causeè unidimensional)
• Aim at accounting for observed variances or covariances
• These indicators should covary: changes in one indicator imply changes in the others.
• Internal consistency is measured (es. Cronbach’s alpha)
e.g. Social Status, Perceptions
• Constructs are combinations of observed variables(multidimensional)
• Not designed to account for observed variables
• These indicators need not covary: changes in one indicator do not imply changes in the others.
• Measures of internal consistency do not apply.
x1j x2j x3j x4j x1j x2j x3j x4j
G. Russolillo – slide 48
STA201 - An Introduction to Structural Equation Modeling
An Example of CFA in Psychology
G. Russolillo – slide 49
A dataset by Long (J. Scott Long : Confirmatory Factor Analysis, SAGE Publications, 1986)
- 603 heads of families from the region of Hennepin, Illinois - Observed (manifest) variables: PSY67 = Psychological disorders 1967 PHY67 = Psychophysiological disorders 1967 PSY71 = Psychological disorders 1971 PHY71 = Psychophysiological disorders 1971
STA201 - An Introduction to Structural Equation Modeling
An Example of CFA in Psychology
G. Russolillo – slide 50
Data PSY67 PHY67 PSY71 PHY71 n 603 603 603 603 corr PSY67 1 . . . corr PHY67 0.454 1 . . corr PSY71 0.526 0.247 1 . corr PHY71 0.377 0.309 0.549 1 stddev 1.45 0.555 1.38 0.503
Data are covariances between manifest variables :
Cov(X p ,X p ' ) =Cor(X p ,X p ' )× Var(X p ) × Var(X p ' )
STA201 - An Introduction to Structural Equation Modeling
An Example of CFA in Psychology
G. Russolillo – slide 51
XS1
PSY67
D1
L11
1
PHY67
D2
L21
1
XSI2
PSY71
D3
L32
1
PHY71
D4
L42
1
phi12
theta13 theta24
Common Factors
Unique factors
The 1st model specified by Long
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification
The 1st model specified by Long: Study of the specified model
XS1
PSY67
D1
L11
1
PHY67
D2
L21
1
XSI2
PSY71
D3
L32
1
PHY71
D4
L42
1
phi12
theta13 theta24
Model parameters λ11, λ21, λ12, λ22 ϕ11 = Var(ξ1) , ϕ22 = Var(ξ2) ,
ϕ12 = Cov(ξ1, ξ2)
θ11 = Var(δ1) , θ22 = Var(δ2) ,
θ33 = Var(δ3) , θ44 = Var(δ4)
θ13 = Cov(δ1, δ3) , θ24 = Cov(δ2, δ4)
Factor equations PSY67 = λ11 ξ1 + δ1
PHY67 = λ21 ξ1 + δ2 PSY71 = λ12 ξ2 + δ3 PHY71 = λ22 ξ2 + δ4
Model equations and model parameters
G. Russolillo – slide 52
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (2)
Model parameters λ11, λ21, λ12, λ22 ϕ11 = Var(ξ1) , ϕ22 = Var(ξ2) , ϕ12 = Cov(ξ1, ξ2) θ11 = Var(δ1) , θ22 = Var(δ2) ,
θ33 = Var(δ3) , θ44 = Var(δ4) θ13 = Cov(δ1, δ3) , θ24 = Cov(δ2, δ4)
Factor Equations
x =
x1x2x3x4
!
"
######
$
%
&&&&&&
=
λ11 0
λ21 0
0 λ120 λ22
!
"
######
$
%
&&&&&&
Λ
ξ1ξ2
!
"
##
$
%
&&
ξ
+
δ1δ2δ3δ4
!
"
######
$
%
&&&&&&
δ
= Λξ +δ
Φ = E(ξξ ') =φ11 φ12φ21 φ22
"
#
$$
%
&
''
Θ = E(δδ ') =
θ11 0 θ13 0
0 θ22 0 θ24θ13 0 θ33 0
0 θ24 0 θ44
"
#
$$$$$$
%
&
''''''
Σ = E(xx ') = E (Λξ +δ)(Λξ +δ) '#$ %&
= ΛE(ξξ ')
ΦΛ '+ E(δδ ')
Θ
= ΛΦΛ '+Θ
Covariance Decomposition
The 1st model specified by Long: Study of the specified model
G. Russolillo – slide 53
STA201 - An Introduction to Structural Equation Modeling
Model identification in CFA: Scaling a LV
G. Russolillo – slide 54
ξ1
x1
x2
x3
λx21
λx31
δ1
δ2
δ3
λx11 = 1
ξ1
x1
x2
x3
λx21
λx31
δ1
δ2
δ3
λx11 var(ξ1) = 1
Unstandardized Estimates
Standardized Estimates
• In CFA, the scale of a LV is not determined. So we need give a scale to the LV imposing a contraint either on the LV variance or on one loading.
• By consequence, in a LV Path Model we need to scale each exogenous LV
STA201 - An Introduction to Structural Equation Modeling
Model identification in CFA: the t-rule
G. Russolillo – slide 55
The t-rule requires that:
This is a necessary condition (but not sufficient): the number of free parameters (parameters to be estimates) must be less than or equal to the number of unique elements in the covariance matrix of X
t ≤ 12P P +1( )
# of parameters to be estimated
# of MVs in a CFA model
STA201 - An Introduction to Structural Equation Modeling
Model identification: the Three-Indicator Rule
G. Russolillo – slide 56
The Three-Indicator Rule is a sufficient condition
a CFA model is identified if
1. every latent construct is associated with at least 3 manifest variables;
2. each manifest variable is associated with only one latent construct, i.e. each row of Λ has only one nonzero element
3. the unique factors are uncorrelated, i.e. Θ is a diagonal matrix
STA201 - An Introduction to Structural Equation Modeling
Model identification in CFA: the Two-Indicator Rule
G. Russolillo – slide 57
The Two-Indicator Rule is a sufficient condition
a CFA model is identified if
1. every latent construct is associated with at least 2 manifest variables;
2. each manifest variable is associated with only one latent construct, i.e. each row of Λ has only one nonzero element
3. the unique factors are uncorrelated, i.e. Θ is a diagonal matrix
4. There are at least two LVs with non-zero covariance
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 58
(Assuming that all LVs are given a scale)
Identification rule Necessary Condition Sufficient Condition
T-rule Yes No
Three-Indicator Rule No Yes
Two-Indicator Rule No Yes
Summing up identification rules in CFA
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (3)
Model parameters λ11, λ21, λ12, λ22 ϕ11 = Var(ξ1) , ϕ22 = Var(ξ2) , ϕ12 = Cov(ξ1, ξ2) θ11 = Var(δ1) , θ22 = Var(δ2) ,
θ33 = Var(δ3) , θ44 = Var(δ4) θ13 = Cov(δ1, δ3) , θ24 = Cov(δ2, δ4)
Factor Equations
x =
x1x2x3x4
!
"
######
$
%
&&&&&&
=
λ11 0
λ21 0
0 λ120 λ22
!
"
######
$
%
&&&&&&
Λ
ξ1ξ2
!
"
##
$
%
&&
ξ
+
δ1δ2δ3δ4
!
"
######
$
%
&&&&&&
δ
= Λξ +δ
Φ = E(ξξ ') =φ11 φ12φ21 φ22
"
#
$$
%
&
''
Θ = E(δδ ') =
θ11 0 θ13 0
0 θ22 0 θ24θ13 0 θ33 0
0 θ24 0 θ44
"
#
$$$$$$
%
&
''''''
Σ = E(xx ') = E (Λξ +δ)(Λξ +δ) '#$ %&
= ΛE(ξξ ')
ΦΛ '+ E(δδ ')
Θ
= ΛΦΛ '+Θ
Covariance Decomposition
The 1st model specified by Long: Study of the specified model
G. Russolillo – slide 59
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (4)
This model is not identifiable The parameters of the model may not be uniquely expressed as a function of the covariances between the manifest variables. N° of parameters > N° of covariances Some constraints are needed.
The 1st model specified by Long: Study of the specified model
XS1
PSY67
D1
L11
1
PHY67
D2
L21
1
XSI2
PSY71
D3
L32
1
PHY71
D4
L42
1
phi12
theta13 theta24
Covariances
PSY67 PHY67 PSY71 PSY71PSY67PHY67PSY71PHY71
σ11 σ12σ22
σ13σ23σ33
σ14σ24σ34σ44
P(P+1)/2=10
Model parameters λ11, λ21, λ12, λ22 ϕ11 = Var(ξ1) , ϕ22 = Var(ξ2) ,
ϕ12 = Cov(ξ1, ξ2)
θ11 = Var(δ1) , θ22 = Var(δ2) ,
θ33 = Var(δ3) , θ44 = Var(δ4)
θ13 = Cov(δ1, δ3) , θ24 = Cov(δ2, δ4)
13
Matrix Σ
G. Russolillo – slide 60
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (5)
The 1st model specified by Long is not identifiable…we need to add some constraints:
XS1
PSY67
D1
L11
1
PHY67
D2
L21
1
XSI2
PSY71
D3
L32
1
PHY71
D4
L42
1
phi12
theta13 theta24
à Normalisation of the common factors
Var(ξ1) = ϕ11 = 1 , Var(ξ2) = ϕ22 = 1
à Stability of the loadings along time:
λPSY67,1 = λPSY71,2
λPHY67,1 = λPHY71,2 à Independence between unique factors:
θ13 = Cov(δ1, δ3) = 0
θ24 = Cov(δ2, δ4) = 0
1
XSI1
PSY67
theta11
D1
L11
1
PHY67
theta22
D2
L21
1
1
XSI2
PSY71
theta33
D3
L11
1
PHY71
theta44
D4
L21
1
phi12
New model
G. Russolillo – slide 61
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (6) The 2nd model (identifiable) specified by Long: Study of the specified model
1
XSI1
PSY67
theta11 D1
L11
1
PHY67
theta22 D2
L21
1
1
XSI2
PSY71
theta33 D3
L11
1 PHY71
theta44 D4
L21
1
phi12 G. Russolillo – slide 62
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (7)
x =
x1x2x3x4
!
"
######
$
%
&&&&&&
=
λ11 0
λ21 0
0 λ110 λ21
!
"
######
$
%
&&&&&&
Λ
ξ1ξ2
!
"
##
$
%
&&
ξ
+
δ1δ2δ3δ4
!
"
######
$
%
&&&&&&
δ
= Λξ +δ
Factor Equations
The 2nd model specified by Long: Study of the specified model
Model parameters λ11, λ21
ϕ12 = Cov(ξ1, ξ2) θ11 = Var(δ1) , θ22 = Var(δ2) , θ33 = Var(δ3) , θ44 = Var(δ4)
Φ = E(ξξ ') =1 φ12φ12 1
"
#
$$
%
&
''
11
22
33
44
0 0 00 0 0
( ')0 0 00 0 0
E
θθ
δδθ
θ
⎡ ⎤⎢ ⎥⎢ ⎥Θ = =⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦
Σ = E(xx ') = E (Λξ +δ)(Λξ +δ) '#$ %&
= ΛE(ξξ ')
ΦΛ '+ E(δδ ')
Θ
= ΛΦΛ '+Θ
Covariance Decomposition
G. Russolillo – slide 63
STA201 - An Introduction to Structural Equation Modeling
Long’s model identification (8)
Model parameters λ11, λ21 ϕ12 = Cov(ξ1, ξ2) θ11 = Var(δ1) , θ22 = Var(δ2) ,
θ33 = Var(δ3) , θ44 = Var(δ4)
The 2nd model specified by Long Study of the specified model
Covariances
PSY67 PHY67 PSY71 PSY71PSY67PHY67PSY71PHY71
σ11 σ12σ22
σ13σ23σ33
σ14σ24σ34σ44
P(P+1)/2=10
7
Matrix Σ 1
XSI1
PSY67
theta11
D1
L11
1
PHY67
theta22
D2
L21
1
1
XSI2
PSY71
theta33
D3
L11
1
PHY71
theta44
D4
L21
1
phi12
This model is identifiable The number of parameters (7) is finally less than the number of covariances (10) Degrees of freedom (DF) = 10 - 7 = 3
G. Russolillo – slide 64
STA201 - An Introduction to Structural Equation Modeling
Applying CFA to Long’s second model (1)
The 2nd model specified by Long Study of the specified model
1
XSI1
PSY67
theta11
D1
L11
1
PHY67
theta22
D2
L21
1
1
XSI2
PSY71
theta33
D3
L11
1
PHY71
theta44
D4
L21
1
phi12
σ11 σ12 σ13 σ14σ 22 σ 23 σ 24
σ 33 σ 34σ 44
!
"
######
$
%
&&&&&&
=
λ12 λ1λ2 φ12λ1
2 φ12λ1λ2λ22 φ12λ1λ2 φ12λ2
2
λ12 λ1λ2
λ22
!
"
######
$
%
&&&&&&
+
θ1 0 0 0
θ2 0 0
θ3 0
θ4
!
"
######
$
%
&&&&&&
ΘΛΦΛΣ xx +′=
G. Russolillo – slide 65
STA201 - An Introduction to Structural Equation Modeling
Applying CFA to Long’s second model (2) Visualising the estimates for model 2 by Long
2 2
2 2
2
2
.52 0 0 01.24 1.24*.30 .67*1.24 .67*1.24*.30.22 0 0.30 .67*1.24*.30 .67*.30
.40 01.24 1.24*.30.16.30
C
⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥= +⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎣ ⎦⎣ ⎦
C =
λ̂12 λ̂1λ̂2 φ̂12λ̂1
2 φ̂12λ̂1λ̂2
λ̂22 φ̂12λ̂1λ̂2 φ̂12λ̂2
2
λ̂12 λ̂1λ̂2
λ̂22
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
+
θ̂1 0 0 0
θ̂2 0 0
θ̂3 0
θ̂4
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
1.00
XSI1
PSY67
.52
D1
1.24
1
PHY67
.22
D2
.30
1
1.00
XSI2
PSY71
.40
D3
1.24
1
PHY71
.16
D4
.30
1
.67
G. Russolillo – slide 66
STA201 - An Introduction to Structural Equation Modeling
Applying CFA to Long’s second model (3)
Observed (sample) and Implied Covariances and Correlations from model 2 by Long
Example: Var(PHY71) = Var(.30*XSI2 + DPHI71) = .302Var(XSI2) + Var(DPHI71) = .302 + .16 = .25
var PHY 71( ) = var λPHY 71φXSI 2 +δPHY 71( ) = λPHY 712 var φXSI 2( )+ var δPHY 71( )
G. Russolillo – slide 67
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 68
Path modeling with Latent Variables
STA201 - An Introduction to Structural Equation Modeling
Two families of methods
The aim is to reproduce the sample covariance matrix of the manifest variables by means of the model parameters: • the implied covariance matrix of the manifest variables is a function of the model parameters• it is a confirmatory approach aimed at validating a model (theory building)
The aim is to provide an estimate of the latent variable scores in such a way that they are the most correlated with one another (according to path diagram structure) and the most representative of each corresponding block of manifest variables.• latent variable score estimation plays a main role• it is more an exploratory approach, than a confirmatory one (operational model strategy)
SEM
Covariance-basedMethods
Component-basedMethods
G. Russolillo – slide 69
STA201 - An Introduction to Structural Equation Modeling
Path model with LVs: drawing conventions
or
or
or
or
Latent Variables (LV)
Manifest Variables (VM)
Unidirectional Path (cause-effect)
Bidirectional Path (correlation)
Feedback relation or reciprocal causation
or or ε εε
η η
x x
Errors
G. Russolillo – slide 70
STA201 - An Introduction to Structural Equation Modeling
Path model with latent variables
Structural Model or inner model
Measurement Model or outer model
ξ1
η2
η1
x1
x2
x3
y1
y3
y2
y4
y5
y6
y7
Path Coefficients
γ21
β21
λx21
λx31
Loadings
δ1
ε3ζ2
γ11
ζ1
λy11
λy21
ε4
ε5
ε6
ε7
δ2
δ3
ε1
ε2
λx11
λy32 λy42
λy52 λy62 λy72
Loadings
G. Russolillo – slide 71
x = manifest variable associated to exogenous latent variablesy = manifest variable associated to endogenous latent variablesξ = exogenous latent variableη = endogenous latent variableδ = measurement error associated to exogenous latent variablesε = measurement error associated to endogenous latent variablesζ = structural error
STA201 - An Introduction to Structural Equation Modeling
x1 = λx11 ξ1 + δ1
x2 = λx21 ξ1 + δ2
x3 = λx31 ξ1 + δ3
x =x1x2x3
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥Λx =
λ11
x
λ21
x
λ31
x
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
δ =δ1δ 2δ 3
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
�
ξ = ξ1[ ]
�
x = Λxξ + δ
For the exogenous MVsξ1
x1
x2
x3
λx21
λx31
δ1
δ2
δ3
λx11
The measurement (outer) model
G. Russolillo – slide 72
STA201 - An Introduction to Structural Equation Modeling
The measurement (outer) model
x = Λ xξ +δ
For the exogenous MVs (A reflective scheme must hold)
Vector of the exogenous MVs
RECTANGULAR matrix with the loadings linking each exogenous MV to the corresponding exogenous LV
Vector of the exogenous LVs
Vector of the measurement errors associated to the exogenous MVs
G. Russolillo – slide 73
STA201 - An Introduction to Structural Equation Modeling
y3
η2
η1
y1
y2
y4
y5
y6
y7
ε3
λy 11
λy 21
ε4
ε5
ε6
ε7
ε1
ε2
λy32 λy42
λy62 λy72
λy52
y =
y1y2y3y4y5y6y7
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
ε =
ε1ε2ε3ε4ε5ε6ε7
⎡
⎣
⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥
η =η1η2
⎡
⎣⎢
⎤
⎦⎥Λy =
λ11 0λ21
00000
0λ32
λ42
λ52
λ62
λ72
⎛
⎝
⎜⎜⎜⎜⎜⎜⎜⎜⎜
⎞
⎠
⎟⎟⎟⎟⎟⎟⎟⎟⎟
y = Λ yη +ε
For the endogenous MVs y3
y1 = λ y11η1 + ε1y2 = λ y21η1 + ε2y3 = λ y32η2 + ε3y4 = λ y42η2 + ε4y5 = λ y52η2 + ε5y6 = λ y62η2 + ε6y7 = λ y72η2 + ε7
The measurement (outer) model
G. Russolillo – slide 74
STA201 - An Introduction to Structural Equation Modeling
ξ1
η2
η1
γ21
β21
ζ2
γ11
ζ1
η1 = γ11 ξ1 + ζ1 η2 = γ21 ξ1 + β21η 1 + ζ2
Γ =γ 11γ 21
⎡
⎣⎢⎢
⎤
⎦⎥⎥Β =
0 0β21 0
⎡
⎣⎢⎢
⎤
⎦⎥⎥
η = Γξ +Βη +ζ
ζ =ζ1ζ 2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
η =η1η2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
ξ = ξ1[ ]
The structural (inner) model
G. Russolillo – slide 75
STA201 - An Introduction to Structural Equation Modeling
The structural (inner) model
Vector of the endogenous LVs
Vector of the errors associated to the endogenous LVs
Vector of the exogenous LVs
�
η = Γξ + Βη + ζ
Path-coefficients Matrix: it is a SQUARED matrix with the path coefficients of the endogenous LVs. On the diagonal there are only zeros
Path-coefficients Matrix: it is a RECTANGULAR matrix with the path coefficients of the exogenous LVs.
G. Russolillo – slide 76
STA201 - An Introduction to Structural Equation Modeling
The Structural Equation Model
�
η = Γξ + Βη + ζ
ξ1
η2
η1
x1
x2
x3
y1
y3
y2
y4
y5
y6
y7
ε1
ε2
δ2
δ1
ε3 ζ2 ε4
ε5
ε6
ε7
δ3
ζ1
�
y = Λyη + ε
�
x = Λxξ + δ
�
η = I−B( )−1 Γξ + ζ( )
Measurement Models
Structural Model
G. Russolillo – slide 77
STA201 - An Introduction to Structural Equation Modeling
To summarize
G. Russolillo – slide 78
ξ1
η2
η1
x1
x2
x3
y1
y3
y2
y4
y5
y6
y7
ε1
ε2
δ2
δ1
ε3 ζ2 ε4
ε5
ε6
ε7
δ3
ζ1
x4 x3 x5 x6 x2 x1
ξ1 ξ2
δ1 δ2 δ3 δ4 δ5 δ6
x1
x2
y
ζ
Path Analysis with Manifest Variables Confirmatory Factor Model
Path Analysis with Latent Variables (Structural Equation Model)
STA201 - An Introduction to Structural Equation Modeling
Model assumptions
ξ1
η2
η1
x1
x2
x3
y1
y3
y2
y4
y5
y6
y7
ε1
ε2
δ2
δ1
ε3 ζ2 ε4
ε5
ε6
ε7
δ3
ζ1
�
y = Λyη + ε
�
x = Λxξ + δ
�
η = I−B( )−1 Γξ + ζ( )
Assuming that:
i) the MVs, the LV and the errors (both in structural and measurement models) are centered
ii) Two errors of different type (structural, exogenous measurement and endogenous measurement) do not covariate
iii) Measurement errors and LVs do not covariate
iv) The covariance between structural error and exogenous LVs is equal to zero
G. Russolillo – slide 79
STA201 - An Introduction to Structural Equation Modeling
�
y = Λyη + ε
�
x = Λxξ + δ
�
η = I−B( )−1 Γξ + ζ( )
�
E x( ) = 0, E y( ) = 0, E ξ( ) = 0, E η( ) = 0, E δ( ) = 0, E ε( ) = 0, E ζ( ) = 0
�
E δ ′ ε ( ) = 0, E ζ ′ δ ( )( ) = 0, E ζ ′ ε ( ) = 0
�
E δ ′ ξ ( ) = 0, E δ ′ η ( ) = 0, E ε ′ ξ ( ) = 0, E ε ′ η ( ) = 0
�
E ζ ′ ξ ( ) = 0
ξ1
η2
η1
x1
x2
x3
y1
y3
y2
y4
y5
y6
y7
ε1
ε2
δ2
δ1
ε3 ζ2 ε4
ε5
ε6
ε7
δ3
ζ1
Hypotheses on Expectations
Hypotheses on Correlations
G. Russolillo – slide 80
Model assumptions
STA201 - An Introduction to Structural Equation Modeling
We can write the covariance matrix among the MVs in terms of model parameters (implied covariance matrix)
G. Russolillo – slide 81
Analysing the covariance in SEMs
�
Σ =Σxx
Σyx Σyy
⎡
⎣ ⎢
⎤
⎦ ⎥
ξ1
η2
η1
x1
x2
x3
y1
y3
y2
y4
y5
y6
y7
ε1
ε2
δ2
δ1
ε3 ζ2 ε4
ε5
ε6
ε7
δ3
ζ1
Population Covariance matrix
C = Σ Ω( ) = Σ Γ,B,Λx ,Λy ,Φ,Ψ,Θδ ,Θε( )Path Coefficients Loadings Exog. LV
CovarianceMeasurement Error Covariance
Structural Error Covariance
STA201 - An Introduction to Structural Equation Modeling
Σ Ω( ) =ΛxΦΛ 'x+Θδ
Λy I−B( )−1Γ ′Φ Λ 'x Λy I−B( )−1 ΓΦ ′Γ +Ψ( ) I−B( )−1′⎡⎣⎢
⎤⎦⎥Λ 'y+Θε
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
Implied Covariance matrix Σ(Ω)
�
Σ =Σxx
Σyx Σyy
⎡
⎣ ⎢
⎤
⎦ ⎥
It can be rewritten as a function of model parameters
Covariance matrix of the population
Covariance matrix of the population implied by the model
G. Russolillo – slide 82
STA201 - An Introduction to Structural Equation Modeling
The sub-matrix ΣXX
ΣXX Ω( ) = E(x ′x ) = E Λxξ +δ( ) Λxξ +δ( )′⎡⎣⎢
⎤⎦⎥=
= E Λxξ +δ( ) ′ξ ′Λx + ′δ( )⎡⎣ ⎤⎦ =
= ΛxE(ξ ′ξ ) ′Λx + ΛxE(ξ ′δ )+ E(δ ′ξ ) ′Λx + E(δ ′δ )
The covariance matrix of the exogenous MVs can be rewritten in terms of model parameters:
�
ΣXX Ω( ) = ΛxΦΛ'x +Θδ
�
x = Λxξ + δ
�
E ξξ '( ) = Φ
E δδ'( ) = Θδ
let
Outer model for exogenous blocks :
G. Russolillo – slide 83
STA201 - An Introduction to Structural Equation Modeling �
ΣXX Ω( ) = ΛxΦΛ'x +Θδ
Covariance matrix of the exogenous MVs expressed as a function of model parameters
G. Russolillo – slide 84
The sub-matrix ΣXX
STA201 - An Introduction to Structural Equation Modeling �
ΣYY Ω( ) = E(y ′ y ) = E Λyη + ε( ) Λyη + ε( )′⎡ ⎣ ⎢
⎤ ⎦ ⎥ =
= E Λyη + ε( ) ′ η ′ Λ y + ′ ε ( )[ ] =
= E Λyη ′ η ′ Λ y + ε ′ η ′ Λ y + Λyη ′ ε + ε ′ ε [ ] =
= ΛyE(η ′ η ) ′ Λ y + ΛyE(ε ′ η ) + E(η ′ ε ) ′ Λ y + E(ε ′ ε )
The covariance matrix of the exogenous MVs can be rewritten in terms of model parameters:
�
ΣYY Ω( ) = ΛyE η ′ η ( )Λ'y +Θε
�
E ε ′ ε ( ) = Θεlet
Outer model for the endogenous blocks :
�
y = Λyη + ε
This matrix is a function of other parameters
G. Russolillo – slide 85
The sub-matrix ΣYY
STA201 - An Introduction to Structural Equation Modeling
Covariance matrix of the endogenous LVs:
�
η = I−B( )−1 Γξ + ζ( )
�
Σηη = E(η ′ η ) = E I−B( )−1 Γξ + ζ( )[ ] I−B( )−1 Γξ + ζ( )[ ]′⎛
⎝ ⎜
⎞
⎠ ⎟ =
= E I−B( )−1 Γξ + ζ( )[ ] ′ ξ ′ Γ + ′ ζ ( ) I−B( )−1′⎡
⎣ ⎢
⎤
⎦ ⎥
⎛
⎝ ⎜
⎞
⎠ ⎟ =
= I−B( )−1E Γξ ′ ξ ′ Γ + ζ ′ ξ ′ Γ + Γξ ′ ζ + ζ ′ ζ ( ) I−B( )−1′ =
= I−B( )−1 ΓE ξ ′ ξ ( ) ′ Γ + E ζ ′ ξ ( ) ′ Γ + ΓE ξ ′ ζ ( ) + E ζ ′ ζ ( )[ ] I−B( )−1′
�
E ζ ′ ζ ( ) = Ψlet
�
Σηη Ω( ) = I−B( )−1 ΓΦ ′ Γ + Ψ( ) I−B( )−1′
G. Russolillo – slide 86
The sub-matrix ΣYY
STA201 - An Introduction to Structural Equation Modeling
The covariance matrix of the endogenous MVs can now be obtained as a function of model parameters:
�
ΣYY Ω( ) = ΛyE η ′ η ( )Λ'y +Θε
ΣYY Ω( ) = Λy I−B( )−1 ΓΦ ′Γ +Ψ( ) I−B( )−1′⎡⎣⎢
⎤⎦⎥Λ 'y+Θε
G. Russolillo – slide 87
The sub-matrix ΣYY
STA201 - An Introduction to Structural Equation Modeling
The cross-covariance matrix between exogenous and endogenous MVs can be written as
Measurement model:
ΣXY Ω( ) = E XY '( ) = E Λxξ +δ( ) Λyη +ε( )′⎡⎣⎢
⎤⎦⎥=
= E Λxξ +δ( ) ′η ′Λy + ′ε( )⎡⎣ ⎤⎦=
= Λ xE(ξ ′η ) ′Λ y +Λ xE(ξ ′ε )+E(δ ′η ) ′Λ y +E(δ ′ε )
�
x = Λxξ + δ
�
y = Λyη + εand
The measurement errors are
uncorrelated among them and with the
exogenous LVs
ΣXY Ω( ) = ΛxE ξ ′η( ) ′Λy
G. Russolillo – slide 88
The sub-matrix ΣXY
STA201 - An Introduction to Structural Equation Modeling
ΣXY Ω( ) = ΛxΦ ′Γ I−B( )−1'Λ 'yCross-covariance matrix between endogenous and exogenous
LVs expressed as a function of model parameters
η = I −B( )−1 Γξ +ζ( )ξη ' = ξ 'Γ '+ζ '( ) I −B( )−1'
ξη ' = ξξ 'Γ '+ξζ '( ) I −B( )−1'
E ξη '( ) = ΦΓ ' I −B( )−1'
G. Russolillo – slide 89
The sub-matrix ΣXY
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 90
Analysing the covariance in SEMs
Population Covariance matrix
“Implied” covariance matrix
Empirical covariance matrix �
Σ =Σxx
Σyx Σyy
⎡
⎣ ⎢
⎤
⎦ ⎥
�
S =sxxsyx syy
⎡
⎣ ⎢
⎤
⎦ ⎥
€
C ≈ S ≈ Σ
C = Σ Ω( ) =Λ xΦΛ 'x+Θδ
Λ y I−B( )−1Γ (ΦΛ 'x Λ y I−B( )−1 ΓΦ (Γ +Ψ( ) I−B( )−1(*+,
-./Λ 'y+Θε
*
+
,,,
-
.
///
STA201 - An Introduction to Structural Equation Modeling
Model identification in SEM
Necessary (but not sufficient) condition
T-rule: A SEM is identified if the covariance matrix may be uniquely decomposed in function of the model parameters à if its DF ≥ 0, that is if the number of covariances is larger than the number of parameters to be estimated
G. Russolillo – slide 91
DF = 12P +Q( ) P +Q +1( ) − t⎡
⎣⎢⎤⎦⎥
# of parameters to be estimated
# of MVs in the model:P = # of exogenous MVs
Q = # of endogenous MVs
12P +Q( ) P +Q +1( ) Number of unique elements in the Σ matrix
STA201 - An Introduction to Structural Equation Modeling
Model Identification in SEM
Two-Step rule:
à In the first step the analyst treats the model as a confirmatory factor analysis (both x and y variables are viewed as exogenous, relations between the LVs are expressed in terms of covariance)
à The second step examines the latent variable equation of the original model and treat it as though it were a structural equation of observed variables
N.B. Those are sufficient (but not necessary) conditions
G. Russolillo – slide 92
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 93
Model estimation
STA201 - An Introduction to Structural Equation Modeling
F = f S −Σ Ω̂( )( )
Estimation minimizes some discrepancy function between the implied covariance matrix and the observed one.
Estimated covariance matrixΣ Ω̂( ) =C =
Λ̂ xΦ̂Λ̂ 'x+ Θ̂δ
Λ̂ y I − B̂( )−1 Γ̂ ′Φ̂ Λ̂ 'x Λ̂ y I − B̂( )−1 Γ̂Φ̂ ′Γ̂ + Ψ̂( ) I − B̂( )−1'⎡⎣⎢
⎤⎦⎥Λ̂ 'y+ Θ̂ε
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
G. Russolillo – slide 94
Discrepancy function
STA201 - An Introduction to Structural Equation Modeling
Main discrepancy function for estimation
G. Russolillo – slide 95
FML = log C + tr SC−1( )− log S − P +Q( )Maximum Likelihood
Assumptions: à data are multinormal à S follows a Wishart distribution à Both S and C are positive-definite
Properties of the ML estimators: - Asymptotically unbiased - Consistent - Asymptotically efficient - The distribution of the ML estimators approximates a normal
distribution as sample size increases
STA201 - An Introduction to Structural Equation Modeling
Alternative discrepancy functions
G. Russolillo – slide 96
FGLS =
12
tr W−1 S - C( )2⎡⎣⎢
⎤⎦⎥
Generalised Least Squares
( )212ULSF tr ⎡ ⎤= ⎣ ⎦S -C
Unweighted Least Squares
( ) ( )T 1/ADF WLSF −= − −s c W s c
Asymptotically Distribution Free
STA201 - An Introduction to Structural Equation Modeling
Unweighted Least Squares - SEM
( )212ULSF tr ⎡ ⎤= ⎣ ⎦S -C
Discrepancy function
à When sample size is large, ULS and ML estimates are very similar.à However, ULS estimators are not the asymptotically most efficient estimatorsà Consistent estimatorsà Estimates differ when correlation instead of covariance matrices are analyzedà No assumption on the observed variables distribution
However:• ULS estimators implicitly weight all the residuals as if they had the same variance.• A weight matrix is considered in the discrepancy in order to overcome the problem
G. Russolillo – slide 97
STA201 - An Introduction to Structural Equation Modeling
Generalised Least Squares - SEM Discrepancy function
à Consistent estimatorsà Asymptotic normality (Depending on the choice of W)à Common choice W=S (Scale-free least squares) or W=diag(S)
(Asymptotically-free least squares), if W = I -> GLS = ULS
In order to assure this property two assumptions must be done:à S elements are unbiased estimators of population covariances à S elements asymptotically follow a multinormal distribution with expectations equal to the population covariances and the generic covariance of spp’ and spp’’ are equal to
G. Russolillo – slide 98
�
FGLS = 12tr W −1 S −C( )2( )
N −1 σ igσ ih +σ ihσ jg( )
STA201 - An Introduction to Structural Equation Modeling
Estimation procedure
Fitting functions are generally solved by using iterative numerical procedures. When the values for the parameters to be estimated in two consecutive steps differ by less than some present criterion the iterative process stops.
Non convergence may happen
Convergence is affected by:- the convergence criterion- the number of iteration allowed- the starting values- model specification Estimates from nonconvergent runs should
not be used or given substantive interpretations.
G. Russolillo – slide 99
STA201 - An Introduction to Structural Equation Modeling
Model fit and validation
G. Russolillo – slide 100
STA201 - An Introduction to Structural Equation Modeling
Reliability
The reliability rel(xi) of a measure xi of a true score ξ modeled asxp= λpξj + δp is defined as:
rel xp( ) =λp2 var ξ j( )var xp( )
= cor2 xp ,ξ j( )
rel(xp) can be interpreted as the variance of xp that is explained by ξj
G. Russolillo – slide 101
STA201 - An Introduction to Structural Equation Modeling
Measuring the Reliability
Question:How to measure the overall reliability of the measurement tool ?In other words, how to measure the homogeneity level of a block Xj of positively correlated variables?
Answer:The composite reliability (internal consistency) of manifest variables can be checked using:à the Cronbach’s Alpha à the Dillon Goldstein rho
G. Russolillo – slide 102
STA201 - An Introduction to Structural Equation Modeling
Composite reliability
G. Russolillo – slide 103
The measurement model assumes that each group of manifest variables is homogeneous and unidimensional (related to a single variable). The composite reliability (internal consistency or homogeneity of a block) of manifest variables is measured by either of the following indices:
à the Cronbach’s Alpha
à the Dillon Goldstein rho
α j =
cor x pj ,x p ' j( )p≠ "p∑
Pj + cor x pj ,x p ' j( )p≠ "p∑
×PjPj −1
ρ j =
λpjp=1
Pj
∑"
#$$
%
&''
2
λpjp=1
Pj
∑"
#$$
%
&''
2
+ 1−λpj2( )
p=1
Pj
∑
The manifest variables are reliable if these indices are at
least 0.7
STA201 - An Introduction to Structural Equation Modeling
Cronbach’s α rationale : the model
G. Russolillo – slide 104
x1q ,…,xpq ,…,xPqq are supposed to be tau− equivalent (or parallel) measures of a true score ξq :
Hp : xi = ξ +δ i , i = 1,..., P
E xi( ) = 0
cov ξ ,δ i( ) = 0
cov δ i ,δ i '( ) = 0 ∀i ≠ i '
λi = 1 (Tau-equivalency condition)
var(δ i ) = var(δ i ' ) ∀i,i ' (Parallelity condition)
Cronbach’s a measures the reliability of a simple sum of tau-equivalent or parallel measures
H = xii
P
∑
Cronbach's α = rel H( ) = cor 2 H ,ξ( ) = cor 2 xii∑ ,ξ( )
STA201 - An Introduction to Structural Equation Modeling
Derivation of Cronbach’s a formula
cor2 xii∑ ,ξ( ) = cov2 ξ +δ i( )i∑ ,ξ( )
var xii∑( )var ξ( ) =cov2 Pξ + δ ii∑( ),ξ( )var xii∑( )var ξ( ) =
P2 var2 ξ( )var xii∑( )var ξ( ) =
P2 var ξ( )var xii∑( )
=P P −1( )P var ξ( )P −1( )var xii∑( ) =
PP −1( )
P2 var ξ( )− P var ξ( )⎡⎣ ⎤⎦var xii∑( )
= PP −1( )
P2 var ξ( )+ var δ i( )i∑⎡⎣ ⎤⎦ − var δ i( )i∑ + P var ξ( )⎡⎣ ⎤⎦var xii∑( )
= PP −1( )
var xii∑( )− var xi( )i∑var xii∑( ) = P
P −1( )cov xi ,x j( )i≠i '∑
var xi( )+i∑ cov xi ,x j( )i≠i '∑
= PP −1( ) 1−
var xi( )i∑var xii∑( )
⎡
⎣⎢⎢
⎤
⎦⎥⎥
1. var xii∑( ) = var ξ +δi( )i∑( ) = var Pξ + δii∑( ) = P2 var ξ( ) + var δii∑( ) = P2 var ξ( ) + var δi( )i∑2. var xi( ) =i∑ var ξ +δi( ) =i∑ var ξ( ) + var δi( ) =i∑i∑ P var ξ( ) + var δi( )i∑
1 2
3. var xii∑( ) = var xi( )+i∑ cov xi ,xi '( )i≠i '∑
3
G. Russolillo – slide 105
α ≤ 1, and α = 1 when all correlations between xi are equal to 1 and all variances of xi are equal to each other.
STA201 - An Introduction to Structural Equation Modeling
Cronbach’s α for standardized MVs
G. Russolillo – slide 106
If variables are standardised, var(H ) = var xii∑( ) = P + cor(xi ,xi ' )i≠i '∑
α = PP −1
⎛⎝⎜
⎞⎠⎟
cor(xi ,xi ' )i≠i '∑P + cor(xi ,xi ' )
i≠i '∑
=
PP −1
⎛⎝⎜
⎞⎠⎟P
cor xi ,xi '( )i≠i '∑
P
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
P 1+cor xi ,xi '( )∑P
⎡
⎣⎢⎢
⎤
⎦⎥⎥
=
Pcor xi ,xi '( )
i≠i '∑P P −1( )
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
1+cor xi ,xi '( )∑P
⎡
⎣⎢⎢
⎤
⎦⎥⎥
⇒
Hence, we can write
α= Pr
1+ P −1( )r , where r =cor(xi ,xi ' )
i≠i '∑
P(P −1)⇒
if r = 1⇒α = 1if r = 0⇒α = 0
⎧⎨⎩
A block of positively correlated variables, is essentially uni-dimensional if they show a high mean correlation
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 107
Dillon Goldstein’s ρ : the rationale
var x pqp∑( ) = var λpqξq +ε pq( )p∑"#
$%= var λpqξqp∑( )+ var ε pq( ) = λpq∑( )
2var ξq( )"
#&$%'+ var ε pq( )
If the MVs and the LV are standardised, ρDG can be estimated as
ρ̂q =λ̂pqp∑( )
2
λ̂pqp∑( )2+ 1− λ̂pq
2
p∑( )
x pq = λpqξq +ε pq ,Given the model
ρq =λpqp∑( )
2× var ξq( )
λpqp∑( )2× var ξq( )+ var ε pq( )p∑
Dillon Goldstein’s ρ is defined as
NB: must be estimated! λpq
STA201 - An Introduction to Structural Equation Modeling
Assessing the fit of a SEM
Overall model fit measures
These tests help to evaluate if the model fit is good or not and if not it help to measure the departure of observed covariance matrix from the implied covariance matrix
à They can not be applied if the model is exactly identified (because in this situation S always equals C)
à Overall fit may be good but parameters estimates may not be statistically significant!
G. Russolillo – slide 108
STA201 - An Introduction to Structural Equation Modeling
Overall model fit measures
G. Russolillo – slide 109
The overall model fit measures are « functions » of the fitting function:
We can show that:
N −1( )F ≈ χDF2
f S,C( ) = F
DF = 12P +Q( ) P +Q +1( ) − t⎡
⎣⎢⎤⎦⎥
with
N : number of observationsF: Discrepancy functionDF: Degrees of freedom of the model
STA201 - An Introduction to Structural Equation Modeling
Overall model fit measures Chi-square Test - Global Validation TestsH0 : Σ = C è Good fitH1 : Σ ≠ C
Decision Rule: The model is accepted if p-value ≥ 0.05 (We cannot reject the hypothesis H0) or if Chi-square/DF ≤ 2 (or other thresholds such as 3 or 5)
N.B.: For a fixed level of differences in covariance matrices, the estimate of the Chi-square increases with N
The power (i.e. the probability of rejecting a false H0) depends on the sample size. If the sample size is important, this test may lead to reject the model even if the data fit well the model!We cannot use this test to compare model estimated on different sample size
N −1( )F ∼ χDF2
Test Statistic
G. Russolillo – slide 110
STA201 - An Introduction to Structural Equation Modeling
Overall model fit measures: Test of Close-Fit
G. Russolillo – slide 111
RMSEA (Root mean square error of approximation) (Steiger and Lind)
RMSEA =
F0
DF
The introduction of degrees of freedom in the index allows for parsimonious models.
where F0 = log C + tr ΣC−1( )− log Σ − P +Q( )
STA201 - An Introduction to Structural Equation Modeling
Overall model fit measures: Test of Close-Fit
G. Russolillo – slide 112
à The model is accepted if RMSEA ≤ 0.05 or, at least, less than 0.08
à A confidence interval is usually provided
à It depends very little on N
RMSEA is estimated as:
where
RMSEAestimated =
FDF
− 1N −1
F = log C + tr SC−1( )− log S − P +Q( )
Use of RMSEA in practice
STA201 - An Introduction to Structural Equation Modeling
An Example of CFA in Psychology
G. Russolillo – slide 113
The 2nd model specified by Long
1.00
XSI1
PSY67
.52
D1
1.24
1
PHY67
.22
D2
.30
1
1.00
XSI2
PSY71
.40
D3
1.24
1
PHY71
.16
D4
.30
1
.67
Model 2 by LONG Chi-square = 22.317
DF = 3 P-Value = 0.000
Chi-square/DF = 7.439
P-Value = 0.011 RMSEA = 0.103
RMSEA LO 90 HI 90 PVALUE ---------- ---------- ---------- ---------- 0.103 0.066 0.145 0.011
à Model 2 is not accepted.
RMSEA is higher than 0.05
P-value smaller than α=0.05, H0: S = C
REJECTED
P-value smaller than α=0.1, H0: RMSEA < 0.05 REJECTED
STA201 - An Introduction to Structural Equation Modeling
Indices based on a baseline model
The SATURATED Model: This model contains as many parameter estimates as there are available degrees of freedom or inputs into the analysis. Therefore, this model shows 0 degrees of freedom. [This is the least restricted model possible] The INDEPENDENCE Model: This model contains estimates of the variance of the observed variables only. In other words, it assumes all relationships between the observed variables are zero (uncorrelated), no theoretical relationships. Therefore, this model shows the maximum number of degrees of freedom. [This is the most restrictive model possible and ANY TEST SHALL ALWAYS LEAD TO ITS REJECTION]
G. Russolillo – slide 114
Model Comparison
STA201 - An Introduction to Structural Equation Modeling
Independence vs Saturated model
G. Russolillo – slide 115
INDEPENDENCE Model:
SATURATED Model:
FIT = max DF = 0 N_PAR = max N_CONSTR = min CHI2 = 0 P-value = max
FIT = min DF = max N_PAR = min N_CONSTR = max CHI2 = max P-value = min
STA201 - An Introduction to Structural Equation Modeling
Indices based on a baseline model
This index was initially devised by Joreskog and Sorbom (1984) for ML and ULS estimation. It has then been generalised to other estimation criteria.
1IND
FGFIF
= −
Fit function that would results if all parameters were zero (fit function of the INDEPENDENCE Model)
If a model is able to explain any true covariance between the observed variables, then F/FIND would be 0 à GFI=1
The model is accepted if GFI is at least 0.9
Goodness-of-Fit Index (GFI)
G. Russolillo – slide 116
STA201 - An Introduction to Structural Equation Modeling
( )1 1 INDDFAGFI GFIDF
= − −
Adjusted Goodness-of-Fit Index (AGFI)
Degree of freedom of the INDEPENDENCE Model
If AGFI equals the unity, the model shows a perfect fit.N.B.: It is not bounded below by zero.
This index take into account the df of the models by adjusting the GFI by a ratio of the DF used in the model and the max DF available
G. Russolillo – slide 117
Indices based on a baseline model
The model is accepted if AGFI is at least 0.9
STA201 - An Introduction to Structural Equation Modeling
Bentler Comparative Fit Index (CFI)
It is bounded between 0 and 1, with higher value indicating better fit
Degree of freedom of the INDEPENDENCE Model
Fit function of the INDEPENDENCE Model
CFI =N −1( )FIND − DFIND
⎡⎣ ⎤⎦− N −1( )F − DF⎡⎣ ⎤⎦N −1( )FIND − DFIND
⎡⎣ ⎤⎦
G. Russolillo – slide 118
Indices based on a baseline model
The model is accepted if CFI is at least 0.9
STA201 - An Introduction to Structural Equation Modeling
11
IND
IND
IND
IND
F FDF DFNNFI FDF n
−=
−−
It is usually (not necessarely) between 0 and 1, with higher value indicating better fit
Bentler-Bonnet Non-Normed Fit Index (NNFI)(Also known as Tucker-Lewis index (TLI or rho2)
G. Russolillo – slide 119
The model is accepted if TLI is at least 0.9
Indices based on a baseline model
STA201 - An Introduction to Structural Equation Modeling
Root Mean Square Residual
RMR = 2P(P +1)
(spp ' − σ̂ pp ' )2
p '=1
p '≤p
∑p=1
P
∑
Standardized RMR =2
P(P +1)(rpp ' − r̂pp ' )
2
p '=1
p '≤p
∑p=1
P
∑
Good fit if RMR < 0.05
G. Russolillo – slide 120
Good fit if SRMR < 0.1
STA201 - An Introduction to Structural Equation Modeling
Comparing model fit measures
G. Russolillo – slide 121
Information-theory based indexes
AIC = χ 2 + 2t
CAIC = χ 2 + 1+ ln n( )!"
#$ t( )
ECVI = χ2
n + 2 tn( )
These indexes are used to compare the fit of nested models
Among two nested model we choose the one with the smallest index
# of parameters to be estimated
STA201 - An Introduction to Structural Equation Modeling
Model revision
G. Russolillo – slide 122
STA201 - An Introduction to Structural Equation Modeling
Testing path coefficients and loadings
Path links may be removed if the associated regression coefficients are not statistically significant at a given significance level.
SEM output traditionally consist of - the unstandardised estimate for each regression coefficient, - its standard error (S.E.), - the estimate divided by the standard error (C.R. for Critical Ratio) - the probability value (p-value) associated with the null hypothesis that the
coefficient is 0.
Decision Rule:• Coeffficients with a absolute value for C.R. above 2 are considered as significantly different from 0 and the related link is kept.• Coeffficients with a absolute value for C.R. below 2 are considered as not significantly different from 0 and the related link is not validated.
G. Russolillo – slide 123
STA201 - An Introduction to Structural Equation Modeling
Modification indices Adding a new link to the model…..
The Modification Index (MI) represents we the expected drop in overall Chi-square value when a fixed parameter specified is freely estimated in a subsequent run.It is the difference between the Chi-square of the model in which the parameter under investigation is a fixed parameter and the Chi-square of the model in which this parameter is included in the set of parameter to be estimated (free parameters). • The MI follows a Chi-square distribution with 1 df.• A fixed parameter is included in the model (and so estimated) if the value of
the MI is higher than 4.Associated with each MI is an Expected Parameter Change (EPC) value that represents the predicted estimated change, in either a positive or negative direction
G. Russolillo – slide 124
STA201 - An Introduction to Structural Equation Modeling
Pay attention..
Any change in the model based on the decrease of chi-square measured by a modification index shall be carefully undertaken by evaluating the impact of the following consequences:
• Respecification of the Model (getting surprising results is always nice as long as they are not illogical...);
• Corresponding changes in the parameters (evaluation by means of EPC - Expected Parameter Change);
• Eventual impossible replication of empirically-based decision (capitalization on chance).
G. Russolillo – slide 125
STA201 - An Introduction to Structural Equation Modeling
Sentimental Engagement
G. Russolillo – slide 126
6 blocks of manifest variables observed on 240 individuals: - Commitment - Satisfaction - Rewards - Costs - Size of investment - Alternative Value
Developing a causality model with latent variables by means of the Maximum Likelihood Approach to Structural Equation Modeling (SEM-ML) - C. E. Rusbult : Commitment and satisfaction in romantic associations: A test of the investment model. Journal of Experimental Social Psychology, 1980 - L. Hatcher : A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. SAS Institute, 2013
STA201 - An Introduction to Structural Equation Modeling
Sentimental Engagement
G. Russolillo – slide 127
Exemple of the investiment block indicators
Please rate each of the following items to indicate the extent to whichyou agree or disagree with each statement. Use a response scale inwhich 1 = Strongly Disagree and 7 = Strongly Agree.
1. I have invested a great deal of time in my current relationship.
2. I have invested a great deal of energy in my current relationship.
3. I have invested a lot of my personal resources (e.g., money) in developing my current relationship.
4. My partner and I have developed a lot of mutual friends which I might lose if we were to break up.
STA201 - An Introduction to Structural Equation Modeling
Sentimental Engagement
G. Russolillo – slide 128
Exemple of the satisfaction block indicators
1. I am satisfied with my current relationship.
2. My current relationship comes close to my ideal relationship.
3. I am more satisfied with my relationship than is the average person.
4. I feel good about my current relationship.
STA201 - An Introduction to Structural Equation Modeling
Sentimental Engagement
G. Russolillo – slide 129
Exemple of the alternarive value block indicators
1. There are plenty of other attractive people around for me to date if I
were to break up with my current partner.
2. It would be attractive for me to break up with my current partner
and date someone else.
3. It would be attractive for me to break up with my partner and “play
the field” for a while.
STA201 - An Introduction to Structural Equation Modeling
Sentimental Engagement
G. Russolillo – slide 130
Commitment
v1 e1 1
v2
e2 1
v3 e3 1
Rewards
v8
e8 1
v9
e9
v10
e10
1
Satisfaction
v5
e5 1
v6
e6
v7
e7
Costs v11
e11 v12
e12 v13
e13 1
Investments
v14 e14
1 v15 e15
v16 e16
1
Alternatives
v17
e17
v18
e18
1 v19
e19 1
1
1
d1
d2 1
1
1 1
1
1
1 1 1
1
1 1
1
F1
F3
F2 F4
F5
F6
v4 e4
Sentimental Engagement SEM model
STA201 - An Introduction to Structural Equation Modeling
Steps in SEM Analysis
G. Russolillo – slide 131
Step 1: Model specification • usually done by drawing pictures using SEM software or by “coding”
SEM equations in R Step 2: Model identification
• Model identification rules Step 3: Parameter estimation
• Minimize a discrepancy function iterative process Step 3: Model fit and validation
• Fit indexes • Model Testing
Step 4: Model revision • Modification indexes
STA201 - An Introduction to Structural Equation Modeling
Topics NOT addressed in this introduction to SEM
• Analysis of residuals
• Missing data
• Multiple group analysis
• Higher order Latent Variables
• Formative measurements
• Non-normality
• Non-linearity
• Moderation analysis
• Equivalent models
G. Russolillo – slide 132
STA201 - An Introduction to Structural Equation Modeling
Main References and further lectures • K.A. Bollen: Structural Equations with Latent Variables, John Wiley & Sons., 1989
• A. Beaujean: Latent Variable Modeling Using R A Step-by-Step Guide, Routledge, 2014
• R. Cudeck, S. Du Toit, D. Sörbom (Editors): Structural Equation Modeling: Present and Future - A Festschrift in honor of Karl G. Jöreskog, Scientific Software International Chicago, 2001
• R. Hoyle: Structural equation modeling: concepts, issues and applications, SAGE Publications, 1995
• K.G. Jöreskog & D. Sörbom Advances in Factor Analysis and Structural Equation Models, Abt Books, 1979
• Rex B. Kline : Principles and Practice of Structural Equation Modeling, The Guilford Press, New York, 2005
• J.S. Long : Confirmatory Factor Analysis: A Preface to LISREL, SAGE Publications, 1983
• N. O’Rourke, L. Hatcher : A step-by-step approach to using the SAS system for factor analysis and structural equation modeling. Cary, NC: SAS Institute, Inc 2013
• D. Kaplan : Structural Equation Modeling: Foundations and Extensions, SAGE Publications, 2000
• Meredith, W. (1993). "Measurement invariance, factor analysis and factorial invariance." Psychometrika 58(4): 525-543.
• T. Raykov and G. A. Marcoulides: First Course in Structural Equation Modeling, Taylor and Francis, 2006
• Y. Rosseel: lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software ,48(2), 2012
G. Russolillo – slide 133
STA201 - An Introduction to Structural Equation Modeling G. Russolillo – slide 134
This presentation is made available through a Creative Commons Attribution-Noncommercial license. Details of the license and permitted uses are available at
http://creativecommons.org/licenses/by-nc/4.0/
© 2018 G. Russolillo Title: An introduction to Structural Equation Modeling
Attribution: Giorgio Russolillo and Laura Trinchera