Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
1
1
An Overview of STRUCTURAL EQUATION MODELS
WITH LATENT VARIABLES
Kenneth A. Bollen Odum Institute for Research in Social Science
Department of Sociology University of North Carolina
at Chapel Hill
Presented at the Miami University Symposium on Computational Research - March 1-2, 2007, Miami University, Oxford, OH.
Bollen
2
I.What are Structural Equation Models? (SEM) II. Fundamental Hypothesis [E = E(2)] III. Illustrations IV. Three Common Types of SEM V. Overview of Modeling Steps VI. SEM with Means and Intercepts VII. Software VIII. Empirical Example
Bollen
3
I. What are SEM? General Statistical Model Special Cases: ANOVA & ANCOVA Multiple Regression Econometric Models Path Analysis Factor Analysis Etc.
Bollen
4
SEM Allow: 1. Latent & Observed Variables 2. Random & Nonrandom Errors 3. Errors-in-Variables Regressions 4. Multiple Indicators 5. Restrictions on Parameters 6. Test of Model Fit 7. Nonnormal Variables
Bollen
5
II. Fundamental Hypothesis Ho: E = E(2) E = Population Covariance Matrix 2 = Vector of Parameters E(2) = Model Implied Covariance Matrix
Bollen
6
III. Illustrations A. Simple Regression as a SEM
y x
x y
Bollen
7
III.A.
Bollen
8
III.A.
Bollen
9
III.B. Simple Factor Analysis
y1 = η + ε1 y2 = η + ε2
•
y1 y2
•1 •2
1 1
Bollen
10
III.B.
E=E(2)
⎥⎥⎦
⎤
⎢⎢⎣
⎡
+
+=
⎥⎥⎦
⎤
⎢⎢⎣
⎡)()()(
)()(
)(),()(
2
1
221
1
εηη
εη
VARVARVAR
VARVAR
yVARyyCOVyVAR
e.g., COV(y1,y2)=COV(η + ε1, η + ε2) =COV(η,η)
=VAR(η)
Bollen
11
Confirmatory Factor Analysis: Air Quality Example
y1 = η1 + ε1 y2 = λ21η1 + ε2 y3 = λ31η1 + ε3 y4 = λ41η1 + ε4
Bollen
12
Latent Variables
Variables of Interest But Not Directly Measured
Common in Sciences: Intelligence, Worker Productivity, Diseases, Happiness, Value of House, Carrying Capacity, “Free” Market, Disturbance Variables
Bollen
13
III.C. Simple General Model
Bollen
14
III.C.
[ ] E = VAR (y) COV (x1,y) VAR (x1)
COV(x2, y) COV(x2, x1) VAR(x2)
E(2) =
[ ](2VAR(>) + VAR(.) (VAR(>) VAR(>) + VAR(*1) (VAR(>) VAR(>) VAR(>) + VAR(*2)
Bollen
15
IV. Three Types of SEM A. Classical Econometric 1. Multiequation System y=βy + Γx + ζ 2. No Measurement Error y=η x=ξ
Bollen
16
Felson and Bohrnstedt (1979) Model of Perceived Attractiveness and Perceived Academic Performance (N = 209) x1 = Grade Point Average x2 = Deviation of height from mean by grade and sex x3 = Weight adjusted for height x4 = Physical attractiveness rated by children outside class y1 = Perceived academic ability, based on class-mates’ ratings y2 = Perceived attractiveness, classmates’ ratings y1 = $12y2 + (11x1 + .1 y2 = $21y1 + (22x2 + (23x3 + (24x4 + .2 E(.i) = 0 COV(.1, .2) … 0 COV(.i,xj) = 0 for i=1,2; j=1,2,3,4
x1
x2
x3
x4
y2
y1 •1
•2
Bollen
17
IV.B. Confirmatory Factor Analysis 1. Latent Variables 2. Measurement Errors y = Λy η + ε y = vector of observed indicators η = vector of latent variables (or “factors”) ε = vector of “measurement errors” E(ε) = 0 COV(η ,ε) = 0
Bollen
18
Bollen
19
IV.B. Confirmatory Factor Analysis 1. Latent Variable Model η= Βη + Γξ + ζ η= latent endogenous variables ξ = latent exogenous variables ζ = disturbance vector Β = coefficient matrix for η or η effects Γ = coefficient matrix for ξ or η effects E(ζ) = 0, COV(ξ,ζ)=0
Bollen
20
IV.C. General SEM 2. Measurement Model
y = Λy η + ε E(ε)=0 x = Λx ξ+ δ E(δ)=0 y = indicators of η Λy=factor loadings of η or y ε = errors of measurement for y x= indicators of ξ Λx= factor loadings of ξ or x δ = errors of measurement for x δ, ε, ζ,ξ are uncorrelated
Bollen
21
IV.C. General SEM
Bollen
22
IV.C.
ξ1 = Industrialization 1960 xi = Indicators of industrialization 01 = Democratic political structure 1960 02 = Democratic political structure 1965 yi = Indicators of political democracy
Bollen
23
V. Overview of Steps in Modeling A. Specification B. Implied Covariance Matrix C. Identification D. Estimation E. Testing and Diagnostics F. Respecification
Bollen
24
V.A. Specification 1. What latent variables? 2. Relation between latent variables?
3. What measures? 4. Relation between measures and latent variables?
Bollen
25
V.B. Implied Covariance Matrix Ho: G = G(2) Each Model => G(2)
Bollen
26
e.g.,
Bollen
27
V.C. Identification Unique values for parameters? If E(21) =E(22), then 21 = 22 Identification VAR(y) = 21 + 22
21 22 VAR(y) ____________________________________ 5 5 10 7 3 10 9 1 10
Bollen
28
Establishing Identification 1. Algebraic Means E =E(2) solve for 2 2. Identification Rules 3. Empirical Tests
Bollen
29
e.g.,
Identified? Yes, Three Indicator Rule
Bollen
30
V.D. Estimation Ho: E =E(2)
S sample estimator of E E(θ̂ ) sample estimator of E(2) Choose θ̂ so E(θ̂ ) close to S
Bollen
31
e.g., y1 = x1 + .1
E =E(2)
⎥⎥⎥
⎦
⎤
⎢⎢⎢
⎣
⎡ +=
⎥⎥⎦
⎤
⎢⎢⎣
⎡1111
1111
221
1
)()()(
φφ
ψφ
xVARyxCOVyVAR
Goal to find 2
Bollen
32
⎥⎥⎦
⎤
⎢⎢⎣
⎡=
46
610S
⎥⎥⎥⎥
⎦
⎤
⎢⎢⎢⎢
⎣
⎡ +
=Σ1111
111111
ˆˆ
ˆˆˆ
)ˆ( φφ
φψφ
θ
Find 11φ̂ and 11ψ̂
Make E(θ̂ ) close to S
Say 11φ̂ = 7 , 11ψ̂ = 3
⎥⎥⎦
⎤
⎢⎢⎣
⎡=Σ
77
710)ˆ(θ
⎥⎥⎦
⎤
⎢⎢⎣
⎡
−−
−=Σ−
31
10)ˆ(θS
Bollen
33
ESTIMATORS A. Full Information 1. Maximum Likelihood (ML) 2. Generalized Least Squares (GLS) 3. Unweighted Least Squares (ULS) 4. Weighted Least Squares (WLS) [Arbitrary Distribution Function (ADF)] B. Limited Information 1. Two-Stage Least Squares (2SLS)
Bollen
34
V.E. Testing and Diagnostics Ho: E =E(2)
P2 Test Tm = (N-1)* Fit Function Min. df = ½ (p+q) (p+q+1) -# of parameters
Bollen
35
2. Overall Model Fit
BIC= Tm – df ln(N)
Bollen
36
V.E. Testing and Diagnostics
3. Residuals (S-E(θ̂ ))
4. Component Fit
5. Statistical Power
Bollen
37
e.g.,
P2 = 16.0 df = 2 p = .0003
Bollen
38
VI.E. Respecification
1. Substantive-Based Revisions 2. Lagrangian Multiplier 3. WALD Ê 4. Residuals (S - G(2))
Bollen
39
e.g.,
P2 = 4.2 df = 1 p = .04
Bollen
40
Steps in Modeling
1. Specification
2. Implied Covariance Matrix
3. Identification
4. Estimation
5. Testing
6. Respecification
Bollen
41
VI. SEM with Means and Intercepts
A. Specification
1. Latent Variable Model
η= "0 + Βη + Γξ + ζ
intercept
e.g, η1= "01 + Β12η2 + (11ξ1 + ζ1
η2= "02 + Β23η3 + (22ξ2 + ζ2
Bollen
42
VI. SEM with Means and Intercepts
A. Specification
2. Measurement Model
y = "y + Λy η + ε x = "x + Λx ξ+ δ
intercept
e.g., y1 = "y1 + λy11 η1 + ε1 y2 = "y2 + λy21 η1 + ε2 x1 = "x1 + λx11 ξ1+ δ1
Bollen
43
VI. SEM with Means and Intercepts
B. Implied Moments
1. Implied Covariance Matrix E =E(2)
2. Implied Mean Vextor μ = μ(θ) Mean of observed variables= Model implied means
Bollen
44
VI. C. Identification Still have Σ, covariance matrix of observed variables. New parameters: intercepts ( αη,αy,αx) & means of latent ξ (k) New observed variable information: Means of observed variables (μy μx) Can we use Σ, μy, and μx to find unique values of all model parameters? Yes identified
Bollen
45
VI. D. Estimation Same as before (e.g., ML, WLS, 2SLS) E. Testing Same as before (e.g., chi-square, IFI, etc.) F. Respecification Same as before
Bollen
46
VII. Software LISREL [ PRELIS, SIMPLIS] AMOS EQS Mplus Proc Calis in SAS
Mx EzPath AUFIT LINCS
Bollen
47
VIII. Empirical Example Robins, P. K. and R. W. West. 1977. Measurement Errors in the Estimation of Home Value. Journal of the American Statistical Association 72, 290-94.
x1
y1
x3
ζ1
η1
ε3
Figure Robins and West (1977, JASA) Value of Home (η1) with Causal Indicators of lot size (x1), square footage (x2), number of rooms (x3), etc. (xq), and Effect Indicators of Appraised Value (y1), Owner Estimate (y2), & Assessed Value (y3), and Disturbances (ζ1, ε1 to ε3 )
y2
ε1
x2
xq
y3
ε2
Bollen
48
Bollen
49
VIII. Empirical Example Bollen, K.A. 1993. Liberal Democracy: Validity and Method Factors in
Cross-National Measures. American Journal of Political Science, 37:1207-1230.
Bollen
50
Bollen
51