Upload
others
View
9
Download
0
Embed Size (px)
Citation preview
1
Structural Equation Models with Latent Variables
Francisco Peñagaricano University of Florida
Decipher causal relationships is the ultimate goal in most studies involving complex traits
§ unravel causal relations among variables can be used to predict the behavior of complex systems
Inferring causal effects from observational data is difficult due to the presence of potential confounders
§ suitable methodologies, e.g. structural equation models are already available and have been used in other fields
Causal Effects
2
modeling of causal relationships between multiple traits
Wright Haavelmo
the founding fathers considered SEM as a mathematical tool for drawing causal conclusions from a combination
of observational data and theoretical assumptions
Causality and Structural Equation Models
Structural Equation Models
Structural Equation Modeling is an inference tool:
INPUTS:
§ qualitative causal assumptions
§ empirical data
OUTPUTS:
§ quantitative causal conclusions
§ statistical measures of the fit of the causal model
modeling of causal relationships between multiple traits
Causality and Structural Equation Models
Structural Equation Models
3
Path Analysis (SEM considering only observed variables)
CM
MY
SCS
Fert
qualitative causal assumptions
+ DATA
sample variance-covariance
matrix
empirical data
CM
MY
SCS
Fert
quantitative causal conclusions (path coefficients)
Latent variables: variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables
q one of the most remarkable features of SEM is the ability to consider latent variables
Structural Equation Models
4
Latent variables variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables
Latent variables variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables
o Intelligence o Meat quality
o Fertility
5
Latent variables variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables
latent variables allow modeling complex phenomena reducing at the same time the dimensionality of the data
§ many phenotypes can be combined in a model to represent an underlying concept of interest
latent variables
observed variables
error terms
Confirmatory Factor Analysis
§ Measure latent variables § Reduce the dimensionality of the data § Test the statistical significance of factor loadings § Precursor of a hybrid model
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
Measurement Analysis
Structural Equation Modeling
6
latent variables
observed variables
error terms
disturbance terms
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
Hybrid Model § Evaluate presumed causal relations among latent variables
Structural Analysis
Structural Equation Modeling
( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( )
0
, , , , , , 0
ζ δ εE E E
Cov Cov Cov Cov Cov Covξ ζ ξ δ ξ ε δ ε ζ δ ζ ε
= = =
= = = = = =
X1 X2 X3 Y1 Y2 Y3 Y4 Y5
2η1η1ξ
Structural Model
η Βη Γξ ζ= + +
x Λ ξ δx= +
y Λ η εy= +
structural model
measurement model
latent variables
observed variables
error terms
disturbance terms
Assumptions:
7
X1 X2 X3 Y1 Y2 Y3 Y4 Y5
2η1η1ξ
Structural Model
latent variables
observed variables
error terms
disturbance terms
4
1 511
2 621
3 7
8
00
0 0 = 0
0 000
Λ Λ Γ Βx y
λλ λ
γλ λ
βλ λ
λ
⎡ ⎤⎢ ⎥⎡ ⎤ ⎢ ⎥ ⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥= = =⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦⎢ ⎥⎣ ⎦ ⎢ ⎥⎢ ⎥⎣ ⎦
x Λ ξ δx= +y Λ η εy= +
measurement model
η Βη Γξ ζ= + +structural model
Parameters
X1 X2 X3 Y1 Y2 Y3 Y4 Y5
2η1η1ξ
Structural Model
latent variables
observed variables
error terms
disturbance terms
x Λ ξ δx= +y Λ η εy= +
measurement model
η Βη Γξ ζ= + +structural model
[ ]
( )11
( ) ( )11 22
11( ) ( )1122 33
22( ) ( )33 44
( )55
00 0 0
00 0 0 0 0
0 0 0 0
δ εΘ Θ Ψ Φ
ε
δ ε
δ ε
δ ε
ε
θθ θ
ψθ φθ
ψθ θ
θ
⎡ ⎤⎢ ⎥⎡ ⎤ ⎢ ⎥ ⎡ ⎤⎢ ⎥ ⎢ ⎥= = = =⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎣ ⎦⎢ ⎥ ⎢ ⎥⎣ ⎦⎢ ⎥⎣ ⎦ VCOV matrices
8
X1 X2 X3 Y1 Y2 Y3 Y4 Y5
2η1η1ξ
Structural Model
latent variables
observed variables
error terms
disturbance terms
Model Estimation
§ VCOV matrices are the fundamental units of the analysis
Estimation: find parameters that reproduce as close as possible the observed VCOV matrix
Structural Equation Modeling
ΣSΣ̂
variance-covariance matrix for the entire population
variance-covariance matrix computed from a sample of the population
model-based (fitted) variance-covariance matrix
ˆS Σ− residual variance-covariance matrix
9
Structural Equation Modeling
( ) ( )1ˆ ˆlog log Σ S Σ SMLF tr p q−= + × − − +
if variables follow a multivariate normal distribution, the ML
estimates are those that minimize the following fitting function:
ΣSΣ̂
variance-covariance matrix for the entire population
variance-covariance matrix computed from a sample of the population
model-based (fitted) variance-covariance matrix
ˆS Σ− residual variance-covariance matrix
Structural Equation Modeling
( ) ( )1ˆ ˆlog log Σ S Σ SMLF tr p q−= + × − − +
( ) 21 ~ML dfN F χ− ⋅
df: difference between the number of unique elements in the VCOV matrix and the number of free parameters in the model
test for model fit
In SEM: we want high P-values
(we want to NO reject H0)
10
Structural Equation Modeling
( ) ( )1ˆ ˆlog log Σ S Σ SMLF tr p q−= + × − − +
( ) 21 ~ML dfN F χ− ⋅test for model fit
§ global test: it evaluates simultaneously all the restrictions imposed in the variance-covariance matrix
§ if the test is significant, i.e. we reject H0, the source of the lack of fit is unclear
§ depends on the sample size and the number of parameters
Structural Equation Modeling Alternatives fit indices o Standardized Root Mean Square Residual (SRMR)
o Tucker-Lewis index (TLI)
o Root mean square error of approximation (RMSEA)
o Comparative fit index (CFI)
11
Latent Variable Modeling in a quantitative genetic context
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
Associations between latent variables can be explained not only by causal links between them, but also by genetic reasons
Latent Variable Modeling in a quantitative genetic context
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
12
Latent Variable Modeling in a quantitative genetic context
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
causal links can be masked by genetic covariances
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
Latent Variable Modeling in a quantitative genetic context
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
adjust the data for possible (confounder) genetic effects
X1 X2 X3 X4 X5 X6 X7 X8
L3 L2 L1
13
Searching for causal networks involving latent variables in complex traits
The approach can be summarized in the following steps: 1. modeling different latent variables
2. evaluating measurement models using CFA
3. fitting a multi-trait animal model using factor scores
4. fitting SEM for assessing causal links between latent variables
§ relevant: evaluate model fit and estimate model parameters
Peñagaricano et al. (2015) J Anim Sci. 93: 4617-4623
Searching for causal networks involving latent variables in complex traits
§ Application: evaluate possible causal relationships between growth, carcass and meat quality traits in pigs
v phenotypic data: dataset with 413 F2 pigs for which several phenotypes were recorded over time
14
Modeling different latent variables
WTWK06
Growth
WTWK10 WTWK13 WTWK16
post-weaning growth (body weights)
HAM
Primal Cuts
LOIN BOSTON PICNIC
weights of primal cuts
FIRST.RIB
Fat Compos.
LAST.RIB LAST.LUM 10TH.RIB
midline backfat thickness
Modeling different latent variables
MARB
Quality
FIRM DRIPLOSS
Juiciness
Taste
Tenderness ConnTissue Off.flavor
trained sensory panel
15
Confirmatory Factor Analysis
þ Model Fit Evaluation
þ Factor Loadings
Confirmatory Factor Analysis
16
Confirmatory Factor Analysis
define a measurement model for the relationship between multivariate
observations and underlying factors
Confirmatory Factor Analysis
model implied variance-covariance matrix of the latent variables !?
17
Searching for causal networks involving latent variables in complex traits
The approach can be summarized in the following steps: 1. modeling different latent variables
2. evaluating measurement models using CFA
3. fitting a multi-trait animal model using factor scores
Searching for causal networks involving latent variables in complex traits
The approach can be summarized in the following steps: 1. modeling different latent variables
2. evaluating measurement models using CFA
3. fitting a multi-trait animal model using factor scores
18
Searching for causal networks involving latent variables in complex traits
The approach can be summarized in the following steps: 1. modeling different latent variables
2. evaluating measurement models using CFA
3. fitting a multi-trait animal model using factor scores
4. fitting SEM for assessing causal links between latent variables
Structural Analysis
þ Model Fit Evaluation
19
Structural Analysis
Statistically Equivalent Models causal models that can have the same statistical consequences; these models fit any set of data equally well
identical implied VCOV matrices
identical residuals
20
Statistically Equivalent Models
§ translation from a causal hypothesis into a statistical model is imperfect
the asymmetrical links of the causal process are replaced by symmetrical relationships involving probability distributions
this imperfection arises because:
§ statistically equivalent models illustrate the limit on the ability to infer causality from statistical data alone
Structural Analysis Alternative Causal Structures