20
1 Structural Equation Models with Latent Variables Francisco Peñagaricano University of Florida Decipher causal relationships is the ultimate goal in most studies involving complex traits §unravel causal relations among variables can be used to predict the behavior of complex systems Inferring causal effects from observational data is difficult due to the presence of potential confounders §suitable methodologies, e.g. structural equation models are already available and have been used in other fields Causal Effects

Structural Equation Models with Latent Variables

  • Upload
    others

  • View
    9

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Structural Equation Models with Latent Variables

1

Structural Equation Models with Latent Variables

Francisco Peñagaricano University of Florida

Decipher causal relationships is the ultimate goal in most studies involving complex traits

§ unravel causal relations among variables can be used to predict the behavior of complex systems

Inferring causal effects from observational data is difficult due to the presence of potential confounders

§ suitable methodologies, e.g. structural equation models are already available and have been used in other fields

Causal Effects

Page 2: Structural Equation Models with Latent Variables

2

modeling of causal relationships between multiple traits

Wright Haavelmo

the founding fathers considered SEM as a mathematical tool for drawing causal conclusions from a combination

of observational data and theoretical assumptions

Causality and Structural Equation Models

Structural Equation Models

Structural Equation Modeling is an inference tool:

INPUTS:

§  qualitative causal assumptions

§  empirical data

OUTPUTS:

§  quantitative causal conclusions

§  statistical measures of the fit of the causal model

modeling of causal relationships between multiple traits

Causality and Structural Equation Models

Structural Equation Models

Page 3: Structural Equation Models with Latent Variables

3

Path Analysis (SEM considering only observed variables)

CM

MY

SCS

Fert

qualitative causal assumptions

+ DATA

sample variance-covariance

matrix

empirical data

CM

MY

SCS

Fert

quantitative causal conclusions (path coefficients)

Latent variables: variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables

q one of the most remarkable features of SEM is the ability to consider latent variables

Structural Equation Models

Page 4: Structural Equation Models with Latent Variables

4

Latent variables variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables

Latent variables variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables

o Intelligence o Meat quality

o Fertility

Page 5: Structural Equation Models with Latent Variables

5

Latent variables variables that are not observable or are not directly measurable, but are characterized in the model from several observed variables

latent variables allow modeling complex phenomena reducing at the same time the dimensionality of the data

§  many phenotypes can be combined in a model to represent an underlying concept of interest

latent variables

observed variables

error terms

Confirmatory Factor Analysis

§ Measure latent variables § Reduce the dimensionality of the data § Test the statistical significance of factor loadings § Precursor of a hybrid model

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

Measurement Analysis

Structural Equation Modeling

Page 6: Structural Equation Models with Latent Variables

6

latent variables

observed variables

error terms

disturbance terms

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

Hybrid Model §  Evaluate presumed causal relations among latent variables

Structural Analysis

Structural Equation Modeling

( ) ( ) ( )( ) ( ) ( ) ( ) ( ) ( )

0

, , , , , , 0

ζ δ εE E E

Cov Cov Cov Cov Cov Covξ ζ ξ δ ξ ε δ ε ζ δ ζ ε

= = =

= = = = = =

X1 X2 X3 Y1 Y2 Y3 Y4 Y5

2η1η1ξ

Structural Model

η Βη Γξ ζ= + +

x Λ ξ δx= +

y Λ η εy= +

structural model

measurement model

latent variables

observed variables

error terms

disturbance terms

Assumptions:

Page 7: Structural Equation Models with Latent Variables

7

X1 X2 X3 Y1 Y2 Y3 Y4 Y5

2η1η1ξ

Structural Model

latent variables

observed variables

error terms

disturbance terms

4

1 511

2 621

3 7

8

00

0 0 = 0

0 000

Λ Λ Γ Βx y

λλ λ

γλ λ

βλ λ

λ

⎡ ⎤⎢ ⎥⎡ ⎤ ⎢ ⎥ ⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥= = =⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎣ ⎦ ⎣ ⎦⎢ ⎥⎣ ⎦ ⎢ ⎥⎢ ⎥⎣ ⎦

x Λ ξ δx= +y Λ η εy= +

measurement model

η Βη Γξ ζ= + +structural model

Parameters

X1 X2 X3 Y1 Y2 Y3 Y4 Y5

2η1η1ξ

Structural Model

latent variables

observed variables

error terms

disturbance terms

x Λ ξ δx= +y Λ η εy= +

measurement model

η Βη Γξ ζ= + +structural model

[ ]

( )11

( ) ( )11 22

11( ) ( )1122 33

22( ) ( )33 44

( )55

00 0 0

00 0 0 0 0

0 0 0 0

δ εΘ Θ Ψ Φ

ε

δ ε

δ ε

δ ε

ε

θθ θ

ψθ φθ

ψθ θ

θ

⎡ ⎤⎢ ⎥⎡ ⎤ ⎢ ⎥ ⎡ ⎤⎢ ⎥ ⎢ ⎥= = = =⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎣ ⎦⎢ ⎥ ⎢ ⎥⎣ ⎦⎢ ⎥⎣ ⎦ VCOV matrices

Page 8: Structural Equation Models with Latent Variables

8

X1 X2 X3 Y1 Y2 Y3 Y4 Y5

2η1η1ξ

Structural Model

latent variables

observed variables

error terms

disturbance terms

Model Estimation

§  VCOV matrices are the fundamental units of the analysis

Estimation: find parameters that reproduce as close as possible the observed VCOV matrix

Structural Equation Modeling

ΣSΣ̂

variance-covariance matrix for the entire population

variance-covariance matrix computed from a sample of the population

model-based (fitted) variance-covariance matrix

ˆS Σ− residual variance-covariance matrix

Page 9: Structural Equation Models with Latent Variables

9

Structural Equation Modeling

( ) ( )1ˆ ˆlog log Σ S Σ SMLF tr p q−= + × − − +

if variables follow a multivariate normal distribution, the ML

estimates are those that minimize the following fitting function:

ΣSΣ̂

variance-covariance matrix for the entire population

variance-covariance matrix computed from a sample of the population

model-based (fitted) variance-covariance matrix

ˆS Σ− residual variance-covariance matrix

Structural Equation Modeling

( ) ( )1ˆ ˆlog log Σ S Σ SMLF tr p q−= + × − − +

( ) 21 ~ML dfN F χ− ⋅

df: difference between the number of unique elements in the VCOV matrix and the number of free parameters in the model

test for model fit

In SEM: we want high P-values

(we want to NO reject H0)

Page 10: Structural Equation Models with Latent Variables

10

Structural Equation Modeling

( ) ( )1ˆ ˆlog log Σ S Σ SMLF tr p q−= + × − − +

( ) 21 ~ML dfN F χ− ⋅test for model fit

§  global test: it evaluates simultaneously all the restrictions imposed in the variance-covariance matrix

§  if the test is significant, i.e. we reject H0, the source of the lack of fit is unclear

§  depends on the sample size and the number of parameters

Structural Equation Modeling Alternatives fit indices o  Standardized Root Mean Square Residual (SRMR)

o  Tucker-Lewis index (TLI)

o  Root mean square error of approximation (RMSEA)

o  Comparative fit index (CFI)

Page 11: Structural Equation Models with Latent Variables

11

Latent Variable Modeling in a quantitative genetic context

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

Associations between latent variables can be explained not only by causal links between them, but also by genetic reasons

Latent Variable Modeling in a quantitative genetic context

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

Page 12: Structural Equation Models with Latent Variables

12

Latent Variable Modeling in a quantitative genetic context

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

causal links can be masked by genetic covariances

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

Latent Variable Modeling in a quantitative genetic context

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

adjust the data for possible (confounder) genetic effects

X1 X2 X3 X4 X5 X6 X7 X8

L3 L2 L1

Page 13: Structural Equation Models with Latent Variables

13

Searching for causal networks involving latent variables in complex traits

The approach can be summarized in the following steps: 1. modeling different latent variables

2. evaluating measurement models using CFA

3. fitting a multi-trait animal model using factor scores

4. fitting SEM for assessing causal links between latent variables

§  relevant: evaluate model fit and estimate model parameters

Peñagaricano et al. (2015) J Anim Sci. 93: 4617-4623

Searching for causal networks involving latent variables in complex traits

§  Application: evaluate possible causal relationships between growth, carcass and meat quality traits in pigs

v phenotypic data: dataset with 413 F2 pigs for which several phenotypes were recorded over time

Page 14: Structural Equation Models with Latent Variables

14

Modeling different latent variables

WTWK06

Growth

WTWK10 WTWK13 WTWK16

post-weaning growth (body weights)

HAM

Primal Cuts

LOIN BOSTON PICNIC

weights of primal cuts

FIRST.RIB

Fat Compos.

LAST.RIB LAST.LUM 10TH.RIB

midline backfat thickness

Modeling different latent variables

MARB

Quality

FIRM DRIPLOSS

Juiciness

Taste

Tenderness ConnTissue Off.flavor

trained sensory panel

Page 15: Structural Equation Models with Latent Variables

15

Confirmatory Factor Analysis

þ Model Fit Evaluation

þ Factor Loadings

Confirmatory Factor Analysis

Page 16: Structural Equation Models with Latent Variables

16

Confirmatory Factor Analysis

define a measurement model for the relationship between multivariate

observations and underlying factors

Confirmatory Factor Analysis

model implied variance-covariance matrix of the latent variables !?

Page 17: Structural Equation Models with Latent Variables

17

Searching for causal networks involving latent variables in complex traits

The approach can be summarized in the following steps: 1. modeling different latent variables

2. evaluating measurement models using CFA

3. fitting a multi-trait animal model using factor scores

Searching for causal networks involving latent variables in complex traits

The approach can be summarized in the following steps: 1. modeling different latent variables

2. evaluating measurement models using CFA

3. fitting a multi-trait animal model using factor scores

Page 18: Structural Equation Models with Latent Variables

18

Searching for causal networks involving latent variables in complex traits

The approach can be summarized in the following steps: 1. modeling different latent variables

2. evaluating measurement models using CFA

3. fitting a multi-trait animal model using factor scores

4. fitting SEM for assessing causal links between latent variables

Structural Analysis

þ Model Fit Evaluation

Page 19: Structural Equation Models with Latent Variables

19

Structural Analysis

Statistically Equivalent Models causal models that can have the same statistical consequences; these models fit any set of data equally well

identical implied VCOV matrices

identical residuals

Page 20: Structural Equation Models with Latent Variables

20

Statistically Equivalent Models

§  translation from a causal hypothesis into a statistical model is imperfect

the asymmetrical links of the causal process are replaced by symmetrical relationships involving probability distributions

this imperfection arises because:

§  statistically equivalent models illustrate the limit on the ability to infer causality from statistical data alone

Structural Analysis Alternative Causal Structures