Upload
gary-gibson
View
221
Download
0
Tags:
Embed Size (px)
Citation preview
Covariance structures in longitudinal analysis
Which one to choose?
Repeated Measures
Importance of Covariance Structures
variability not explained by the fixed effects are model in the covariance structure
represent the background variability that the fixed effects are tested against
valid inferences for fixed effects parameters
Selecting the Appropriate Covariance StructureChoice of covariance structure is a
balance since:
Too simple Type I error rate increases
Too complex power and efficiency decreases
Example
How does the left atrial dimension change over time in patients newly diagnosed with atrial fibrillation?
Atrial fibrillation is an irregularity of the heart’s rhythm Due to chaotic electrical activity in the upper chambers
(atria), the atria quiver instead of contracting in an organized manner
Atrial enlargement maybe related to how easily a subject can go back to a normal rhythm and the likelihood of a blood clot forming --> stroke
Heart Diagram
Example - Data
Data source: Canadian Registry of Atrial Fibrillation
Left atrial dimension measured at enrolment, Year 2, Year 4, Year 7 and Year 10
Fit model with fixed effects only adjust for age at first diagnosis of atrial fibrillation (AF),
gender, hypertension at enrolment and visit year
Example
Model specification
Y = X + Z + where:
Y = response over time
X = design matrix for fixed effects
= parameters for fixed effects
Z = vector of 1s for the random effects
= parameters for random effects
= within-subject variation
Y = X +
SAS Code
PROC MIXED < options > ; CLASS variables ; MODEL dependent = < fixed-effects > < / options > ; RANDOM random-effects < / options > ; REPEATED < repeated-effect >
/ TYPE = covariance-structure ;
Repeat vs Random statement
The RANDOM statement relates to random effects
The REPEATED statement relates to the structure of the within subject errors.
Each statement has a different role…BUT specifying a model with compound symmetry covariance structure can be done with either statement
Models with REPEATED Statement only
No random effects specified in model Assume random effects error is small compared
to within subject error
Covariance structure is based only on the within subject error.
General covariance structure
Assume homogeneity assumption for practical reasons – reduces the number of parameters estimated
Possible to not assume the homogeneity assumption (can be tested but need sufficient amount of data to specify)
Block Diagonal Covariance Matrix
r ~ N
0 0 . . . 0
0 0 . . . 0
0 0 . . . 0
0 . . . . . 0
0 . . 0 . . 0
0 . . 0 . . 0
0 0 0 0 0 0
0,
Covariance structures
Simple (VC – Variance Component)
1 parameter
Covariance structures
Unstructured (UN)
15 parameters
Covariance structures
Compound Symmetry (CS)
2 parameters
Covariance structures
First-order Autoregressive [AR(1)]
2 parameters
Covariance structures
Toeplitz (TOEP)
5 parameters
Draftsman’s plots
2D array of scatterplots for each pair of time lagged observations
For 3 time points: Y1, Y2 and Y3 Y1 vs. Y2 Y1 vs. Y3 Y2 vs. Y3
Draftsman’s plot – Simulation examples
Independence
Y2 Y3 Y4
Y1
Y2
Y3
Draftsman’s plot – Simulation examples
AutoregressiveCompound Symmetry
Example – Draftsman’s plot
la0
20 30 40 50 60 30 40 50 60
20
30
40
50
60
20
30
40
50
60
la2
la4
20
30
40
50
60
70
30
40
50
60
la7
20 30 40 50 60 20 30 40 50 60 70 30 40 50 60 70 80
30
50
70
la10
Example - Correlation matrix
LA_0 LA_2 LA_4 LA_7 LA_10
LA_0 1.000 0.703 0.702 0.674 0.589
LA_2 1.000 0.777 0.706 0.708
LA_4 1.000 0.751 0.720
LA_7 1.000 0.724
LA_10 1.000
Variogram
graphical description of the time/spatial correlation between observations
summarises the relationship between differences in pairs of measurements and the distance of the corresponding points from each other
Equally or unequally spaced observation periods
Variogram
Calculate the sample variogram components:
vijk = ½ (rij – rik)2rij=residual
uijk = |tij – tik| tij=time
Plot of vijk vs. uijk
Process variance – estimated by the average of ½(rij – rlk)2 for i ≠ l
Variogram - Theoretical
Measurement Error
Within Subject Correlation
Time Lag
ProcessVariance
Random Effects
ProcessVariance
Variogram – Sitka tree example
Example - Variogram
lag in months
Va
rio
gra
m
2 4 6 8 10
05
01
00
15
0
Which covariance structure? Fit model with different covariance structures
Compare goodness-of-fit statistics to choose covariance structure
Goodness-of-fit statistics
Bayesian information criterion (BIC) BIC = -2loglik+ d logn
Akaike information criterion (AIC) AIC = -2loglik+ 2d
Estimation method for the covariance parameters
Maximum Likelihood (ML) versus Restricted Maximum Likelihood (REML)
both are based on likelihood principles properties of consistency, asymptotic
normality, and efficiency
differences increase as the number of fixed effects in the model increases
ML vs. REML
Goodness-of-fit testing for the two methods differ in what part of the model it assesses
ML: describes the fit of the whole model (fixed and random effects)
REML: describes the fit of the stochastic portion (random effects)
Which goodness-of-fit statistic?Bayesian information criterion (BIC) BIC = -2loglik+ d logn
Akaike information criterion (AIC) AIC = -2loglik+ 2d
The BIC has a higher penalty than AIC for including more parameters more simple model
a too simple model has inflated Type I error rates Typically, choose model based on AIC
Example
Which covariance structure fits the best?
Fit StatisticsUN(15)
CS(2)
TOEP(5)
AR(1)(2)
-2 Res Log Likelihood 3655.5 3670.6 3663.5 3729.5
AIC (smaller is better) 3685.5 3674.6 3673.5 3733.5
BIC (smaller is better) 3726.4 3680.0 3687.2 3739.0
Fixed Effects Parameter Estimates
EffectCovariance structure Estimate SE t-statistic p-value
Intercept UN 34.237 3.681 9.3 <.0001
CS 33.265 3.832 8.68 <.0001
TOEP 33.323 3.810 8.75 <.0001
AR(1) 33.361 3.412 9.78 <.0001
Age UN 0.048 0.064 0.74 0.4585
CS 0.060 0.066 0.9 0.3676
TOEP 0.059 0.066 0.9 0.3693
AR(1) 0.058 0.059 0.99 0.323
Female UN -1.135 1.513 -0.75 0.455
CS -1.213 1.574 -0.77 0.4425
TOEP -1.141 1.563 -0.73 0.4672
AR(1) -0.995 1.391 -0.72 0.4759
Fixed Effect Parameters – cont’d
EffectCovariance structure Estimate SE t-statistic p-value
Hypertension UN 3.123 1.548 2.02 0.0461
CS 3.007 1.610 1.87 0.0645
TOEP 3.021 1.600 1.89 0.0616
AR(1) 3.044 1.423 2.14 0.0347
Time UN 0.626 0.064 9.76 <.0001
CS 0.629 0.057 11.02 <.0001
TOEP 0.632 0.065 9.72 <.0001
AR(1) 0.653 0.099 6.58 <.0001
Likelihood ratio test (LRT)
For nested models, can also test if the additional parameters add a statistically significant improvement in the model
For the example, the LRT for TOEP (5 parameters) vs. CS (2 parameters)
---> choose CS model
Summary
Graphical plots to help identify covariance structure
AIC and BIC to choose between covariance structures
LRT to test if additional parameters are warranted
References Dawson, K.S., Gennings, C. and Carter, W.H. 1997. Two graphical
techniques useful in detecting correlation structure in repeated measures data. The American Statistician. 51(3). 275-283.
Diggle, P.J., Liang, K.Y. and Zeger, S.L. 1994. Analysis of Longitudinal Data. Oxford. Clarendon Press.
Littell, R.C., Pendergast, J. and Natarajan, R. 2000. Modelling covariance structure in the analysis of repeated measures data. Statistics in Medicine. 19. 1783-1819.
Moser, E.B. 2004. Repeated Measures Modeling with PROC MIXED. Paper 188-29. SUGI 29.
Singer, J.D. 1998. Using SAS PROC MIXED to Fit Multilevel Models, Hierarchichal Models, and Individual Growth Models. Journal of Educational and Behavioral Statistics. 24(40). 323-355.
Singer, J.D. and Willet, J.B. 2003. Applied Longitudinal Data Analysis: Modeling Change and Event Occurrence. New York. Oxford Univeristy Press.
Ware, J.H. 1985. Linear models for the analysis of longitudinal studies. The American Statistician. 39(2). 95-101.