Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
CBMS Lecture 6
Alan E. GelfandDuke University
Multivariate spatial modeling
I Point-referenced spatial data often come as multivariatemeasurements at each location
I Examples:I Environmental monitoring stations yield measurements on
ozone, NO, CO, PM2.5, etc.I In atmospheric modeling at a given site we observe surface
temperature, precipitation and wind speedI At a monitoring site we observe precipitation, wet sulfate
deposition, wet nitrate depositionI At locations in a forest, we observe tree growth, soil moisture,
light availability, climate variablesI In real estate modeling for a commercial property we observe
selling price and total rental income
I We anticipate dependence between measurementsI at a particular locationI across locations
Basic issues
I Y(s) denotes a p × 1 vector of random variables at s
I We seek to model Y(s) : s ∈ D, again specifying finitedimensional distributions, e.g., for Y = (Y(s1), . . . ,Y(sn))
I Crucial object: the cross-covariance
C (s, s′) = Cov(Y(s),Y(s′))
a p × p matrix that need not be symmetric, i.e.,cov(Yj(s),Yj ′(s′)) need not equal cov(Yj ′(s),Yj(s′))
I C (s, s′) is not positive definite except in a limiting sense:C (s, s) is the covariance matrix associated with Y(s).
I Our primary focus: Gaussian processes and valid specificationfor C (s, s′) to overlay
Separable models
I A common specification is the separable model
C (s, s′) = ρ(s, s′) · T
where ρ is a valid (univariate) correlation function and T is ap × p positive definite matrix
I T is the non-spatial or “local” covariance matrix
I ρ controls spatial association based upon proximityI Easy to verify that ΣY = H ⊗ T , where Hij = ρ(si , sj) and ⊗
is the Kronecker product.I ΣY is positive definite since H and T areI ΣY is convenient since |ΣY| = |H|p |T |n and
Σ−1Y = H−1 ⊗ T−1.
Application: Bivariate spatial regressionI A single covariate X (s) and a univariate response Y (s)
I Treat this as a bivariate process. (WHY?)
Z(s) =
(X (s)Y (s)
)∼ N(µ(s),T )
I Simplifying assumptions:
I Separable cross-covariance for Z(s)I µ(s) = (µ1, µ2), i.e., constant means.
I Then, p(Y (s)|X (s)) = N(β0 + β1X (s), σ2) where:
β0 = µ2 −T12
T11µ1, β1 =
T12
T11, and σ2 = T22 −
T 212
T11
I Regression model parameters are functions of process modelparameters
Bivariate spatial regression (cont’d)
I Rearrangement of the components of Z toZ̃ = (X (s1),X (s2), . . . ,X (sn),Y (s1),Y (s2), . . . ,Y (sn))′
yields (XY
)∼ N
((µ11µ21
), T ⊗ H (φ)
),
I Priors: Wishart for T−1, vague but proper normal for(µ1, µ2), discrete prior for φ
I Full conditionals for Gibbs sampler: again Wishart for T−1,bivariate normal for (µ1, µ2); sampling from a discretedistribution for φ or perhaps a uniform on (0, .5max dist)
Dew-shrub data example
I 1129 locations with UTM coordinates
I Y (s) : shrub density at location s
I X (s) : Dew duration at location s
I Illustrative analysis assuming separability and an exponentialcorrelation function, ρ(h;φ) = e−φh
I Conjugate priors for µ,T as above; prior for φ has infinitevariance and suggests a range (3/φ) of 125 km, roughly halfthe maximum pairwise distance in the region
I (µ1, µ2,T11,T12,T22) updated directly; φ updated viaMetropolis
I Posterior samples of (β0, β1, σ2) from posterior samples of
process parameters
Parameter estimation, dew-shrub data
Parameter 2.5% 50% 97.5%
µ1 73.12 73.89 74.67µ2 5.20 5.38 5.572T11 95.10 105.22 117.69T12 –4.46 –2.42 –0.53T22 5.56 6.19 6.91φ 0.01 0.03 0.21
β0 5.72 7.08 8.46β1 –0.04 –0.02 –0.01σ2 5.58 6.22 6.93
T12/√T11T22 –0.17 –0.10 –0.02
⇒ Surprising - a significant negative association between dewduration and shrub density!
Benefits and limitations of separability
I Benefits:
I Easy interpretation (decomposition of variance structure)I Substantial computational benefits
I Limitations:I Symmetry in cross-covariance matrix (not so serious)I Imposes same spatial range for every component (more
serious, only one correlation function)
I a proposed solutionI Coregionalization models
An simple nonseparable example
I The delay effect or pure offset model
I Define Y(s) to be two-dimensional such thatY2(s) = Y1(s + λ). λ is a delay vector
I Cross covariance matrix is(σ2ρ(h) σ2ρ(h + λ)
σ2ρ(−h + λ) σ2ρ(h)
)I Here ρ is valid
I Can add a nugget, i.e., define Y2(s) = Y1(s + λ) + ε(s)
I Potential application to exposures driven by wind direction
Linear Model of Coregionalization
I For point referenced data, Y(s) = Aw(s) wherew(s) = (w1(s),w2(s), . . . ,wp(s))
I p independent spatial processes with stationary correlationfunctions ρj(s− s′), j = 1, 2, . . . , p
I If ρj = ρ for all j ⇐⇒ separable case with AA′ = T
I In general, the cross covariance matrix is (with aj being thecolumns of A)
C (s− s′) =
p∑j=1
ρj(s− s′)aja′j
I Approach is “constructive” so C (s− s′) immediately valid,still stationary, and provides a distinct covariance function foreach component
Linear Model of Coregionalization
I More general: Y(s) = A(s)w(s).A spatially varying LMC!
I model A(s)⇔ model T (s) = A(s)A′(s)I Possibilities for T (s):
I T (s) = g(X (s))× TI T (s) is a spatial process (e.g., T−1(s) is a spatial Wishart
process)
I Computationally demanding
cont.
I Specification of A
I p × p entries in A but, since A⇔ T , only require p(p+1)2
parameters. For convenience, we often take A to be lowertriangular.
I Given φ1, · · · , φp, the cross covariance matrix is symmetric,regardless of A.
I Number of parameters in the model p(p+1)2 + pm where m is
the dimension of φj , i.e., number of parameters in theindividual correlation functions.
I With p = 2, we have 3 parameters in A and, using anexponential covariance function, m = 2 decay parameters
cont.
I The one-to-one relationship between T and lower triangular Ais standard.
I When p = 2 we have
a11 =√T11, a21 =
T12√T11
, a22 =
√T22 −
T 212
T11
I When p=3 we add
a31 =T13√T11
, a32 =T11T23 − T12T13
T11(T11T22 − T 212)
and a33 =
√T33 −
T 213
T11− (T11T23 − T12T13)2
T11(T11T22 − T 212)
cont.
I More explicitly
Y1(s)Y2(s)
...Yj(s)
...Yp(s)
=
a11w1(s)a21w1(s) + a22w2(s)
...∑jl=1 ajlwl(s)
...∑pl=1 ajlwl(s)
.
I Y(s) is stationary, has a symmetric cross-covariance matrix,with a different variance and, if the ρ(.;φj)’s are isotropic, adifferent range for each component of Y(s).
General Multivariate Spatial Model
I So, we arrive at the model
Y(s) = µ(s) + v(s) + ε(s)
with
I ε(s) ∼ N(0,Dε), (Dε)jj = τ2j .
I v(s) = Aw(s) following previous specification
I wj(s) are mean 0 Gaussian processes with individualcorrelation functions.
I µ(s) arises from µj(s) = XTj (s)βj .
A useful example
I Spatially varying coefficient models (Gelfand et al., 2003)
I Model Y (s) = X(s)Tβ(s) + ε(s).
I Here Y (s) is univariate. The multivariate process is for β(s).Use coregionalization here.
I For p = 2, with X(s) having a column of “1”’s, we obtainβ0(s) + X (s)β1(s)
I Spatially varying intercept (like a spatial random effect) and aspatially varying slope.
I Analogous to longitudinal growth curve models
I A very rich class of nonlinear models
I Infer about the multivariate process is for β(s) while onlyobserving the univariate Y (s) process
Hierarchical Model
I 1st stage:Y(si )|{βj}, {v(si )},Dε ∼ N(µ(si ) + v(si ),Dε).
I 2nd stage:
v =
v(s1)...
v(sn)
∼ N(0,∑p
j=1 Rj ⊗ Tj),
Y(si ) into Y, µ(si ) into µ, marginalize over v
f (Y|{βj},Dε, {ρj},T ) =
N
µ,
p∑j=1
(Hj ⊗ Tj) + In×n ⊗ Dε
.
I 3rd stage: Priors on {βj}, {τ2j }, T and the parameters of theρj .
California Pollution Data Example
I From the California Air Resources Board. Available fordownload athttp://www.arb.ca.gov/aqd/aqdcd/aqdcddld.htm.
I Daily average of Carbon Monoxide (CO), Nitric Oxide (NO)and Nitrogen dioxide (NO2) based on hourly measurements onJuly, 6th, 1999 → 68 sites.
I The observed correlations between these pollutants rangefrom 0.46 (CO and NO) to 0.77 (NO and NO2).
I Use the logarithm of the daily average of each of thesevariables.
I No information on covariates, such as temperature or winddirections, at these gauged sites.
A model specificationI Can specify coregionalization model sequentially
I We anticipate smooth exposure surfaces so only spatialrandom effects, no nuggets
I The model:
CO(s) = µ1 + σ1w̃1(s)
NO(s)|CO(s) = µ2 + αCO(s) + σ2w̃2(s)
NO2(s)|CO(s),NO(s) = µ3 + γCO(s) +
βNO(s) + σ3w̃3(s),
with w̃j(s) ∼ GP(0, ρj) and
ρj = exp{−ψj ||s− s′||}
Prior specifications
µ1 ∼ N(0, 5), µ2 ∼ N(0, 5), µ3 ∼ N(0, 5)
α ∼ N(0, 5), γ ∼ N(0, 0.2), β ∼ N(0, 0.2)
σ21 ∼ IG (5, 0.35 ∗ 4), σ22 ∼ IG (5, 0.52 ∗ 4),
σ23 ∼ IG (5, 0.13 ∗ 4),
ψ1 ∼ Ga(0.6, 1), ψ2 ∼ Ga(0.6, 1),
ψ3 ∼ Ga(0.6, 1).
p(ψj) based on ψ = 3/range and range = .5max dist
Posterior SummariesPosterior Summaries for CO (1), NO (2) and NO2 (3).
Parameter Mean 2.50% Median 97.50%α 0.296 0.045 0.292 0.553β 0.302 0.190 0.301 0.413γ 0.198 0.082 0.199 0.314µ1 -0.922 -1.135 -0.921 -0.710µ2 -5.015 -5.504 -5.018 -4.538µ3 -2.602 -3.281 -2.602 -1.943φ1 4.995 2.789 4.882 7.952φ2 2.186 1.064 2.081 3.854φ3 1.209 0.525 1.157 2.201σ21 0.391 0.267 0.381 0.570σ22 0.698 0.438 0.668 1.163σ23 0.215 0.124 0.199 0.407
CO range 0.647 0.380 0.614 1.103NO range 1.497 0.772 1.405 2.733
NO2 range 1.334 0.706 1.236 2.473
Coregionalization matrix
Posterior Median with the associate 95% credible interval (inbrackets) of the elements of the coregionalization matrix andthe correlation matrix for each location s.
Y1 Y2 Y3
0.3812 0.1108 0.1085(0.27;0.57) (0.01;0.24) (0.06;0.20)
0.7110 0.2354(0.47;1.22) (0.14;0.42)
0.3000(0.21;0.50)
Correlations
Y1 Y2 Y3
1 0.2134 0.3223(0.03;0.41) (0.17;0.49)
1 0.520(0.31;0.69)
1
Prediction of NO2Prediction of NO2 based on three different models.
small
(i) Independent model for NO2
Site Mean 2.50% Median 97.50% Observed1 -4.869 -5.986 -4.839 -3.802 −4.3422 -4.624 -5.082 -4.632 -4.127 −4.5853 -4.294 -4.679 -4.294 -3.896 −4.100
(ii) Model for NO2 conditioned on CO
1 -4.722 -5.712 -4.733 -3.73 −4.3422 -4.7 -5.106 -4.702 -4.301 −4.5853 -4.132 -4.471 -4.131 -3.794 −4.100
(iii) Model for NO2 conditioned on CO and NO
1 -4.5 -5.313 -4.508 -3.679 −4.3422 -4.585 -4.964 -4.587 -4.22 −4.5853 -3.966 -4.26 -3.964 -3.653 −4.100
Other Approaches
I Moving average or kernel convolution of a process:
Yj(s) =
∫kj(u)Z (s + u)du =
∫kj(s− s′)Z (s′)ds′
where Z (s) is a univariate spatial process and kj are kernelfunctions, j = 1, 2, . . . , p. Yields the cross covariance
Cij(s− s′) =
∫ ∫ki (s− s′ + u)kj(u′)ρ(u− u′)dudu′
I Convolution of Covariance Functions: Suppose C1,C2, ...Cp
are valid covariance functions. DefineCij(s) =
∫Ci (s− t)Cj(t)dt. Then the p × p matrix
C (s) = {Cij(s)} is a valid cross covariance function
Multivariate Areal Data Examples
I Cancer counts for areal units for several different types ofcancers
I Employment rates by sectors for a set of areal units
I Individual level bivariate data within units, e.g., heightadjusted for age (HAZ) and weight adjusted for age (WAZ)with areal unit level spatial effects for each outcome
I Spatially varying coefficient models with coefficients at arealscale because covariates are at areal scale
Multivariate Areal Data Models
I Now areal units (e.g., counties) instead of points
I Need to model dependence within and across units
I As in univariate case, use spatial random effects φji , whereagain i = 1, . . . , n indexes region but now j = 1, . . . , p indexesvariables (e.g., cancer type) within region
I Suppose we observe Yi = (Y1i ,Y21, ...Ypi ), Then
g(E (Yji )) = xTji βj + φji ,
with φi = (φ1i , . . . , φpi ) and φ = (φ1, . . . ,φn).
I Link function g useful for modeling rates (e.g., Poissondisease mapping).
I Multivariate CAR (MCAR) model for the φji
Some modelsI Illustrate with p = 2
I A disease mapping example: Y1i ,Y2i are counts for diseases 1and 2 in unit i
Yji ∼ Po(λji ), , j = 1, 2,
λji = Ejiηji
logηji = XTji βj + φji
I Bivariate CAR model for {φ1i , φ2i}I Height and weight example:
Yir =
(HAZir
WAZir
)= Xir
(β(H)
β(W )
)+
(φ(H)i
φ(W )i
)+
(ε(H)ir
ε(W )ir
)
I Bivariate CAR model for {φ(H)i , φ
(W )i }
Multivariate CAR (MCAR) models
I Again, local or neighbor idea, conditioning, CAR
I Approach 1: multivariate CAR (MCAR) in the formp(φi |φj , j 6= i) with
p(φi |φj 6=i ,Σi ) = N
∑j
Bijφj ,Σi
, i = 1, . . . , n
I As earlier, Brook’s Lemma yields p(φ), improper, etc.
I Simplification: Bij = bij I , bij = wij/wi+,Σi = ( 1wi+
)Σ
I To make proper, add ρ or perhaps ρj , j = 1, . . . , p
cont.
I A coregionalization approach (straightforward)
I With say, p = 2, write(φ1iφ2i
)= A
(η1iη2i
)I η1i ∼ CAR(τ1), η2i ∼ CAR(τ2)
I η1i , η2i independent