CBMS Lecture 6

CBMS Lecture 6

Alan E. GelfandDuke University

Multivariate spatial modeling

I Point-referenced spatial data often come as multivariatemeasurements at each location

I Examples:I Environmental monitoring stations yield measurements on

ozone, NO, CO, PM2.5, etc.I In atmospheric modeling at a given site we observe surface

temperature, precipitation and wind speedI At a monitoring site we observe precipitation, wet sulfate

deposition, wet nitrate depositionI At locations in a forest, we observe tree growth, soil moisture,

light availability, climate variablesI In real estate modeling for a commercial property we observe

selling price and total rental income

I We anticipate dependence between measurementsI at a particular locationI across locations

Basic issues

I Y(s) denotes a p × 1 vector of random variables at s

I We seek to model Y(s) : s ∈ D, again specifying finitedimensional distributions, e.g., for Y = (Y(s1), . . . ,Y(sn))

I Crucial object: the cross-covariance

C (s, s′) = Cov(Y(s),Y(s′))

a p × p matrix that need not be symmetric, i.e.,cov(Yj(s),Yj ′(s′)) need not equal cov(Yj ′(s),Yj(s′))

I C (s, s′) is not positive definite except in a limiting sense:C (s, s) is the covariance matrix associated with Y(s).

I Our primary focus: Gaussian processes and valid specificationfor C (s, s′) to overlay

Separable models

I A common specification is the separable model

C (s, s′) = ρ(s, s′) · T

where ρ is a valid (univariate) correlation function and T is ap × p positive definite matrix

I T is the non-spatial or “local” covariance matrix

I ρ controls spatial association based upon proximityI Easy to verify that ΣY = H ⊗ T , where Hij = ρ(si , sj) and ⊗

is the Kronecker product.I ΣY is positive definite since H and T areI ΣY is convenient since |ΣY| = |H|p |T |n and

Σ−1Y = H−1 ⊗ T−1.

Application: Bivariate spatial regressionI A single covariate X (s) and a univariate response Y (s)

I Treat this as a bivariate process. (WHY?)

Z(s) =

(X (s)Y (s)

)∼ N(µ(s),T )

I Simplifying assumptions:

I Separable cross-covariance for Z(s)I µ(s) = (µ1, µ2), i.e., constant means.

I Then, p(Y (s)|X (s)) = N(β0 + β1X (s), σ2) where:

β0 = µ2 −T12

T11µ1, β1 =

T12

T11, and σ2 = T22 −

T 212

T11

I Regression model parameters are functions of process modelparameters

Bivariate spatial regression (cont’d)

I Rearrangement of the components of Z toZ̃ = (X (s1),X (s2), . . . ,X (sn),Y (s1),Y (s2), . . . ,Y (sn))′

yields (XY

)∼ N

((µ11µ21

), T ⊗ H (φ)

),

I Priors: Wishart for T−1, vague but proper normal for(µ1, µ2), discrete prior for φ

I Full conditionals for Gibbs sampler: again Wishart for T−1,bivariate normal for (µ1, µ2); sampling from a discretedistribution for φ or perhaps a uniform on (0, .5max dist)

Dew-shrub data example

I 1129 locations with UTM coordinates

I Y (s) : shrub density at location s

I X (s) : Dew duration at location s

I Illustrative analysis assuming separability and an exponentialcorrelation function, ρ(h;φ) = e−φh

I Conjugate priors for µ,T as above; prior for φ has infinitevariance and suggests a range (3/φ) of 125 km, roughly halfthe maximum pairwise distance in the region

I (µ1, µ2,T11,T12,T22) updated directly; φ updated viaMetropolis

I Posterior samples of (β0, β1, σ2) from posterior samples of

process parameters

Parameter estimation, dew-shrub data

Parameter 2.5% 50% 97.5%

µ1 73.12 73.89 74.67µ2 5.20 5.38 5.572T11 95.10 105.22 117.69T12 –4.46 –2.42 –0.53T22 5.56 6.19 6.91φ 0.01 0.03 0.21

β0 5.72 7.08 8.46β1 –0.04 –0.02 –0.01σ2 5.58 6.22 6.93

T12/√T11T22 –0.17 –0.10 –0.02

⇒ Surprising - a significant negative association between dewduration and shrub density!

Benefits and limitations of separability

I Benefits:

I Easy interpretation (decomposition of variance structure)I Substantial computational benefits

I Limitations:I Symmetry in cross-covariance matrix (not so serious)I Imposes same spatial range for every component (more

serious, only one correlation function)

I a proposed solutionI Coregionalization models

An simple nonseparable example

I The delay effect or pure offset model

I Define Y(s) to be two-dimensional such thatY2(s) = Y1(s + λ). λ is a delay vector

I Cross covariance matrix is(σ2ρ(h) σ2ρ(h + λ)

σ2ρ(−h + λ) σ2ρ(h)

)I Here ρ is valid

I Can add a nugget, i.e., define Y2(s) = Y1(s + λ) + ε(s)

I Potential application to exposures driven by wind direction

Linear Model of Coregionalization

I For point referenced data, Y(s) = Aw(s) wherew(s) = (w1(s),w2(s), . . . ,wp(s))

I p independent spatial processes with stationary correlationfunctions ρj(s− s′), j = 1, 2, . . . , p

I If ρj = ρ for all j ⇐⇒ separable case with AA′ = T

I In general, the cross covariance matrix is (with aj being thecolumns of A)

C (s− s′) =

p∑j=1

ρj(s− s′)aja′j

I Approach is “constructive” so C (s− s′) immediately valid,still stationary, and provides a distinct covariance function foreach component

Linear Model of Coregionalization

I More general: Y(s) = A(s)w(s).A spatially varying LMC!

I model A(s)⇔ model T (s) = A(s)A′(s)I Possibilities for T (s):

I T (s) = g(X (s))× TI T (s) is a spatial process (e.g., T−1(s) is a spatial Wishart

process)

I Computationally demanding

cont.

I Specification of A

I p × p entries in A but, since A⇔ T , only require p(p+1)2

parameters. For convenience, we often take A to be lowertriangular.

I Given φ1, · · · , φp, the cross covariance matrix is symmetric,regardless of A.

I Number of parameters in the model p(p+1)2 + pm where m is

the dimension of φj , i.e., number of parameters in theindividual correlation functions.

I With p = 2, we have 3 parameters in A and, using anexponential covariance function, m = 2 decay parameters

cont.

I The one-to-one relationship between T and lower triangular Ais standard.

I When p = 2 we have

a11 =√T11, a21 =

T12√T11

, a22 =

√T22 −

T 212

T11

I When p=3 we add

a31 =T13√T11

, a32 =T11T23 − T12T13

T11(T11T22 − T 212)

and a33 =

√T33 −

T 213

T11− (T11T23 − T12T13)2

T11(T11T22 − T 212)

cont.

I More explicitly

Y1(s)Y2(s)

...Yj(s)

...Yp(s)

=

a11w1(s)a21w1(s) + a22w2(s)

...∑jl=1 ajlwl(s)

...∑pl=1 ajlwl(s)

.

I Y(s) is stationary, has a symmetric cross-covariance matrix,with a different variance and, if the ρ(.;φj)’s are isotropic, adifferent range for each component of Y(s).

General Multivariate Spatial Model

I So, we arrive at the model

Y(s) = µ(s) + v(s) + ε(s)

with

I ε(s) ∼ N(0,Dε), (Dε)jj = τ2j .

I v(s) = Aw(s) following previous specification

I wj(s) are mean 0 Gaussian processes with individualcorrelation functions.

I µ(s) arises from µj(s) = XTj (s)βj .

A useful example

I Spatially varying coefficient models (Gelfand et al., 2003)

I Model Y (s) = X(s)Tβ(s) + ε(s).

I Here Y (s) is univariate. The multivariate process is for β(s).Use coregionalization here.

I For p = 2, with X(s) having a column of “1”’s, we obtainβ0(s) + X (s)β1(s)

I Spatially varying intercept (like a spatial random effect) and aspatially varying slope.

I Analogous to longitudinal growth curve models

I A very rich class of nonlinear models

I Infer about the multivariate process is for β(s) while onlyobserving the univariate Y (s) process

Hierarchical Model

I 1st stage:Y(si )|{βj}, {v(si )},Dε ∼ N(µ(si ) + v(si ),Dε).

I 2nd stage:

v =

v(s1)...

v(sn)

∼ N(0,∑p

j=1 Rj ⊗ Tj),

Y(si ) into Y, µ(si ) into µ, marginalize over v

f (Y|{βj},Dε, {ρj},T ) =

N

µ,

p∑j=1

(Hj ⊗ Tj) + In×n ⊗ Dε

.

I 3rd stage: Priors on {βj}, {τ2j }, T and the parameters of theρj .

California Pollution Data Example

I From the California Air Resources Board. Available fordownload athttp://www.arb.ca.gov/aqd/aqdcd/aqdcddld.htm.

I Daily average of Carbon Monoxide (CO), Nitric Oxide (NO)and Nitrogen dioxide (NO2) based on hourly measurements onJuly, 6th, 1999 → 68 sites.

I The observed correlations between these pollutants rangefrom 0.46 (CO and NO) to 0.77 (NO and NO2).

I Use the logarithm of the daily average of each of thesevariables.

I No information on covariates, such as temperature or winddirections, at these gauged sites.

A model specificationI Can specify coregionalization model sequentially

I We anticipate smooth exposure surfaces so only spatialrandom effects, no nuggets

I The model:

CO(s) = µ1 + σ1w̃1(s)

NO(s)|CO(s) = µ2 + αCO(s) + σ2w̃2(s)

NO2(s)|CO(s),NO(s) = µ3 + γCO(s) +

βNO(s) + σ3w̃3(s),

with w̃j(s) ∼ GP(0, ρj) and

ρj = exp{−ψj ||s− s′||}

Prior specifications

µ1 ∼ N(0, 5), µ2 ∼ N(0, 5), µ3 ∼ N(0, 5)

α ∼ N(0, 5), γ ∼ N(0, 0.2), β ∼ N(0, 0.2)

σ21 ∼ IG (5, 0.35 ∗ 4), σ22 ∼ IG (5, 0.52 ∗ 4),

σ23 ∼ IG (5, 0.13 ∗ 4),

ψ1 ∼ Ga(0.6, 1), ψ2 ∼ Ga(0.6, 1),

ψ3 ∼ Ga(0.6, 1).

p(ψj) based on ψ = 3/range and range = .5max dist

Posterior SummariesPosterior Summaries for CO (1), NO (2) and NO2 (3).

Parameter Mean 2.50% Median 97.50%α 0.296 0.045 0.292 0.553β 0.302 0.190 0.301 0.413γ 0.198 0.082 0.199 0.314µ1 -0.922 -1.135 -0.921 -0.710µ2 -5.015 -5.504 -5.018 -4.538µ3 -2.602 -3.281 -2.602 -1.943φ1 4.995 2.789 4.882 7.952φ2 2.186 1.064 2.081 3.854φ3 1.209 0.525 1.157 2.201σ21 0.391 0.267 0.381 0.570σ22 0.698 0.438 0.668 1.163σ23 0.215 0.124 0.199 0.407

CO range 0.647 0.380 0.614 1.103NO range 1.497 0.772 1.405 2.733

NO2 range 1.334 0.706 1.236 2.473

Coregionalization matrix

Posterior Median with the associate 95% credible interval (inbrackets) of the elements of the coregionalization matrix andthe correlation matrix for each location s.

Y1 Y2 Y3

0.3812 0.1108 0.1085(0.27;0.57) (0.01;0.24) (0.06;0.20)

0.7110 0.2354(0.47;1.22) (0.14;0.42)

0.3000(0.21;0.50)

Correlations

Y1 Y2 Y3

1 0.2134 0.3223(0.03;0.41) (0.17;0.49)

1 0.520(0.31;0.69)

1

Prediction of NO2Prediction of NO2 based on three different models.

small

(i) Independent model for NO2

Site Mean 2.50% Median 97.50% Observed1 -4.869 -5.986 -4.839 -3.802 −4.3422 -4.624 -5.082 -4.632 -4.127 −4.5853 -4.294 -4.679 -4.294 -3.896 −4.100

(ii) Model for NO2 conditioned on CO

1 -4.722 -5.712 -4.733 -3.73 −4.3422 -4.7 -5.106 -4.702 -4.301 −4.5853 -4.132 -4.471 -4.131 -3.794 −4.100

(iii) Model for NO2 conditioned on CO and NO

1 -4.5 -5.313 -4.508 -3.679 −4.3422 -4.585 -4.964 -4.587 -4.22 −4.5853 -3.966 -4.26 -3.964 -3.653 −4.100

Other Approaches

I Moving average or kernel convolution of a process:

Yj(s) =

∫kj(u)Z (s + u)du =

∫kj(s− s′)Z (s′)ds′

where Z (s) is a univariate spatial process and kj are kernelfunctions, j = 1, 2, . . . , p. Yields the cross covariance

Cij(s− s′) =

∫ ∫ki (s− s′ + u)kj(u′)ρ(u− u′)dudu′

I Convolution of Covariance Functions: Suppose C1,C2, ...Cp

are valid covariance functions. DefineCij(s) =

∫Ci (s− t)Cj(t)dt. Then the p × p matrix

C (s) = {Cij(s)} is a valid cross covariance function

Multivariate Areal Data Examples

I Cancer counts for areal units for several different types ofcancers

I Employment rates by sectors for a set of areal units

I Individual level bivariate data within units, e.g., heightadjusted for age (HAZ) and weight adjusted for age (WAZ)with areal unit level spatial effects for each outcome

I Spatially varying coefficient models with coefficients at arealscale because covariates are at areal scale

Multivariate Areal Data Models

I Now areal units (e.g., counties) instead of points

I Need to model dependence within and across units

I As in univariate case, use spatial random effects φji , whereagain i = 1, . . . , n indexes region but now j = 1, . . . , p indexesvariables (e.g., cancer type) within region

I Suppose we observe Yi = (Y1i ,Y21, ...Ypi ), Then

g(E (Yji )) = xTji βj + φji ,

with φi = (φ1i , . . . , φpi ) and φ = (φ1, . . . ,φn).

I Link function g useful for modeling rates (e.g., Poissondisease mapping).

I Multivariate CAR (MCAR) model for the φji

Some modelsI Illustrate with p = 2

I A disease mapping example: Y1i ,Y2i are counts for diseases 1and 2 in unit i

Yji ∼ Po(λji ), , j = 1, 2,

λji = Ejiηji

logηji = XTji βj + φji

I Bivariate CAR model for {φ1i , φ2i}I Height and weight example:

Yir =

(HAZir

WAZir

)= Xir

(β(H)

β(W )

)+

(φ(H)i

φ(W )i

)+

(ε(H)ir

ε(W )ir

)

I Bivariate CAR model for {φ(H)i , φ

(W )i }

Multivariate CAR (MCAR) models

I Again, local or neighbor idea, conditioning, CAR

I Approach 1: multivariate CAR (MCAR) in the formp(φi |φj , j 6= i) with

p(φi |φj 6=i ,Σi ) = N

∑j

Bijφj ,Σi

, i = 1, . . . , n

I As earlier, Brook’s Lemma yields p(φ), improper, etc.

I Simplification: Bij = bij I , bij = wij/wi+,Σi = ( 1wi+

)Σ

I To make proper, add ρ or perhaps ρj , j = 1, . . . , p

cont.

I A coregionalization approach (straightforward)

I With say, p = 2, write(φ1iφ2i

)= A

(η1iη2i

)I η1i ∼ CAR(τ1), η2i ∼ CAR(τ2)

I η1i , η2i independent

Documents

CBMS Lecture 6