33
CBMS Lecture 6 Alan E. Gelfand Duke University

CBMS Lecture 6

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CBMS Lecture 6

CBMS Lecture 6

Alan E. GelfandDuke University

Page 2: CBMS Lecture 6

Multivariate spatial modeling

I Point-referenced spatial data often come as multivariatemeasurements at each location

I Examples:I Environmental monitoring stations yield measurements on

ozone, NO, CO, PM2.5, etc.I In atmospheric modeling at a given site we observe surface

temperature, precipitation and wind speedI At a monitoring site we observe precipitation, wet sulfate

deposition, wet nitrate depositionI At locations in a forest, we observe tree growth, soil moisture,

light availability, climate variablesI In real estate modeling for a commercial property we observe

selling price and total rental income

I We anticipate dependence between measurementsI at a particular locationI across locations

Page 3: CBMS Lecture 6

Basic issues

I Y(s) denotes a p × 1 vector of random variables at s

I We seek to model Y(s) : s ∈ D, again specifying finitedimensional distributions, e.g., for Y = (Y(s1), . . . ,Y(sn))

I Crucial object: the cross-covariance

C (s, s′) = Cov(Y(s),Y(s′))

a p × p matrix that need not be symmetric, i.e.,cov(Yj(s),Yj ′(s′)) need not equal cov(Yj ′(s),Yj(s′))

I C (s, s′) is not positive definite except in a limiting sense:C (s, s) is the covariance matrix associated with Y(s).

I Our primary focus: Gaussian processes and valid specificationfor C (s, s′) to overlay

Page 4: CBMS Lecture 6

Separable models

I A common specification is the separable model

C (s, s′) = ρ(s, s′) · T

where ρ is a valid (univariate) correlation function and T is ap × p positive definite matrix

I T is the non-spatial or “local” covariance matrix

I ρ controls spatial association based upon proximityI Easy to verify that ΣY = H ⊗ T , where Hij = ρ(si , sj) and ⊗

is the Kronecker product.I ΣY is positive definite since H and T areI ΣY is convenient since |ΣY| = |H|p |T |n and

Σ−1Y = H−1 ⊗ T−1.

Page 5: CBMS Lecture 6

Application: Bivariate spatial regressionI A single covariate X (s) and a univariate response Y (s)

I Treat this as a bivariate process. (WHY?)

Z(s) =

(X (s)Y (s)

)∼ N(µ(s),T )

I Simplifying assumptions:

I Separable cross-covariance for Z(s)I µ(s) = (µ1, µ2), i.e., constant means.

I Then, p(Y (s)|X (s)) = N(β0 + β1X (s), σ2) where:

β0 = µ2 −T12

T11µ1, β1 =

T12

T11, and σ2 = T22 −

T 212

T11

I Regression model parameters are functions of process modelparameters

Page 6: CBMS Lecture 6

Bivariate spatial regression (cont’d)

I Rearrangement of the components of Z toZ̃ = (X (s1),X (s2), . . . ,X (sn),Y (s1),Y (s2), . . . ,Y (sn))′

yields (XY

)∼ N

((µ11µ21

), T ⊗ H (φ)

),

I Priors: Wishart for T−1, vague but proper normal for(µ1, µ2), discrete prior for φ

I Full conditionals for Gibbs sampler: again Wishart for T−1,bivariate normal for (µ1, µ2); sampling from a discretedistribution for φ or perhaps a uniform on (0, .5max dist)

Page 7: CBMS Lecture 6

Dew-shrub data example

I 1129 locations with UTM coordinates

I Y (s) : shrub density at location s

I X (s) : Dew duration at location s

I Illustrative analysis assuming separability and an exponentialcorrelation function, ρ(h;φ) = e−φh

I Conjugate priors for µ,T as above; prior for φ has infinitevariance and suggests a range (3/φ) of 125 km, roughly halfthe maximum pairwise distance in the region

I (µ1, µ2,T11,T12,T22) updated directly; φ updated viaMetropolis

I Posterior samples of (β0, β1, σ2) from posterior samples of

process parameters

Page 8: CBMS Lecture 6
Page 9: CBMS Lecture 6

Parameter estimation, dew-shrub data

Parameter 2.5% 50% 97.5%

µ1 73.12 73.89 74.67µ2 5.20 5.38 5.572T11 95.10 105.22 117.69T12 –4.46 –2.42 –0.53T22 5.56 6.19 6.91φ 0.01 0.03 0.21

β0 5.72 7.08 8.46β1 –0.04 –0.02 –0.01σ2 5.58 6.22 6.93

T12/√T11T22 –0.17 –0.10 –0.02

⇒ Surprising - a significant negative association between dewduration and shrub density!

Page 10: CBMS Lecture 6

Benefits and limitations of separability

I Benefits:

I Easy interpretation (decomposition of variance structure)I Substantial computational benefits

I Limitations:I Symmetry in cross-covariance matrix (not so serious)I Imposes same spatial range for every component (more

serious, only one correlation function)

I a proposed solutionI Coregionalization models

Page 11: CBMS Lecture 6

An simple nonseparable example

I The delay effect or pure offset model

I Define Y(s) to be two-dimensional such thatY2(s) = Y1(s + λ). λ is a delay vector

I Cross covariance matrix is(σ2ρ(h) σ2ρ(h + λ)

σ2ρ(−h + λ) σ2ρ(h)

)I Here ρ is valid

I Can add a nugget, i.e., define Y2(s) = Y1(s + λ) + ε(s)

I Potential application to exposures driven by wind direction

Page 12: CBMS Lecture 6

Linear Model of Coregionalization

I For point referenced data, Y(s) = Aw(s) wherew(s) = (w1(s),w2(s), . . . ,wp(s))

I p independent spatial processes with stationary correlationfunctions ρj(s− s′), j = 1, 2, . . . , p

I If ρj = ρ for all j ⇐⇒ separable case with AA′ = T

I In general, the cross covariance matrix is (with aj being thecolumns of A)

C (s− s′) =

p∑j=1

ρj(s− s′)aja′j

I Approach is “constructive” so C (s− s′) immediately valid,still stationary, and provides a distinct covariance function foreach component

Page 13: CBMS Lecture 6

Linear Model of Coregionalization

I More general: Y(s) = A(s)w(s).A spatially varying LMC!

I model A(s)⇔ model T (s) = A(s)A′(s)I Possibilities for T (s):

I T (s) = g(X (s))× TI T (s) is a spatial process (e.g., T−1(s) is a spatial Wishart

process)

I Computationally demanding

Page 14: CBMS Lecture 6

cont.

I Specification of A

I p × p entries in A but, since A⇔ T , only require p(p+1)2

parameters. For convenience, we often take A to be lowertriangular.

I Given φ1, · · · , φp, the cross covariance matrix is symmetric,regardless of A.

I Number of parameters in the model p(p+1)2 + pm where m is

the dimension of φj , i.e., number of parameters in theindividual correlation functions.

I With p = 2, we have 3 parameters in A and, using anexponential covariance function, m = 2 decay parameters

Page 15: CBMS Lecture 6

cont.

I The one-to-one relationship between T and lower triangular Ais standard.

I When p = 2 we have

a11 =√T11, a21 =

T12√T11

, a22 =

√T22 −

T 212

T11

I When p=3 we add

a31 =T13√T11

, a32 =T11T23 − T12T13

T11(T11T22 − T 212)

and a33 =

√T33 −

T 213

T11− (T11T23 − T12T13)2

T11(T11T22 − T 212)

Page 16: CBMS Lecture 6

cont.

I More explicitly

Y1(s)Y2(s)

...Yj(s)

...Yp(s)

=

a11w1(s)a21w1(s) + a22w2(s)

...∑jl=1 ajlwl(s)

...∑pl=1 ajlwl(s)

.

I Y(s) is stationary, has a symmetric cross-covariance matrix,with a different variance and, if the ρ(.;φj)’s are isotropic, adifferent range for each component of Y(s).

Page 17: CBMS Lecture 6

General Multivariate Spatial Model

I So, we arrive at the model

Y(s) = µ(s) + v(s) + ε(s)

with

I ε(s) ∼ N(0,Dε), (Dε)jj = τ2j .

I v(s) = Aw(s) following previous specification

I wj(s) are mean 0 Gaussian processes with individualcorrelation functions.

I µ(s) arises from µj(s) = XTj (s)βj .

Page 18: CBMS Lecture 6

A useful example

I Spatially varying coefficient models (Gelfand et al., 2003)

I Model Y (s) = X(s)Tβ(s) + ε(s).

I Here Y (s) is univariate. The multivariate process is for β(s).Use coregionalization here.

I For p = 2, with X(s) having a column of “1”’s, we obtainβ0(s) + X (s)β1(s)

I Spatially varying intercept (like a spatial random effect) and aspatially varying slope.

I Analogous to longitudinal growth curve models

I A very rich class of nonlinear models

I Infer about the multivariate process is for β(s) while onlyobserving the univariate Y (s) process

Page 19: CBMS Lecture 6

Hierarchical Model

I 1st stage:Y(si )|{βj}, {v(si )},Dε ∼ N(µ(si ) + v(si ),Dε).

I 2nd stage:

v =

v(s1)...

v(sn)

∼ N(0,∑p

j=1 Rj ⊗ Tj),

Y(si ) into Y, µ(si ) into µ, marginalize over v

f (Y|{βj},Dε, {ρj},T ) =

N

µ,

p∑j=1

(Hj ⊗ Tj) + In×n ⊗ Dε

.

I 3rd stage: Priors on {βj}, {τ2j }, T and the parameters of theρj .

Page 20: CBMS Lecture 6

California Pollution Data Example

I From the California Air Resources Board. Available fordownload athttp://www.arb.ca.gov/aqd/aqdcd/aqdcddld.htm.

I Daily average of Carbon Monoxide (CO), Nitric Oxide (NO)and Nitrogen dioxide (NO2) based on hourly measurements onJuly, 6th, 1999 → 68 sites.

I The observed correlations between these pollutants rangefrom 0.46 (CO and NO) to 0.77 (NO and NO2).

I Use the logarithm of the daily average of each of thesevariables.

I No information on covariates, such as temperature or winddirections, at these gauged sites.

Page 21: CBMS Lecture 6
Page 22: CBMS Lecture 6

A model specificationI Can specify coregionalization model sequentially

I We anticipate smooth exposure surfaces so only spatialrandom effects, no nuggets

I The model:

CO(s) = µ1 + σ1w̃1(s)

NO(s)|CO(s) = µ2 + αCO(s) + σ2w̃2(s)

NO2(s)|CO(s),NO(s) = µ3 + γCO(s) +

βNO(s) + σ3w̃3(s),

with w̃j(s) ∼ GP(0, ρj) and

ρj = exp{−ψj ||s− s′||}

Page 23: CBMS Lecture 6

Prior specifications

µ1 ∼ N(0, 5), µ2 ∼ N(0, 5), µ3 ∼ N(0, 5)

α ∼ N(0, 5), γ ∼ N(0, 0.2), β ∼ N(0, 0.2)

σ21 ∼ IG (5, 0.35 ∗ 4), σ22 ∼ IG (5, 0.52 ∗ 4),

σ23 ∼ IG (5, 0.13 ∗ 4),

ψ1 ∼ Ga(0.6, 1), ψ2 ∼ Ga(0.6, 1),

ψ3 ∼ Ga(0.6, 1).

p(ψj) based on ψ = 3/range and range = .5max dist

Page 24: CBMS Lecture 6

Posterior SummariesPosterior Summaries for CO (1), NO (2) and NO2 (3).

Parameter Mean 2.50% Median 97.50%α 0.296 0.045 0.292 0.553β 0.302 0.190 0.301 0.413γ 0.198 0.082 0.199 0.314µ1 -0.922 -1.135 -0.921 -0.710µ2 -5.015 -5.504 -5.018 -4.538µ3 -2.602 -3.281 -2.602 -1.943φ1 4.995 2.789 4.882 7.952φ2 2.186 1.064 2.081 3.854φ3 1.209 0.525 1.157 2.201σ21 0.391 0.267 0.381 0.570σ22 0.698 0.438 0.668 1.163σ23 0.215 0.124 0.199 0.407

CO range 0.647 0.380 0.614 1.103NO range 1.497 0.772 1.405 2.733

NO2 range 1.334 0.706 1.236 2.473

Page 25: CBMS Lecture 6

Coregionalization matrix

Posterior Median with the associate 95% credible interval (inbrackets) of the elements of the coregionalization matrix andthe correlation matrix for each location s.

Y1 Y2 Y3

0.3812 0.1108 0.1085(0.27;0.57) (0.01;0.24) (0.06;0.20)

0.7110 0.2354(0.47;1.22) (0.14;0.42)

0.3000(0.21;0.50)

Page 26: CBMS Lecture 6

Correlations

Y1 Y2 Y3

1 0.2134 0.3223(0.03;0.41) (0.17;0.49)

1 0.520(0.31;0.69)

1

Page 27: CBMS Lecture 6

Prediction of NO2Prediction of NO2 based on three different models.

small

(i) Independent model for NO2

Site Mean 2.50% Median 97.50% Observed1 -4.869 -5.986 -4.839 -3.802 −4.3422 -4.624 -5.082 -4.632 -4.127 −4.5853 -4.294 -4.679 -4.294 -3.896 −4.100

(ii) Model for NO2 conditioned on CO

1 -4.722 -5.712 -4.733 -3.73 −4.3422 -4.7 -5.106 -4.702 -4.301 −4.5853 -4.132 -4.471 -4.131 -3.794 −4.100

(iii) Model for NO2 conditioned on CO and NO

1 -4.5 -5.313 -4.508 -3.679 −4.3422 -4.585 -4.964 -4.587 -4.22 −4.5853 -3.966 -4.26 -3.964 -3.653 −4.100

Page 28: CBMS Lecture 6

Other Approaches

I Moving average or kernel convolution of a process:

Yj(s) =

∫kj(u)Z (s + u)du =

∫kj(s− s′)Z (s′)ds′

where Z (s) is a univariate spatial process and kj are kernelfunctions, j = 1, 2, . . . , p. Yields the cross covariance

Cij(s− s′) =

∫ ∫ki (s− s′ + u)kj(u′)ρ(u− u′)dudu′

I Convolution of Covariance Functions: Suppose C1,C2, ...Cp

are valid covariance functions. DefineCij(s) =

∫Ci (s− t)Cj(t)dt. Then the p × p matrix

C (s) = {Cij(s)} is a valid cross covariance function

Page 29: CBMS Lecture 6

Multivariate Areal Data Examples

I Cancer counts for areal units for several different types ofcancers

I Employment rates by sectors for a set of areal units

I Individual level bivariate data within units, e.g., heightadjusted for age (HAZ) and weight adjusted for age (WAZ)with areal unit level spatial effects for each outcome

I Spatially varying coefficient models with coefficients at arealscale because covariates are at areal scale

Page 30: CBMS Lecture 6

Multivariate Areal Data Models

I Now areal units (e.g., counties) instead of points

I Need to model dependence within and across units

I As in univariate case, use spatial random effects φji , whereagain i = 1, . . . , n indexes region but now j = 1, . . . , p indexesvariables (e.g., cancer type) within region

I Suppose we observe Yi = (Y1i ,Y21, ...Ypi ), Then

g(E (Yji )) = xTji βj + φji ,

with φi = (φ1i , . . . , φpi ) and φ = (φ1, . . . ,φn).

I Link function g useful for modeling rates (e.g., Poissondisease mapping).

I Multivariate CAR (MCAR) model for the φji

Page 31: CBMS Lecture 6

Some modelsI Illustrate with p = 2

I A disease mapping example: Y1i ,Y2i are counts for diseases 1and 2 in unit i

Yji ∼ Po(λji ), , j = 1, 2,

λji = Ejiηji

logηji = XTji βj + φji

I Bivariate CAR model for {φ1i , φ2i}I Height and weight example:

Yir =

(HAZir

WAZir

)= Xir

(β(H)

β(W )

)+

(φ(H)i

φ(W )i

)+

(ε(H)ir

ε(W )ir

)

I Bivariate CAR model for {φ(H)i , φ

(W )i }

Page 32: CBMS Lecture 6

Multivariate CAR (MCAR) models

I Again, local or neighbor idea, conditioning, CAR

I Approach 1: multivariate CAR (MCAR) in the formp(φi |φj , j 6= i) with

p(φi |φj 6=i ,Σi ) = N

∑j

Bijφj ,Σi

, i = 1, . . . , n

I As earlier, Brook’s Lemma yields p(φ), improper, etc.

I Simplification: Bij = bij I , bij = wij/wi+,Σi = ( 1wi+

I To make proper, add ρ or perhaps ρj , j = 1, . . . , p

Page 33: CBMS Lecture 6

cont.

I A coregionalization approach (straightforward)

I With say, p = 2, write(φ1iφ2i

)= A

(η1iη2i

)I η1i ∼ CAR(τ1), η2i ∼ CAR(τ2)

I η1i , η2i independent