Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
A Model for Extremeson a Regular Spatial Lattice
Dan CooleyGrant Weller
Department of StatisticsColorado State University
Support obtained from NSF grant DMS-0905315
1
Outline
1. Motivating Data
2. Multivariate Extremes and the Angular Measure
3. A Big (i.e., High Dimensional) Extremes Model from SmallExtremes Models
4. Inference via a Decomposition
2
Motivating Data
−130 −125 −120 −115 −110 −105
3035
4045
5055
longitude
latit
ude
010
2030
4050
60
Daily precipitation amounts from a regional climate modelfor August 31, 2012.
3
Motivating Data
Characteristics of this data:• Data occurs on a regular lattice.
• Data is very high dimensional (∼ 2500 locations).
• We wish to work with daily data, and not use block (an-nual) maxima.
• Extremes occur on a small subregion of the spatial domain.
−130 −125 −120 −115 −110 −105
3035
4045
5055
longitude
latit
ude
010
2030
4050
60
Goal: Construct a threshold exceedance model which cap-tures the spatial tail dependence in this data.
4
Max-stable Processes
Spatial extremes work has centered around max-stable pro-cesses.
Q: Are max-stable processes the right answer?A: Yes. Limiting distribution of pointwise maxima.
Max-stable process notes:
• process models (i.e., infinite dimension)
• have been used to model block-maximum data
• fitting: composite likelihoods (size limitations?)
5
Proposed Approach
• not a process model. Instead, a high-dimensional multi-variate model.
• designed to handle threshold exceedances.
• constructed from the perspective of multivariate regular-variation.
• describes tail dependence via the angular (spectral) mea-sure.
• overall spatial model is composed of smaller models.
6
Multivariate Regular Variation
Idea: Joint tail behavior like a power function.
So What? Because it is defined in terms of tail behavior, itprovides a framework for describing the joint tail.
One definition: Let R = ‖Z‖ and W = ‖Z‖−1Z. Z is regu-lar varying if there exists a normalizing sequence {bn} whereP (b−1
n ‖Z‖ > r) ∼ 1/n, such that
nP(b−1n R > r,W ∈ B
)v→ r−αH(B)
where d is the dimension of Z, and where H is some proba-bility measure on the unit ‘ball’ Sd = {z ∈ Rd | ‖z‖ = 1}.
• left says “as points get big”
• measure on right is a product measure.
• angular measure H describes distribution of directions –completely describes dependence.
• definition requires a common tail behavior.
7
Multivariate Regular Variation, in Practice
• it is standard practice to convert marginals to a commonand convenient distribution, similar to copula methods.
• as points get large, it is natural to think in terms of pseudopolar coordinates.
• goal is to model the angular (or spectral) measure H.
8
Center-of-Mass Condition
We assume α = 1 and common marginals.
If we choose to use the L1 norm: ‖Z‖ = Z1 + . . .+ Zd, theseassumptions imply ∫
Sd
wiH(dw) = 1/d;
the “center-of-mass” condition.
9
More about the Simplex Sd
Sd = {w ∈ [0,1]d | ‖w‖ = 1}
• actually of dimension (d− 1), often denoted Sd−1.
One can speak of lower-dimensional simplices. FollowingColes and Tawn (1991), let l ∈ {1, . . . , d} denote the (lower)dimension, and c = {v1, . . . , vl} be an index variable of sub-spaces of size l.
Then Sl,c = {w ∈ Sd | wi = 0, i /∈ c;wj > 0, j ∈ c}.
10
A Big Model from Small Models
Toy Example:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
Let i = 1, . . . , d denote the locations on a lattice.
11
Defining Subregions
Using a ‘template’, define overlapping subregions Rj = {i :cell i ∈ subregion j} to cover spatial domain. Denote by Rthe set of subregions {R1, . . . , RJ}.
Template: ‘cross’
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
R1 = {1,2,6} R2 = {1,2,3,7} R13 = {8,12,13,14,18}
12
Building a Big EV Model from Small EV Models
Fundamental Model AssumptionWe assume that H(Sl,c) = 0 if c /∈ R.
• That is, we assume that cells are asymptotically dependentonly if there is a subregion which contains them.
• Made for modeling purposes.
• Corresponds to the idea that extreme events have a limitedspatial extent and occur only on small subregions of thestudy region.
We call Sj the simplex that corresponds to Rj.
Furthermore, let us assume that H is differentiable on Sj, anddenote by hj the angular density on Sj .
Idea: build models for hj, combine to get an overall model.
13
Building a Big EV Model from Small EV Models
Assume that for each subregion Rj, we have gj(w), a valid|Rj|-dimensional angular measure; that is, it meets the ‘center-of-mass’ conditions.
Assume hj = ajgj. Can we find weights aj, which yield a valid(overall) angular measure?
We need ∫Sd
wiH(dw) = 1/d.
14
Building a Valid Overall Angular Measure
∫Sd
wiH(dw) =J∑
j=1
∫Sj
wihj(w)dw
=J∑
j=1
aj
∫Sj
wigj(w)dw
=J∑
j=1
ajI{i∈Rj}
∫Sj
wigj(w)dw
=J∑
j=1
ajI{i∈Rj}1
|Rj|.
Setting this equation equal to 1/d for all i yields a linearsystem
Γa = (1/d)1,
where Γ is the d×J matrix whose {i, j}th element is I{i∈Rj}1|Rj|
.
15
Solving for Weights
Model: ‘cross’
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
Say we define subregions by centering the template at eachcell, creating 20 subregions.
Γ =
1/3 1/4 0 0 0 1/3 0 . . .
1/3 1/4 1/4 0 0 0 1/5 . . .
0 1/4 1/4 1/4 0 0 0 . . ....
Solving Γa = (1/d)1 for a yields strange results.
16
Solving for Weights
Problem is due to edge effects. Specifically, it is that cells atcorners and edges appear in too few subregions.
Fix: Expand the study region and account for missing subre-gions...
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
17
Solving for Weights
Problem is due to edge effects. Specifically, it is that cells atcorners and edges appear in too few subregions.
Fix: Expand the study region and account for missing sub-regions...and add columns to Γ which correspond to the newsubregions (of dimension 1 in this model).
Γ =
1/3 1/4 0 0 0 1/3 0 . . . 1 1 0 . . .
1/3 1/4 1/4 0 0 0 1/5 . . . 0 0 1 . . .
0 1/4 1/4 1/4 0 0 0 . . . 0 0 0 1...
Now system is underconstrained. Γ is d× J.
A solution: Impose additional constraints such as: subregionswith equal coverage get equal weights. Procedure: use SVDto find null space of Γ . . .
18
Solving for Weights
SVD procedure not really necessary, as setting aj =|Rj|Dd
whereD is the dimension of the template is a solution the linearsystem Γa = (1/d)1 yielding equal weights.
Further interpretation of aj:Let ARj = {z ∈ Rd | ‖z‖ > r, z‖z‖−1 ∈ Rj}, andAr = {z ∈ Rd | ‖z‖ > r}. Then,
P(b−1n Z ∈ ARj | b
−1n Z ∈ Ar
)=
nP(b−1n Z ∈ ARj
)nP (b−1
n Z ∈ Ar)
→r−1H(Sj)
r−1
= aj.
That is, probability a large point occurs in region Rj is aj.
It is easily shown that∑Jj=1 aj = 1.
19
So Far ....
We can take angular measure models on the lower-dimensionalsimplices which correspond to our pre-defined subregions andcombine them to obtain a valid overall angular measure model.
How do you fit the model?
Tasks:
1. How do you define a threshold exceedance and extractthem?
2. Estimating model parameters of gj. We will make a sta-tionary assumption; assume dependence parameters donot change with location.
20
Model Fitting
Angular measure models have been fit via point-process basedlikelihood methods applied to threshold exceedances.
• Dirichlet of Coles and Tawn (1991)
• Pairwise Beta of Cooley et al. (2010)
• Models of Ballani and Schlather (2011)
Problem: Model assumes that limiting angular measure onlyhas mass on the lower-dimensional simplices–that is, angularcomponents have many elements that are exactly zero. Ourobserved data, when ‘angularized’ (i.e., ‖Z‖−1Z) will not havecomponents that are exactly zero.
That is, for finite sample sizes, the higher-order terms haven’tcompletely disappeared.
Proposed Solution: Account/penalize for the higher orderterms.
21
Decomposition of Z
Theorem: (Weller and Cooley, 2012) Assume Z is regularlyvarying on cone C with limiting measure ν with correspondingangular measure H. Further assume, ∃ C0 ⊂ C s.t. ν(C0) = 0and Z exhibits hidden regular variation (Maulik and Resnick, 2004) onC0. Then there exists independent random vectors Y and Es.t. Z is tail equivalent to Y +E on both C and C0.
Idea: Assume Y has angular measure H, and that E is the‘error’ term which is regular varying (with lighter tail) andwhich accounts for the difference between the large observa-tions of a finite sample and the asymptotic limiting behavior.
22
Example in two dimensions
0 500 1000 1500
0500
1000
1500
Y
Y1
Y2 +
0 500 1000 1500
0500
1000
1500
E
E1
E2 =
0 500 1000 1500
0500
1000
1500
Y +E
Y1 +E1
Y2+E2
No point has component exactly equal to zero.
23
Inference for Z = Y +E
If Y and E were observed, we could use a point-processlikelihood approach applied to the large values of each.
Of course, we only observe Z.
Work in progress: A Monte-Carlo EM for obtaining MLE’sfor a likelihood which is valid only for the large values.
L(y, e; θ) =
J∏j=1
{1
mj!exp
[−aj(uY
bn
)−1]
×mj∏k=1
ajgj
(y(j,k)
‖y(j,k)‖; θ
)(‖y(j,k)‖bn
)−2}
×d∏
i=1
exp(−uE
cn
)mi!
mi∏k=1
(‖e(i,k)‖cn
)−3
Temporary solution: Assign Y to subregion with largest normand set these elements of E to zero.
24
Example: Smith (1990) Fields
• 20× 20
• max-stable fields with unit Frechet marginals
• regular-varying fields with α = 1 and common marginals
5 10 15 20
510
1520
x
y
10
20
30
40
50
60
70
25
Defining and Extracting Threshold Exceedances
5 10 15 20
510
1520
x
y
1000
2000
3000
4000
5 10 15 205
1015
20
x
y
1000
2000
3000
4000
5 10 15 20
510
1520
x
y
1000
2000
3000
4000
Original Field y e
‖z‖ = 46646 ‖y‖ = 17483 ‖e‖ = 29163
Note: Cross model likely doesn’t fit data well, as large valuesdo not appear to be contained by template.
26
Model Fitting (Preliminary)
Cross Model (5 dimensional)
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
16 17 18 19 20
In theory, we can choose any 5-dimensional angular measuremodel
• Dirichlet (Not spatial, Coles and Tawn (1991) )
• Pairwise Beta (Not spatial, Cooley et al. (2010))
• Smith (fit via an ad-hoc pairwise likelihood approach)
27
Model Fitting (Preliminary)
Cross Model (5 dimensional)
0 2 4 6 8 10
1.0
1.2
1.4
1.6
1.8
2.0
distance
thet
a
●
● ●
● ● ●
●
●
●
● ● ● ● ● ● ● ●
Dirichlet / Pairwise Beta / Smith
28
Model Fitting (Preliminary)
Star Model (13-dimensional)
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56
• Dirichlet
• Pairwise Beta
• Smith
29
Model Fitting (Preliminary)
Star Model (13-dimensional)
0 2 4 6 8 10
1.0
1.2
1.4
1.6
1.8
2.0
distance
theta
Dirichlet / Pairwise Beta / Smith
30
Summary
• Proposed a finite-dimensional model for high-dimensionalspatial data.– Large extremes model constructed from small models.– Results in a valid angular-density model.– Makes an assumption of asymptotic independence.– Different approach from max-stable processes.– Designed to model spatial threshold exceedances.
• Model fitting/model selection.– Threshold exceedances via Y ’s radial component.– Models fit via a point-process likelihood.∗ Requires a flexible finite-dimensional angular measure
model at subregion level.∗ Possible EM algorithm for Y and E components.
– Model selection yet to be explored.∗ Possible likelihood-based with E serving as penalty?
31
References
Ballani, F. and Schlather, M. (2011). A construction principle for multivariate extremevalue distributions. Biometrika, 98:633–645.
Coles, S. and Tawn, J. (1991). Modeling multivariate extreme events. Journal of theRoyal Statistical Society, Series B, 53:377–92.
Cooley, D., Davis, R. A., and Naveau, P. (2010). The pairwise beta: A flexible paramet-ric multivariate model for extremes. Journal of Multivariate Analysis, 101:2103–2117.
Maulik, K. and Resnick, S. (2004). Characterizations and examples of hidden regularvariation. Extremes, 7(1):31–67.
Smith, R. (1990). Max-stable processes and spatial extremes. Technical report, Uni-versity of North Carolina, Chapel Hill.
Weller, G. and Cooley, D. (2012). An alternative characterization of hidden regularvariation in joint tail modeling. Technical Report 2012/2, Colorado State University,Department of Statistics.
32