Upload
ngodieu
View
219
Download
3
Embed Size (px)
Citation preview
On the parametric and non-parametric estimationof the dependence function in
multivariate extremes
Sabrina Vettori, Raphaël Huser and Marc G. Genton
November 17, 2014
Why study multivariate extremes?• In a wide range of applications ranging from hydrology, finance, air pollution
control, environmental research, climate change to seismic analysis it is ofinterest to model and predict extreme events.
• To investigate the extremal behaviour of several variables at high levels,such as extremes of natural phenomena observed at distinct locationsand understand whether extremes tend to occur simoultaneously or not.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 2/28
Main GoalInvestigate the performance of various estimators of extremal dependenceunder different dependence scenarios, focusing on the comparison between non-parametric and parametric approaches.
Outline1. Bivariate extreme-value theory2. Extension to the multivariate framework3. Inference4. Simulation study5. Results6. Discussion
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 3/28
1 Bivariate extreme-value theoryGeneral frameworkLet (Xi,1, Xi,2), i = 1, 2... be an i.i.d sequence of r.v. having common d.f. F . Wedefine the bivariate vector of componentwise maxima as
-5 0 5 10
-50
510
X1
X2
Maxima(x1,x2)
Mn = (Mn,1,Mn,2) =(
max1≤i≤n
Xi,1, max1≤i≤n
Xi,2
).
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 4/28
General framework
If for some vectors an = (an,1, an,2) ∈ R2+ and bn = (bn,1, bn,2) ∈ R2, the
rescaled vector M∗n = an
−1(Mn − bn) satisfies, as n→∞,
P(M∗n,1 ≤ x1,M
∗n,2 ≤ x2
)= Fn(an,1x1 + bn,1, an,2x2 + bn,2)→ G(x1, x2),
where the margins of G are non-degenerate, then G is a bivariate extreme-value d.f. and F is said to belong to the MDA of G.
The margins Gj of G, j = 1, 2, are Generalized extreme-value (GEV) distributed
G(x | µ, σ, ξ) = exp
−{1 + ξ
(x− µσ
)}−1/ξ
+
, x ∈ R,
where µ ∈ R, σ > 0, ξ ∈ R and a+ = max(0, a).
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 5/28
Dependence function A(ω)
Suppose the margins have unit Fréchet d.f., i.e Gj(x) = exp(−1/x) for x > 0,which correspond to the case µ = σ = ξ = 1 in the GEV d.f..
A possible representation for G is via the dependence function A
G (x1, x2) = exp
{−
(1x1
+1x2
)A
(x2
x1 + x2
)}x1 > 0, x2 > 0,
where A is a function on [0, 1] satisfying the conditions:i) A is convexii) max(ω, 1− ω) ≤ A(ω) ≤ 1 for ω = x2/(x1 + x2) ∈ [0, 1].
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 6/28
Measure of dependenceA summary of the dependence strength is the extremal coefficient
θ = 2A(1/2) ∈ [1, 2].
Independence and complete dependence occur if θ = 2 or θ = 1, respectively.
Parametric models for A(ω)⇒ Logistic model (Gumbel 1960)
A(ω) ={
(1− ω)−1/α + ω−1/α}α
,
with 0 < α ≤ 1.⇒ Asymmetric logistic model (Tawn 1988)
A(ω) = (ψ − φ)ω + 1− ψ +[{ψ(1− ψ)}−1/α + (φω)−1/α
]α,
with 0 < α ≤ 1 and 0 ≤ α,ψ, φ ≤ 1.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 7/28
0.0 0.2 0.4 0.6 0.8 1.0
0.5
0.6
0.7
0.8
0.9
1.0
Logistic model
α = 0.6ω
A(ω)
0.0 0.2 0.4 0.6 0.8 1.00.5
0.6
0.7
0.8
0.9
1.0
Asymmetric logistic model
α = 0.4,ψ = 1,φ = 0.6ω
A(ω)
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 8/28
Data simulation⇒ Data can be simulated in the MDA of the logistic model using the outer
power Clayton archimedean copula (Nelsen, 2006)
1.(U1, U2) ∼ C1(u1, u2) = ϕ
{ϕ−1(u1) + ϕ−1(u2)
},
with ϕ(t) = (tα + 1)−1.
2.(X1, X2) = (−1/ logU1,−1/ logU2),
⇒ This result can be extended to simulate data in theMDA of the asymmetriclogistic model, using instead
C2(u1, u2) = C1(uψ1 , uφ2 )u(1−ψ)
1 u(1−φ)2 .
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 9/28
2 Extension to the multivariate case
Multivariate dependence function A(ω) d > 2The representation of G via the dependence function A becomes
G(x1, . . . , xd) = exp
{−
(d∑j=1
x−1j
)A
(x1
x1 + · · ·+ xd, . . . ,
xd−1
x1 + · · ·+ xd
)},
where A(ω), ω = (ω1, . . . , ωd−1)>, is now a function on the unit simplex Sd,satisfyingi) A convex;ii) max(ω) ≤ A(ω) ≤ 1 for all ω ∈ Sd.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 10/28
3 Inference
Two main aspects
1. Study the distribution of the univariate margins and standardise them to acommon standard Fréchet scale;
2.Estimate the dependence function A(ω).
The marginal and dependence parameters can be estimated either separatelyor jointly, where the latter approach allows to better estimate the global uncertainty.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 11/28
Parametric approach
A parametric model may be fitted by maximum likelihood to the standardiseddata mi,j = −1/ log{Gj(mi,j)}.
The log-likelihood function may be expressed as
`(δ) =M∑i=1
log g{(mi,1, mi,2); δ}
where g is the joint probability density function of the bivariate extreme-value modeland (mi,1,mi,2), i = 1, ...,M , are the observed maxima.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 12/28
Non-parametric approachFor m∗i,j = − log{Gj(mi,j)}, the variable
κ (ω) = min
(m∗1
1− ω,m∗2ω
), ω ∈ [0, 1] ,
has a standard exponential distribution with
E {κ(ω)} = A(ω)−1 and E {log κ(ω)} = − logA(ω)− η,
where η = −∫ 0∞ log(x)e−xdx (Genest and Segers 2009).
In practise, we estimate Gj using the empirical d.f. and m∗i,j = − log{Gj(mi,j)}.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 13/28
For ω ∈ [0, 1], standard non-parametric estimators of A(ω) are:
Pickands estimator (Pickands, 1981)
AP (ω) =
{1M
M∑i=1
min(m∗
i,1
1 − ω,m∗
i,2
ω
)}−1
.
HT estimator (Hall and Tajvidi, 2000)
AHT (ω) =
{1M
M∑i=1
min(m∗
i,1/m∗M,1
1 − ω,m∗
i,2/m∗M,2
ω
)}−1
.
where m∗M,j = M−1∑Mi=1 m
∗i,j , j = 1, 2.
CFG estimator (Capéràa et al., 1997)
ACF G(ω) = exp[log{A(ω)
}− ω log
{A(1)
}− (1 − ω) log
{A(0)
}]where
log A (ω) = −η −1M
M∑i=1
log{
min(m∗
i,1
1 − ω,m∗
i,2
ω
)}.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 14/28
A non-parametric estimator of A(ω) can be constructed based on themadogramν(ω), defined as
ν (ω) = E{
max(G
1/ω1 , G
1/(1−ω)2
)−
12(G
1/ω1 +G
1/(1−ω)2
)},
with 0 < ω < 1.MD estimator (Naveau et al., 2009 and Marcon et al., 2014)
AMD(ω) =νM (ω) + c(ω)
1− νM (ω)− c(ω),
where c(ω) =12
(ω
1− ω +1 + ω
2 + ω
)and
νM (ω) =1M
M∑i=1
{max
(G
1/ω1 (m∗
i,1), G1/(1−ω)2 (m∗
i,2))
−12(G
1/ω1 (m∗
i,1) + G1/(1−ω)2 (m∗
i,2))}
.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 15/28
Modifications to satisfy the constraints
• A common modification consists of considering
A(ω) = min{
1,max( ˆA(ω), ω, 1− ω
)},
where ˆA(ω) is the greatest convex minorant of the non-parametric estima-
tor A(ω).
• Shape-preserving estimator proposed by Marcon et al., 2014, consists ofthe projection of A(ω) onto the subspace of valid dependence functions,expressable through linear combinations of kth order Bernstein polynomials..
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 16/28
Marcon et al. approach 2014 (d=2)The jth Bernstein basis polynomial of degree k is defined as a continuous functionwith values in [0, 1] as
bj(ω; k) =(k
j
)ωj(1− ω)k−j ,
where(nν
)is a binomial coefficient.
The Bernstein polynomial approximation of a function A(ω) is
BA(ω; k) =k∑j=0
βjbj(ω; k), βj ∈ R.
Additionally, BA(ω, k) converges uniformly to A as k goes to infinity.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 17/28
The shape-preserving estimator based on a first guess, say AM , is obtained asthe solution of the optimization problem
AM = arg minA∈A‖AM −A‖2,
where the minimum is taken among all possible measurable functions in the classA of valid dependence functions, which is a closed and convex subset of L2([0, 1]).
To approximate A, they consider a nested sequence of constrained multivariateBernstein polynomial families Ak ⊂ A. Suitable restrictions on parameters βjmay be found in order that each member of Ak satisfies the dependence functionconditions.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 18/28
Inference for d>2
• Log-pairwise likelihood based on block maxima
`pair(δ) =M∑i=1
∑j1<j2
log g{(mi,j1 , mi,j2); δ}
where j = 1, . . . , d and g denotes the bivariate density (Padoan et al, 2010;Davison and Gholamrezaee, 2012).
• The presented non-parametric estimators of the dependence function canbe easily generalized in the d > 2 case, see e.g., Zhang et al. (2008) andGudendorf and Segers (2012).
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 19/28
4 Simulation study
Assessment of estimator performance
We denote the discrepancy between thetrue curve and the estimated one as{A(ω) −A(ω)
}
0.0 0.2 0.4 0.6 0.8 1.0
0.5
0.6
0.7
0.8
0.9
1.0
ω
A(ω)
We approximate the root integrated meansquare error (RIMSE)(
E[∫ 1
0
{A(ω) −A(ω)
}2dω
])1/2
the integrated bias (IBIAS)(∫ 1
0
[E{A(ω)
}−A(ω)
]2dω
)1/2
and the integrated standard deviation (ISD)
E(∫ 1
0
[A(ω) − E
{A(ω)
}]2dω
)1/2
.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 20/28
5 Results
Standard non-parametric estimators d = 2: data inthe MDA of the logistic model
θ
050
100
150
1.2 1.4 1.6 1.8 2.0
RIMSEx1000
1.2 1.4 1.6 1.8 2.00
50100
150
IBIASx1000
510
1520
2530
1.2 1.4 1.6 1.8 2.0
ISDx1000
EstimatorsP CFG HT MD
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 21/28
Standard and shape-preserving estimators d = 2:data in the MDA of the logistic model
θ
1020
3040
1.2 1.4 1.6 1.8 2.0
CFG
RIMSEx1000
1020
3040
50
HT
RIMSEx1000
050
100
150
MDRIMSEx1000
1.2 1.4 1.6 1.8 2.0
1015
2025
3035
P
RIMSEx1000
EstimatorsPP-BP
CFGCFG-BP
HTHT-BP
MDMD-BP
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 22/28
Parametric and non-parametric estimators d = 2:data in the MDA of the asymmetric logistic model
φ
0
10
20
30
40
0.4 0.6 0.8 1.0
1.3
ISDx1000
1.5
ISDx1000
0.4 0.6 0.8 1.0
1.7
ISDx1000
1.3
IBIASx1000
1.5
IBIASx1000
0
10
20
30
401.7
IBIASx1000
0
10
20
30
40
1.3
RIMSEx1000
0.4 0.6 0.8 1.0
1.5
RIMSEx1000
1.7
RIMSEx1000
EstimatorsLog Asy log CFG
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 23/28
Standard non-parametric estimators d = 5:data in the MDA of the logistic model
θ
0
10
20
30
40
50
2.5 3.0 3.5
RIMSEx1000
2.5 3.0 3.5
IBIASx1000
2.5 3.0 3.5
ISDx1000
EstimatorsCFG HT MD
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 24/28
Standard and shape-preserving estimators d = 5:data in the MDA of the logistic model
θ
10
20
30
40
50
2.5 3.0 3.5
CFG
RIMSEx1000
2.5 3.0 3.5
HT
RIMSEx1000
2.5 3.0 3.5
MD
RIMSEx1000
EstimatorsCFGCFG-BP
HTHT-BP
MDMD-BP
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 25/28
Parametric and non-parametric estimators d = 5:data in the MDA of the asymmetric logistic modelθ = 3
φ
2
4
6
8
10
12
0.80 0.85 0.90 0.95 1.00
RIMSEx1000
0.80 0.85 0.90 0.95 1.00
IBIASx1000
0.80 0.85 0.90 0.95 1.00
ISDx1000
EstimatorsLog Asy log CFG
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 26/28
6 Discussion
• Overall the bias increases as the strenght of the dependence decreases orthe asymmetry increases;
• the logistic estimator performs quite well even in the case of mild asym-metry;
• non-parametric methods work well in d = 2 but present high variabilityin the multivariate case compared to parametric ones.
Specific comments for non-parametric estimators:
• the CFG and the HT estimator perform generally well, the CFG especiallyclose to strong dependence;
• the BP modification of the estimators decreases the RIMSE especiallyclose to independence and in the d = 2 case.
Sabrina Vettori, Raphaël Huser and Marc G. Genton | KAUST 27/28
Questions?
Thank you!