Paul Embrechts - People – Department of Mathematics ...embrecht/ftp/GAEPpres.pdf · Paul...

Risk AggregationPaul Embrechts

Department of Mathematics, ETH Zurich

Senior SFI Professor

www.math.ethz.ch/~embrechts/

Joint work with P. Arbenz and G. Puccetti

1 / 33

The background

Query by practitioner (2005):

Calculate VaR for the sum of three random variables withgiven marginals (Pareto, gamma, lognormal) and across avariety of dependence structures (copulas)

Research project:

Numerical evaluation of (generalized) copula convolutions,leading to (G)AEP.

2 / 33

Risk aggregation is relevant for:

I portfolio analysis

I understanding diversification & concentration

I for regulatory capital calculationsI between risk categoriesI within risk categories

within the Basel III, Solvency 2, SST frameworks

I better understanding of diversification

I we shall only touch upon some aspects

3 / 33

New publication by the Bank for International Settlements

4 / 33

A canonical set-up

I X1, . . . ,Xd one-period risks

I ψ : Rd → R aggregation function

I R a risk measure

Task: calculate R(ψ(X1, . . . ,Xd))

Example: ψ(X1, . . . ,Xd) = X1 + X2 + · · ·+ Xd , R = VaRα, α ∈ (0, 1)

VaRα(X1 + X2 + · · ·+ Xd)

At best:RL ≤ R(ψ(X)) ≤ RU

depending on the underlying model assumptions!

5 / 33

Key issues

I Conditions:I Xi ∼ Fi , i = 1, . . . , dI known?/unknown?/unknowable?I risk versus uncertaintyI statistical uncertaintyI model uncertainty

I Dimensionality:I small: d ≤ 5, say, versusI large: d ∼ 100s

I Extremes matter:I in the tails: Extreme Value Theory (EVT)I in the interdependence: copulas (may) enter

P[X1 ≤ x1, . . . ,Xd ≤ xd ] = C (F1(x1), . . . ,Fd(xd))

6 / 33

Return to canonical example:

VaRα(X1 + X2 + · · ·+ Xd)

Issues:

I Relevance: sense or nonsense?

I Estimation, calculation

I additive (=) for comonotonic riskssubadditive (≤) for elliptical riskssuperadditive (>) for

I very heavy-tailed risksI very skewed risksI risks with a special interdependence

does it matter?

I measure of frequency (if), not severity (what if)

7 / 33

VaR in finance and insurance

Concerning VaR-calculations in finance and insurance:

I the VaR-number is just the final-final issue

I getting the risk-factor-mapping, clean-P&L are far moreimportant

I recall: VaR is a statistical estimate

I often upper (lower) bounds can be found

I find (best) worst case VaR given some side conditions

8 / 33

Example for an upper bound for VaR

Theorem (Embrechts-Puccetti)Let (X1, . . . ,Xd) be continuous with equal margins Fi = F ,i = 1, . . . , d . Then for α ∈ (0, 1),

VaRα(X1 + · · ·+ Xd) ≤ D−1d (1− α),

Dd(s) = infr∈[0,s/d)

∫ s−(d−1)rr (1− F (x))dx

s/d − r

9 / 33

This talk (as an example):

Numerically calculate, for α close to 1,

VaRα(X1 + X2 + · · ·+ Xd) (1)

or equivalently, calculate, typically for s large:

P[X1 + X2 + · · ·+ Xd ≤ s] (2)

numerically in terms of F1, . . . ,Fd and C which are assumed to beknown analytically

Remark: in order to calculate (1) for a given α, use a root-findingprocedure based on (2)

10 / 33

Standard solution

Monte Carlo: simulate i.i.d.

(X i1,X

i2, . . . ,X

id), i = 1, . . . , n

and estimate

P [X1 + X2 + · · ·+ Xd ≤ s] ≈ 1

n∑i=1

1{X i1 + X i

2 + · · ·+ X id ≤ s

}(Dis)advantages:

I A sampling algorithm must be available

I The convergence rate is relatively slow: O(1/√n)

I The convergence rate is independent of the dimension d

11 / 33

The AEP algorithm: First assumptionFirst assumption:The components of (X1,X2, . . . ,Xd) are positive: P[Xi > 0] = 1(or bounded from below)

Consequence: Suppose d = 2. Due to X1 > 0 and X2 > 0 we get

P [X1 + X2 ≤ s] = P [(X1,X2) ∈ S]

S = {(x1, x2) : x1 > 0, x2 > 0, x1 + x2 ≤ s}

12 / 33

The AEP algorithm: Second assumption

Second assumption:The joint distribution function (df)

H(x1, . . . , xd) = P [X1 ≤ x1,X2 ≤ x2, . . . ,Xd ≤ xd ]

is known analytically or can be numerically evaluated

Example: H is given by a copula model:

H(x1, . . . , xd) = C (F1(x1), . . . ,Fd(xd))

13 / 33

The probability mass of a rectangle is easy to calculate

For d = 2

Q = (a, b]× (c , d ]

P[(X1,X2) ∈ Q] = H(b, d)− H(a, d)− H(b, c) + H(a, c)

Idea behind the AEP algorithm: approximate the triangle S byrectangles!

14 / 33

First approximation (d = 2)I Recall: S = {(x1, x2) ∈ R2 : x1 > 0, x2 > 0, x1 + x2 ≤ s}I Set: Q = (0, 2/3s]× (0, 2/3s] (later: why 2/3)

Use Q as a first approximation of S

P [(X1,X2) ∈ S] ≈ P [(X1,X2) ∈ Q]︸︷︷︸easy to calculate

Q = (0, 2/3s]× (0, 2/3s]

15 / 33

Error of the first approximation

−P[ ]

P [S] = P[ ]

The error of the first approximation P [(X1,X2) ∈ Q] can again beexpressed in terms of triangles!Idea: again approximate those triangles by squares!

16 / 33

Approximate new triangles by squares

+ =+−+

− =−−+−

+ =+−+

With these geometric approximations of S, define a sequence Pn ofapproximations of P[X1 + X2 ≤ s] = P[(X1,X2) ∈ S]:

P1 = P[ ]

−P[ ]

P2 = P[ ]

17 / 33

Set representation of P1, P2 and P3

P1 P2 P3

Triangles are iteratively approximated by squares and the left overtriangles are then passed on to the next iteration

18 / 33

AEP algorithm for d = 3

In higher dimensions, the AEP can also be used.For instance, for d = 3, the set representation of P1, P2 and P3 is

Analogous decomposition possible in any dimension d ∈ N,but resulting simplexes are overlapping for d ≥ 4!

19 / 33

Choice of the sidelengths of the approximating hypercubes

How to choose the sidelengths of the approximating hypercubes?

Answer: For an optimal rate of convergence, take a hypercubewith sidelength

d + 1× (sidelength of the triangle)

Hence the choice of Q = (0, 2/3s]× (0, 2/3s] before for d = 2

20 / 33

Convergence

TheoremLet d ≤ 5 and suppose (X1, . . . ,Xd) has a density in aneighbourhood of {x ∈ Rd :

∑xi = s}, then

limn→∞

Pn = P [X1 + · · ·+ Xd ≤ s]

Remark: reason for convergence problems in high dimensions:simplex decomposition is overlapping for d ≥ 4

21 / 33

Richardson extrapolationDefine the extrapolated estimator P∗n of P [X1 + · · ·+ Xd ≤ s] by

P∗n = Pn + a (Pn − Pn−1),

where a = 2−d(d + 1)d/d!− 1.The additional term cancels the dominant error term of Pn

TheoremLet d ≤ 8 and suppose (X1, . . . ,Xd) has a twice continuouslydifferentiable density in a neighbourhood of {x ∈ Rd :

∑xi = s},

thenlimn→∞

P∗n = P [X1 + · · ·+ Xd ≤ s]

Remark: for d > 8, higher order extrapolation may be useful forproving convergence

22 / 33

Convergence ratesTheorem

I Let d ≤ 5 and suppose (X1, . . . ,Xd) has a density in aneighbourhood of {x ∈ Rd :

∑xi = s}, then

|Pn − P [X1 + · · ·+ Xd ≤ s]| = O((Ad)n

)I Let d ≤ 8 and suppose (X1, . . . ,Xd) has a twice continuously

differentiable density in a neighbourhood of{x ∈ Rd :

∑xi = s}, then

|P∗n − P [X1 + · · ·+ Xd ≤ s]| = O((A∗d)n

d = 2 d = 3 d = 4 d = 5 d = 6 d = 7 d = 8

Ad 0.333 0.500 0.664 0.925 – – –A∗d 0.037 0.125 0.234 0.358 0.498 0.656 0.8314

23 / 33

Convergence rates, cont.

The calculation of Pn and P∗n requires N(n) = O((Bd)n

)evaluations of the joint distribution function

d = 2 d = 3 d = 4 d = 5 d = 6 d = 7 d = 8

Bd 3 4 15 21 63 92 255

Both convergence rate and numerical complexity of Pn and P∗n areexponential. Combining both, we get

|Pn − P [X1 + · · ·+ Xd ≤ s]| = O(N(n)−γd

)|P∗n − P [X1 + · · ·+ Xd ≤ s]| = O

(N(n)−γ

)where γd and γ∗d determine the rate of convergence.

24 / 33

Convergence rates, cont.

The following table shows γd and γ∗d

d = 2 d = 3 d = 4 d = 5 d = 6 d = 7

γd 1 0.5 0.15 0.05 – –γ∗d 3 1.5 0.54 0.34 0.17 0.09

I Convergence rate of Monte Carlo: O(N−0.5

where N is the number of simulations.BUT: a (complex?) sampling algorithm must be available.

I Convergence rate of Quasi Monte Carlo O(N−1(logN)d

BUT: the algorithm must be tailored for each application.

I AEP does not need any tailoring or simulation.Only requirement: able to evaluate the joint distributionfunction of (X1, . . . ,Xd).

25 / 33

Numerical exampleI d = 2, 3, 4I Xi are Pareto(i) distributed (P[Xi ≤ x ] = 1− (1 + x)−i )I Clayton copula with θ = 2 (pairwise Kendall’s tau = 0.5)I s = 100I plot shows logarithm absolute errors: difference between

estimate (extrapolated AEP & MC) and reference valuex-axis, execution time on log scale

0.1msec 10msec 1sec

1e−13

1e−10

1e−7

1e−4

1e−1

execution time

0.1msec 10msec 1sec

1e−13

1e−10

1e−7

1e−4

1e−1

execution time

Monte Carlo

d=2d=3d=4

26 / 33

Numerical example: Conclusion

I In two and three dimensions, AEP is much faster than MonteCarlo

I For d ≥ 4, Monte Carlo beats AEP

I Memory requirements to calculate Pn with AEP growexponentially in n and in the dimension d , hence only lowdimensions are numerically feasible

AEP in general:INPUT:

I marginal dfs FiI copula C

I threshold s

OUTPUT:

I sequence Pn of estimates of P[X1 + · · ·+ Xd ≤ s]

SOFTWARE: available in C++27 / 33

Open problem

Recall: using Richardson extrapolation,

P∗n = Pn + a(Pn − Pn−1)

for some a ∈ R converges faster and in higher dimensions than Pn

Further work:Extend Richardson extrapolation to cancel higher order error terms!Possibly through estimators of the following form?

P∗∗n = Pn + b1(Pn − Pn−1) + b2(Pn−1 − Pn−2)

P∗∗∗n = Pn + c1(Pn − Pn−1) + c2(Pn−1 − Pn−2) + c3(Pn−2 − Pn−3)

28 / 33

The GAEP algorithmGAEP (Generalized AEP) concerns more general aggregationfunctionals, i.e. the estimation of

P[ψ(X1, . . . ,Xd) ≤ s],

where ψ : Rd → R is a continuous function that is strictlyincreasing in each coordinate.This probability can be represented as the mass of some“generalized triangle”:

{x ∈ R2+ : ψ(x) = s}

29 / 33

GAEP generalized triangle decomposition

Analogous to the AEP algorithm, we can decompose a generalizedtriangle into a rectangle and further generalized triangles:

{x ∈ R2+ : ψ(x) = s}

30 / 33

GAEP: short summary

I Issue: how to choose the sidelengths of the approximatinghypercubes (rectangles)? Paper proposes different possibilities

I Performance: Similar to AEP, very good for d = 2, 3,acceptable for d = 4 and not competitive for d ≥ 5

I Open problems:I A proof for an optimal choice of the hypercube sidelengthsI Extension of the extrapolation technique as used for AEP

31 / 33

References

I P. Arbenz, P. Embrechts, G. Puccetti: The AEP algorithm forthe fast computation of the distribution of the sum ofdependent random variables. Bernoulli 17(2), 2011, 562–591

I P. Arbenz, P. Embrechts, G. Puccetti: The GAEP algorithmfor the fast computation of the distribution of a function ofdependent random variables. (Forthcoming in Stochastics,2011)

I P. Embrechts, G. Puccetti: Risk Aggregation. In: CopulaTheory and its Applications, P. Jaworski, F. Durante, W.Haerdle, and T. Rychlik (Eds.). Lecture Notes in Statistics -Proceedings 198, Springer Berlin/Heidelberg, pp. 111-126

I Software (C++ code) to be obtained through the authors

32 / 33

Thank you

33 / 33

Paul Embrechts - People – Department of Mathematics ...embrecht/ftp/GAEPpres.pdf · Paul...

Documents

PHILIPP ARBENZ, PAUL EMBRECHTS, GIOVANNI PUCCETTI^ …math.univ-lyon1.fr/~mercadier/cirm2010/puccetti.pdf · 2010. 4. 30. · *ETH ZURICH ^UNIVERSITY OF FIRENZE THE AEP ALGORITHM

The Financial Crisis: Warnings, Guilt and a Mathematical ... fileThe Financial Crisis: Warnings, Guilt and a Mathematical Theorem Paul Embrechts Department of Mathematics . Director

Bernoulli and tail-dependence compatibility - ETH Zembrecht/ftp/Bernoulli_TailDep.pdf · BERNOULLI AND TAIL-DEPENDENCE COMPATIBILITY BY PAUL EMBRECHTS∗,1,MARIUS HOFERT†,2 AND

The devil is in the tails: actuarial mathematics and the ...people.math.ethz.ch/~embrecht/ftp/donnemb.pdf · The devil is in the tails: actuarial mathematics and the subprime mortgage

Sixth Form Newsletter - Isaac Newton Academy · Paul Embrechts is Emeritus Professor of Mathematics at the ETH Zurich (Swiss Federal Institute of Technology, Zurich) where he taught

The Mathematics of Risk Paul Embrechts...A risk measure which satisﬁes monotonicity, translation invariance, subadditivity and positive homogeneity is called coherent. A coherent

Presentación Paul Embrechts

Modelling of Stochastic Hybrid Systems with Applications to … · with statistical analysis of collected data (Embrechts et al., 1997), stochastic dynamical modelling approach has

UsingCopulaetoboundtheValue-at-Risk for functions of dependent …embrecht/ftp/EHJ.pdf · 2002. 2. 1. · Generalized inverses of increasing functions and copulae constitute the tech-nical

Multivariate Extremes and Market Risk Scenarios · 2018. 5. 22. · • Ansatz 3: (A.A. Balkema and P. Embrechts): geometric (portfolio based) approach c 2004 (P. Embrechts) 11. Ansatz

Ruin Theory Revisited: Stochastic Models for Operational Riskembrecht/ftp/ersamo.pdf · 2004-04-07 · Ruin Theory Revisited: Stochastic Models for Operational Risk Paul Embrechts

Seven Proofs for the Subadditivity of Expected Shortfallembrecht/ftp/Seven_Proofs.pdf · Seven Proofs for the Subadditivity of Expected Shortfall Paul Embrechts and Ruodu Wangy October

Risk Management for Insurance and Banking: Then, Now and ... · Then, Now and Tomorrow Paul Embrechts Department of Mathematics Director of RiskLab, ETH Zurich ... •«The puzzle

Modelling Extremes in Insurance and Finance: Practical Necessity versus Methodological ...embrecht/ftp/lecture_paris.pdf · 2006-06-07 · Modelling Extremes in Insurance and Finance:

Credit Risk Models: An Overview - Peopleembrecht/ftp/K.pdfCredit Risk Models: An Overview Paul Embrechts, Ru¨diger Frey, Alexander McNeil ETH Zu¨rich c 2003 (Embrechts, Frey, McNeil)

Bernoulli and Tail-Dependence Compatibilitysas.uwaterloo.ca/~wang/papers/2015EHW(AAP).pdf · 2015-07-31 · Bernoulli and Tail-Dependence Compatibility Paul Embrechts, Marius Hofertyand

1 DNA Classifications with Self-Organizing Maps (SOMs) Thanakorn Naenna Mark J. Embrechts Robert A. Bress May 2003 IEEE International Workshop on Soft

The devil is in the tails: actuarial mathematics and the ...embrecht/ftp/donnemb.pdfThe devil is in the tails: actuarial mathematics and the subprime mortgage crisis Catherine Donnelly

Data-driven polynomial chaos expansion for machine ...embrecht/ftp/Data_driven.pdf · Data-driven polynomial chaos expansion for machine learning regression E. Torre, S. Marelli,

Applying AI to Human Genome Part 1 : Collecting data Prof. M. Embrechts Robert Bress Bram Heyns