37

corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT:

PROPERTIES AND PITFALLS

PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

Abstract. Modern risk management calls for an understanding of stochastic de-pendence going beyond simple linear correlation. This paper deals with the static(non-time-dependent) case and emphasizes the copula representation of depen-dence for a random vector. Linear correlation is a natural dependence measurefor multivariate normally and, more generally, elliptically distributed risks butother dependence concepts like comonotonicity and rank correlation should alsobe understood by the risk management practitioner. Using counterexamples thefalsity of some commonly held views on correlation is demonstrated; in general,these fallacies arise from the naive assumption that dependence properties of theelliptical world also hold in the non-elliptical world. In particular, the problem of�nding multivariate models which are consistent with prespeci�ed marginal dis-tributions and correlations is addressed. Pitfalls are highlighted and simulationalgorithms avoiding these problems are constructed.

1. Introduction

1.1. Correlation in �nance and insurance. In �nancial theory the notion ofcorrelation is central. The Capital Asset Pricing Model (CAPM) and the ArbitragePricing Theory (APT) (Campbell, Lo, and MacKinlay 1997) use correlation as ameasure of dependence between di�erent �nancial instruments and employ an ele-gant theory, which is essentially founded on an assumption of multivariate normallydistributed returns, in order to arrive at an optimal portfolio selection. Althoughinsurance has traditionally been built on the assumption of independence and thelaw of large numbers has governed the determination of premiums, the increasingcomplexity of insurance and reinsurance products has led recently to increased ac-tuarial interest in the modelling of dependent risks (Wang 1997); an example is theemergence of more intricate multi-line products.The current quest for a sound methodological basis for integrated risk manage-

ment also raises the issue of correlation and dependency. Although contemporary�nancial risk management revolves around the use of correlation to describe de-pendence between risks, the inclusion of non-linear derivative products invalidatesmany of the distributional assumptions underlying the use of correlation. In insur-ance these assumptions are even more problematic because of the typical skewnessand heavy-tailedness of insurance claims data.Recently, within the actuarial world, dynamic �nancial analysis (DFA) and dy-

namic solvency testing (DST) have been heralded as a way forward for integratedrisk management of the investment and underwriting risks to which an insurer (or

Date: November 1998.Key words and phrases. Risk management; correlation; elliptic distributions; rank correlation;

dependency; copula; comonotonicity; simulation; Value-at-Risk; coherent risk measures.The work of the third author was supported by RiskLab; the second author would like to thank

Swiss Re for �nancial support.1

Page 2: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

2 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

bank) is exposed. DFA, for instance, is essentially a Monte Carlo or simulation-based approach to the joint modelling of risks (see e.g. Cas (1997) or Lowe andStanard (1997)). This necessitates model assumptions that combine information onmarginal distributions together with ideas on interdependencies. The correct im-plementation of a DFA-based risk management system certainly requires a properunderstanding of the concepts of dependency and correlation.

1.2. Correlation as a source of confusion. But correlation, as well as beingone of the most ubiquitous concepts in modern �nance and insurance, is also oneof the most misunderstood concepts. Some of the confusion may arise from theliterary use of the word to cover any notion of dependency. To a mathematiciancorrelation is only one particular measure of stochastic dependency among many. Itis the canonical measure in the world of multivariate normal distributions, and moregenerally for spherical and elliptical distributions. However, empirical research in�nance and insurance shows that the distributions of the real world are seldom inthis class.

••

••

••

• •

••

••

••

•••

••

• •• •

••

••

• •

••

• • •

••

••

••

••

••

••

•••

••

••

• ••

••

••

••

••

••

••

••

••

• ••

••

••

••

••

••

••

••

• •

••

••

••

•••

••

••

•••

••

••

••

•••

••

••

••

••

••

••

• ••

••

••

••

••

• •

••

••

•• •

• ••

• •

••

••

••

••

• •

••

••

••

••

•••

••

••

••

••

••

•••

••

••

• ••• •

• •

••

••

• ••

••

•••

••

• •

••

••• •

••

••

••

••

•••

••

• • ••

••

•• ••

••

•••

••

• •

••

••

••

• ••

••

••

••

••

•••

• •

••

••

••

••

• •

• •

•••

• •

• •

• •

••

••

••

••

••

••

••

••

••

•••

• •

••

••

•••

••

•••

••

• •

•••

••

• • •

•••

••

• •

••

• •

•••

• •

••

• •• •

••

••

••

••

••

••

••

••

••

•••

• •

••

• •

•••

• •

•••

••

••

••

••

••

• •

• ••

• •

••

• •

••

••

••

••

• •

••

••

••

••

••

•• •

••

••

•••

••

•••

•• •

••

••

••

••

••

••

•••

X1

Y1

0 2 4 6 8 10 12

02

46

810

12

Gaussian

••

•• ••

••

•• • •• •

••

••

••

••••

• •

••

••

••

•••

••

••

••

•• ••

••

• ••

••

• •

••

• •

••

••

• •

••

• •

•••

•••

••

••

••

••

••

••

•• •

••

•••

• • ••

• ••

••

• •

••

• •••••

••

••

••

••

• •

•• •

••••

••

• •

•••

••

••

••

• •

••

••

••

••

••

••

••

•••

••

••

••

••

••

••

••

••

• •

••

••

••

• •

••

• •

••

••

••

••

••

••

••

••

••

••

••

••••

••

•••

••

••

• •

•••• •

••

••

••

••

• •

••

••

••

••

••

••

••

• •

••

••

••

•• •

••

••

•• •

••

• •

•••

••

• •

•••

•••

••

••

••

••

•••

••

•• •

• •

••

•••

••

••

••

••

••

• •

••

••

• •

••

• •

•••

• •••

• ••

••••

••

•••

•••

••

•• ••

• ••

•• •

• •

••

••

•• •••

•• •

• •

••

•• •

••

••

•••

• •

••

••

•••

••

••

••

•• •

••

••

••

• •

• ••

••

••

••

•••

••

• •

••

••

••

X2

Y2

0 2 4 6 8 10 12

02

46

810

12

Gumbel

Figure 1. 1000 random variates from two distributions with iden-tical Gamma(3,1) marginal distributions and identical correlation� = 0:7, but di�erent dependence structures.

As motivation for the ideas of this paper we include Figure 1. This shows 1000bivariate realisations from two di�erent probability models for (X; Y ). In both mod-els X and Y have identical gamma marginal distributions and the linear correlationbetween them is 0.7. However, it is clear that the dependence between X and Y inthe two models is qualitatively quite di�erent and, if we consider the random vari-ables to represent insurance losses, the second model is the more dangerous modelfrom the point of view of an insurer, since extreme losses have a tendency to occurtogether. We will return to this example later in the paper, see Section 5; for the

Page 3: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 3

time-being we note that the dependence in the two models cannot be distinguishedon the grounds of correlation alone.The main aim of the paper is to collect and clarify the essential ideas of depen-

dence, linear correlation and rank correlation that anyone wishing to model depen-dent phenomena should know. In particular, we highlight a number of importantfallacies concerning correlation which arise when we work with models other thanthe multivariate normal. Some of the pitfalls which await the end-user are quitesubtle and perhaps counter-intuitive.We are particularly interested in the problem of constructing multivariate dis-

tributions which are consistent with given marginal distributions and correlations,since this is a question that anyone wanting to simulate dependent random vectors,perhaps with a view to DFA, is likely to encounter. We look at the existence andconstruction of solutions and the implementation of algorithms to generate randomvariates. Various other ideas recur throughout the paper. At several points we lookat the e�ect of dependency structure on the Value-at-Risk or VaR under a partic-ular probability model, i.e. we measure and compare risks by looking at quantiles.We also relate these considerations to the idea of a coherent measure of risk asintroduced by Artzner, Delbaen, Eber, and Heath (1999).We concentrate on the static problem of describing dependency between a pair or

within a group of random variables. There are various other problems concerning themodelling and interpretation of serial correlation in stochastic processes and cross-correlation between processes; see Boyer, Gibson, and Loretan (1999) for problemsrelated to this. We do not consider the statistical problem of estimating correlationsand rank correlation, where a great deal could also be said about the availableestimators, their properties and their robustness, or the lack of it.

1.3. Organization of paper. In Section 2 we begin by discussing joint distribu-tions and the use of copulas as descriptions of dependency between random variables.Although copulas are a much more recent and less well known approach to describ-ing dependency than correlation, we introduce them �rst for two reasons. First,they are the principal tool we will use to illustrate the pitfalls of correlation andsecond, they are the approach which in our opinion a�ords the best understandingof the general concept of dependency.In Section 3 we examine linear correlation and de�ne spherical and elliptical

distributions, which constitute, in a sense, the natural environment of the linearcorrelation. We mention both some advantages and shortcomings of correlation.Section 4 is devoted to a brief discussion of some alternative dependency conceptsand measures including comonotonicity and rank correlation. Three of the mostcommon fallacies concerning linear correlation and dependence are presented inSection 5. In Section 6 we explain how vectors of dependent random variables maybe simulated using correct methods.

2. Copulas

Probability-integral and quantile transforms play a fundamental role when work-ing with copulas. In the following proposition we collect together some essentialfacts that we use repeatedly in this paper. The notation X � F means that therandom variable X has distribution function F .

Page 4: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

4 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

Proposition 1. Let X be a random variable with distribution function F . Let F�1

be the quantile function of F , i.e.

F�1(�) = inffxjF (x) � �g;� 2 (0; 1). Then

1. For any standard-uniformly distributed U � U(0; 1) we have F�1(U) � F .This gives a simple method for simulating random variates with distributionfunction F .

2. If F is continuous then the random variable F (X) is standard-uniformly dis-tributed, i.e. F (X) � U(0; 1).

Proof. In most elementary texts on probability.

2.1. What is a copula? The dependence between the real-valued random vari-ables X1; : : : ; Xn is completely described by their joint distribution function

F (x1; : : : ; xn) = P[X1 � x1; : : : ; Xn � xn]:

The idea of separating F into a part which describes the dependence structure andparts which describe the marginal behaviour only, has led to the concept of a copula.Suppose we transform the random vector X = (X1; : : : ; Xn)

t component-wise tohave standard-uniform marginal distributions, U(0; 1)1. For simplicity we assume tobegin with that X1; : : : ; Xn have continuous marginal distributions F1; : : : ; Fn, sothat this can be achieved by using the probability-integral transformation T : Rn !Rn; (x1; : : : ; xn)

t 7! (F1(x1); : : : ; Fn(xn))t. The joint distribution function C of

(F1(X1); : : : ; Fn(Xn))t is then called the copula of the random vector (X1; : : : ; Xn)

t

or the multivariate distribution F . It follows that

F (x1; : : : ; xn) = P[F1(X1) � F1(x1); : : : ; Fn(Xn) � Fn(xn)]

= C(F1(x1); : : : ; Fn(xn)): (1)

De�nition 1. A copula is the distribution function of a random vector in Rn withuniform-(0; 1) marginals. Alternatively a copula is any function C : [0; 1]n ! [0; 1]which has the three properties:

1. C(x1; : : : ; xn) is increasing in each component xi.2. C(1; : : : ; 1; xi; 1; : : : ; 1) = xi for all i 2 f1; : : : ; ng, xi 2 [0; 1].3. For all (a1; : : : ; an); (b1; : : : ; bn) 2 [0; 1]n with ai � bi we have:

2Xi1=1

� � �2X

in=1

(�1)i1+���+inC(x1i1 ; : : : ; xnin) � 0; (2)

where xj1 = aj and xj2 = bj for all j 2 f1; : : : ; ng.These two alternative de�nitions can be shown to be equivalent. It is a par-

ticularly easy matter to verify that the �rst de�nition in terms of a multivariatedistribution function with standard uniform marginals implies the three propertiesabove: property 1 is clear; property 2 follows from the fact that the marginalsare uniform-(0; 1); property 3 is true because the sum (2) can be interpreted asP[a1 � X1 � b1; : : : ; an � Xn � bn], which is non-negative.For any continuous multivariate distribution the representation (1) holds for a

unique copula C. If F1; : : : ; Fn are not all continuous it can still be shown (seeSchweizer and Sklar (1983), Chapter 6) that the joint distribution function can

1Alternatively one could transform to any other distribution, but U(0; 1) is particularly easy.

Page 5: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 5

always be expressed as in (1), although in this case C is no longer unique and werefer to it as a possible copula of F .The representation (1), and some invariance properties which we will show shortly,

suggest that we interpret a copula associated with (X1; : : :Xn)t as being the depen-

dence structure. This makes particular sense when all the Fi are continuous and thecopula is unique; in the discrete case there will be more than one way of writing thedependence structure. Pitfalls related to non-continuity of marginal distributionsare presented in Marshall (1996). A recent, very readable introduction to copulasis Nelsen (1999).

2.2. Examples of copulas. For independent random variables the copula triviallytakes the form

Cind(x1; : : : ; xn) = x1 � : : : � xn: (3)

We now consider some particular copulas for non-independent pairs of random vari-ables (X; Y ) having continuous distributions. The Gaussian or normal copula is

CGa� (x; y) =

Z ��1(x)

�1

Z ��1(y)

�1

1

2�(1� �2)1=2exp

��(s2 � 2�st + t2)

2(1� �2)

�dsdt; (4)

where �1 < � < 1 and � is the univariate standard normal distribution func-tion. Variables with standard normal marginal distributions and this dependencestructure, i.e. variables with d.f. CGa

� (�(x);�(y)), are standard bivariate normalvariables with correlation coeÆcient �. Another well-known copula is the Gumbelor logistic copula

CGu� (x; y) = exp

��n(� log x)1=� + (� log y)1=�

o��; (5)

where 0 < � � 1 is a parameter which controls the amount of dependence betweenX and Y ; � = 1 gives independence and the limit of CGu

� for � ! 0+ leads to perfectdependence, as will be discussed in Section 4. This copula, unlike the Gaussian, isa copula which is consistent with bivariate extreme value theory and could be usedto model the limiting dependence structure of component-wise maxima of bivariaterandom samples (Joe (1997), Galambos (1987)).The following is a simple method for generating a variety of copulas which will

be used later in the paper. Let f; g : [0; 1] ! R withR 1

0f(x)dx =

R 1

0g(y)dy = 0

and f(x)g(y) � �1 for all x; y 2 [0; 1]. Then h(x; y) = 1 + f(x)g(y) is a bivariatedensity function on [0; 1]2. Consequently,

C(x; y) =

Z x

0

Z y

0

h(u; v)dudv = xy +

�Z x

0

f(u)du

��Z y

0

g(v)dv

�(6)

is a copula. If we choose f(x) = �(1� 2x), g(y) = (1� 2y), j�j � 1, we obtain, forexample, the Farlie-Gumbel-Morgenstern copula C(x; y) = xy[1+�(1�x)(1� y))].Many copulas and methods to construct them can be found in the literature; seefor example Hutchinson and Lai (1990) or Joe (1997).

2.3. Invariance. The following proposition shows one attractive feature of the cop-ula representation of dependence, namely that the dependence structure as summa-rized by a copula is invariant under increasing and continuous transformations ofthe marginals.

Page 6: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

6 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

Proposition 2. If (X1; : : : ; Xn)t has copula C and T1; : : : ; Tn are increasing con-

tinuous functions, then (T1(X1); : : : ; Tn(Xn))t also has copula C.

Proof. Let (U1; : : : ; Un)t have distribution function C (in the case of continuous

marginals FXitake Ui = FXi

(Xi)). We may write

C(FT1(X1)(x1); : : : ; FTn(Xn)(xn))

= P[U1 � FT1(X1)(x1); : : : ; Un � FTn(Xn)(xn)]

= P[F�1T1(X1)

(U1) � x1; : : : ; F�1Tn(Xn)

(Un) � xn]

= P[T1 Æ F�1X1(U1) � x1; : : : ; Tn Æ F�1

Xn(Un) � xn]

= P[T1(X1) � x1; : : : ; Tn(Xn) � xn]:

Remark 1. The continuity of the transformations Ti is necessary for general ran-dom variables (X1; : : : ; Xn)

t since, in that case, F�1Ti(Xi)

= TiÆF�1Xi. In the case where

all marginal distributions of X are continuous it suÆces that the transformationsare increasing (see also Chapter 6 of Schweizer and Sklar (1983)).

As a simple illustration of the relevance of this result, suppose we have a prob-ability model (multivariate distribution) for dependent insurance losses of variouskinds. If we decide that our interest now lies in modelling the logarithm of theselosses, the copula will not change. Similarly if we change from a model of percentagereturns on several �nancial assets to a model of logarithmic returns, the copula willnot change, only the marginal distributions.

3. Linear Correlation

3.1. What is correlation? We begin by considering pairs of real-valued, non-degenerate random variables X; Y with �nite variances.

De�nition 2. The linear correlation coeÆcient between X and Y is

�(X; Y ) =Cov[X; Y ]p�2[X]�2[Y ]

;

where Cov[X; Y ] is the covariance betweenX and Y , Cov[X; Y ] = E[XY ]�E[X]E[Y ]and �2[X]; �2[Y ] denote the variances of X and Y .

The linear correlation is a measure of linear dependence. In the case of indepen-dent random variables, �(X; Y ) = 0 since Cov[X; Y ] = 0. In the case of perfectlinear dependence, i.e. Y = aX + b a.s. or P[Y = aX + b] = 1 for a 2 R n f0g,b 2 R, we have �(X; Y ) = �1. This is shown by considering the representation

�(X; Y )2 =�2[Y ]�min

a;bE[(Y � (aX + b))2]

�2[Y ]: (7)

In the case of imperfect linear dependence, �1 < �(X; Y ) < 1, and this is the casewhen misinterpretations of correlation are possible, as will later be seen in Section 5.Equation (7) shows the connection between correlation and simple linear regression.The right hand side can be interpreted as the relative reduction in the variance ofY by linear regression on X, The regression coeÆcients aR; bR, which minimise the

Page 7: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 7

squared distance E[(Y � (aX + b))2] are given by

aR =Cov[X; Y ]

�2[X];

bR = E[Y ]� aRE[X]:

Correlation ful�lls the linearity property

�(�X + �; Y + Æ) = sgn(� � )�(X; Y );when �; 2 R n f0g, �; Æ 2 R. Correlation is thus invariant under positive aÆnetransformations, i.e. strictly increasing linear transformations.The generalisation of correlation to more than two random variables is straight-

forward. Consider vectors of random variables X = (X1; : : : ; Xn)t and Y =

(Y1; : : : ; Yn)t in Rn. We can summarise all pairwise covariances and correlations

in n � n matrices Cov[X;Y] and �(X;Y). As long as the corresponding variancesare �nite we de�ne

Cov[X;Y]ij := Cov[Xi; Yj];

�(X;Y)ij := �(Xi; Yj) 1 � i; j � n:

It is well known that these matrices are symmetric and positive semi-de�nite. Oftenone considers only pairwise correlations between components of a single random vec-tor; in this case we setY = X and consider �(X) := �(X;X) or Cov[X] := Cov[X;X].The popularity of linear correlation can be explained in several ways. Correla-

tion is often straightforward to calculate. For many bivariate distributions it is asimple matter to calculate second moments (variances and covariances) and henceto derive the correlation coeÆcient. Alternative measures of dependence, which wewill encounter in Section 4 may be more diÆcult to calculate.Moreover, correlation and covariance are easy to manipulate under linear op-

erations. Under aÆne linear transformations A : Rn ! Rm; x 7! Ax + a and

B : Rn ! Rm; x 7! Bx+ b for A;B 2 Rm�n, a; b 2 Rm we have

Cov[AX+ a; BY + b] = ACov[X;Y]Bt:

A special case is the following elegant relationship between variance and covariancefor a random vector. For every linear combination of the components �tX with� 2 Rn,

�2[�tX] = �tCov[X]�:

Thus, the variance of any linear combination is fully determined by the pairwisecovariances between the components. This fact is commonly exploited in portfoliotheory.A third reason for the popularity of correlation is its naturalness as a measure of

dependence in multivariate normal distributions and, more generally, in multivariatespherical and elliptical distributions, as will shortly be discussed. First, we mentiona few disadvantages of correlation.

3.2. Shortcomings of correlation. We consider again the case of two real-valuedrandom variables X and Y .

� The variances of X and Y must be �nite or the linear correlation is not de�ned.This is not ideal for a dependency measure and causes problems when wework with heavy-tailed distributions. For example, the covariance and thecorrelation between the two components of a bivariate t�-distributed random

Page 8: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

8 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

vector are not de�ned for � � 2. Non-life actuaries who model losses in di�erentbusiness lines with in�nite variance distributions must be aware of this.

� Independence of two random variables implies they are uncorrelated (linearcorrelation equal to zero) but zero correlation does not in general imply inde-pendence. A simple example where the covariance disappears despite strongdependence between random variables is obtained by taking X � N (0; 1),Y = X2, since the third moment of the standard normal distribution is zero.Only in the case of the multivariate normal is it permissable to interpret un-correlatedness as implying independence. This implication is no longer validwhen only the marginal distributions are normal and the joint distribution isnon-normal, which will also be demonstrated in Example 1. The class of spher-ical distributions model uncorrelated random variables but are not, except inthe case of the multivariate normal, the distributions of independent randomvariables.

� Linear correlation has the serious de�ciency that it is not invariant under non-linear strictly increasing transformations T : R ! R. For two real-valuedrandom variables we have in general

�(T (X); T (Y )) 6= �(X; Y ):

If we take the bivariate standard normal distribution with correlation � andthe transformation T (x) = �(x) (the standard normal distribution function)we have

�(T (X); T (Y )) =6

�arcsin

��2

�; (8)

see Joag-dev (1984). In general one can also show (see Kendall and Stu-art (1979), page 600) for bivariate normally-distributed vectors and arbitrary

transformations T; eT : R! R that

j�(T (X); eT (Y ))j � j�(X; Y )j;which is also true in (8).

3.3. Spherical and elliptical distributions. The spherical distributions extendthe standard multivariate normal distribution Nn(0; I), i.e. the distribution of in-dependent standard normal variables. They provide a family of symmetric distri-butions for uncorrelated random vectors with mean zero.

De�nition 3.

A random vector X = (X1; : : : ; Xn)t has a spherical distribution if for every orthog-

onal map U 2 Rn�n (i.e. maps satisfying UU t = U tU = In�n)

UX =d X:2

The characteristic function (t) = E[exp(ittX)] of such distributions takes aparticularly simple form. There exists a function � : R�0 ! R such that (t) =�(ttt) = �(t21+: : :+t

2n). This function is the characteristic generator of the spherical

distribution and we write

X � Sn(�):

If X has a density f(x) = f(x1; : : : ; xn) then this is equivalent to f(x) = g(xtx) =g(x21+ : : :+x

2n) for some function g : R�0 ! R�0, so that the spherical distributions

are best interpreted as those distributions whose density is constant on spheres.

2We standardly use =d to denote equality in distribution.

Page 9: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 9

Some other examples of densities in the spherical class are those of the multivariatet-distribution with � degrees of freedom f(x) = c(1+xtx=�)�(n+�)=2 and the logisticdistribution f(x) = c exp(�xtx)=[1 + exp(�xtx)]2, where c is a generic normalizingconstant. Note that these are the distributions of uncorrelated random variables but,contrary to the normal case, not the distributions of independent random variables.In the class of spherical distributions the multivariate normal is the only distributionof independent random variables, see Fang, Kotz, and Ng (1987), page 106.The spherical distributions admit an alternative stochastic representation. X �

Sn(�) if and only if

X =d R �U; (9)

where the random vector U is uniformly distributed on the unit hypersphere Sn�1 =fx 2 R

njxtx = 1g in Rn and R � 0 is a positive random variable, independentof U (Fang, Kotz, and Ng (1987), page 30). Spherical distributions can thus beinterpreted as mixtures of uniform distributions on spheres of di�ering radius inRn. For example, in the case of the standard multivariate normal distribution

the generating variate satis�es R � p�2n, and in the case of the multivariate t-

distribution with � degrees of freedom R2=n � F (n; �) holds, where F (n; �) denotesan F-distribution with n and � degrees of freedom.Elliptical distributions extend the multivariate normal Nn(�;�), i.e. the distri-

bution with mean � and covariance matrix �. Mathematically they are the aÆnemaps of spherical distributions in Rn.

De�nition 4. Let T : Rn ! Rn;x 7! Ax + �, A 2 Rn�n, � 2 Rn be an aÆne map.

X has an elliptical distribution if X = T (Y) and Y � Sn(�).

Since the characteristic function can be written as

(t) = E[exp(ittX)] = E[exp(itt(AY + �))]

= exp(itt�) exp(i(Att)tY) = exp(itt�)�(tt�t);

where � := AAt, we denote the elliptical distributions

X � En(�;�; �):

For example, Nn(�;�) = En(�;�; �) with �(t) = exp(�t2=2). If Y has a densityf(y) = g(yty) and if A is regular (det(A) 6= 0 so that � is strictly positive-de�nite),then X = AY + � has density

h(x) =1p

det(�)g((x� �)t��1(x� �));

and the contours of equal density are now ellipsoids.Knowledge of the distribution of X does not completely determine the elliptical

representation En(�;�; �); it uniquely determines � but � and � are only determinedup to a positive constant3. In particular � can be chosen so that it is directlyinterpretable as the covariance matrix of X, although this is not always standard.LetX � En(�;�; �), so thatX =d �+AY where � = AAt andY is a random vectorsatisfyingY � Sn(�). EquivalentlyY =d R�U, where U is uniformly distributed onSn�1 and R is a positive random variable independent of U. If E[R2] <1 it follows

3If X is elliptical and non-degenerate there exists �, A and Y � Sn(�) so that X =d AY + �,

but for any � 2 R n f0g we also have X =d (A=�)�Y + � where �Y � Sn(e�) and e�(u) := �(�2u).

In general, if X � En(�;�; �) = En(e�; e�; e�) then � = ~� and there exists c > 0 so that e� = c� ande�(u) = �(u=c) (see Fang, Kotz, and Ng (1987), page 43).

Page 10: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

10 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

that E[X] = � and Cov[X] = AAtE[R2]=n = �E[R2]=n since Cov[U] = In�n=n.

By starting with the characteristic generator e�(u) := �(u=c) with c = n=E[R2] weensure that Cov[X] = �. An elliptical distribution is thus fully described by itsmean, its covariance matrix and its characteristic generator.We now consider some of the reasons why correlation and covariance are natu-

ral measures of dependence in the world of elliptical distributions. First, many ofthe properties of the multivariate normal distribution are shared by the ellipticaldistributions. Linear combinations, marginal distributions and conditional distri-butions of elliptical random variables can largely be determined by linear algebrausing knowledge of covariance matrix, mean and generator. This is summarized inthe following properties.

� Any linear combination of an elliptically distributed random vector is alsoelliptical with the same characteristic generator �. If X � En(�;�; �) andB 2 Rm�n, b 2 Rm, then

BX+ b � Em(B�+ b; B�Bt; �):

It is immediately clear that the components X1; : : : ; Xn are all symmetricallydistributed random variables of the same type4.

� The marginal distributions of elliptical distributions are also elliptical with

the same generator. Let X =

�X1

X2

�� En(�; �; �) with X1 2 R

p, X2 2 Rq,

p + q = n. Let E[X] = � =

��1�2

�, �1 2 Rp, �2 2 Rq and � =

��11 �12

�21 �22

�,

accordingly. Then

X1 � Ep(�1;�11; �); X2 � Eq(�2;�22; �):

� We assume that � is strictly positive-de�nite. The conditional distribution of

X1 given X2 is also elliptical, although in general with a di�erent generator e�:X1jX2 � Ep(�1:2;�11:2; e�); (10)

where �1:2 = �1 + �12��122 (X2 � �2), �11:2 = �11 � �12�

�122 �21. The distribu-

tion of the generating variable eR in (9) corresponding to e� is the conditionaldistributionq

(X� �)t��1(X� �)� (X2 � �2)t��122 (X2 � �2)

���X2:

Since in the case of multivariate normality uncorrelatedness is equivalent to

independence we have eR =d

p�2p and

e� = �, so that the conditional distribu-tion is of the same type as the unconditional; for general elliptical distributionsthis is not true. From (10) we see that

E[X1jX2] = �1:2 = �1 + �12��122 (X2 � �2);

so that the best prediction of X1 given X2 is linear in X2 and is simply thelinear regression of X1 on X2. In the case of multivariate normality we haveadditionally

Cov[X1jX2] = �11:2 = �11 � �12��122 �21;

4Two random variables X und Y are of the same type if we can �nd a > 0 and b 2 R so thatY =d aX + b:

Page 11: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 11

which is independent of X2. The independence of the conditional covariancefrom X2 is also a characterisation of the multivariate normal distribution inthe class of elliptical distributions (Kelker 1970).

Since the type of all marginal distributions is the same, we see that an ellipticaldistribution is uniquely determined by its mean, its covariance matrix and knowl-edge of this type. Alternatively the dependence structure (copula) of a continuouselliptical distribution is uniquely determined by the correlation matrix and knowl-edge of this type. For example, the copula of the bivariate t-distribution with �degrees of freedom and correlation � is

Ct�;�(x; y) =

Z t�1� (x)

�1

Z t�1� (y)

�1

1

2�(1� �2)1=2

�1 +

(s2 � 2�st+ t2)

�(1� �2)

��(�+2)=2dsdt;

(11)

where t�1� (x) denotes the inverse of the distribution function of the standard uni-variate t-distribution with � degrees of freedom. This copula is seen to depend onlyon � and �.An important question is, which univariate types are possible for the marginal

distribution of an elliptical distribution in Rn for any n 2 N? Without loss of

generality, it is suÆcient to consider the spherical case (Fang, Kotz, and Ng (1987),pages 48{51). F is the marginal distribution of a spherical distribution in Rn forany n 2 N if and only if F is a mixture of centred normal distributions. In otherwords, if F has a density f , the latter is of the form,

f(x) =1p2�

Z 1

0

1

�exp

�� x2

2�2

�G(d�);

where G is a distribution function on [0;1) with G(0) = 0. The correspondingspherical distribution has the alternative stochastic representation

X =d S � Z;where S � G, Z � Nn(0; In�n) and S and Z are independent. For example, themultivariate t-distribution with � degrees of freedom can be constructed by takingS � p

�=p�2�.

3.4. Covariance and elliptical distributions in risk management. A fur-ther important feature of the elliptical distributions is that these distributions areamenable to the standard approaches of risk management. They support both theuse of Value-at-Risk as a measure of risk and the mean-variance (Markowitz) ap-proach (see e.g. Campbell, Lo, and MacKinlay (1997)) to risk management andportfolio optimization.Suppose that X = (X1; : : : ; Xn)

t represents n risks with an elliptical distributionand that we consider linear portfolios of such risks

fZ =nXi=1

�iXi j �i 2 Rg

with distribution FZ . The Value-at-Risk (VaR) of portfolio Z at probability level �is given by

VaR�(Z) = F�1Z (�) = inffz 2 R : FZ(z) � �g;

i.e. it is simply an alternative notation for the quantile function of FZ evaluated at� and we will often use VaR�(Z) and F

�1Z (�) interchangeably.

Page 12: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

12 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

In the elliptical world the use of VaR as a measure of the risk of a portfolio Zmakes sense because VaR is a coherent risk measure in this world. A coherent riskmeasure in the sense of Artzner, Delbaen, Eber, and Heath (1999) is a real-valuedfunction % on the space of real-valued random variables5 which ful�lls the following(sensible) properties:

A1. (Positivity). For any positive random variable X � 0: %(X) � 0.A2. (Subabdditivity). For any two random variables X and Y we have

%(X + Y ) � %(X) + %(Y ).A3. (Positive homogeneity). For � � 0 we have that %(�X) = �%(X).A4. (Translation invariance).

For any a 2 R we have that %(X + a) = %(X) + a.

In the elliptical world the use of any positive homogeneous, translation-invariantmeasure of risk to rank risks or to determine optimal risk-minimizing portfolioweights under the condition that a certain return is attained, is equivalent to theMarkowitz approach where the variance is used as risk measure. Alternative riskmeasures such as VaR� or expected shortfall, E[ZjZ > VaR�(Z)], give di�erentnumerical values, but have no e�ect on the management of risk. We make theseassertions more precise in Theorem 1.Throughout this paper for notational and pedagogical reasons we use VaR in its

most simplistic form, i.e. disregarding questions of appropriate horizon, estimationof the underlying pro�t-and-loss distribution, etc. However, the key messages stem-ming from this oversimpli�ed view carry over to more concrete VaR calculations inpractice.

Theorem 1. Suppose X � En(�;�; �) with �2[Xi] <1 for all i. Let

P = fZ =nXi=1

�iXi j �i 2 Rg

be the set of all linear portfolios. Then the following are true.

1. (Subadditivity of VaR.) For any two portfolios Z1; Z2 2 P and 0:5 � � < 1,

VaR�(Z1 + Z2) � VaR�(Z1) + VaR�(Z2):

2. (Equivalence of variance and positive homogeneous risk measurement.) Let %be a real-valued risk measure on the space of real-valued random variables whichdepends only on the distribution of a random variable X. Suppose this measuresatis�es A3. Then for Z1; Z2 2 P

%(Z1 � E[Z1]) � %(Z2 � E[Z2]) () �2[Z1] � �2[Z2]:

3. (Markowitz risk-minimizing portfolio.) Let % be as in 2 and assume that A4 isalso satis�ed. Let

E = fZ =nXi=1

�iXi j �i 2 R;nXi=1

�i = 1;E[Z] = rg

be the subset of portfolios giving expected return r. Then

argminZ2E %(Z) = argminZ2E �2[Z]:

Proof. The main observation is that (Z1; Z2)t has an elliptical distribution so Z1,

Z2 and Z1 + Z2 all have distributions of the same type.

5Positive values of these random variables should be interpreted as losses; this is in contrast toArtzner, Delbaen, Eber, and Heath (1999), who interpret negative values as losses.

Page 13: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 13

1. Let q� be the �-quantile of the standardised distribution of this type. Then

VaR�(Z1) = E[Z1] + �[Z1]q�;

VaR�(Z2) = E[Z2] + �[Z2]q�;

VaR�(Z1 + Z2) = E[Z1 + Z2] + �[Z1 + Z2]q�:

Since �[Z1 + Z2] � �[Z1] + �[Z2] and q� � 0 the result follows.2. Since Z1 and Z2 are random variables of the same type, there exists an a > 0

such that Z1 � E[Z1] =d a(Z2 � E[Z2]). It follows that

%(Z1 � E[Z1]) � %(Z2 � E[Z2]) () a � 1 () �2[Z1] � �2[Z2]:

3. Follows from 2 and the fact that we optimize over portfolios with identicalexpectation.

While this theorem shows that in the elliptical world the Markowitz variance-minimizing portfolio minimizes popular risk measures like VaR and expected short-fall (both of which are coherent in this world), it can also be shown that theMarkowitz portfolio minimizes some other risk measures which do not satisfy A3and A4. The partial moment measures of downside risk provide an example. Thekth (upper) partial moment of a random variable X with respect to a threshold �is de�ned to be

LPMk;� (X) = E

hf(X � �)+gk

i; k � 0; � 2 R:

Suppose we have portfolios Z1; Z2 2 E and assume additionally that r � � , sothat the threshold is set above the expected return r. Using a similar approach tothe preceding theorem it can be shown that

�2[Z1] � �2[Z2] () (Z1 � �) =d a(Z2 � �)� (1� a)(� � r);

with 0 < a � 1. It follows that

LPMk;�(Z1) � LPMk;�(Z2) () �2[Z1] � �2[Z2];

from which the equivalence to Markowitz is clear. See Harlow (1991) for an empir-ical case study of the change in the optimal asset allocation when LPM1;� (targetshortfall) and LPM2;� (target semi-variance) are used.

4. Alternative dependence concepts

We begin by clarifying what we mean by the notion of perfect dependence. Wego on to discuss other measures of dependence, in particular rank correlation. Weconcentrate on pairs of random variables.

4.1. Comonotonicity. For every copula the well-known Fr�echet bounds apply(Fr�echet (1957))

maxfx1 + � � �+ xn + 1� n; 0g| {z }C`(x1;::: ;xn)

� C(x1; : : : ; xn) � minfx1; : : : ; xng| {z }Cu(x1;::: ;xn)

; (12)

these follow from the fact that every copula is the distribution function of a randomvector (U1; : : : ; Un)

t with Ui � U(0; 1). In the case n = 2 the bounds C` and Cu arethemselves copulas since, if U � U(0; 1), then

C`(x1; x2) = P[U � x1; 1� U � x2]

Cu(x1; x2) = P[U � x1; U � x2];

Page 14: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

14 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

so that C` and Cu are the bivariate distribution functions of the vectors (U; 1�U)tand (U; U)t respectively.The distribution of (U; 1�U)t has all its mass on the diagonal between (0; 1) and

(1; 0), whereas that of (U; U)t has its mass on the diagonal between (0; 0) and (1; 1).In these cases we say that C` and Cu describe perfect positive and perfect negativedependence respectively. This is formalized in the following theorem.

Theorem 2. Let (X; Y )t have one of the copulas C` or Cu.6 (In the former case

this means F (x1; x2) = maxfF1(x1) + F2(x2) � 1; 0g; in the latter F (x1; x2) =minfF1(x1); F2(x2)g.) Then there exist two monotonic functions u; v : R ! R anda real-valued random variable Z so that

(X; Y )t =d (u(Z); v(Z))t;

with u increasing and v decreasing in the former case and with both increasing inthe latter. The converse of this result is also true.

Proof. The proof for the second case is given essentially in Wang and Dhaene (1998).A geometrical interpretation of Fr�echet copulas is given in Mikusinski, Sherwood,and Taylor (1992). We consider only the �rst case C = C`, the proofs being similar.Let U be a U(0; 1)-distributed random variable. We have

(X; Y )t =d (F�11 (U); F�1

2 (1� U))t = (F�11 (U); F�1

2 Æ g (U))t;where F�1

i (q) = infx2RfFi(x) � qg, q 2 (0; 1) is the quantile function of Fi; i = 1; 2;and g(x) = 1 � x. It follows that u := F�1

1 is increasing and v := F�12 Æ g is

decreasing. For the converse assume

(X; Y )t =d (u(Z); v(Z))t;

with u and v increasing and decreasing respectively. We de�neA := fZ 2 u�1((�1; x])g, B := fZ 2 v�1((�1; y])g. If A \ B 6= ; then themonotonicity of u and v imply that

P[A [B] = P[] = 1 = P[A] + P[B]� P[A \B]and hence P[A \ B] = P[u(Z) � x; v(Z) � y] = F1(x) + F2(y) � 1. If A \ B = ;,then F1(x) + F2(y)� 1 � 0. In all cases we have

P[u(Z) � x; v(Z) � y] = maxfF1(x) + F2(y)� 1; 0g:

We introduce the following terminology.

De�nition 5. [Yaari (1987)] If (X; Y )t has the copula Cu (see again footnote 6)then X and Y are said to be comonotonic; if it has copula C` they are said to becountermonotonic.

In the case of continuous distributions F1 and F2 a stronger version of the resultcan be stated:

C = C` () Y = T (X) a.s.; T = F�12 Æ (1� F1) decreasing; (13)

C = Cu () Y = T (X) a.s.; T = F�12 Æ F1 increasing: (14)

6If there are discontinuities in F1 or F2 so that the copula is not unique, then we interpret C`and Cu as being possible copulas.

Page 15: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 15

4.2. Desired properties of dependency measures. A measure of dependence,like linear correlation, summarises the dependency structure of two random variablesin a single number. We consider the properties that we would like to have from thismeasure. Let Æ(�; �) be a dependency measure which assigns a real number to anypair of real-valued random variables X and Y . Ideally, we desire the followingproperties:

P1. Æ(X; Y ) = Æ(Y;X) (symmetry).P2. �1 � Æ(X; Y ) � 1 (normalisation).P3. Æ(X; Y ) = 1 () X; Y comonotonic;

Æ(X; Y ) = �1 () X; Y countermonotonic.P4. For T : R! R strictly monotonic on the range of X:

Æ(T (X); Y ) =

�Æ(X; Y ) T increasing,�Æ(X; Y ) T decreasing:

Linear correlation ful�lls properties P1 and P2 only. In the next Section we see thatrank correlation also ful�lls P3 and P4 if X and Y are continuous. These propertiesobviously represent a selection and the list could be altered or extended in variousways (see Hutchinson and Lai (1990), Chapter 11). For example, we might like tohave the property

P5. Æ(X; Y ) = 0 () X; Y are independent.

Unfortunately, this contradicts property P4 as the following shows.

Proposition 3. There is no dependency measure satisfying P4 and P5.

Proof. Let (X; Y )t be uniformly distributed on the unit circle S1 in R2, so that(X; Y )t = (cos�; sin�)t with � � U(0; 2�). Since (�X; Y )t =d (X; Y )

t, we have

Æ(�X; Y ) = Æ(X; Y ) = �Æ(X; Y );which implies Æ(X; Y ) = 0 although X and Y are dependent. With the sameargumentation it can be shown that the measure is zero for any spherical distributionin R2.

If we require P5, then we can consider dependency measures which only assignpositive values to pairs of random variables. For example, we can consider theamended properties,

P2b. 0 � Æ(X; Y ) � 1.P3b. Æ(X; Y ) = 1 () X; Y comonotonic or countermonotonic.P4b. For T : R! R strictly monotonic Æ(T (X); Y ) = Æ(X; Y ).

If we restrict ourselves to the case of continuous random variables there are de-pendency measures which ful�ll all of P1, P2b, P3b, P4b and P5, although theyare in general measures of theoretical rather than practical interest. We introducethem brie y in the next Section. A further measure which satis�es all of P1, P2b,P3b, P4b and P5 (with the exception of the implication Æ(X; Y ) = 1 =) X; Ycomonotonic or countermonotonic) is monotone correlation,

Æ(X; Y ) = supf;g

�(f(X); g(Y ));

where � represents linear correlation and the supremum is taken over all monotonicf and g such that 0 < �2(f(X)); �2(g(Y )) < 1 (Kimeldorf and Sampson 1978).The disadvantage of all of these measures is that they are constrained to give non-negative values and as such cannot di�erentiate between positive and negative de-pendence and that it is often not clear how to estimate them. An overview ofdependency measures and their statistical estimation is given by Tj�stheim (1996).

Page 16: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

16 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

4.3. Rank correlation.

De�nition 6. Let X and Y be random variables with distribution functions F1

and F2 and joint distribution function F . Spearman's rank correlation is given by

�S(X; Y ) = �(F1(X); F2(Y )); (15)

where � is the usual linear correlation. Let (X1; Y1) and (X2; Y2) be two independentpairs of random variables from F , then Kendall's rank correlation is given by

�� (X; Y ) = P[(X1 �X2)(Y1 � Y2) > 0]� P[(X1 �X2)(Y1 � Y2) < 0]: (16)

For the remainder of this Section we assume that F1 and F2 are continuous dis-tributions, although some of the properties of rank correlation that we derive couldpartially be formulated for discrete distributions. Spearman's rank correlation isthen seen to be the correlation of the copula C associated with (X; Y )t. Both �Sand �� can be considered to be measures of the degree of monotonic dependence be-tween X and Y , whereas linear correlation measures the degree of linear dependenceonly. The generalisation of �S and �� to n > 2 dimensions can be done analogouslyto that of linear correlation: we write pairwise correlations in a n� n-matrix.We collect together the important facts about �S and �� in the following theorem.

Theorem 3. Let X and Y be random variables with continuous distributions F1

and F2, joint distribution F and copula C. The following are true:

1. �S(X; Y ) = �S(Y;X), �� (X; Y ) = �� (Y;X).2. If X and Y are independent then �S(X; Y ) = �� (X; Y ) = 0.3. �1 � �S(X; Y ); �� (X; Y ) � +1.

4. �S(X; Y ) = 12R 1

0

R 1

0fC(x; y)� xygdxdy.

5. �� (X; Y ) = 4R 1

0

R 1

0C(u; v)dC(u; v)� 1.

6. For T : R ! R strictly monotonic on the range of X, both �S and �� satisfyP4.

7. �S(X; Y ) = �� (X; Y ) = 1 () C = Cu () Y = T (X) a.s. with Tincreasing.

8. �S(X; Y ) = �� (X; Y ) = �1 () C = C` () Y = T (X) a.s. with Tdecreasing.

Proof. 1., 2. and 3. are easily veri�ed.

4. Use of the identity, due to H�o�ding (1940)

Cov[X; Y ] =

Z 1

�1

Z 1

�1

fF (x; y)� F1(x)F2(y)g dxdy (17)

which is found, for example, in Dhaene and Goovaerts (1996). Recall that(F1(X); F2(Y ))

t have joint distribution C.5. Calculate

�� (X; Y ) = 2P[(X1 �X2)(Y1 � Y2) > 0]� 1

= 2 � 2ZZZZ

R4

1fx1>x2g1fy1>y2g dF (x2; y2) dF (x1; y1)� 1

= 4

ZZR2

F (x1; y1) dF (x1; y1)� 1

= 4

ZZC(u; v) dC(u; v)� 1:

Page 17: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 17

6. Follows since �� and �S can both be expressed in terms of the copula which isinvariant under strictly increasing transformations of the marginals.

7. From 4. it follows immediately that �S(X; Y ) = +1 i� C(x; y) is maximizedi� C = Cu i� Y = T (X) a.s. Suppose Y = T (X) a.s. with T increasing,then the continuity of F2 ensures P[Y1 = Y2] = P[T (X1) = T (X2)] = 0, whichimplies �� (X; Y ) = P[(X1 � X2)(Y1 � Y2) > 0] = 1. Conversely �� (X; Y ) = 1means P P[(!1; !2) 2 � j(X(!1) � X(!2))(Y (!1) � Y (!2)) > 0g] = 1.Let us de�ne sets A = f! 2 jX(w) � xg and B = f! 2 jY (w) � yg.Assume P[A] � P[B]. We have to show P[A \ B] = P[A]. If P[A n B] > 0then also P[B n A] > 0 and (X(!1) � X(!2))(Y (!1) � Y (!2)) < 0 on the set(A n B) � (B n A), which has measure P[A n B] � P[B n A] > 0, and this is acontradiction. Hence P[AnB] = 0, from which one concludes P[A\B] = P[A].

8. We use a similar argument to 7.

In this result we have veri�ed that rank correlation does have the properties P1,P2, P3 and P4. As far as P5 is concerned, the spherical distributions again pro-vide examples where pairwise rank correlations are zero, despite the presence ofdependence.Theorem 3 (part 4) shows that �S is a scaled version of the signed volume enclosed

by the surfaces S1 : z = C(x; y) and S2 : z = xy. The idea of measuring dependenceby de�ning suitable distance measures between the surfaces S1 and S2 is furtherdeveloped in Schweizer and Wol� (1981), where the three measures

Æ1(X; Y ) = 12

Z 1

0

Z 1

0

jC(u; w)� uvjdudv

Æ2(X; Y ) =�90

Z 1

0

Z 1

0

jC(u; w)� uvj2dudv�1=2

Æ3(X; Y ) = 4 supu;v2[0;1]

jC(u; v)� uvj

are proposed. These are the measures that satisfy our amended set of propertiesincluding P5 but are constrained to give non-negative measurements and as suchcannot di�erentiate between positive and negative dependence. A further disadvan-tage of these measures is statistical. Whereas statistical estimation of �S and �� fromdata is straightforward (see Gibbons (1988) for the estimators and Tj�stheim (1996)for asymptotic estimation theory) it is much less clear how we estimate measureslike Æ1; Æ2; Æ3.The main advantages of rank correlation over ordinary correlation are the invari-

ance under monotonic transformations and the sensible handling of perfect depen-dence. The main disadvantage is that rank correlations do not lend themselves tothe same elegant variance-covariance manipulations that were discussed for linearcorrelation; they are not moment-based correlations. As far as calculation is con-cerned, there are cases where rank correlations are easier to calculate and caseswhere linear correlations are easier to calculate. If we are working, for example,with multivariate normal or t-distributions then calculation of linear correlation iseasier, since �rst and second moments are easily determined. If we are workingwith a multivariate distribution which possesses a simple closed-form copula, likethe Gumbel or Farlie-Gumbel-Morgenstern, then moments may be diÆcult to de-termine and calculation of rank correlation using Theorem 3 (parts 4 and 5) maybe easier.

Page 18: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

18 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

4.4. Tail Dependence. If we are particularly concerned with extreme values anasymptotic measure of tail dependence can be de�ned for pairs of random variablesX and Y . If the marginal distributions of these random variables are continuousthen this dependency measure is also a function of their copula, and is thus invariantunder strictly increasing transformations.

De�nition 7. Let X and Y be random variables with distribution functions F1

and F2. The coeÆcient of (upper) tail dependence of X and Y is

lim�!1�

P[Y > F�12 (�) j X > F�1

1 (�)] = �;

provided a limit � 2 [0; 1] exists. If � 2 (0; 1] X and Y are said to be asymptoticallydependent (in the upper tail); if � = 0 they are asymptotically independent.

As for rank correlation, this de�nition makes most sense in the case that F1 andF2 are continuous distributions. In this case it can be veri�ed, under the assumptionthat the limit exists, that

lim�!1�

P[Y > F�12 (�) j X > F�1

1 (�)]

= lim�!1�

P[Y > VaR�(Y ) j X > VaR�(X)] = lim�!1�

C(�; �)

1� �;

where C(u; u) = 1�2u+C(u; u) denotes the survivor function of the unique copulaC associated with (X; Y )t. Tail dependence is best understood as an asymptoticproperty of the copula.Calculation of � for particular copulas is straightforward if the copula has a

simple closed form. For example, for the Gumbel copula introduced in (5) it iseasily veri�ed that � = 2 � 2�, so that random variables with this copula areasymptotically dependent provided � < 1.For copulas without a simple closed form, such as the Gaussian copula or the

copula of the bivariate t-distribution, an alternative formula for � is more useful.Consider a pair of uniform random variables (U1; U2)

t with distribution C(x; y),which we assume is di�erentiable in both x and y. Applying l'Hospital's rule weobtain

� = � limx!1�

dC(x; x)

dx= lim

x!1�Pr[U2 > x j U1 = x] + lim

x!1�Pr[U1 > x j U2 = x]:

Furthermore, if C is an exchangeable copula, i.e. (U1; U2)t =d (U2; U1)

t, then

� = 2 limx!1�

Pr[U2 > x j U1 = x]:

It is often possible to evaluate this limit by applying the same quantile transformF�11 to both marginals to obtain a bivariate distribution for which the conditional

probability is known. If F1 is a distribution function with in�nite right endpointthen

� = 2 limx!1�

Pr[U2 > x j U1 = x] = 2 limx!1

Pr[F�11 (U2) > x j F�1

1 (U1) = x]

= 2 limx!1

Pr[Y > x j X = x];

where (X; Y )t � C(F1(x); F1(y)).For example, for the Gaussian copula CGa

� we would take F1 = � so that (X; Y )t

has a standard bivariate normal distribution with correlation �. Using the fact that

Page 19: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 19

Y j X = x � N(�x; 1� �2), it can be calculated that

� = 2 limx!1

�(xp1� �=

p1 + �):

Thus the Gaussian copula gives asymptotic independence, provided that � < 1.Regardless of how high a correlation we choose, if we go far enough into the tail,extreme events appear to occur independently in each margin. See Sibuya (1961)or Resnick (1987), Chapter 5, for alternative demonstrations of this fact.The bivariate t-distribution provides an interesting contrast to the bivariate nor-

mal distribution. If (X; Y )t has a standard bivariate t-distribution with � degreesof freedom and correlation � then, conditional on X = x,�

� + 1

� + x2

�1=2 Y � �xp1� �2

� t�+1:

This can be used to show that

� = 2t�+1�p

� + 1p1� �=

p1 + �

�;

where t�+1 denotes the tail of a univariate t-distribution. Provided � > �1 thecopula of the bivariate t-distribution is asymptotically dependent. In Table 1 wetabulate the coeÆcient of tail dependence for various values of � and �. Perhapssurprisingly, even for negative and zero correlations, the t-copula gives asymptoticdependence in the upper tail. The strength of this dependence increases as � de-creases and the marginal distributions become heavier-tailed.

� n � -0.5 0 0.5 0.9 12 0.06 0.18 0.39 0.72 14 0.01 0.08 0.25 0.63 110 0.0 0.01 0.08 0.46 11 0 0 0 0 1

Table 1. Values of � for the copula of the bivariate t-distributionfor various values of �, the degrees of freedom, and �, the correlation.Last row represents the Gaussian copula.

In Figure 2 we plot exact values of the conditional probability P[Y > VaR�(Y ) jX = VaR�(X)] for pairs of random variables (X; Y )t with the Gaussian and t-copulas, where the correlation parameter of both copulas is � = 0:9 and the degreesof freedom of the t-copula is � = 4. For large values of � the conditional probabilitiesfor the t-copula dominate those for the Gaussian copula. Moreover the former tendtowards a non-zero asymptotic limit, whereas the limit in the Gaussian case is zero.

4.5. Concordance. In some situations we may be less concerned with measuringthe strength of stochastic dependence between two random variables X and Y andwe may wish simply to say whether they are concordant or discordant, that is,whether the dependence between X and Y is positive or negative. While it mightseem natural to de�ne X and Y to be positively dependent when �(X; Y ) > 0 (orwhen �S(X; Y ) > 0 or �� (X; Y ) > 0), stronger conditions are generally used and wediscuss two of these concepts in this Section.

Page 20: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

20 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

1-alpha

0.0001 0.0010 0.0100 0.1000

0.2

00.2

50.3

00.3

50.4

00.4

50.5

0t-copulaGaussian copulaAsymptotic value for t

Figure 2. Exact values of the conditional probability P[Y >VaR�(Y ) j X = VaR�(X)] for pairs of random variables (X; Y )t withthe Gaussian and t-copulas, where the correlation parameter in bothcopulas is � = 0:9 and the degrees of freedom of the t-copula is � = 4.

De�nition 8. Two random variables X and Y are positive quadrant dependent(PQD), if

P[X > x; Y > y] � P[X > x]P[Y > y] for all x; y 2 R: (18)

Since P[X > x; Y > y] = 1�P[X � x]+P[Y � y]�P[X � x; Y � y] it is obviousthat (18) is equivalent to

P[X � x; Y � y] � P[X � x]P[Y � y] for all x; y 2 R:De�nition 9. Two random variables X and Y are positively associated (PA), if

E[g1(X; Y )g2(X; Y )] � E[g1(X; Y )]E[g2(X; Y )] (19)

for all real-valued, measurable functions g1 und g2, which are increasing in bothcomponents and for which the expectations above are de�ned.

For further concepts of positive dependence see Chapter 2 of Joe (1997), wherethe relationships between the various concepts are also systematically explored. We

Page 21: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 21

note that PQD and PA are invariant under increasing transformations and we verifythat the following chain of implications holds:

Comonotonicity) PA) PQD) �(X; Y ) � 0; �S(X; Y ) � 0; �� (X; Y ) � 0: (20)

If X and Y are comonotonic, then from Theorem 2 we can conclude that(X; Y ) =d (F�1

1 (U); F�12 (U)), where U � U(0; 1). Thus the expectations in (19)

can be written as

E[g1(X; Y )g2(X; Y )] = E[eg1(U)eg2(U)]and

E[g1(X; Y )] = E[eg1(U)] , E[g2(X; Y )] = E[eg2(U)];where eg1 and eg2 are increasing. Lemma 2.1 in Joe (1997) shows that

E[eg1(U)eg2(U)] � E[eg1(U)]E[eg2(U)];so that X and Y are PA. The second implication follows immediately by taking

g1(u; v) = 1fu>xg

g2(u; v) = 1fv>yg:

The third implication PQD) �(X; Y ) � 0; �S(X; Y ) � 0 follows from the identity(17) and the fact that PA and PQD are invariant under increasing transformations.PQD) �� (X; Y ) � 0 follows from Theorem 2.8 in Joe (1997).In the sense of these implications (20), comonotonicity is the strongest type of

concordance or positive dependence.

5. Fallacies

Where not otherwise stated, we consider bivariate distributions of the randomvector (X; Y )t.

Fallacy 1. Marginal distributions and correlation determine the joint distribution.

This is true if we restrict our attention to the multivariate normal distribution orthe elliptical distributions. For example, if we know that (X; Y )t have a bivariatenormal distribution, then the expectations and variances of X and Y and the corre-lation �(X; Y ) uniquely determine the joint distribution. However, if we only knowthe marginal distributions of X and Y and the correlation then there are many pos-sible bivariate distributions for (X; Y )t. The distribution of (X; Y )t is not uniquelydetermined by F1, F2 and �(X; Y ). We illustrate this with examples, interesting intheir own right.

Example 1. Let X and Y have standard normal distributions and let assume�(X; Y ) = �. If (X; Y )t is bivariate normally distributed, then the distributionfunction F of (X; Y )t is given by

F (x; y) = CGa� (�(x);�(y)):

We have represented this copula earlier as a double integral in (4). Any other copulaC 6= CGa

� gives a bivariate distribution with standard normal marginals which is notbivariate normal with correlation �. We construct a copula C of the type (6) bytaking

f(x) = 1f( ;1� )g(x) +2 � 1

2 1f( ;1� )cg(x)

g(y) = �1f( ;1� )g(y)� 2 � 1

2 1f( ;1� )cg(y);

Page 22: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

22 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

where 14� � 1

2. Since h(x; y) disappears on the square [ ; 1� ]2 it is clear that

C for < 12and F (x; y) = C(�(x);�(y)) is never bivariate normal; from symmetry

considerations (C(u; v) = C(1�u; v), 0 � u; v � 1) the correlation irrespective of is zero. There are uncountably many bivariate distributions with standard normalmarginals and correlation zero. In Figure 3 the density of F is shown for = 0:3;this is clearly very di�erent from the joint density of the standard bivariate normaldistribution with zero correlation.

-2

0

2

X

-2

0

2

Y

00.

10.

2Z

Figure 3. Density of a non-bivariate normal distribution which hasstandard normal marginals.

Example 2. A more realistic example for risk management is the motivating exam-ple of the Introduction. We consider two bivariate distributions with Gamma(3,1)marginals (denoted G3;1) and the same correlation � = 0:7, but with di�erent de-pendence structures, namely

FGa(x; y) = CGa~� (G(x);G(y));

FGu(x; y) = CGu� (G(x);G(y));

where CGa~� is the Gaussian dependence structure and CGu

� is the Gumbel copulaintroduced in (5). To obtain the desired linear correlation the parameter valueswere set to be ~� = 0:71 and � = 0:54 7.In Section 4.4 we showed that the two copulas have quite di�erent tail dependence;

the Gaussian copula is asymptotically independent if ~� < 1 and the Gumbel copulais asymptotically dependent if � < 1. At �nite levels the greater tail dependence ofthe Gumbel copula is apparent in Figure 1. We �x u = VaR0:99(X) = VaR0:99(Y ) =

7These numerical values were determined by stochastic simulation.

Page 23: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 23

G�13;1(0:99) and consider the conditional exceedance probability P[Y > u j X > u]

under the two models. An easy empirical estimation based on Figure 1 yieldsbPFGa[Y > u j X > u] = 3=9;bPFGu[Y > u j X > u] = 12=16:

In the Gumbel model exceedances of the threshold u in one margin tend to beaccompanied by exceedances in the other, whereas in the Gaussian dependencemodel joint exceedances in both margins are rare. There is less \diversi�cation" oflarge risks in the Gumbel dependence model.Analytically it is diÆcult to provide results for the Value-at-Risk of the sum

X + Y under the two bivariate distributions,8 but simulation studies con�rm thatX + Y produces more large outcomes under the Gumbel dependence model thanthe Gaussian model. The di�erence between the two dependence structures mightbe particularly important if we were interested in losses which were triggered onlyby joint extreme values of X and Y .

Example 3. The Value-at-Risk of linear portfolios is certainly not uniquely deter-mined by the marginal distributions and correlation of the constituent risks. Sup-pose (X; Y )t has a bivariate normal distribution with standard normal marginalsand correlation � and denote the bivariate distribution function by F�. Any mixtureF = �F�1 + (1 � �)F�2 ; 0 � � � 1 of bivariate normal distributions F�1 and F�2also has standard normal marginals and correlation ��1+(1��)�2. Suppose we �x�1 < � < 1 and choose 0 < � < 1 and �1 < � < �2 such that � = ��1 + (1� �)�2.The sum X + Y is longer tailed under F than under F�. Since

PF [X + Y > z] = ��

�z

2(1 + �1)

�+ (1� �)�

�z

2(1 + �2)

�;

and

PF�[X + Y > z] = �

�z

2(1 + �)

�;

we can use Mill's ratio

�(x) = 1� �(x) = �(x)

�1

x+O

�1

x2

��to show that

limz!1

PF [X + Y > z]

PF�[X + Y > z]=1:

Clearly as one goes further into the respective tails of the two distributions the Value-at-Risk for the mixture distribution F is larger than that of the original distributionF�. By using the same technique as Embrechts, Mikosch, and Kl�uppelberg (1997)(Example 3.3.29) we can show that, as �! 1�,

VaR�;F (X + Y ) � 2(1 + �2) (�2 log(1� �))1=2

VaR�;F�(X + Y ) � 2(1 + �) (�2 log(1� �))1=2 ;

so that

lim�!1�

VaR�;F (X + Y )

VaR�;F�(X + Y )=

1 + �21 + �

> 1;

8See M�uller and B�auerle (1998) for related work on stop-loss risk measures applied to bivariateportfolios under various dependence models.

Page 24: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

24 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

irrespective of the choice of �.

Fallacy 2. Given marginal distributions F1 and F2 for X and Y , all linear corre-lations between -1 and 1 can be attained through suitable speci�cation of the jointdistribution.

This statement is not true and it is simple to construct counterexamples.

Example 4. LetX and Y be random variables with support [0;1), so that F1(x) =F2(y) = 0 for all x; y < 0. Let the right endpoints of F1 and F2 be in�nite,supxfxjF1(x) < 1g = supyfyjF2(y) < 1g = 1. Assume that �(X; Y ) = �1, whichwould imply Y = aX + b a.s., with a < 0 and b 2 R. It follows that for all y < 0,

F2(y) = P[Y � y] = P[X � (y � b)=a] � P[X > (y � b)=a]

= 1� F1((y � b)=a) > 0;

which contradicts the assumption F2(y) = 0.

The following theorem shows which correlations are possible for given marginaldistributions.

Theorem 4. [H�o�ding (1940) and Fr�echet (1957)] Let (X; Y )t be a random vec-tor with marginals F1 and F2 and unspeci�ed dependence structure; assume 0 <�2[X]; �2[Y ] <1. Then

1. The set of all possible correlations is a closed interval [�min; �max] and for theextremal correlations �min < 0 < �max holds.

2. The extremal correlation � = �min is attained if and only if X and Y are coun-termonotonic; � = �max is attained if and only if X and Y are comonotonic.

3. �min = �1 i� X and �Y are of the same type; �max = 1 i� X and Y are ofthe same type.

Proof. We make use of the identity (17) and observe that the Fr�echet inequalities(12) imply

maxfF1(x) + F2(y)� 1; 0g � F (x; y) � minfF1(x); F2(y)g:The integrand in (17) is minimized pointwise, if X and Y are countermonotonicand maximized if X and Y are comonotonic. It is clear that �max � 0. However,if �max = 0 this would imply that minfF1(x); F2(y)g = F1(x)F2(y) for all x; y.This can only occur if F1 or F2 is degenerate, i.e. of the form F1(x) = 1fx�x0gor F2(y) = 1fy�y0g, and this would imply �2[X] = 0 or �2[Y ] = 0 so that thecorrelation between X and Y is unde�ned. Similarly we argue that �min < 0. IfF`(x1; x2) = maxfF1(x)+F2(y)�1; 0g and Fu(x1; x2) = minfF1(x); F2(y)g then themixture �F`+(1��)Fu, 0 � � � 1 has correlation ��min+(1��)�max. Using suchmixtures we can construct joint distributions with marginals F1 and F2 and witharbitrary correlations � 2 [�min; �max]. This will be used in Section 6

Example 5. Let X � Lognormal(0; 1) and Y � Lognormal(0; �2), � > 0. We wishto calculate �min and �max for these marginals. Note that X and Y are not of thesame type although logX and logY are. It is clear that �min = �(eZ ; e��Z) and�max = �(eZ ; e�Z), where Z � N (0; 1). This observation allows us to calculate �min

and �max analytically:

�min =e�� � 1p

(e� 1)(e�2 � 1);

Page 25: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 25

�max =e� � 1p

(e� 1)(e�2 � 1):

These maximal and mininal correlations are shown graphically in Figure 4. Weobserve that lim�!1 �min = lim�!1 �max = 0.

sigma

corr

elat

ion

0 1 2 3 4 5

-1.0

-0.5

0.0

0.5

1.0

min. correlationmax. correlation

Figure 4. �min and �max graphed against �.

This example shows it is possible to have a random vector (X; Y )t where the cor-relation is almost zero, even though X and Y are comonotonic or countermonotonicand thus have the strongest kind of dependence possible. This seems to contra-dict our intuition about probability and shows that small correlations cannot beinterpreted as implying weak dependence between random variables.

Fallacy 3. The worst case VaR (quantile) for a linear portfolio X+Y occurs when�(X; Y ) is maximal, i.e. X and Y are comonotonic

As we had discussed in Section 3.3 it is common to consider variance as a measureof risk in insurance and �nancial mathematics and, whilst it is true that the varianceof a linear portfolio, �2(X +Y ) = �2(X)+ �2(Y )+ 2�(X; Y )�(X)�(Y ), is maximalwhen the correlation is maximal, it is in general not correct to conclude that theValue-at-Risk is also maximal. For elliptical distributions it is true, but generallyit is false.Suppose two random variablesX and Y have distribution functions F1 and F2 but

that their dependence structure (copula) is unspeci�ed. In the following theoremwe give an upper bound for VaR�(X + Y ).

Theorem 5. [Makarov (1981) and Frank, Nelsen, and Schweizer (1987)]

1. For all z 2 R,P[X + Y � z] � sup

x+y=zC`(F1(x); F2(y)) =: (z):

Page 26: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

26 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

This bound is sharp in the following sense: Set t = (z�) = limu!z� (u).Then there exists a copula, which we denote by C(t), such that under the dis-tribution with distribution function F (x; y) = C(t)(F1(x); F2(y)) we have thatP[X + Y < z] = (z�).9

2. Let �1(�) := inffz j (z) � �g; � 2 (0; 1), be the generalized inverse of .Then

�1(�) = infC`(u;v)=�

fF�11 (u) + F�1

2 (v)g:3. The following upper bound for Value-at-Risk holds:

VaR�(X + Y ) � �1(�):

This bound is best-possible.

Proof. 1. For any x; y 2 R with x+ y = z application of the lower Fr�echet bound(12) yields

P[X + Y � z] � P[X � x; Y � y] � C`(F1(x); F2(y)):

Taking the supremum over x + y = z on the right hand side shows the �rstpart of the claim.The proof of the second part will be a sketch. We merely want to show how

C(t) is chosen. For full mathematical details we refer to Frank, Nelsen, andSchweizer (1987). We restrict ourselves to continuous distribution functions F1

and F2. Since copulas are distributions with uniform marginals we transformthe problem onto the unit square by de�ning A = f(F1(x); F2(y))jx + y � zgthe boundary of which is s = f(F1(x); F2(y))jx + y = zg. We need to �nd acopula C(t) such that

RRAdC(t) = 1 � t. Since F1 and F2 are continuous, we

have that (z�) = (z) and therefore t � u + v � 1 for all (u; v) 2 s. Thusthe line u+ v� 1 = t can be considered as a tangent to s and it becomes clearhow one can choose C(t). C(t) belongs to the distribution which is uniform onthe line segments (0; 0)(t; t) and (t; 1)(1; t). Therefore

C(t)(u; v) =

(maxfu+ v � 1; tg (u; v) 2 [t; 1]� [t; 1],

minfu; vg otherwise:(21)

Since the set (t; 1)(1; t) � A has probability mass 1� t we have under C(t) thatP[X + Y � z] =

RRAdC(t) � 1� t and therefore P[X + Y < z] � t. But since

t is a lower bound for P[X + Y < z] it is necessary that P[X + Y < z] = t.2. This follows from the duality theorems in Frank and Schweizer (1979).3. Let � > 0. Then we have

P[X + Y � �1(�) + �] � ( �1(�) + �) � �:

Taking the limit � ! 0+ this yields P[X + Y � �1(�)] � � and thereforeVaR�(X + Y ) � �1(�). This upper bound cannot be improved. Again, take� > 0. Then if (X; Y )t has copula C( �1(�)��=2) one has

P[X + Y � �1(�)� �] � P[X + Y < �1(�)� �=2]

= (( �1(�)� �=2)�) � ( �1(�)� �=2) < �

and therefore VaR�(X + Y ) > �1(�)� �.

9In general there is no copula such that P[X + Y � z] = (z), not even if F1 and F2 are bothcontinuous; see Nelsen (1999).

Page 27: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 27

Remark 2. The results in Frank, Nelsen, and Schweizer (1987) are more generalthan Theorem 5 in this paper. Frank, Nelsen, and Schweizer (1987) give lowerand upper bounds for P[L(X; Y ) � z] where L(�; �) is continuous and increasing ineach coordinate. Therefore a best-possible lower bound for VaR�(X + Y ) also ex-ists. Numerical evaluation methods of �1 are described in Williamson and Downs(1990). These two authors also treat the case where we restrict attention to par-ticular subsets of copulas. By considering the sets of copulas D = fCjC(u; v) �u v; 0 � u; v � 1g, which has minimal copula Cind(u; v) = u v, we can derive boundsof P[X + Y � z] under positive dependence (PQD as de�ned in De�nition 8). Mul-tivariate generalizations of Theorem 5 can be found in Li, Scarsini, and Shaked(1996).

alpha

VaR

0.90 0.92 0.94 0.96 0.98

68

1012

1416

1820

comonotonicitymax. VaRindependence

Figure 5. �1(�)(max. VaR) graphed against �.

In Figure 5 the upper bound �1(�) is shown for X � Gamma(3; 1) and Y �Gamma(3; 1), for various values of �. Notice that �1(�) can easily be analyticallycomputed analytically for this case since for � suÆciently large

�1(�) = infu+v�1=�

fF�11 (u) + F�1

2 (v)g = F�11 ((�+ 1)=2) + F�1

1 ((�+ 1)=2):

This is because F1 = F2 and the density of Gamma(3; 1) is unimodal, see alsoExample 6. For comparative purposes VaR�(X + Y ) is also shown for the casewhere X; Y are independent and the case where they are comonotonic. The latteris computed by addition of the univariate quantiles since under comonotonicityVaR�(X + Y ) = VaR�(X) + VaR�(Y ).

10 The example shows that for a �xed

10This is also true when X or Y do not have continuous distributions. Using Proposition 4.5in Denneberg (1994) we deduce that for comonotonic random variables X + Y = (u + v)(Z)where u and v are continuous increasing functions and Z = X + Y . Remark 1 then shows thatVaR�(X + Y ) = (u+ v)(VaR�(Z)) = u(VaR�(Z)) + v(VaR�(Z)) = VaR�(X) + VaR�(Y ).

Page 28: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

28 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

� 2 (0; 1) the maximal value of VaR�(X + Y ) is considerably larger than the valueobtained in the case of comonotonicity. This is not surprising since we know thatVaR is not a subadditive risk measure (Artzner, Delbaen, Eber, and Heath 1999)and there are situations where VaR�(X + Y ) > VaR�(X) + VaR�(Y ). In a sense,the di�erence �1(�)� (VaR�(X)+VaR�(Y )) quanti�es the amount by which VaRfails to be subadditive for particular marginals and a particular �. For a coherentrisk measure %, we must have that %(X + Y ) attains its maximal value in the caseof comonotonicity and that this value is %(X)+%(Y ) (Delbaen 1999). The fact thatthere are situations which are worse than comonotonicity as far as VaR is concerned,is another way of showing that VaR is not a coherent measure of risk.Suppose we de�ne a measure of diversi�cation by

D = (VaR�(X) + VaR�(Y ))� VaR�(X + Y );

the idea being that comonotonic risks are undiversi�able (D = 0) but that riskswith weaker dependence should be diversi�able (D > 0). Unfortunately, Theorem5 make it clear that we can always �nd distributions with linear correlation strictlyless than the (comonotonic) maximal correlation (see Theorem 4) that give negativediversi�cation (D < 0). This weakens standard diversi�cation arguments, which saythat \low correlation means high diversi�cation". As an example Table 2 gives thenumerical values of the correlations of the distributions yielding maximal VaR�(X+Y ) for X; Y � Gamma(3; 1).

� 0.25 0.5 0.75 0.8 0.85 0.9 0.95 0.99� -0.09 0.38 0.734 0.795 0.852 0.901 0.956 0.992

Table 2. Correlations of the distributions giving maximal VaR�(X + Y ).

It might be supposed that VaR is in some sense asymptotically subadditive, sothat negative diversi�cation disappears as we let � tend to one, and comonotonicitybecomes the worst case. The following two examples show that this is also wrong.

Example 6. The quotient VaR�(X+Y )=(VaR�(X)+VaR�(Y )) can be made arbi-trarily large. In general we do not have lim�!1�

�1(�)=(VaR�(X)+VaR�(Y )) = 1.To see this consider Pareto marginals F1(x) = F2(x) = 1�x��, x � 1, where � > 0.We have to determine infu+v�1=�fF�1

1 (u) + F�12 (v)g. Since F1 = F2, the function

g : (�; 1)! R�0; u 7! F�11 (u) + F�1

2 (�+ 1� u)

is symmetrical with respect to (� + 1)=2. Since the Pareto density is decreasing,the function g is decreasing on (�; (�+1)=2] and increasing on [(�+1)=2; 1); henceg((�+1)=2) = 2F�1

1 ((�+1)=2) is the minimum of g and �1(�) = 2F�11 ((�+1)=2).

Therefore

VaR�(X + Y )

VaR�(X) + VaR�(Y )� �1(�)

VaR�(X) + VaR�(Y )

=F�11 ((�+ 1)=2)

F�11 (�)

=(1� �+1

2)�1=�

(1� �)�1=�= 21=�:

The upper bound 21=�, which is irrespective of �, can be reached.

Example 7. Let X and Y be independent random variables with identical distri-bution F1(x) = 1 � x�1=2; x � 1. This distribution is extremely heavy-tailed with

Page 29: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 29

no �nite mean. Consider the risks X + Y and 2X, the latter being the sum ofcomonotonic risks. We can calculate

P[X + Y � z] = 1� 2pz � 1

z< P[2X � z];

for z > 2. It follows that

VaR�(X + Y ) > VaR�(2X) = VaR�(X) + VaR�(Y )

for � 2 (0; 1), so that, from the point of view of VaR, independence is worse thanperfect dependence no matter how large we choose �. VaR is not sub-additive forthis rather extreme choice of distribution and diversi�cation arguments do not hold;one is better o� taking one risk and doubling it than taking two independent risks.Diversi�ability of two risks is not only dependent on their dependence structurebut also on the choice of marginal distribution. In fact, for distributions withF1(x) = F2(x) = 1 � x��, � > 0, we do have asymptotic subadditivity in the case� > 1. That means VaR�(X +Y ) < VaR�(X)+VaR�(Y ) if � large enough. To seethis use lemma 1.3.1 of Embrechts, Mikosch, and Kl�uppelberg (1997) and the factthat 1�F1 is regularly varying of index �� (for an introduction to regular variationtheory see the appendix of the same reference).

6. Simulation of Random Vectors

There are various situations in practice where we might wish to simulate depen-dent random vectors (X1; : : : ; Xn)

t. In �nance we might wish to simulate the futuredevelopment of the values of assets in a portfolio, where we know these assets to bedependent in some way. In insurance we might be interested in multiline products,where payouts are triggered by the occurrence of losses in one or more dependentbusiness lines, and wish to simulate typical losses. The latter is particularly im-portant within DFA. It is very tempting to approach the problem in the followingway:

1. Estimate marginal distributions F1; : : : ; Fn,2. Estimate matrix of pairwise correlations �ij = �(Xi; Xj); i 6= j,3. Combine this information in some simulation procedure.

Unfortunately, we now know that step 3 represents an attempt to solve an ill-posedproblem. There are two main dangers. Given the marginal distributions the corre-lation matrix is subject to certain restrictions. For example, each �ij must lie in aninterval [�min(Fi; Fj); �max(Fi; Fj)] bounded by the minimal and maximal attainablecorrelations for marginals Fi and Fj. It is possible that the estimated correlationsare not consistent with the estimated marginals so that no corresponding multi-variate distribution for the random vector exists. In the case where a multivariatedistribution exists it is often not unique.The approach described above is highly questionable. Instead of considering

marginals and correlations separately it would be more satisfactory to attempt a di-rect estimation of the multivariate distribution. It might also be sensible to considerwhether the question of interest permits the estimation problem to be reduced to aone-dimensional one. For example, if we are really interested in the behaviour of thesum X1+ � � �+Xn we might consider directly estimating the univariate distributionof the sum.

Page 30: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

30 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

6.1. Given marginals and linear correlations. Suppose, however, we are re-quired to construct a multivariate distribution F in Rn which is consistent withgiven marginals distributions F1; : : : ; Fn and a linear correlation matrix �. Weassume that � is a proper linear correlation matrix, by which we mean in the re-mainder of the paper that it is a symmetric, positive semi-de�nite matrix with�1 � �ij � 1; i; j = 1; : : : ; n and �ii = 1; i = 1; : : : ; n. Such a matrix will alwaysbe the linear correlation matrix of some random vector in Rn but we must checkit is compatible with the given marginals. Our problem is to �nd a multivariatedistribution F so that if (X1; : : : ; Xn)

t has distribution F the following conditionsare satis�ed:

Xi � Fi; i = 1; : : : ; n; (22)

�(Xi; Xj) = �ij; i; j = 1; : : : ; n: (23)

In the bivariate case, provided the prespeci�ed correlation is attainable, the con-struction is simple and relies on the following.

Theorem 6. Let F1 and F2 be two univariate distributions and �min and �max thecorresponding minimal and maximal linear correlations. Let � 2 [�min; �max]. Thenthe bivariate mixture distribution given by

F (x1; x2) = �F`(x1; x2) + (1� �)Fu(x1; x2); (24)

where � = (�max � �)=(�max � �min), F`(x1; x2) = maxfF1(x1) + F2(x2)� 1; 0g andFu(x1; x2) = minfF1(x1); F2(x2)g, has marginals F1 and F2 and linear correlation�.

Proof. Follows easily from Theorem 4.

Remark 3. A similar result to the above holds for rank correlations when �min and�max are replaced by -1 and 1 respectively.

Remark 4. Also note that the mixture distribution is not the unique distributionsatisfying our conditions. If � � 0 the distribution

F (x1; x2) = �F1(x1)F2(x2) + (1� �)Fu(x1; x2); (25)

with � = (�max � �)=�max also has marginals F1 and F2 and correlation �. Manyother mixture distributions (e.g. mixtures of distributions with Gumbel copulas) arepossible.

Simulation of one random variate from the mixture distribution in Theorem 6 isachieved with the following algorithm:

1. Simulate U1; U2 independently from standard uniform distribution,2. If U1 � � take (X1; X2)

t = (F�11 (U2); F

�12 (1� U2))

t,3. If U1 > � take (X1; X2)

t = (F�11 (U2); F

�12 (U2))

t.

Constructing a multivariate distribution in the case n � 3 is more diÆcult.For the existence of a solution it is certainly necessary that �min(Fi; Fj) � �ij ��max(Fi; Fj); i 6= j, so that the pairwise constraints are satis�ed. In the bivariatecase this is suÆcient for the existence of a solution to the problem described by (22)and (23), but in the general case it is not suÆcient as the following example shows.

Example 8. Let F1, F2 and F3 be Lognormal(0; 1) distributions. Suppose that� is such that �ij is equal to the minimum attainable correlation for a pair ofLognormal(0; 1) random variables (� �0:368) if i 6= j and �ij = 1 if i = j. Thisis both a proper correlation matrix and a correlation matrix satisfying the pairwise

Page 31: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 31

constraints for lognormal random variables. However, since �12, �13 and �23 areall minimum attainable correlations, Theorem 4 implies that X1, X2 and X3 arepairwise countermonotonic random variables. Such a situation is unfortunatelyimpossible as is is clear from the following proposition.

Proposition 4. Let X, Y and Z be random variables with joint distribution F andcontinuous marginals F1; F2 and F3.

1. If (X; Y ) and (Y; Z) are comonotonic then (X;Z) is also comonotonic andF (x; y; z) = minfF1(x); F2(y); F3(z)g.

2. If (X; Y ) is comonotonic and (Y; Z) is countermonotonic then (X;Z) is coun-termonotonic and F (x; y; z) = maxf0;minfF1(x); F2(y)g+ F3(z)� 1g.

3. If (X; Y ) and (Y; Z) are countermonotonic then (X;Z) is comonotonic andF (x; y; z) = maxf0;minfF1(x); F3(z)g+ F2(y)� 1g

Proof. We show only the �rst part of the proposition, the proofs of the other partsbeing similar. Using (14) we know that Y = S(X) a.s. and Z = T (Y ) a.s. whereS; T : R ! R are increasing functions. It is clear that Z = T Æ S(X) a.s. withT ÆS increasing, so that X and Z are comonotonic. Now let x; y; z 2 R and becausealso (X;Z) is comonotonic we may without loss of generality assume that F1(x) �F2(y) � F3(z). Assume for simplicity, but without loss of generality, that Y = S(X)and Z = T (Y ) (i.e. ignore almost surely). It follows that fX � xg � fY � yg andfY � yg � fZ � zg so that

F (x; y; z) = P[X � x] = F1(x):

Example 9. Continuity of the marginals is an essential assumption in this propo-sition. It does not necessarily hold for discrete distributions as the next counterex-ample shows. Consider the multivariate two-point distributions given by

P[(X; Y; Z)t = (0; 0; 1)t] = 0:5;

P[(X; Y; Z)t = (1; 0; 0)t] = 0:5:

(X; Y ) and (Y; Z) are comonotonic but (X;Z) is countermonotonic.

The proposition permits us now to state a result concerning existence and unique-ness of solutions to our problem given by in the special case where random variablesare either pairwise comonotonic or countermonotonic.

Theorem 7. [Tiit (1996)] Let F1; : : : ; Fn; n � 3, be continuous distributions andlet � be a (proper) correlation matrix satisfying the following conditions for all i 6= j,i 6= k and j 6= k:

� �ij 2 f�min(Fi; Fj); �max(Fi; Fj)g,� If �ij = �max(Fi; Fj) and �ik = �max(Fi; Fk) then �jk = �max(Fj; Fk),� If �ij = �max(Fi; Fj) and �ik = �min(Fi; Fk) then �jk = �min(Fj; Fk),� If �ij = �min(Fi; Fj) and �ik = �min(Fi; Fk) then �jk = �max(Fj; Fk).

Then there exists a unique distribution with marginals F1; : : : ; Fn and correlationmatrix �. This distribution is known as an extremal distribution. In Rn there are2n�1 possible extremal distributions.

Proof. Without loss of generality suppose

�1j =

(�max(F1; Fj) for 2 � j � m � n;

�min(F1; Fj) for m < j � n;

Page 32: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

32 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

for some 2 � m � n. The conditions of the theorem ensure that the pairwiserelationship of any two margins is determined by their pairwise relationship to the�rst margin. The margins for which �1j takes a maximal value form an equivalenceclass, as do the margins for which �1j takes a minimal value. The joint distributionmust be such that (X1; : : : ; Xm) are pairwise comonotonic, (Xm+1; : : : ; Xn) arepairwise comonotonic, but two random variables taken from di�erent groups arecountermonotonic. Let U � U(0; 1). Then the random vector

(F�11 (U); F�1

2 (U); : : : ; F�1m (U); F�1

m+1(1� U); : : : ; F�1n (1� U))t;

has the required joint distribution. We use a similar argument to the Proposition 4and assume, without loss of generality, that

min1�i�m

fFi(xi)g = F1(x1); minm<i�n

fFi(xi)g = Fm+1(x1):

It is clear that the distribution function is

F (x1; : : : ; xn) = P[X1 � x1; Xm+1 � xm+1]

= maxf0; min1�i�m

fFi(xi)g+ minm�i�n

fFi(xi)g � 1g;

which in addition shows uniqueness of distributions with pairwise extremal correla-tions.

Let Gj; j = 1; : : : ; 2n�1 be the extremal distributions with marginals F1; : : : ; Fn andcorrelation matrix �j. Convex combinations

G =2n�1Xj=1

�jGj; �j � 0;2n�1Xj=1

�j = 1;

also have the same marginals and correlation matrix given by � =P2n�1

j=1 �j�j. If wecan decompose an arbitrary correlation matrix � in this way, then we can use a con-vex combination of extremal distributions to construct a distribution which solvesour problem. In Tiit (1996) this idea is extended to quasi-extremal distributions.Quasi-extremal random vectors contain sub-vectors which are extremal as well assub-vectors which are independent.A disadvantage of the extremal (and quasi-extremal) distributions is the fact that

they have no density, since they place all their mass on edges in Rn. However, onecan certainly think of practical examples where such distributions might still behighly relevant.

Example 10. Consider two portfolios of credit risks. In the �rst portfolio we haverisks from country A, in the second risks from country B. Portfolio A has a pro�t-and-loss distribution F1 and portfolio B a pro�t-and-loss distribution F2. Withprobability p the results move in the same direction (comonotonicity); with proba-bility (1�p) they move in opposite directions (countermonotonicity). This situationcan be modelled with the distribution

F (x1; x2) = p �minfF1(x1); F2(x2)g+ (1� p) �maxfF1(x1) + F2(x2)� 1; 0g;and of course generalized to more than two portfolios.

Page 33: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 33

6.2. Given marginals and Spearman's rank correlations. This problem hasbeen considered in Iman and Conover (1982) and their algorithm forms the basis ofthe @RISK computer program (Palisade 1997).It is clear that a Spearman's rank correlation matrix is also a linear correlation

matrix (Spearman's rank being de�ned as the linear correlation of ranks). It is notknown to us whether a linear correlation matrix is necessarily a Spearman's rankcorrelation matrix. That is, given an arbitrary symmetric, positive semi-de�nitematrix with unit elements on the diagonal and o�-diagonal elements in the interval[�1; 1], can we necessarily �nd a random vector with continuous marginals for whichthis is the rank correlation matrix, or alternatively a multivariate distribution forwhich this is the linear correlation matrix of the copula? If we estimate a rankcorrelation matrix from data, is it guaranteed that the estimate is itself a rankcorrelation matrix? A necessary condition is certainly that the estimate is a linearcorrelation matrix, but we do not know if this is suÆcient.If the given matrix is a true rank correlation matrix, then the problem of the

existence of a multivariate distribution with prescribed marginals is solved. Thechoice of marginals is in fact irrelevant and imposes no extra consistency conditionson the matrix.Iman and Conover (1982) do not attempt to �nd a multivariate distribution which

has exactly the given rank correlation matrix �. They simulate a standard multivari-ate normal variate (X1; : : : ; Xn)

t with linear correlation matrix � and then transformthe marginals to obtain (Y1; : : : ; Yn)

t = (F�11 (�(Xi)); : : : ; F

�1n (�(Xn)))

t.The rankcorrelation matrix of Y is identical to that of X. Now because of (8)

�S(Yi; Yj) = �S(Xi; Xj) =6

�arcsin

�(Xi; Xj)

2� �(Xi; Xj);

and, in view of the bounds for the absolute error,���� 6� arcsin�

2� �

���� � 0:0181; � 2 [�1; 1];

and for the relative error, �� 6�arcsin �

2� �

��j�j � � � 3

�;

the rank correlation matrix of Y is very close to that which we desire. In thecase when the given matrix belongs to an extremal distribution (i.e. comprises onlyelements 1 and �1) then the error disappears entirely and we have constructed theunique solution of our problem.This suggests how we can �nd a suÆcient condition for � to be a Spearman's rank

correlation matrix and how, when this condition holds, we can construct a distribu-tion that has the required marginals and exactly this rank correlation matrix. Wede�ne the matrix ~� by

~�ij = 2 sin��ij6; (26)

and check whether this is a proper linear correlation matrix. If so, then the vec-tor (Y1; : : : ; Yn)

t = (F�11 (�(Xi)); : : : ; F

�1n (�(Xn)))

t has rank correlation matrix �,where (X1; : : : ; Xn)

t is a standard multivariate normal variate with linear correla-tion matrix ~�.In summary, a necessary condition for � to be a rank correlation matrix is that it

is a linear correlation matrix and a suÆcient condition is that ~� given by (26) is a

Page 34: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

34 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

linear correlation matrix. We are not aware at present of a necessary and suÆcientcondition.A further problem with the approach described above is that we only ever con-

struct distributions which have the dependency structure of the multivariate normaldistribution. This dependency structure is limited as we observed in Example 2; itdoes not permit asymptotic dependency between random variables.

6.3. Given marginals and copula. In the case where marginal distributionsF1; : : : ; Fn and a copula C(u1; : : : ; un) are speci�ed a unique multivariate distri-bution with distribution function F (x1; : : : ; xn) = C(F1(x1); : : : ; Fn(xn)) satisfyingthese speci�cations can be found. The problem of simulating from this distributionis no longer the theoretical one of whether a solution exists, but rather the tech-nical one of how to perform the simulation. We assume the copula is given in theform of a parametric function which the modeller has chosen; we do not considerthe problem of how copulas might be estimated from data, which is certainly morediÆcult than estimating linear or rank correlations.Once we have simulated a random vector (U1; : : : ; Un)

t from C, then the randomvector (F�1

1 (U1); : : : ; F�1n (Un))

t has distribution F . We assume that eÆcient uni-variate simulation presents no problem and refer to Ripley (1987),Gentle (1998) orDevroye (1986) for more on this subject. The major technical diÆculty lies now insimulating realisations from the copula.Where possible a transformation method can be applied; that is, we make use of

multivariate distributions with the required copula for which a multivariate simula-tion method is already known. For example, to simulate from the bivariate Gaussiancopula it is trivial to simulate (Z1; Z2)

t from the standard bivariate normal distri-bution with correlation � and then to transform the marginals with the univariatedistribution function so that (�(Z1);�(Z2))

t is distributed according to the desiredcopula. For the bivariate Gumbel copula a similar approach can be taken.

Example 11. Consider the Weibull distribution having survivor function F 1(x) =1 � F1(x) = exp

��x�� for � > 0; x � 0. If we apply the Gumbel copula to thissurvivor function (not to the distribution function) we get a bivariate distributionwith Weibull marginals and survivor function

F (z1; z2) = P[Z1 > z1; Z2 > z2] = C(F 1(z1); F 1(z2)) = exp��(z1 + z2)

��:

Lee (1979) describes a method for simulating from this distribution. We take(Z1; Z2)

t = (US1=�; (1 � U)S1=�)t where U is standard uniform and S is a mix-ture of Gamma distributions with density h(s) = (1 � � + �s) exp(�s) for s � 0.Then (F 1(Z1); F 1(Z2))

t will have the desired copula distribution.

Where the transformation method cannot easily be applied, another possiblemethod involves recursive simulation using univariate conditional distributions. Weconsider the general case n > 2 and introduce the notation

Ci(u1; : : : ; ui) = C(u1; : : : ; ui; 1; : : : ; 1); i = 2; : : : ; n� 1

to represent i{dimensional marginal distributions of C(u1; : : : ; un). We writeC1(u1) = u1 and Cn(u1; : : : ; un) = C(u1; : : : ; un). Let us suppose now that(U1; : : : ; Un)

t � C; the conditional distribution of Ui given the values of the �rsti�1 components of (U1; : : : ; Un)

t can be written in terms of derivatives and densities

Page 35: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 35

of the i{dimensional marginals

Ci(ui j u1; : : : ; ui�1) = P[Ui � ui j U1 = u1; : : : ; Ui�1 = ui�1]

=@i�1Ci(u1; : : : ; ui)

@u1 � � �@ui�1.@i�1Ci�1(u1; : : : ; ui�1)

@u1 � � �@ui�1 ;

provided both numerator and denominator exist. This suggests that in the casewhere we can calculate these conditional distributions we use the algorithm:

� Simulate a value u1 from U(0; 1),� Simulate a value u2 from C2(u2 j u1),� Continue in this way,� Simulate a value un from Cn(un j u1; : : : ; un�1).

To simulate a value from Ci(ui j u1; : : : ; ui�1) we would in general simulate u fromU(0; 1) and then calculate C�1

i (u j u1; : : : ; ui�1), if necessary by numerical root�nding.

7. Conclusions

In this paper we have shown some of the problems that can arise when the con-cept of linear correlation is used with non-elliptical multivariate distributions. Inthe world of elliptical distributions correlation is a natural and elegant summaryof dependence, which lends itself to algebraic manipulation and the standard ap-proaches of risk management dating back to Markowitz. In the non-elliptical worldour intuition about correlation breaks down and leads to a number of fallacies. The�rst aim of this paper has been to suggest that practitioners of risk managementmust be aware of these pitfalls and must appreciate that a deeper understanding ofdependency is needed to model the risks of the real world.The second main aim of this paper has been to address the problem of simulating

dependent data with given marginal distributions. This question arises naturallywhen one contemplates a Monte Carlo approach to determining the risk capitalrequired to cover dependent risks. We have shown that the ideal situation is whenthe multivariate dependency structure (in the form of a copula) is fully speci�ed bythe modeller. Failing this, it is preferable to be given a matrix of rank correlationsthan a matrix of linear correlations, since rank correlations are de�ned at a copulalevel, and we need not worry about their consistency with the chosen marginals.Both correlations are, however, scalar-valued dependence measures and if there isa multivariate distribution which solves the simulation problem, it will not be theunique solution. The example of the Introduction showed that two distributions withthe same correlation can have qualitatively very di�erent dependency structures and,ideally, we should consider the whole dependence structure which seems appropriatefor the risks we wish to model.

Achnowledgements. We would like to thank Eduardo Vilela and R�udiger Freyfor fruitful discussions and Roger Kaufmann for careful reading.

References

Artzner, P., F. Delbaen, J.-M. Eber, and D. Heath (1999): \Coherent measures of risk,"To appear in Mathematical Finance.Boyer, B. H., M. S. Gibson, and M. Loretan (1999): \Pitfalls in tests for changes in corre-lations," International Finance Papers Number 597, Board of Governors of the Federal ReserveSystem.

Page 36: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

36 PAUL EMBRECHTS, ALEXANDER MCNEIL, AND DANIEL STRAUMANN

Campbell, J. Y., A. W. Lo, and A. MacKinlay (1997): The Econometrics of Financial

Markets. Princeton University Press, Princeton.Cas (1997): CAS Forum Summer 1997: DFA Call Papers, Casualty Actuarial Society.Delbaen, F. (1999): \Coherent risk measures on general probability spaces," Preprint ETHZ�urich.Denneberg, D. (1994): Non-additive Measure and Integral. Kluwer Academic Publishers, Dor-drecht.Devroye, L. (1986): Non-uniform Random Variate Generation. Springer, New York.Dhaene, J., and M. J. Goovaerts (1996): \Dependency of risks and stop-loss order," ASTINBulletin, 26(2), 201{212.Embrechts, P., T. Mikosch, and C. Kl�uppelberg (1997): Modelling Extremal Events for

Insurance and Finance. Springer, Berlin.Fang, K.-T., S. Kotz, and K.-W. Ng (1987): Symmetric Multivariate and Related Distribu-

tions. Chapman & Hall, London.Frank, M., R. B. Nelsen, and B. Schweizer (1987): \Best{possible bounds for the distribu-tion of a sum | a problem of Kolmogorov," Probability Theory and Related Fields, 74, 199{211.Frank, M., and B. Schweizer (1979): \On the duality of generalized in�mal and supremalconvolutions," Rendiconti di Matematica, 12(1), 1{23.Fr�echet, M. (1957): \Les tableaux de corr�elation dont les marges sont donn�ees," Annales de

l'Universit�e de Lyon, Sciences Math�ematiques et Astronomie, S�erie A, 4, 13{31.Galambos, J. (1987): The Asymptotic Theory of Extreme Order Statistics. Kreiger PublishingCo., Melbourne, FL.Gentle, J. E. (1998): Random number generation and Monte Carlo methods. Springer, NewYork.Gibbons, J. D. (1988): Nonparametric Statistical Inference. Dekker, New York.Harlow, W. (1991): \Asset allocation in a downside-risk framework," Financial Analysts Jour-nal, 47(5), 28{40.H�offding, W. (1940): \Massstabinvariante Korrelationstheorie," Schriften des Mathematischen

Seminars und des Instituts f�ur Angewandte Mathematik der Universit�at Berlin, 5, 181{233.Hutchinson, T. P., and C. D. Lai (1990): Continuous Bivariate Distributions, Emphasizing

Applications. Rumsby Scienti�c Publishing, Adelaide.Iman, R. L., and W. Conover (1982): \A distribution{free approach to inducing rank correla-tion among input variables," Communications in Statistics | Simulation and Computation, 11,311{334.Joag-dev, K. (1984): \Measures of Dependence," in Handbook of Statistics, ed. by P. R. Krish-naiah, vol. 4, pp. 79{88. North{Holland/Elsevier, New York.Joe, H. (1997): Multivariate Models and Dependence Concepts. Chapman & Hall, London.Kelker, D. (1970): \Distribution theory of spherical distributions and a location{scale parametergeneralization," Sankhi�a A, 32, 419{430.Kendall, M., and A. Stuart (1979): Handbook of Statistics. GriÆn & Company, London.Kimeldorf, G., and A. R. Sampson (1978): \Monotone dependence," Annals of Statistics, 6,895{903.Lee, L. (1979): \Multivariate distributions having Weibull properties," Journal of Multivariate

Analysis, 9, 267{277.Li, H., M. Scarsini, and M. Shaked (1996): \Bounds for the distribution of a multivari-ate Sum," in Distributions with Fixed Marginals and Related Topics, ed. by L. R�uschendor�,B. Schweizer, and M. D. Taylor, pp. 198{212, Hayward, CA. Institute of Mathematical Statis-tics.Lowe, S. P., and J. N. Stanard (1997): \An integrated dynamic �nancial analysis and decisionsystem for a property catastrophe insurer," ASTIN Bulletin, 27(2), 339{371.Makarov, G. (1981): \Estimates for the distribution function of a sum of two random variableswhen the marginal distributions are �xed," Theory of Probability and its Applications, 26, 803{806.Marshall, A. W. (1996): \Copulas, marginals and joint distributions," in Distributions with

Fixed Marginals and Related Topics, ed. by L. R�uschendor�, B. Schweizer, and M. D. Taylor, pp.213{222, Hayward, CA. Institute of Mathematical Statistics.Mikusinski, P., H. Sherwood, and M. Taylor (1992): \The Fr�echet bounds revisited," Real

Analysis Exchange, 17, 759{764.

Page 37: corrams - ETH Zembrecht/ftp/pitfalls.pdfgan t theory, whic h is essen tially founded on an assumption of m ultiv ariate normally distributed returns, in order to arriv e at an optimal

CORRELATION AND DEPENDENCY IN RISK MANAGEMENT 37

M�uller, A., and N. B�auerle (1998): \Modelling and comparing dependencies in multivariaterisk portfolios," ASTIN Bulletin, 28(1), 59{76.Nelsen, R. B. (1999): An Introduction to Copulas. Springer, New York.Palisade (1997): Manual for @RISK. Palisade Corporation, New�eld, NY.Resnick, S. I. (1987): Extreme Values, Regular Variation and Point Processes. Springer, NewYork.Ripley, B. D. (1987): Stochastic Simulation. Wiley, New York.Schweizer, B., and A. Sklar (1983): Probabilistic Metric Spaces. North{Holland/Elsevier, NewYork.Schweizer, B., and E. Wolff (1981): \On nonparametric measures of dependence for randomvariables," Annals of Statistics, 9, 879{885.Sibuya, M. (1961): \Bivariate extreme statistics," Annals of Mathematical Statistics, 11, 195{210.Tiit, E. (1996): \Mixtures of multivariate quasi{extremal distributions having given marginals,"in Distributions with Fixed Marginals and Related Topics, ed. by L. R�uschendor�, B. Schweizer,and M. D. Taylor, pp. 337{357, Hayward, CA. Institute of Mathematical Statistics.Tj�stheim, D. (1996): \Measures of dependence and tests of independence," Statistics, 28, 249{284.Wang, S. (1997): \Aggregation of correlated Risk Portfolios: Models and Algorithms," PreprintCasualty Actuarial Society (CAS).Wang, S., and J. Dhaene (1998): \Comonoticity, correlation order and premium principles,"Insurance Mathematics and Economics, 22, 235{242.Williamson, R. C., and T. Downs (1990): \Probabilistic Arithmetic: Numerical methods forcalculating convolutions and dependency bounds," Journal of Approximate Reasoning, 4, 89{158.Yaari, M. (1987): \The dual theory of choice under risk," Econometrica, 55, 95{115.

(Paul Embrechts) DEPARTEMENT MATHEMATIK, ETHZ, CH-8092 Z�URICH, SWITZERLAND

E-mail address : [email protected]

(Alexander McNeil) DEPARTEMENT MATHEMATIK, ETHZ, CH-8092 Z�URICH, SWITZERLAND

E-mail address : [email protected]

(Daniel Straumann) DEPARTEMENT MATHEMATIK, ETHZ, CH-8092 Z�URICH, SWITZERLAND

E-mail address : [email protected]