62
Improving Value at Risk Calculations by Using Copulas and Non-Gaussian Margins Dr J. R. New College University of Oxford A thesis submitted in partial fulfillment of the requirements for the MSc in Mathematical Finance November 30, 2002

Improving Value at Risk Calculations by Using Copulas … · Improving Value at Risk Calculations by Using Copulas and Non-Gaussian Margins Dr J. R. New College University of Oxford

Embed Size (px)

Citation preview

Improving Value at RiskCalculations by Using Copulas

and Non-Gaussian Margins

Dr J. R.

New College

University of Oxford

A thesis submitted in partial fulfillment of the requirements for the MSc in

Mathematical Finance

November 30, 2002

MSc in Mathematical Finance

New College

OXFORD UNIVERSITY

Authenticity Statement

I hereby declare that this dissertation is my own work and confirm its authenticity.

Name Dr J. R.

Address .........

Signed

Date

Abstract

Value at risk (VaR) is of central importance in modern financial risk management.

Of the various methods that exist to compute the VaR, the most popular are historical

simulation, the variance-covariance method and Monte Carlo (MC) simulation. While

historical simulation is not based on particular assumptions as to the behaviour of

the risk factors, the two other methods assume some kind of multinormal distribution

of the risk factors. Therefore the dependence structure between different risk factors

is described by the covariance or correlation between these factors.

It is shown in [1, 2] that the concept of correlation entails several pitfalls. As a

consequence, copulas are proposed to describe the dependence between n variables

with arbitrary marginal distributions. A copula is a function C : [0, 1]n → [0, 1]

with certain special properties [3], so that the joint distribution can be written as

IP(R1 ≤ r1, . . . , Rn ≤ rn) = C (F1(r1), . . . , Fn(rn)). F1, . . . , Fn denote the cumulative

probability functions of the n variables. In general, a copula C depends on one

or more parameters p1, . . . , pk that determine the dependence between the variables

r1, . . . , rn. In this sense, these parameters assume the role of correlations.

The second pitfall that arises from the multinormal distribution ansatz is the fat

tail problem of the margins [4, 5]. One way of taking extreme values better into

account is to assume that the risk factors obey Student distribution patterns instead

of Gaussian patterns.

In this thesis we investigate two risk factors only. We model each risk factor

independently using a Student distribution and describe their dependence by both

the Frank copula [6] and the Gumbel-Hougaard copula [7]. We present algorithms

to estimate the parameters of the margins and the copulas and to generate pseudo

random numbers due to a copula dependence.

Making use of historical data spanning a period of nineteen years, we compute the

VaR using a copula-modified MC algorithm. To see the advantage of this method, we

compare our results with VaR results obtained from the three standard methods men-

tioned at the beginning. Based on backtesting results, we find that the copula method

is more reliable than both traditional MC simulation and the variance-covariance

method and about as good as historical simulation.

To Christiane and Maximilian

Acknowledgements

I would like to express my gratitude towards my actual employer d-fine

GmbH and my former employer Arthur Andersen Wirtschaftsprufungs-

gesellschaft, Steuerberatungsgesellschaft mbH for financing this course

and giving me the necessary time off work. In particular I am very grateful

to Dr Hans-Peter Deutsch, head of d-fine GmbH and former head of the

Financial and Commodity Risk Consulting Group of Arthur Andersen in

Germany, who has given me the opportunity to participate in this course.

In addition, I would like to thank Dr Jeff Dewynne, Dr Sam Howison and

all other lecturers for plenty of interesting lectures during the modules.

Of course, many thanks also go to Ms Tiffany Fliss, Ms Anna Turner and

Ms Roz Sainty for organising the whole course so well and reliably.

Furthermore, I am very grateful to Dr Thomas Siegl who brought my

attention to the mathematical theory of copulas and who acted as a proof-

reader of this thesis.

My special word of thanks is dedicated to my beloved wife Christiane for

all her patience during the many weekends that I had to spend on writing

this thesis and on other MSc-related work.

Contents

1 Introduction 1

2 Traditional Monte Carlo Simulation of the Value at Risk 5

3 Basic Features of Copulas 9

3.1 Definition of a Copula . . . . . . . . . . . . . . . . . . . . . . . . . . 9

3.2 Examples of Copulas . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.1 The Product Copula . . . . . . . . . . . . . . . . . . . . . . . 10

3.2.2 Bounding Copulas . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.3 The Frank Copula . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2.4 The Gumbel-Hougaard Copula . . . . . . . . . . . . . . . . . 12

3.2.5 The Gaussian Copula . . . . . . . . . . . . . . . . . . . . . . . 13

3.3 Sklar’s Theorem and Further Important Properties of Copulas . . . . 14

3.4 Copulas and Random Variables . . . . . . . . . . . . . . . . . . . . . 17

4 Generation of Pseudo Random Numbers Using Copulas 19

4.1 The General Method . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2 t-Distributed Pseudo Random Numbers with a Copula Dependence

Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5 Numerical Results 24

6 Summary and Conclusions 38

A Generation of Pseudo Random Numbers 41

A.1 Transformation of Variables . . . . . . . . . . . . . . . . . . . . . . . 41

A.2 The Box-Muller Method . . . . . . . . . . . . . . . . . . . . . . . . . 42

A.3 Pseudo Random Numbers According to the Multinormal Distribution 42

i

B The t-Distribution 45

B.1 The Density Function . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

B.2 The Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . 47

C The Maximum Likelihood Method 48

C.1 Theoretical Background . . . . . . . . . . . . . . . . . . . . . . . . . 48

C.2 Estimation of the Degrees of Freedom of the t-Distribution Using the

ML Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

C.3 ML Estimation of the Copula Parameter . . . . . . . . . . . . . . . . 52

Bibliography 54

ii

Chapter 1

Introduction

In modern financial risk management, one fundamental quantity of interest is value

at risk (VaR). It denotes the amount by which a portfolio of financial assets might

fall at the most with a given probability during a particular time horizon. For the

sake of simplification we will only consider time horizons of one day’s length in this

thesis. This covers portfolios that include assets that can be sold in one day. The

discussions would have to be modified slightly if we were interested in longer time

horizons.

Typical choices of the probabilities that are connected with VaRs are confidence

levels of 90%, 95% and 99%, respectively. If a risk manager is free to choose,1 the

particular value he might select will depend on weighing the reliability that he ex-

pects from the VaR prediction against the amount a VaR should have. Using a high

confidence level, e.g., 99%, will result in a higher absolute VaR compared with a 90%

VaR. If the risk manager compares the predicted VaRs of the portfolio with the real

losses of the following business days over a sufficiently long period of time, he will

observe that about 1% of the real losses exceeds the calculated 99% VaRs, assuming

the method the VaR predictions are based on were reliable. However, if he had chosen

a confidence level of 90%, backtesting of the data would deliver underestimated losses

of 10%.

In statistics, the gap between the confidence level and 100% is called the quantile

and usually denoted by the Greek letter α. Once the risk manager has chosen a

confidence level 1 − α, it is vital that he can trust the predicted VaRs, i.e., that

he can be sure that the fraction of underestimated losses is in fact very close to α.

Otherwise, the larger the difference between the fraction of underestimated losses

1In some countries the confidence level to choose is prescribed by law. For example, in Germanya confidence level of 99% is required for internal models [8].

1

and α, the less useful the VaR is for managing the risk of the portfolio. Therefore, a

reliable method is needed to compute the VaR.

In literature, several methods to calculate the VaR are known. The most popular

are

• historical simulation,

• the variance-covariance method and

• Monte Carlo (MC) simulation.

As it is not within the scope of this thesis, we will not explain the first two methods.

Excellent descriptions of these and of what we will term “traditional” Monte Carlo

can be found in [4, 9].

We next present general information on how to compute the VaR of a portfolio

using Monte Carlo simulation. The value of the portfolio at present time t will be

denoted by Vt. Let us assume that Vt depends on n risk factors which might be

(relative) interest rates, foreign exchange (FX) rates, share prices, etc. Then a Monte

Carlo computation of the VaR would consist of the following steps:

1. Choose the level of confidence 1 − α to which the VaR refers.

2. Simulate the evolution of the risk factors from time t to time t+1 by generating

n-tuples of pseudo random numbers (PRNs) with appropriate marginal distri-

butions and an appropriate joint distribution that describe the behaviour of the

risk factors. The number m of these n-tuples has to be large enough (typically

m = O(1000)) to obtain sufficient statistics in step 5.

3. Calculate the m different values of the portfolio at time t + 1 using the values

of the simulated n-tuples of the risk factors. Let us denote these values by

Vt+1,1, Vt+1,2, . . . , Vt+1,m.

4. Calculate the simulated profits and losses, i.e., the differences between the sim-

ulated future portfolio values and the present portfolio value, ∆Vi = Vt+1,i − Vt

for i = 1, . . . ,m.

5. Ignore the fraction of the α worst changes ∆Vi. The minimum of the remain-

ing ∆Vi s is then the VaR of the portfolio at time t. It will be denoted by

VaR(α, t, t + 1).

2

As soon as the time evolves from t to t+1, the real value of the (unchanged) port-

folio changes from Vt to Vt+1. With this data at hand, one can backtest VaR(α, t, t+1)

by comparing it with ∆V = Vt+1 − Vt.

It is obvious that the central point in the above algorithm is step two, the genera-

tion of the PRNs according to the appropriate distributions. Therefore the following

questions have to be answered:

Question 1 What are the appropriate marginal distributions of the risk factors under

consideration?

Question 2 What is the appropriate joint distribution of the risk factors under con-

sideration?

Question 3 Is it possible to simulate PRNs using these marginal distributions and

this joint distribution?

Finding answers to the above questions for two risk factors is the main objective of

this work. For this purpose, the thesis is organised as follows:

In the following chapter, the traditional Monte Carlo method is presented. In Chap-

ter 3, copulas are defined. Examples of these functions are given and the basic prop-

erties are summarised. Copulas can be used to describe the dependence structure

between two or more variables. In this sense they replace the concept of covari-

ance or correlation that is usually used to describe the dependence between variables.

Chapter 4 is concerned with the simultaneous generation of pairs of PRNs whose

dependence structure is determined by a copula.

After the theoretical introduction we present in Chapter 5 numerical results of a

VaR Monte Carlo simulation based on copulas. The portfolio for which the VaR is

calculated is affected by two risk factors only. The first risk factor we have used is the

foreign exchange (FX) rate GBP/DEM.2 The second risk factor in our calculations

is one of the FX rates USD/DEM, JPY/DEM, CHF/DEM and FRF/DEM. The

FX rates that we have used are daily data ranging from 02/01/1980 to 30/12/19983

[10]. For the marginal distributions of the two risk factors we have used the Student

distribution. To test the quality of the copula MC simulation, we backtest our data

2The abbreviations that we use for the currencies are the corresponding ISO codes: CHF (Swissfranc), DEM (German mark), FRF (French franc), GBP (British pound sterling), JPY (Japaneseyen), USD (United States dollar)

3With the start of European Monetary Union by 01/01/1999, the German mark and the Frenchfranc ceased to exist as independent currencies. Therefore we have chosen 30/12/1998 as the finalday of our investigation.

3

and compare the results with numerical VaR predictions obtained from the traditional

MC simulation, from the variance-covariance method and from historical simulation.

Finally, we give a summary and our conclusions in Chapter 6.

The appendix at the end of this thesis contains some well known statistical tools

that we used to obtain our numerical results. It starts with a short overview of

the generation of pseudo random numbers, followed by a section on the Student

distribution and ends with a description of the maximum likelihood method.

4

Chapter 2

Traditional Monte CarloSimulation of the Value at Risk

In this chapter we recall the joint simulation of risk factors which is used with most

Monte Carlo based methods used for computing the VaR. In other words, we want

to explain the second step on page 2 in more detail.

As the algorithm we will present next is the one that is most often used in practice,

we will refer to it as the “traditional method”. This is to distinguish it from the new

method which will be presented in Chapter 4 and which we will call the “copula

method”.

Let us briefly recapitulate the content of the second step mentioned above. It

states that we have to generate n-tuples of PRNs with appropriate marginal distri-

butions and an appropriate joint distribution that describe the behaviour of the risk

factors. If we only take FX rates as risk factors into account, the traditional ansatz

to achieve these PRNs consists of the following steps:

1. Collect historical data of the n FX rates, i.e., n time series spanning N + 1

business days. We denote these data by xi,0, xi,1, . . . , xi,N for i = 1, . . . , n,

where today’s data are given by the xi,N ’s.

A typical choice for N + 1 is, for example, N + 1 = 250.

2. Assuming xi,j 6= 0, compute the relative changes in the data:

ri,j =xi,j − xi,j−1

xi,j−1

for i = 1, . . . n and j = 1, . . . , N . (2.1)

For each i = 1, . . . , n, the ri,1, . . . , ri,N will be considered as elements belonging

to a random variable ri.

5

3. The next step is to make an assumption as to the marginal distributions f1, . . . ,

fn of the random variables r1, . . . , rn.

As the r1, . . . , rn originally come from FX rates, the assumption usually made in

finance at this point is that the fi s are given by normal distributions N (µi, σ2i ),

fi(ri) =1

2πσ2i

exp

{

−(ri − µi)2

2σ2i

}

(2.2)

for i = 1, . . . , n. In (2.2), µi and σ2i denote the mean and the variance of ri,

µi = IE(ri) , σ2i = IE((ri − µi)

2) . (2.3)

4. In general, the parameters of the marginal distributions are unknown. We have

to determine estimators for these parameters from the historical data.

In the case of normal distributions, this task is easy4 because the parameters

of the normal distributions are the mean and the variance of the distribution.

Their estimators can be calculated as

µi =1

N

N∑

j=1

ri,j and (2.4)

σ2i =

1

N − 1

N∑

j=1

(ri,j − µi)2 (2.5)

for i = 1, . . . , n.

Note that the mean and the variance of an arbitrary distribution, provided that

they are well defined, can also be estimated by (2.4) and (2.5). This is essantially

a consequence of the law of large numbers. In addition, these estimators are

unbiased and consistent [11].

5. As stated above, we want to generate n-tuples of PRNs according to a joint

distribution. Therefore we have to make a further assumption, i.e., an assump-

tion that describes the dependence structure of the random variables. As the

marginal distributions have already been chosen, we are, of course, not com-

pletely free to choose the joint distribution. If f(~r) denotes the joint distribu-

tion, we have to ensure that the following condition is true for each i = 1, . . . , n:

fi(ri) =

−∞

dr1 . . .

−∞

dri−1

−∞

dri+1 . . .

−∞

drn f(~r) . (2.6)

4“Easy” in the sense that we can calculate the estimators directly from the historical data withoutthe use of any kind of numerical algorithm (see Appendix C).

6

Let us return to our traditional Monte Carlo simulation. Here we assume at

this point a multinormal distribution,

f(~r) =1

(2π)n det Cexp

{

−1

2(~r − ~µ) t C−1 (~r − ~µ)

}

, (2.7)

where

~r =

r1...rn

, ~µ =

µ1...

µn

, (2.8)

and C denotes the covariance matrix,

C =

σ21 c1,2 c1,3 · · · c1,n

c1,2 σ22 c2,3 · · · c2,n

c1,3 c2,3 σ23

. . ....

......

. . . . . . cn−1,n

c1,n c2,n · · · cn−1,n σ2n

, (2.9)

where ci,j is the covariance between ri and rj,

ci,j = IE ((ri − µi)(rj − µj)) . (2.10)

For the sake of completeness, we introduce the correlations at this point. They

are given by

ρi,j ≡ci,j

σiσj

if σi, σj 6= 0 . (2.11)

One can easily show that the function f given in (2.7) fulfils condition (2.6).

However, we would like to emphasise that the normal marginal distributions

do not imply a joint distribution of the multinormal kind. This point is a

direct consequence from Sklar’s theorem (see page 17). This is essential for

understanding the copula ansatz. Furthermore, it was explained in [1, 2] that a

dependence structure, which is, like the multinormal distribution, described by

correlation, entails several pitfalls.

6. This step of the algorithm consists of a determination of the covariances (2.10).

In a similar way as the means and variances in step 4, they can be estimated as

ci,j =1

N − 1

N∑

k=1

(ri,k − µi) (rj,k − µj) . (2.12)

7

7. As we have chosen the marginal distributions (step 3) and a joint distribution

(step 5) and have estimated its parameters (steps 4 and 6) we now have all tools

at hand to generate the desired n-tuples of PRNs. Because this is a standard

technical procedure, we present it in detail in Appendix A.3. The relevant steps

required are given on page 43, steps (a) and (b) for n dimensions, and steps (a)

and (b) for two dimensions only. Let us therefore assume at this point that we

have generated n-tuples of PRNs. We denote these numbers by

~rk =

rk1...rkn

(2.13)

where k = 1, . . . ,m, and m is the number of Monte Carlo iterations.

8. The previous step provided us with m independent n-tuples of PRNs. However,

these numbers are related to the relative changes in the data (see step 2).

Therefore we finally have to compute the absolute values of the simulated risk

factors. From (2.1) and (2.13) the m simulated values of the n risk factors at

time step N + 1 are given by

xki,N+1 = xi,N (1 + rk

i ) (2.14)

where i = 1, . . . , n and k = 1, . . . ,m.

So far, we have presented the traditional method of generating PRNs, assuming

that the marginal distributions of the relative risk factors are given by normal dis-

tributions (see step 3, equation (2.2)) and the joint distribution of the relative risk

factors is given by a multinormal distribution (see step 5, equation (2.7)). As al-

ready outlined in the introduction, we want to replace step 3 by the assumption that

each margin obeys a Student distribution. Then we want to replace step 5 by the

assumption that a copula can be used to describe the dependence structure between

the margins. To be able to understand the copula method, we present the basic fea-

tures of copulas in the following chapter. Chapter 4 then shows how to replace the

steps 3–7 from the algorithm above.

8

Chapter 3

Basic Features of Copulas

In this chapter we summarise the basic definitions that are necessary to understand

the concept of copulas. We then present the most important properties of copulas

that are needed to understand the usage of copulas in finance.

We will follow the notation used in [3]. As it would exceed the scope of this thesis,

we do not present the proofs of the theorems cited. At this point we again refer to the

excellent textbook by Nelson [3]. Furthermore, we will restrict ourselves to copulas

of two dimensions only. The generalisation to n dimensions can also be found in [3].

3.1 Definition of a Copula

Definition 1 A two-dimensional copula is a function C : [0, 1] × [0, 1] → [0, 1] with

the following properties:5

1. For every u, v ∈ [0, 1]:

C(u, 0) = C(0, v) = 0 . (3.1)

2. For every u, v ∈ [0, 1]:

C(u, 1) = u and C(1, v) = v . (3.2)

3. For every u1, u2, v1, v2 ∈ [0, 1] where u1 ≤ u2 and v1 ≤ v2:

C(u2, v2) − C(u2, v1) − C(u1, v2) + C(u1, v1) ≥ 0 . (3.3)

5In mathematical literature, the letter C is commonly used in conjunction with continuous func-tions. As we will show in Theorem 1 (see page 14), a copula is even uniformly continuous on itsdomain.

9

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

10

0.2

0.4

0.6

0.8

1

v

u

Π(u,v)

Figure 3.1: The product copula.

Below we will use the shorthand notation copula for C. As the usage of the name

copula for the function C is not obvious from Definition 1, it will be explained at the

end of Section 3.3.

A function that fulfils property 1 is also said to be grounded. Property 3 is the

two-dimensional analogue of a non-decreasing one-dimensional function. A function

with this feature is therefore called 2-increasing.

3.2 Examples of Copulas

This section starts with three examples of copulas that have a very simple structure.

These copulas are very important as they represent some characteristic features. Then

we introduce three families of copulas. A copula family is defined by a copula function

that includes an additional free parameter.

3.2.1 The Product Copula

The first example we want to consider is the product copula. It is given by the function

Π(u, v) ≡ uv (3.4)

which is also shown in Figure 3.1. As stated in Theorem 4 (see page 18) the special

10

0 0.2 0.4 0.6 0.8 100.20.40.60.810

0.2

0.4

0.6

0.8

1

vu

M(u,v)

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 100.20.40.60.810

0.2

0.4

0.6

0.8

1

vu

W(u,v)

Figure 3.2: The Frechet-Hoeffding upper (left) and lower (right) bounds.

importance of the product copula goes back to the description of independent random

variables.

3.2.2 Bounding Copulas

The copula defined by

M(u, v) ≡ min(u, v) (3.5)

is called the Frechet-Hoeffding upper bound. It can be shown that for any given copula

C one has C(u, v) ≤ M(u, v) for all u, v ∈ [0, 1].

By contrast, the Frechet-Hoeffding lower bound is defined by

W (u, v) ≡ max(u + v − 1, 0) . (3.6)

For W one finds W (u, v) ≤ C(u, v) for any given copula C and all u, v ∈ [0, 1]. Both

M and W are presented in Figure 3.2.

3.2.3 The Frank Copula

The Frank copula [6] is our first example of a 1-parameter copula family. It is given

by

CFrank(u, v) ≡

−1θln

[

1 +(e−θu

−1)(e−θv−1)

e−θ−1

]

∀ θ ∈ ]−∞,∞[\{0}uv for θ = 0

. (3.7)

11

0 0.2 0.4 0.6 0.8 100.20.40.60.810

0.2

0.4

0.6

0.8

1

vu

θ = 3.1

CFrank

(u,v)

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 100.20.40.60.810

0.2

0.4

0.6

0.8

1

vu

θ = −3.1

CFrank

(u,v)

Figure 3.3: The Frank copula for positive (left) and negative (right) values of thecopula parameter θ.

The influence of the copula parameter θ on the dependence structure between the two

variables u and v is shown in Figure 3.3.6 As one might already assume by comparing

Figure 3.3 with Figures 3.1 and 3.2, the Frank copula obeys the following properties

CFrank(u, v)θ→∞−→ M(u, v) , (3.8)

CFrank(u, v)θ→−∞−→ W (u, v) , (3.9)

CFrank(u, v)|θ=0 = Π(u, v) . (3.10)

A copula family that contains M , W and Π is called comprehensive.

3.2.4 The Gumbel-Hougaard Copula

The next copula we want to consider is the Gumbel-Hougaard copula [7]. It is given

by the function

CGH(u, v) ≡ exp{

−[

(− ln u)θ + (− ln v)θ]1/θ

}

. (3.11)

The parameter θ may take all values in the interval [1,∞[. In Figure 3.4 we present

the graph of the Gumbel-Hougaard copula for θ = 1.4.7

6The value θ = 3.1 as used in Figure 3.3 was not chosen arbitrarily. It describes the dependencestructure of the relative FX rates USD/DEM vs. GBP/DEM, obtained from historical data of theperiod 02/01/1980 to 30/12/1998. See Chapter 5 for details.

7As in the case of the Frank copula (see footnote 6), the value θ = 1.4 as used in Figure 3.4describes the dependence structure of the relative FX rates USD/DEM vs. GBP/DEM under theassumption of a Gumbel-Hougaard copula.

12

0

0.2

0.4

0.6

0.8

1

0 0.2 0.4 0.6 0.8 10

0.2

0.4

0.6

0.8

10

0.2

0.4

0.6

0.8

1

v

u

θ = 1.4

CGH

(u,v)

Figure 3.4: The Gumbel-Hougaard copula for θ = 1.4.

A discussion in [1, 2] shows that CGH is suitable to describe positive depen-

dence structures, i.e., the Gumbel-Hougaard copula “is consistent with bivariate ex-

treme value theory and could be used to model the limiting dependence structure of

component-wise maxima of bivariate random samples” [2]. We will come back to this

point later in this thesis (page 27).

Like relations (3.8) and (3.10) one finds for the Gumbel-Hougaard copula

CGH(u, v)θ→∞−→ M(u, v) , (3.12)

CGH(u, v)|θ=1 = Π(u, v) . (3.13)

However, a relation equivalent to (3.9) does not exist for the Gumbel-Hougaard cop-

ula, i.e., the Gumbel-Hougaard copula is not comprehensive.

3.2.5 The Gaussian Copula

The last copula that we want to introduce is the Gaussian or normal copula [2],

CGauss(u, v) ≡∫ Φ−1

1 (u)

−∞

dr1

∫ Φ−12 (v)

−∞

dr2 f(r1, r2) . (3.14)

In (3.14), f denotes the bivariate normal density function, i.e., the function defined by

(2.7) for n = 2. The function Φ1 in (3.14) refers to the one-dimensional, cumulative

13

normal density function,

Φ1(r1) =

∫ r1

−∞

dr′1 f1(r′

1) , (3.15)

where f1 is given by equation (2.2). For Φ2 an analogous expression holds.

If the covariance between r1 and r2 is zero, the Gaussian copula becomes

CGauss(u, v) =

∫ Φ−11 (u)

−∞

dr1 f1(r1)

∫ Φ−12 (v)

−∞

dr2 f2(r2)

= u v (3.16)

= Π(u, v) if c1,2 = 0 .

As we will see in Section 3.4, result (3.16) is a direct consequence of Theorem 4.8

As Φ1(r1), Φ2(r2) ∈ [0, 1], we can replace u, v in (3.14) by Φ1(r1), Φ2(r2). If we

consider r1, r2 in a probabilistic sense, i.e., r1 and r2 being values of two random

variables R1 and R2, we obtain from (3.14)

CGauss(Φ1(r1), Φ2(r2)) = IP(R1 ≤ r1, R2 ≤ r2) . (3.17)

In other words, CGauss(Φ1(r1), Φ2(r2)) is the binormal cumulative probability function.

3.3 Sklar’s Theorem and Further Important Prop-

erties of Copulas

In this section we focus on the properties of copulas. The theorem we will present

next establishes the continuity of copulas via a Lipschitz condition on [0, 1] × [0, 1]:

Theorem 1 Let C be a copula. Then for every u1, u2, v1, v2 ∈ [0, 1]:

|C(u2, v2) − C(u1, v1)| ≤ |u2 − u1| + |v2 − v1| . (3.18)

From (3.18) it follows that every copula C is uniformly continuous on its domain.

A further important property of copulas concerns the partial derivatives of a

copula with respect to its variables:

Theorem 2 Let C be a copula. For every u ∈ [0, 1], the partial derivative ∂ C/∂ v

exists for almost all9 v ∈ [0, 1]. For such u and v one has

0 ≤ ∂

∂ vC(u, v) ≤ 1 . (3.19)

8See also footnote 16 on page 27.9The expression “almost all” is used in the sense of the Lebesgue measure.

14

The analogous statement is true for the partial derivative ∂ C/∂ u. Consequently, the

functions

u → Cv(u) ≡ ∂ C(u, v)/∂ v and v → Cu(v) ≡ ∂ C(u, v)/∂ u (3.20)

are well defined on [0,1].

In addition, Cv and Cu are non-decreasing almost everywhere on [0,1].

To give some examples of this theorem, we consider the partial derivatives of the

Frank copula and the Gumbel-Hougaard copula with respect to u. Let us start with

the Frank copula (3.7). As the discussion would be trivially for θ = 0, we consider in

the following the case θ ∈ ]−∞,∞[\{0} only. From definition (3.7) we have

CFrank,u(v) =e−θu

(

e−θv − 1)

e−θ − 1 + (e−θu − 1) (e−θv − 1)∀ θ ∈ ]−∞,∞[\{0} . (3.21)

Obviously, CFrank,u is defined for all v ∈ [0, 1]. CFrank,u in particular is a strictly

increasing function of v. The inverse function can be calculated analytically and is

given by

C−1Frank,u(y) = −1

θln

[

1 + y(

eθ(u−1) − 1)

1 + y (eθu − 1)

]

∀ θ ∈ ]−∞,∞[\{0} . (3.22)

The second example of Theorem 2 deals with the Gumbel-Hougaard copula (3.11).

Its partial derivative with respect to u is given by

CGH,u(v) = exp{

−[

(− ln u)θ + (− ln v)θ]1/θ

}

×[

(− ln u)θ + (− ln v)θ]

−θ−1

θ(− ln u)θ−1

u.

(3.23)

In Figure 3.5, CGH,u is shown for u = 0.1, 0.2, . . . , 0.9 and θ = 2.0. Note that for

u ∈ ]0, 1[ and for all θ ∈ IR where θ ≥ 1, CGH,u is a strictly increasing function of

v. Therefore the inverse function C−1GH,u is well defined. However, as one might guess

from (3.23), C−1GH,u cannot be calculated analytically so that some kind of numerical

algorithm has to be used for this task.

As CFrank and CGH are both symmetric in u and v, the partial derivatives of CFrank

and CGH with respect to v exhibit identical behaviours for the same set of parameters.

We now turn to the central theorem in the theory of copulas, Sklar’s theorem [12].

To be able to understand it, we recall some terms known from statistics.

Definition 2 A distribution function is a function F : IR → [0, 1] with the following

properties:

15

0 0.25 0.5 0.75 10

0.25

0.5

0.75

1

v

CGH,u

(v)

u = 0.1 →

← u = 0.9

Figure 3.5: The partial derivative CGH,u of the Gumbel-Hougaard copula for u =0.1, 0.2, . . . , 0.9 and θ = 2.0.

1. F is non-decreasing.

2. F (−∞) = 0 and F (+∞) = 1.

For example, if f is a probability density function, then the cumulative probability

function

F (x) =

∫ x

−∞

dt f(t) (3.24)

is a distribution function.

Definition 3 A joint distribution function is a function H : IR× IR → [0, 1] with the

following properties:

1. H is 2-increasing.

2. For every x1, x2 ∈ IR: H(x1,−∞) = H(−∞, x2) = 0 and H(+∞, +∞) = 1.

It is shown in [3] that H has margins F1 and F2 that are given by F1(x1) ≡ H(x1, +∞)

and F2(x2) ≡ H(+∞, x2). Furthermore, F1 and F2 themselves are distribution func-

tions.

In Section 3.2.5 we have already given an example of a two-dimensional function

satisfying the properties of Definition 3. If one considers CGauss as given in (3.17) as

16

a function of r1 and r2, it can be seen as the joint distribution function belonging to

the binormal distribution.

We now have all the definitions that we need for Sklar’s theorem.

Theorem 3 (Sklar’s theorem) Let H be a joint distribution function with margins

F1 and F2. Then there exists a copula C where

H(x1, x2) = C(F1(x1), F2(x2)) (3.25)

for every x1, x2 ∈ IR. If F1 and F2 are continuous, then C is unique. Otherwise, C is

uniquely determined on Range F1 × Range F2. On the other hand, if C is a copula

and F1 and F2 are distribution functions, then the function H defined by (3.25) is a

joint distribution function with margins F1 and F2.

With Sklar’s theorem in mind the use of the name “copula” becomes obvious. It was

chosen by Sklar to describe [13] “a function that links a multidimensional distribution

to its one-dimensional margins” and appeared in mathematical literature for the first

time in [12]. Usually, the term “copula” is used in grammar to describe an expression

that links a subject and a predicate. The origin of the word is Latin.

3.4 Copulas and Random Variables

So far we have considered functions of one or two variables. We now turn our attention

to random variables. As already used at the end of Section 3.2.5, we will denote

random variables by capital letters and their values by lower case letters. In the

definition of a random variable and its distribution function we again follow [3]:

Definition 4 A random variable R is a real valued measurable function on a prob-

ability space whose distribution is described by a probability density function f(r),

where the domain of f is the range of R.

Note that it is not stated whether or not the probability function f is known.

Definition 5 The distribution function of a random variable R is a function F that

assigns all r ∈ IR a probability F (r) = IP(R ≤ r). In addition, the joint distribution

function of two random variables R1, R2 is a function H that assigns all r1, r2 ∈ IR a

probability H(r1, r2) = IP(R1 ≤ r1, R2 ≤ r2).

17

We want to emphasise that (joint) distribution functions in the sense of Defi-

nition 5 are also (joint) distribution functions in the sense of Definitions 2 and 3.

Therefore equation (3.25) from Sklar’s theorem can be written as

H(r1, r2) = C(F1(r1), F2(r2)) = IP(R1 ≤ r1, R2 ≤ r2) . (3.26)

Before we can present the next important property of copulas we first have to

introduce the term of independent variables:

Definition 6 Two random variables R1 and R2 are independent if and only if the

product of their distribution functions F1 and F2 equals their joint distribution func-

tion H,

H(r1, r2) = F1(r1) F2(r2) for all r1, r2 ∈ IR . (3.27)

In Section 3.2.1 we have already mentioned the product copula Π (see equation

(3.4)). The next theorem clarifies the importance of this copula.

Theorem 4 Let R1 and R2 be random variables with continuous distribution func-

tions F1 and F2 and joint distribution function H. From Sklar’s theorem we know

that there exists a unique copula CR1R2 where

IP(R1 ≤ r1, R2 ≤ r2) = H(r1, r2) = CR1R2(F1(r1), F2(r2)) . (3.28)

Then R1 and R2 are independent if and only if CR1R2 = Π.

We will end this section with a statement on the behaviour of copulas under

strictly monotone transformations of random variables.

Theorem 5 Let R1 and R2 be random variables with continuous distribution func-

tions and with copula CR1R2 . If α1 and α2 are strictly increasing functions on

Range R1 and Range R2, then Cα1(R1) α2(R2) = CR1R2 . In other words: CR1R2 is

invariant under strictly increasing transformations of R1 and R2.

18

Chapter 4

Generation of Pseudo RandomNumbers Using Copulas

In the last chapter we presented the definition of copulas and their most important

properties, especially Sklar’s theorem. We are thus ready to return to the Monte

Carlo simulation of the value at risk. The strategy, as already outlined at the end

of Chapter 2, is to give a general guide on how to generate pairs of t-distributed

pseudo random numbers whose dependence structure is given by a copula.10 As the

numerical results presented in the following chapter assume either a Frank copula

or a Gumbel-Hougaard copula, we will focus our attention on these copulas where

necessary.

4.1 The General Method

Let us assume that the copula C we are interested in is known, i.e., all parameters of

the copula are known. The task is then to generate pairs (u, v) of observations of in

[0, 1]× [0, 1] uniformly distributed random variables U and V whose joint distribution

function is C. To reach this goal we will use the method of conditional distributions.

Let cu denote the conditional distribution function for the random variable V at a

given value u of U ,

cu(v) ≡ IP(V ≤ v, U = u) . (4.1)

From (3.26) and (4.1) and using the notation introduced in (3.20) we obtain11

cu(v) = lim∆u→0

C(u + ∆u, v) − C(u, v)

∆u=

∂uC(u, v) = Cu(v) . (4.2)

10The incorporation of the copula dependence structure will follow very closely the instructionsgiven in [3].

11Note that if Funiform denotes the distribution function of a random variable U which is uniformlydistributed in [0, 1], we have Funiform(u) = u.

19

From Theorem 2 we know that cu(v) is non-decreasing and exists for almost all

v ∈ [0, 1].

For the sake of simplicity, we assume from now on that cu is strictly increasing

and exists for all v ∈ [0, 1]. As discussed in Section 3.3, this is at least true for

the Frank and the Gumbel-Hougaard families of copulas. If these conditions are not

fulfilled, we have to replace the term “inverse” in the remaining part of this section

by “quasi-inverse” (see [3] again for details).

With result (4.2) at hand we can now use the method of variable transformation,

which is described in Appendix A.1, to generate the desired pair (u, v) of PRNs. The

algorithm consists of the following two steps:

• Generate two independent uniform PRNs u,w ∈ [0, 1]. Here u is already the

first number we are looking for.

• Compute the inverse function of cu. In general, it will depend on the parameters

of the copula and on u, which can be seen, in this context, as an additional

parameter of cu. Set v = c−1u (w) to obtain the second PRN.

It may happen that the inverse function cannot be calculated analytically. In this

case one can try to use a numerical algorithm to determine v. For example, this

situation occurs when the Gumbel-Hougaard copula is used, as already mentioned in

Section 3.3.

4.2 t-Distributed Pseudo Random Numbers with

a Copula Dependence Structure

In this section we present a detailed example of the application of the copula method

in the calculation of the value at risk. We refer to the algorithm presented in Chapter 2

and will use the same notation.

As stated at the end of Chapter 2, we want to replace steps 3–7 of the traditional

algorithm. On a more abstract level, these steps have the following content:

Step 3: For each margin, choose a proper distribution function.

Step 4: Determine the parameters of the distribution functions of the margins from

historical data.

Step 5: Choose a joint distribution function that describes the dependence structure

of the problem under consideration.

20

Step 6: Determine the parameter(s) of the joint distribution function from historical

data.

Step 7: Generate n-tuples (or simply pairs) of PRNs according to the distributions

of the margins with respect to the joint distribution.

Let us first turn to step 3 as mentioned on page 6. We have chosen the normal

distribution function (2.2) to describe each margin. We replace this assumption by

the following more general12 one:

Step 3 (copula method): The probability density functions that describe the risk

factors are generalised t-distributions (see Appendix B.1 for details),

fi(ri) =Γ

(

ki+12

)

Γ(

ki

2

)

Γ(

12

)√ki − 2 σi

(

1 +(ri − µi)

2

(ki − 2) σ2i

)

−ki+1

2

for i = 1, 2 . (4.3)

The degrees of freedom k1 and k2 are both restricted to numbers that are greater

or equal to three. This is to ensure that the variances of f1 and f2 are well

defined. To keep the cost of numerical computations low, we restrict ourselves

to the values ki = 3, 4, 5, . . . for i = 1, 2.

Step 4 of the original algorithm deals with the estimation of the parameters of the

marginal distributions. It will be modified to

Step 4 (copula method): The parameters µi and σ2i are the mean and the variance

of the marginal distribution (4.3) for each i = 1, 2 (see Appendix B.1 for details).

Therefore µi and σi can be calculated by using equations (2.4) and (2.5), as

already mentioned in step 4 of the original algorithm.

Using µi and σi we can consider each fi as a function of one free parameter ki

for i = 1, 2. To estimate k1 and k2 we use the maximum likelihood method as

described in Appendix C.2.

We now come to step 5. Here we have chosen the multinormal distribution function

to describe the dependence structure. As the numerical results that we will present

in the next chapter are based on two risk factors only, we will only consider the

case n = 2 below. We thus have to replace a binormal distribution function by an

alternative dependence structure.

With Sklar’s theorem in mind (see equation (3.26)), we now make the following

assumption:

12As the t-distribution converges in the limit k → ∞ to the normal distribution (see Appendix B),one can consider the t-distribution as a generalisation of the normal distribution.

21

Step 5 (copula method): The joint distribution function of the two risk factors is

given by a 1-parameter copula family,

C(F1(r1), F2(r2)) = IP(R1 ≤ r1, R2 ≤ r2) . (4.4)

The Fi are the distribution functions of the t-distributions (4.3). They are given

in Appendix B.2, equations (B.16)-(B.18). The copula parameter will be called

θ below.

When comparing (4.4) with expression (3.17) it becomes obvious that what we call

the copula method actually involves two replacements. The Gaussian copula in the

traditional method is replaced by an alternative copula, for example, the Frank cop-

ula or the Gumbel-Hougaard copula. In addition, the Gaussian distributions of the

margins are replaced by t-distributions.

Next we consider step 6 of the original algorithm. For two risk factors only, it shows

how to estimate the covariance or the correlation between these factors (see equation

(2.12)). Note that this is the only parameter that results from the assumption that

the joint distribution is of the binormal kind. From (3.17) and (4.4) it is obvious that

the role of the correlation ρ ≡ ρ1,2 in the traditional method is assumed by the copula

parameter θ in the copula method. Therefore, step 6 is modified as follows:

Step 6 (copula method): To estimate θ we again use the maximum likelihood

method, see Appendix C.3. From the cumulative probability (4.4) we derive

a probability density by calculating the partial derivative of C with respect to

both parameters,

f(r1, r2) ≡ ∂2

∂r1 ∂r2

C(F1(r1), F2(r2))

=∂2

∂u ∂vC(u, v)

u=F1(r1),v=F2(r2)

f1(r1) f2(r2) . (4.5)

For the Frank copula, we can compute the partial derivative in (4.5) using

equation (3.21). It follows that

∂2

∂u ∂vCFrank(u, v) =

e−θue−θv(

1 − e−θ)

θ

[e−θ − 1 + (e−θu − 1) (e−θv − 1)]2. (4.6)

For the Gumbel-Hougaard copula, we obtain from (3.23)

∂2

∂u ∂vCGH(u, v) =

(− ln u)θ−1

u

(− ln v)θ−1

e−Σ1/θ (

Σ−2(θ−1)/θ + (θ − 1)Σ−(2θ−1)/θ)

(4.7)

22

where Σ ≡ (− ln u)θ + (− ln v)θ. With equations (4.5)-(4.7) at hand the likeli-

hood function can be computed,

L(θ) = L((r1,1, r2,1), . . . , (r1,N , r2,N) ; θ) =N∏

j=1

f(r1,j, r2,j) . (4.8)

The N pairs (r1,j, r2,j) are again the relative changes of the historical data (see

equation (2.1) in Chapter 2).

From the numerical maximisation of L(θ) we finally obtain θ. A detailed dis-

cussion of this point is contained in Appendix C.3.

Let us now come to step 7 of the original algorithm. Instead of creating pairs of

correlated PRNs that belong to binormal distributed random variables, we now have

to generate PRNs that obey the copula dependence structure that was chosen in (4.4)

with the parameter θ and the marginal distributions that were selected in (4.3). Using

the results from Section 4.1, step 7 can now be reformulated:

Step 7 (copula method): First, generate two independent, in [0, 1] uniformly dis-

tributed PRNs u,w. Compute v = C−1u (w). For the Frank copula, C−1

Frank,u

is given by equation (3.22). In the case of the Gumbel-Hougaard copula,

C−1GH,u has to be computed numerically from equation (3.23). Finally deter-

mine r1 = F−11 (u) and r2 = F−1

2 (v) to obtain one pair (r1, r2) of PRN with the

desired copula dependence structure.

Let us end this chapter with two remarks on the last step. As already discussed

in Section 3.3, the determination of C−1GH,u(w) is generally possible from a numerical

point of view. Secondly, we also computed F−11 (u) and F−1

2 (v) numerically to obtain

the results which are presented in following chapter.

23

Chapter 5

Numerical Results

As already mentioned in the previous chapters, we considered just two risk factors in

our numerical simulations. These factors are the FX rate GBP/DEM and one out of

the FX rates USD/DEM, JPY/DEM, CHF/DEM and FRF/DEM.

For the copula method, we were using both the Frank copula (3.7) and the

Gumbel-Hougaard copula (3.11). Furthermore, for each copula we tested margins

that were assumed to be normally distributed (2.2) and t-distributed (4.3).

To see how copulas can improve the calculation of the value at risk and to demon-

strate the benefits it is not necessary to use a complicated portfolio. The VaR results

that we will discuss below are therefore based on a simple linear portfolio that contains

both a long position and a short position. Its value is given by

V = ∆Curr × Curr − 1000 × GBP , (5.1)

where the long position of the portfolio is given by Curr ∈ {USD, JPY, CHF, FRF}.As we are interested in the value of V given in DEM, one can say that the portfolio

contains the risk factors directly.

The data on which our investigation is based are the historical daily FX rates

ranging from 02/01/1980 to 30/12/1998 [10]. To make the results of the four portfolios

comparable, we have computed the means of the five FX rates under consideration

during this period. The results are listed in Table 5.1.13 With these results at hand,

we have chosen the quantities ∆Curr in a way that the average DEM value of the

long position equals the average DEM value of the short position for each of the

four portfolios. We therefore have ∆USD = 1640, ∆JPY = 253323, ∆CHF = 2708 and

∆FRF = 10024.

13As we need the results that are shown in Table 5.1 for the determination of the portfolio weightsonly, we do not present any errors of the values.

24

currency mean of DEM value

1 GBP 3.19395

1 USD 1.94745

100 JPY 1.26082

1 CHF 1.17945

10 FRF 3.18610

Table 5.1: Means of the DEM values of the currencies under investigation for theperiod 02/01/1980 to 30/12/1998.

Each time interval on which a VaR computation was based consisted of 250 busi-

ness days. Using the notation given in step 1 on page 5, we used N + 1 = 250 to

obtain our results. This corresponds to a time horizon of one banking year. Because

of this moving time window, the first date for which we were able to calculate a VaR

is 02/01/1981.

Let us now start our comparison of the traditional method vs. the copula method of

calculating the value at risk. First, we want to take a look at the different parameters

of the underlying joint distributions, i.e., the correlation ρ = ρ1,2 of the binormal

distribution, the parameter θFrank of the Frank copula and the parameter θGH of

the Gumbel-Hougaard copula. The behaviour of these parameters is presented in

Figure 5.1,14 where we show ρ, θFrank and θGH, obtained from the relative GBP/DEM

rates vs. the relative CHF/DEM rates, against time. As already mentioned in the

previous chapters, the theoretical values that these parameters may take are ρ ∈[−1, +1], θFrank ∈ ]−∞,∞[ and θGH ∈ [1,∞[.

A comparison of ρ and θFrank shows that the directions of both curves behave in

a similar manner on a qualitative level. However, even though it is most often the

case, an increase or decrease in one of the parameters does not necessarily result in an

increase or decrease in the other. Particularly there is no unique relation between ρ

and θFrank. This can also be seen from the right part of Figure 5.2. There we present

the same data as shown in Figure 5.1. The difference between both figures is that

θGH and θFrank are now plotted against ρ.

14In Figures 5.1 and 5.2 we only show the copula parameters θFrank and θGH for t-distributedmargins. The behaviour of the copula parameters does not change on a qualitative level if we areusing normally distributed margins. Therefore we omit a discussion of the corresponding values ofkGBP and kCHF at this point (see also Figure 5.5).

25

1982 1984 1986 1988 1990 1992 1994 1996 1998−3

−2

−1

0

1

2

3

4

date

ρ

θGH

θ

^

^

^

Frank

Figure 5.1: Linear correlation ρ (green), Gumbel-Hougaard copula parameter θGH

(blue) and Frank copula parameter θFrank (red), obtained from the relative GBP/DEMrates vs. the relative CHF/DEM rates, plotted against time.

−1 −0.5 0 0.5 1

1

1.1

1.2

1.3

1.4

ρ

θ

^

^

GH

−1 −0.5 0 0.5 1−3

−2

−1

0

1

2

3

4

ρ

θ

^

^

Frank

Figure 5.2: Gumbel-Hougaard copula parameter θGH (left) and Frank copula param-eter θFrank (right) vs. linear correlation ρ, obtained from the relative GBP/DEM ratesvs. the relative CHF/DEM rates.

26

If we consider positive dependence only, i.e., ρ > 0 and θFrank > 0, also θGH

shows the same qualitative behaviour. However, time ranges that result in negative

correlations ρ or negative values of θFrank on the one hand give (mostly) a constant

value of θGH = 1 for the Gumbel-Hougaard parameter. This behaviour can be seen

from Figure 5.1 and the left part of Figure 5.2. Therefore we conclude that the

Gumbel-Hougaard copula is only suited to describe positive dependence structures.15

This conclusion is in accordance with the results presented in [1, 2].

Our second point is that both the curve of the correlation ρ and the one of the

Frank copula parameter θFrank tend in the same region to significant values, ρ → 0

and θFrank → 0. Significant in this context means that, as stated in Section 3.2.3 and

in Theorem 4, for the Frank copula, θFrank = 0 is equivalent to independence of the

two risk factors. On the other hand, for the traditional method, ρ = 0 is equivalent to

independence, too.16 This shows again that both methods interpret the data correctly

in qualitative terms.

However, in finance we are interested in quantitatively correct statements. Espe-

cially in the case of risk management, we want to compute a value at risk that best

reflects reality. The message of this thesis is that for the examples under considera-

tion, which are defined by the portfolios

VUSD = 1640 × USD − 1000 × GBP (5.2)

VJPY = 253323 × JPY − 1000 × GBP (5.3)

VCHF = 2708 × CHF − 1000 × GBP (5.4)

VFRF = 10024 × FRF − 1000 × GBP (5.5)

the copula method provides more reliable results than the results obtained from the

traditional method. To confirm this statement we calculated the VaR for each port-

folio in the time interval from 02/01/1981 to 29/12/1998.17 For each business day

in this interval we calculated three different VaRs, corresponding to confidence levels

(or quantiles) of 90% (or α1 = 10%), 95% (or α2 = 5%) and 99% (or α3 = 1%). Each

of these VaR(αi, t, t + 1) was compared with the change in the value of the portfolio

from time t to time t + 1, i.e., with ∆V = Vt+1 − Vt. The numbers of underestimated

losses in relation to the total number of backtested values gave three values α1, α2

and α3 that could be compared with the three quantiles α1, α2 and α3.

15See page 13 for an explanation of what we mean with “positive dependence structures”.16In general, a correlation of zero does not imply independence. However, for the multinormal

distribution ρi,j = 0 for all i, j implies independence of the variables as the distribution function(2.7) factorises.

17We use the final day (30/12/1998) of our investigation for backtesting only.

27

1 3 5 7 9 11 13 15 0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

11%

# MC steps / 100

back

test

ing

failu

reTraditional Monte Carlo VaR

1 3 5 7 9 11 13 15 0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

10%

11%

# MC steps / 100ba

ckte

stin

g fa

ilure

Monte Carlo VaR using Frank

copula and t−distr. margins

Figure 5.3: Backtesting results of the USD portfolio (5.2), using the traditionalmethod (left) and the copula method (right) for quantiles α1 = 10% (green squares),α2 = 5% (red circles) and α3 = 1% (blue triangles) and various numbers of MC steps.

To test the dependence of the numerical results on the number of Monte Carlo

steps, we did this investigation for various numbers of MC steps ranging from 100

to 1500. For the USD portfolio (5.2), we present our results of the underestimated

losses in Table 5.2. The data that are listed in this table were obtained from a tradi-

tional Monte Carlo simulation on the one hand (i.e., Gaussian copula and normally

distributed margins) and from MC simulations using both the Frank and the Gumbel-

Hougaard copula on the other hand. The margins used with the copula methods were

t-distributions. In Figure 5.3 we show the data of the traditional method and of the

copula method, using the Frank copula.

First, we want to emphasise that, once a number of MC steps of about 500 is

reached, the data do not change significantly under any additional increase of the

number of MC steps. This behaviour can be observed for both algorithms, each

copula and each confidence level, as one can see from Table 5.2. Therefore numerical

results for numbers of MC steps of O(1000) seem to be reliable and are suited as a

basis for a discussion and comparison of the two methods.

Secondly, and this is the most important point that we want to emphasise, the

backtesting results show that the copula method works better then the traditional

method for each confidence level. As mentioned in Chapter 1, “better” in this context

28

percentage of underestimated losses

traditional Monte CarloMC with Gumbel-Hougaardcopula and t-distr. margins

MC with Frank copulaand t-distr. margins

no. ofMC-steps

α1 = 10% α2 = 5% α3 = 1% α1 = 10% α2 = 5% α3 = 1% α1 = 10% α2 = 5% α3 = 1%

100 9.35% 5.34% 2.28% 10.46% 5.76% 1.86% 10.77% 5.99% 1.80%

200 8.82% 4.68% 1.84% 9.98% 5.03% 1.35% 10.31% 5.36% 1.44%

300 8.60% 4.61% 1.71% 9.82% 4.99% 1.22% 10.06% 5.14% 1.15%

400 8.62% 4.63% 1.64% 9.73% 4.90% 1.15% 9.91% 5.05% 1.00%

500 8.60% 4.63% 1.68% 9.73% 4.79% 1.11% 9.93% 4.94% 1.00%

600 8.67% 4.72% 1.62% 9.71% 4.74% 1.06% 9.82% 4.88% 0.95%

700 8.67% 4.68% 1.62% 9.71% 4.81% 1.09% 9.82% 4.94% 1.00%

800 8.69% 4.68% 1.62% 9.64% 4.74% 1.02% 9.84% 4.92% 1.00%

900 8.73% 4.66% 1.55% 9.67% 4.83% 1.06% 9.80% 4.99% 1.00%

1000 8.73% 4.70% 1.62% 9.62% 4.72% 1.06% 9.86% 4.92% 1.02%

1100 8.65% 4.63% 1.66% 9.69% 4.72% 1.06% 9.86% 4.90% 0.93%

1200 8.58% 4.61% 1.64% 9.75% 4.77% 1.02% 9.93% 5.01% 0.98%

1300 8.62% 4.61% 1.62% 9.67% 4.81% 0.98% 9.86% 4.99% 0.91%

1400 8.76% 4.61% 1.62% 9.64% 4.70% 0.95% 9.86% 4.85% 0.89%

1500 8.71% 4.61% 1.62% 9.62% 4.74% 0.98% 9.93% 4.92% 0.89%

Tab

le5.2:

Back

testing

results

ofth

eU

SD

/GB

Pportfolio,

usin

gth

etrad

itional

meth

od

and

the

copula

meth

od

forth

eFran

kan

dth

eG

um

bel-H

ougaard

copula,

forquan

tilesα

1=

10%,α

2=

5%an

3=

1%,an

dvariou

snum

bers

ofM

Cstep

s.

29

means that the deviation of the percentage of the backtesting failures αi from the

quantiles αi is smaller in the case of the copula method than in the case of the

traditional method for each i = 1, 2, 3.

To quantify the last statement, we consider below the results that we obtained

from simulations using 1500 Monte Carlo steps. Again, the quantiles under inves-

tigation were α1 = 10%, α2 = 5% and α3 = 1%. In Table 5.3 we summarise the

corresponding percentages of underestimated losses αi for the four portfolios (5.2)-

(5.5) for i = 1, 2, 3. For the copula method, we used both the Frank copula and

the Gumbel-Hougaard copula. In addition, each copula was used with margins that

obey a normal distribution on the one hand (see below) and a t-distribution on the

other hand. Beside the results that we have obtained from the traditional and the

copula Monte Carlo methods, we also present in Table 5.3 backtesting results that

were based on the variance-covariance method and on historical simulation.18

We do the comparison of the backtesting results using the quantity

χ2 ≡3

i=1

(

αi − αi

αi

)2

. (5.6)

Its values for the various VaR methods just mentioned and for each portfolio are also

listed in Table 5.3.

The first statement we want to make is that the copula method results in more

reliable results than the traditional method for each portfolio under consideration if

the copula method is used with t-distributed margins. Whereas the deviation χ2 lies

in the range χ2 ∼ 0.2−0.4 for the traditional method, it lies between 0.005 and 0.034

for the copula method with t-distributed margins.

Next we turn our attention to the copula method in combination with Gaussian

margins. In anticipation of our results, stating that the copula method (i.e., using a

copula and t-distributed margins) works better, we have not mentioned this method so

far. It simply describes the use of a copula in combination with normally distributed

margins. In the language of Chapters 2 and 4, the copula method in combination with

Gaussian margins consists of steps 1-4 and 8 of the traditional method and steps 5-7

of the copula method. In this case, the deviation χ2 is approximately χ2 ∼ 0.1−0.45.

This is the same order of magnitude as the results obtained from the traditional

method. A one by one comparison of the eight values under consideration (four

portfolios times two copulas) shows that the copula method with Gaussian margins

gives more reliable results in five cases.

18Both the variance-covariance method and historical simulation are not explained in this thesis.An overview of both methods can be found in [4], a detailed discussion in [9], for example.

30

percentage of underestimated losses

portfolio quantileshistoricalsimulation

variance-covariance

tradition-al MC

MC GH,Gaussianmargins

MC GH,t-distr.margins

MC Frank,Gaussianmargins

MC Frank,t-distr.margins

10% 10.33% 8.60% 8.71% 8.60% 9.62% 9.75% 9.93%

1640 USD 5% 5.32% 4.63% 4.61% 4.34% 4.74% 5.03% 4.92%

-1000 GBP 1% 1.22% 1.40% 1.62% 1.29% 0.98% 1.40% 0.89%

χ2 0.0533 0.1822 0.4049 0.1184 0.0047 0.1579 0.0131

10% 10.77% 8.22% 8.71% 8.71% 10.02% 9.33% 10.02%

253323 JPY 5% 5.52% 4.34% 4.61% 4.43% 5.28% 5.10% 5.39%

-1000 GBP 1% 1.33% 1.22% 1.44% 1.26% 1.17% 1.35% 1.13%

χ2 0.1257 0.0968 0.2171 0.0989 0.0336 0.1289 0.0230

10% 10.04% 7.76% 7.74% 8.25% 10.33% 7.71% 9.64%

2708 CHF 5% 5.03% 3.99% 4.08% 4.10% 4.83% 4.12% 4.50%

-1000 GBP 1% 1.04% 1.37% 1.35% 1.57% 1.17% 1.35% 1.02%

χ2 0.0018 0.2312 0.2092 0.3925 0.0328 0.2071 0.0117

10% 9.98% 7.07% 6.94% 7.32% 9.29% 7.92% 9.40%

10024 FRF 5% 4.99% 4.13% 4.15% 4.24% 5.08% 4.41% 4.95%

-1000 GBP 1% 1.11% 1.35% 1.44% 1.60% 1.11% 1.55% 0.93%

χ2 0.0119 0.2407 0.3176 0.4514 0.0171 0.3623 0.0084

Tab

le5.3:

Com

parison

ofth

evariou

sV

aRm

ethods.

31

With these results at hand we come to the following conclusions:

• The bare change of the joint distribution function, leaving the distribution func-

tions of the margins unchanged, i.e., the change from the traditional method to

the copula method with Gaussian margins may improve the VaR calculation a

little if one is using the Frank copula or the Gumbel-Hougaard copula.

• The change of the distribution functions of the margins from Gaussian margins

to t-distributed margins makes the VaR calculations much more reliable: αi lies

much closer to αi for the copula method with t-distributed margins than for the

traditional method for each i = 1, 2, 3. Therefore χ2 is between one and two

orders of magnitude smaller for the copula method with t-distributed margins

than for the traditional method. These statements hold for both the Frank and

the Gumbel-Hougaard copula.

• The use of copulas is necessary if one wants to consider arbitrary distribution

functions of the margins.

• The backtesting failures that we obtain for the USD portfolio (5.2) for the whole

time period under consideration result in a smaller value of χ2 in the case of the

Gumbel-Hougaard copula than in the case of the Frank copula. For the other

three portfolios (5.3) - (5.5), the results of the Frank copula are more reliable.

The reason for this behaviour is the (strong) positive dependence between the

USD/DEM rate and the GBP/DEM rate during that time period. For most

dates within that period the correlation between these two risk factors, mea-

sured over 250 business days, is greater than zero.19 In contrast, the correlations

between the JPY/DEM rate, the CHF/DEM rate and the FRF/DEM rate on

the one side and the GBP/DEM rate on the other side, measured over 250 busi-

ness days, are less than zero for a significant number of dates within the whole

time period under consideration.20 For the correlation between the CHF/DEM

rate and the GBP/DEM rate, for example, this behaviour is presented by the

green line in Figure 5.1.

19The strong positive dependence between the USD/DEM rate and the GBP/DEM rate can alsobe seen if one takes a look on the correlations between these risk factors, based on all data in theperiod 02/01/1980 to 30/12/1998. It is given by ρUSD,GBP = 0.415.

20The correlations between the JPY/DEM rate, the CHF/DEM rate and the FRF/DEM rate onthe one side and the GBP/DEM rate on the other side, based on all data in the period 02/01/1980to 30/12/1998, are given by ρJPY,GBP = 0.179, ρCHF,GBP = 0.040 and ρFRF,GBP = 0.164.

32

−0.03 −0.015 0 0.015 0.030

20

40

60

80

100

120

r

f(r)

−0.03 −0.025 −0.02 −0.0150

1

2

3

r

f(r)

Figure 5.4: Histogram of relative GBP/DEM rates and approximations of the datausing a t-distribution (red line) with k = 4 and a Gaussian distribution (blue line).The right picture shows a magnified detail of the left picture.

As the estimator θGH of the Gumbel-Hougaard copula parameter becomes one

for a data sample that shows a negative dependence structure and because

θGH = 1 equals the product copula (see (3.13)), all data with any negative

dependence structure will be treated by the Gumbel-Hougaard copula as if they

were independent due to Theorem 4 (page 18). Therefore the pseudo random

numbers, that have to be generated within a Monte Carlo calculation of the

value at risk, do not show the correct dependence structure.

The comparison of the two copula methods, using the Gumbel-Hougaard copula

on the one hand and the Frank copula on the other hand, shows that the Frank

copula should be preferred in general. In situations where the data under inves-

tigation show a (strong) positive dependence structure, the Gumbel-Hougaard

copula may also be used. Again, this in accordance with [1, 2].

As the main improvement of the VaR calculation comes from the replacement

of the Gaussian margins by t-distributed margins, we want to investigate this point

in some more detail. In Figure 5.4 we show a normalised histogram of the relative

GBP/DEM rates, based on 4759 data points that can be obtained from the period

02/01/1980 to 30/12/1998. In addition, we show in the same figure the density

functions of the corresponding t-distribution and of the corresponding Gaussian dis-

tribution. The estimated parameters for both distributions are µGBP = −0.0000528

33

1982 1984 1986 1988 1990 1992 1994 1996 19981

10

100

date

k (250 days)

↑k (19 years)

Figure 5.5: Estimator of the index k of the t-distribution of the relative GBP/DEMrates, measured over 250 business days (blue points) and measured over the period02/01/1980 to 30/12/1998 (red line).

and σGBP = 0.00500. For the t-distribution we have kGBP = 4. As one can see

from Figure 5.4, the t-distribution describes the data much better than the Gaussian

distribution. This is both true for the region around the mean of the distribution,

which is obvious from the left part of Figure 5.4, and for the tail behaviour, as can

be seen from the right part of Figure 5.4. As the tail behaviour is relevant for VaR

calculations, we will come back to it during the discussion of Figure 5.6.

As step number four of the copula method (see page 21) states that the estimator of

the parameter k of the t-distribution of each margin has to be calculated for each time

step, we present in Figure 5.5 the evolution of k of the relative GBP/DEM rates. The

blue points that are shown in Figure 5.5 are based on 250 business days, in agreement

with our VaR calculations. One can see from this semilogarithmic plot that k lies

between three and ten for most of the dates. As a Gaussian distribution corresponds

to k → ∞, we conclude that the relative GBP/DEM rates can be described much

better by a t-distribution with a small number k than by a Gaussian distribution

in the period 02/01/1980 to 31/12/1998. We observed the same behaviour for the

relative FX rates USD/DEM, JPY/DEM CHF/DEM and FRF/DEM.

We already mentioned that two additional methods exist to compute the VaR, the

variance-covariance method and historical simulation. We now compare our numerical

34

results with the analytical results (see Table 5.3 again) obtained using these methods.

We first take a look on the backtesting results of the variance-covariance method.

As expected, the values of αi lie very close to the corresponding values that we

obtained from the traditional method for each i = 1, 2, 3. This can be seen as a

kind of consistence check of the algorithm of the traditional method as, in the case

of a portfolio which is a linear combination of the risk factors (see equation (5.1)),

the variance-covariance VaR is the limit of the traditional Monte Carlo VaR for an

infinite number of MC steps. Because of this fact, the VaR results obtained from

the variance-covariance method are also less reliable than the VaR results from the

copula method.

Let us now discuss the historical simulation results. As one can see from Ta-

ble 5.3, the backtesting results are much better than in the case of the traditional

method, slightly better than the results obtained from the copula method with Gaus-

sian margins and about as good as the results obtained from the copula method using

t-distributed margins. The high quality of the backtesting results obtained from the

historical simulation is well known [9]. Its reason is given by the simplicity of the

historical simulation. This method does not need any model to describe the risk

factors and the dependence between different risk factors, i.e., one can neglect any

systematic errors due to model assumptions. In addition, no approximations (e.g.,

linear approximation) have to be used within the VaR calculation.

To get a deeper understanding of our numerical results, we now come back to the

traditional method and the copula method with t-distributed margins. In the upper

left part of Figure 5.6 we present the relative FX rates USD/DEM vs. GBP/DEM,

measured during the whole period under investigation, i.e., in the period 02/01/1980

to 30/12/1998. Based on these data points we estimated the parameters of the

distributions. For the relative FX rates GBP/DEM they are given on page 33. For

the relative FX rates USD/DEM we obtained µUSD = 0.0000216 and σUSD = 0.00732

for both distributions and kUSD = 6 for the t-distribution.

With the parameters of the margins at hand we generated three sets of pairs of

pseudo random numbers. Each of these sets contains 4759 pairs of PRNs, i.e., the

same number of data as provided by the historical data.

The first set of PRNs was generated according to the traditional method, i.e., the

pairs of PRNs were generated according to a binormal distribution. The correlation

that we used to generate the PRNs is ρUSD,GBP = 0.415, as already given on page 32.

We present these PRNs in the lower left part of Figure 5.6. A comparison of this

figure with the historical data points provides a good hint for the rather bad quality

35

−0.06 −0.03 0 0.03 0.06−0.06

−0.03

0

0.03

0.06

rUSD

rGBP

relative FX−rates USD/DEM vs. GBP/DEM

(02.01.1980 − 30.12.1998 = 4759 data points)

−0.06 −0.03 0 0.03 0.06−0.06

−0.03

0

0.03

0.06

r1

r2

4759 PRNs, acc. to Gumbel−Hougaard

copula with t−distributed margins

−0.06 −0.03 0 0.03 0.06−0.06

−0.03

0

0.03

0.06

r1

r2

4759 PRNs, acc. to Gaussian

copula with Gaussian margins

−0.06 −0.03 0 0.03 0.06−0.06

−0.03

0

0.03

0.06

r1

r2

4759 PRNs, acc. to Frank

copula with t−distributed margins

Figure 5.6: Historical data of relative FX rates USD/DEM vs. GBP/DEM and cor-responding pseudo random numbers, according to the traditional method and thecopula method. See text for more details.

of the backtesting results, obtained with the traditional method. Even if the shape

of the PRNs21 around the mean is similar to the one of the historical data, extreme

data points are missing in the lower left part of Figure 5.6. This is a consequence of

the low tails of Gaussian distributions. However, the tail dependence is crucial for a

VaR calculation. Especially in our case of a linear portfolio (5.1), the values of the

simulated pairs of PRNs are directly related to the profits and losses of the portfolio.

Let us now turn to the right part of Figure 5.6. There we present pairs of PRNs

that were generated according to t-distributions (with parameters as given above)

that were linked together by copulas. For the upper right part of Figure 5.6 we used

the Gumbel-Hougaard copula with θGH;USD,GBP = 1.401, for the lower right part we

used the Frank copula with θFrank;USD,GBP = 3.077. One can see immediately that

both copula pictures contain many more outliers than the traditional picture. The

reason for this are the fat tails of the t-distributions. Although the outliers in the

copula pictures are not in the same places as in the historical picture, they give

a better approximation of the historical picture than the traditional picture. As a

consequence, the copula method in conjunction with t-distributed margins provides

21The PRNs that have been generated using the binormal distribution yield a figure that resemblesan ellipsoid. This behaviour is well known in statistics and can therefore be used as additionalevidence that our numerical Monte Carlo algorithm works fine in the traditional case.

36

VaR results that are much more reliable than the VaR results obtained from the

traditional method.

We finally want to make a remark on the three sets of PRNs that are shown

in Figure 5.6. As described in step 7 of the copula method (see page 23) and in

Appendix A.3 (see equations (A.15) and (A.16)), the starting point for all these PRNs

are independent, in [0, 1] uniformly distributed PRNs. To ensure a fair comparison,

we used the same set of these PRNs to generate the lower left part, the upper right

part and the lower right part of Figure 5.6.

37

Chapter 6

Summary and Conclusions

In this thesis we have demonstrated an alternative method to compute the value at

risk of a financial portfolio. For this purpose, we have introduced the concept of

copulas and presented the most important properties of these functions. For the sake

of simplicity, we have restricted ourselves to two risk factors and therefore to two-

dimensional copulas only. We have discussed two representatives of the copula family

in detail, the Frank copula and the Gumbel-Hougaard copula. Using this tool, we

have modified two basic assumptions underlying the common method of computing

the VaR using Monte Carlo techniques: usually, the dependence structure between

two risk factors is assumed to be described by the correlation between these factors.

Particularly, the joint distribution is assumed to be of the binormal kind. We have

replaced this dependence structure by a dependence structure that is defined by one

of the two copulas mentioned above. In doing so, we have replaced the role of the

correlation ρ by the parameters θFrank and θGH that determine the behaviour of the

copulas. Secondly, we have replaced the assumption that each risk factor obeys a

normal distribution by the assumption of t-distributed margins.

To test whether the copula ansatz improves the computation of the VaR, we

have considered four simple linear portfolios which consist of the FX rate GBP/DEM

and one out of the FX rates USD/DEM, JPY/DEM, CHF/DEM and FRF/DEM.

Assuming confidence levels of 90%, 95% and 99%, we have calculated the VaRs for

these portfolio on the basis of 250 banking days, using the following methods:

• traditional Monte Carlo simulation;

• copula Monte Carlo simulation, using both normally distributed margins and t-

distributed margins and using both the Frank copula and the Gumbel-Hougaard

copula;

38

• the variance-covariance method;

• historical simulation.

The computations and the corresponding backtesting of the results have been per-

formed on the basis of historical FX rates ranging over nineteen years. This means

that our backtesting statistics are based on approximately 4750 measurements.

The results that we have obtained in this thesis show that, at least for the portfolios

under consideration, the copula method in conjunction with t-distributed margins

is more reliable than both the traditional Monte Carlo method and the variance-

covariance method. In other words, for each confidence level the absolute deviations

of the percentages of the backtesting failures from the quantiles are smaller in the

case of the copula method than both in the case of the traditional method and in the

case of the variance-covariance method.

A comparison of the copula method with t-distributed margins with the historical

simulation shows that both methods result in VaR values that are of the same quality.

The backtesting values of these methods lie very close to the predicted values and

very close to each other.

With these results at hand we now want to formulate the answers to the three

questions that were asked in the introduction. The first question asked for the ap-

propriate marginal distributions of the risk factors. Based on our numerical data the

answer has to be that, for the most dates within the nineteen years of investigation,

the five FX rates under consideration can be described very well by t-distributions

with a parameter k that is less than or equal ten. Particularly, the margins do not

show a Gaussian behaviour, which corresponds to k → ∞.

The second question asked for the appropriate joint distribution of the risk factors.

As discussed in detail in the previous chapter, a binormal distribution describes the

data well in the sense that the shape of the distribution resembles the shape of

the historical data. However, extreme values are missing if one uses a binormal

distribution. Therefore we suggest the Frank copula to describe the joint distribution.

In addition, if one is dealing with margins that show a positive dependence structure,

the joint distribution can also be described by the Gumbel-Hougaard copula.

The content of the third question was of a more technical nature. It concerned a

possibility of simulating pseudo random numbers using these marginal distributions

and this joint distribution. Whereas the necessary tools to generate independent

PRNs according to a t-distribution are given in Appendix A.1 and Appendix B,

Chapter 4 describes in detail how to generate pairs of PRNs (with arbitrary margins)

39

using a copula dependence structure. Furthermore, we have given detailed infor-

mation on what an algorithm looks like in the case of the Frank copula and the

Gumbel-Hougaard copula.

We finally want to summarise the application of copulas and the t-distribution in

financial risk management. In this thesis it has been shown that the use of copulas

and the t-distribution may improve the reliability of the VaR calculation of a financial

portfolio. To be able to decide whether or not one should prefer the copula method

to the traditional method, a supplementary investigation of the speed of the proposed

copula algorithm would be useful. As we considered two risk factors only, a natural

extension of studies in this field would have to include three or more risk factors.

Furthermore, there is a large number of other copulas [3], some of which also may

be suited for use in a VaR computation. Finally one could test additional marginal

distributions. For example, this might be necessary if one is using risk factors that

are different from FX rates. As discussed in detail, due to the use of copulas one is

completely free in the choice of the marginal distribution. The last point might be

the biggest advantage when applying copulas to the computation of the value at risk.

40

Appendix A

Generation of Pseudo RandomNumbers

In this appendix we deal with the generation of pseudo random numbers. At first

we present a method that is based on the transformation of variables. It can be

used for the generation of PRNs that obey an univariate distribution for which one

can compute the inverse distribution function either analytically or numerically. In

the next section we summarise the Box-Muller method. It shows how to generate

normally distributed PRNs. We end this chapter with a generalisation of the Box-

Muller method for the simultaneous generation of tuples of PRNs that are distributed

according to a multivariate normal distribution.

All methods that we discuss are based on the assumption that one is able to

generate independent PRNs that are uniformly distributed in the [0, 1] interval. As

it is not within the scope of this thesis, we do not describe how to generate these

numbers. The interested reader may find several algorithms describing this topic in

any good text book on numerical statistics (see, for example, [11]).

A.1 Transformation of Variables

A general method to generate PRNs according to a given distribution uses a trans-

formation of variables to obtain these numbers from uniformly distributed PRNs.

Starting with a uniformly distributed PRN u ∈ [0, 1] we want to generate a PRN

r = r(u) with a given distribution f(r),

∫ b

a

dr f(r) = 1 . (A.1)

41

First we equalise the integrated probability density to u,

F (r) =

∫ r

a

dr′ f(r′) = u . (A.2)

We next have to invert the distribution function F . Then

r(u) = F−1(u) (A.3)

is the PRN with the desired distribution. In cases where the computation of F−1

cannot be done analytically one either has to compute (A.3) numerically or one has

to use a different algorithm for the generation of the PRNs, e.g., the Box-Muller

method for normally distributed PRNs.

A.2 The Box-Muller Method

We now shortly summarise what is known in literature as the Box-Muller method.

This refers to an algorithm which is used to generate simultaneously two independent

normally distributed PRNs from two independent uniformly distributed PRNs.

Assume that we want to generate r1 from a N (µ1, σ21) distribution and r2 from a

N (µ2, σ22) distribution.22 If we denote the two uniformly distributed PRNs by u1 and

u2, the introduction of polar coordinates, followed by a variable transformation as in

A.1, shows (see [11] for details) that r1 and r2 are given by

r1 =√

−2σ21 ln u1 sin(2π u2) + µ1 , (A.4)

r2 =√

−2σ22 ln u1 cos(2π u2) + µ2 . (A.5)

A.3 Pseudo Random Numbers According to the

Multinormal Distribution

Below we will discuss how to generate a multivariate normal distribution of n corre-

lated PRNs represented by a vector ~r. The density function we have to consider is

presented in equation (2.7).

The basic idea behind generating the desired PRNs is to apply what is called

Cholesky decomposition [14] to the inverse covariance matrix, provided that the inverse

exists. The ansatz of this decomposition is

C−1 = AtA , (A.6)

22See equation (2.2) on page 6 for the density function N (µi, σ2i ) of a normal distribution.

42

where C is given by (2.9). The matrix A in (A.6) is a lower triangular matrix,

A =

α1,1 0 · · · 0

α2,1 α2,2. . .

......

.... . . 0

αn,1 αn,2 · · · αn,n

. (A.7)

Using (A.6) we now perform a linear variable transformation in (2.7),

~r → ~s = A (~r − ~µ) . (A.8)

One can show that the corresponding determinant of the Jacobian matrix is given by

∂(r1, . . . , rn)

∂(s1, . . . , sn)=

√det C . (A.9)

Therefore, the transformed density function takes the form

f(~s) =1

(2π)nexp

{

−1

2~s t ~s

}

=n

i=1

1√2π

exp

{

−1

2s2

i

}

. (A.10)

Equation (A.10) is the density function of n uncorrelated, standard normally dis-

tributed variables. Therefore the simultaneous generation of n correlated, normally

distributed PRNs is achieved in two steps:

(a) Generate n uncorrelated, standard normally distributed PRNs s1, . . . , sn using

the Box-Muller method presented in Section A.2.

(b) Compute the desired PRNs r1, . . . , rn by the reverse transformation of (A.8),

~s → ~r = A−1~s + ~µ . (A.11)

As we only need the two-dimensional case in this thesis, we will now focus on

the generation of bivariate normally distributed PRNs. The n-dimensional case is a

simple generalisation and can be found, for example, in [9].

The first step is (see equations (A.4) and (A.5)):

(a) Generate two uncorrelated, standard normally distributed PRNs

s1 =√

−2 ln u1 sin(2π u2) and (A.12)

s2 =√

−2 ln u1 cos(2π u2) (A.13)

from two uncorrelated PRNs u1 and u2 that are uniformly distributed in [0, 1].

43

From (2.9), (A.7) and condition (A.6) one can show that A can be written as

A =1√

det C

(

σ2

1 − ρ2 0−ρ σ2 σ1

)

(A.14)

where det C = σ21σ

22 (1 − ρ2) and ρ denotes the correlation coefficient (see (2.11)),

ρ = ρ1,2 = c1,2/(σ1σ2). The correlation is normalised so it can take all values in the

interval [−1, 1]. For ρ = ±1 expression (A.14) is not defined. However, in this section

we can neglect these cases as ρ = ±1 means that the variables r1 and r2 are completely

positively or negatively correlated. In this case we only have to simulate one PRN

using the method described in Section A.2. The second PRN will be deterministic.

From (A.11) and (A.14) one finds that the second step mentioned above becomes:

(b) Perform the variable transformation s1, s2 → r1, r2 as follows:

r1 = σ1 s1 + µ1 and (A.15)

r2 = σ2 (ρ s1 +√

1 − ρ2 s2) + µ2 . (A.16)

r1 and r2 are the two desired, correlated PRNs that obey a bivariate normal

distribution.

44

Appendix B

The t-Distribution

In this appendix we summarise some features of the so-called Student-distribution or

t-distribution [15].

B.1 The Density Function

Let us start directly with the density function of a t-distribution with k degrees of

freedom,

hk(r) =Γ

(

k+12

)

Γ(

k2

)

Γ(

12

)√k

(

1 +r2

k

)

−k+12

. (B.1)

To keep things simple, we restrict ourselves to k = 1, 2, 3, . . .. Therefore we only need

the following features of the Γ-function:

Γ(1) = 1

Γ

(

1

2

)

=√

π (B.2)

Γ(k + 1) = k Γ(k)

It is well known in statistics [15] that the t-distribution converges as k → ∞ to

the standard normal distribution N (0, 1),

limk→∞

hk(r) =1√2π

exp

{

−r2

2

}

. (B.3)

Motivated by this property we perform the following variable transformation,

r → r − µ

γ. (B.4)

Under this transformation, the t-distribution becomes

fk(r) =1

γhk

(

r − µ

γ

)

(

k+12

)

Γ(

k2

)

Γ(

12

)√k γ

(

1 +(r − µ)2

k γ2

)

−k+12

, (B.5)

45

and fk converges to the normal distribution N (µ, γ) in the limit k → ∞.

At this point we want to discuss the role of the two parameters µ and γ. By

construction, µ equals the mean of the distribution. However, γ2 as given in the

generalised t-distribution (B.5) is not the variance of fk. In particular, the variance

of the (generalised) t-distribution is infinite [15] for k = 1 and k = 2. For k ≥ 3 the

variance σ2 of the t-distribution is well defined. In this case, we can find a relation

between γ2 and σ2. Let us start with the definition of σ2,

σ2 = IE((r − µ)2) =

−∞

dr (r − µ)2fk(r) . (B.6)

Using the abbreviation

A(k, γ) ≡ Γ(

k+12

)

Γ(

k2

)

Γ(

12

)√k γ

(B.7)

and the translation invariance of the integral we obtain

σ2 = A(k, γ)

−∞

dr r2

(

1 +r2

k γ2

)

−k+12

. (B.8)

The integral in (B.8) can be evaluated by partial integration. By using

r2

(

1 +r2

k γ2

)

−k+12

=k γ2

k − 1

(

(

1 +r2

k γ2

)

−k−12

− d

dr

[

r

(

1 +r2

k γ2

)

−k−12

])

(B.9)

we find

σ2 = A(k, γ)k γ2

k − 1

−∞

dr

1 +

r2

(k − 2)(√

kk−2

γ)2

−(k−2)+1

2

−A(k, γ)k γ2

k − 1

[

r

(

1 +r2

k γ2

)

−k−12

]∞

−∞

.

(B.10)

The second part of the sum in equation (B.10) becomes zero for k ≥ 3. By comparing

the integral in the first part of (B.10) with the density function (B.5) of the generalised

t-distribution we obtain

σ2 = A(k, γ)k γ2

k − 1A−1

(

k − 2,

k

k − 2γ

)

. (B.11)

We finally make use of (B.2) and (B.7) to get the result

σ2 =k

k − 2γ2 . (B.12)

46

In other words, γ is a parameter which influences the shape of the generalised t-

distribution and which is related to the variance σ2 via equation (B.12).

From (B.12) we again obtain the expected behaviour of γ2 in the limit k → ∞,

i.e., γ2 → σ2, which is due to the fact that the t-distribution converges to the normal

distribution.

Inserting (B.12) into (B.5) results in the following equation for the density function

of the generalised t-distribution:

fk(r) =Γ

(

k+12

)

Γ(

k2

)

Γ(

12

)√k − 2 σ

(

1 +(r − µ)2

(k − 2) σ2

)

−k+12

for k ≥ 3 (B.13)

B.2 The Distribution Function

In this section we describe the computation of the cumulative density function of the

generalised t-distribution. As the equations that we present in the following are valid

for all k ≥ 1, we use equation (B.5) instead of (B.13) to describe the density function.

With the abbreviation (B.7) the quantity we are interested in is given by

Fk(r) =

∫ r

−∞

dr′ fk(r′) = A(k, γ)

∫ r

−∞

dr′(

1 +(r′ − µ)2

k γ2

)

−k+12

. (B.14)

Let us define

I(r; a, b, k) ≡∫ r

−∞

dr′(

r′2+ ar′ + b

)

−k+12

. (B.15)

Then we have

Fk(r) = A(k, γ)(

k γ2)

k+12 I(r;−2µ, k γ2 + µ2, k) . (B.16)

The remaining task is to compute the function I. This can be done recursively. The

two initial conditions are given by [15]

I(r; a, b, 1) =2√

4b − a2

[

arctan

(

2r + a√4b − a2

)

2

]

and

I(r; a, b, 2) =2

4b − a2

(

2r + a√r2 + ar + b

+ 2

)

.(B.17)

Furthermore, the recursive formula for k ≥ 3 is [15]

I(r; a, b, k) =4r + 2a

(k − 1)(4b − a2)(r2 + ar + b)−

k−12 +

4k − 8

(k − 1)(4b − a2)I(r; a, b, k − 2) .

(B.18)

47

Appendix C

The Maximum Likelihood Method

In statistical data analysis, a common problem is to determine a probability density

function f of which only the functional form is known, while its parameters are

unknown. Let us assume that f depends on n variables r1, . . . , rn and contains k

parameters p1, . . . , pk. We write f in the form f(r1, . . . , rn ; p1, . . . , pk). Let us further

assume that a set of N (historical) data, ~rj ≡ (r1,j, . . . , rn,j)t where j = 1, . . . , N , is

available that obeys the probability density function. The task is therefore to estimate

the parameters p1, . . . , pk on the basis of these N data tuples.

For some probability density functions, the estimators of the parameters are well

known. For example, the parameters of the multinormal distribution function (2.7)

are the means and the variances of the variables and the correlations between the

variables. On the basis of a set of historical data, unbiased and consistent estimators

for these parameters are given by expressions (2.4), (2.5) and (2.12). However, it

is not always possible to find an analytical form to calculate parameter estimators

directly from historical data. For these probability density functions, the maximum

likelihood method [11] (ML method) provides a solution.

C.1 Theoretical Background

The quantity of central importance in the maximum likelihood method is given by

the likelihood function

L(~r1, . . . , ~rN ; p1, . . . , pk) ≡N∏

j=1

f(~rj ; p1, . . . , pk) . (C.1)

The maximum likelihood method states that the particular k-tuple (p1, . . . , pk) at

which L reaches its maximum is a good estimator of the “real” k-tuple (p1, . . . , pk). In

other words, we have to find a k-tuple (p1, . . . , pk) so that L(~r1, . . . , ~rN ; p1, . . . , pk) <

48

L(~r1, . . . , ~rN ; p1, . . . , pk) for all possible (p1, . . . , pk) where (p1, . . . , pk) 6= (p1, . . . , pk).

Here we assume that the likelihood function contains one unique maximum only. As

this is, of course, not true in general, we will come back to this point later in this

section.

From the above discussion it is obvious that (p1, . . . , pk) is a function of ~r1, . . . , ~rN ,

i.e., (p1, . . . , pk) = S(~r1, . . . , ~rN). Under the maxmimum likelihood method, the sam-

pling function S is called maximum likelihood estimator [15].

Often one does not maximise L directly but the logarithm of L,

l(~r1, . . . , ~rN ; p1, . . . , pk) ≡ ln L(~r1, . . . , ~rN ; p1, . . . , pk)

=N

j=1

ln f(~rj ; p1, . . . , pk) . (C.2)

To simplify the notation, we will neglect the dependence of l on ~r1, . . . , ~rN and use

the abbrevation ~p ≡ (p1, . . . , pk)t. At ~p we have

∂l

∂pi

~p

= 0 for i = 1, . . . , k . (C.3)

Therefore a Taylor expansion gives

l(~p ) = l(~p ) +1

2

k∑

i,j=1

∂2l

∂pi ∂pj

~p

(pi − pi) (pj − pj) + . . . (C.4)

It follows that

L(~p ) ≃ L(~p ) exp

{

1

2

(

~p − ~p)t

G|~p(

~p − ~p)

}

. (C.5)

Comparing (C.5) with equation (2.7) shows that the likelihood function can be de-

scribed by a multinormal distribution in the vicinity of ~p. The corresponding covari-

ance matrix is given by C = − G|−1

~pwith the matrix G|~p defined by

(

G|~p)

i,j=

∂2l

∂pi ∂pj

~p

. (C.6)

From (C.5) it follows that maximum likelihood estimations are asympotically nor-

mally distributed. This is one of the reasons why the maximum likelihood method

works for a wide range of statistical problems.

Let us come back to the general problem of finding the maximum of the likelihood

function L. If the analytical form of the probability density function f is too compli-

cated, it is possible that one cannot find the maximum analytically. In this case one

has to maximise L(~p ) or l(~p ) numerically to obtain the parameter estimator ~p. The

following two sections provide examples of this situation.

49

C.2 Estimation of the Degrees of Freedom of the

t-Distribution Using the ML Method

First we want to discuss the calculation of the estimators of the parameters µ, σ and

k of the generalised t-distribution (B.13) for k = 3, 4, 5, . . .. In the notation of this

chapter, this is given by

f(r ; µ, σ, k) =Γ

(

k+12

)

Γ(

k2

)

Γ(

12

)√k − 2 σ

(

1 +(r − µ)2

(k − 2) σ2

)

−k+12

. (C.7)

For a given data sample r1, . . . , rN , the logarithm of the likelihood function is

l(r1, . . . , rN ; µ, σ, k) =N

j=1

ln f(rj ; µ, σ, k) . (C.8)

In principle the next step would consist in solving the following set of equations,

∂l

∂µ

µ,σ,k

= 0 , (C.9)

∂l

∂σ

µ,σ,k

= 0 and (C.10)

∂l

∂k

µ,σ,k

= 0 , (C.11)

in order to obtain µ, σ and k. Instead of solving (C.9) and (C.10) we now make use of

the results from Appendix B.1. There we have shown that µ and σ2 are the mean and

the variance of the generalised t-distribution. Therefore we can use the estimators

(2.4) and (2.5) to calculate µ and σ.23

The remaining task is the calculation of k. As this cannot be done analytically by

solving (C.11), we compute k by a numerical maximisation of the logarithm of the

likelihood function (C.8) with respect to k, already using the results for µ and σ.

To give an example, we consider the relative changes of the FX rate GBP/DEM for

the period 02/01/1980 to 30/12/1998 (see Chapter 5 for details). Using the estimators

(2.4) and (2.5) for µ and σ2, we show in Figure C.1 the logarithm of the likelihood

function

l(k) = l (r1, . . . , rN ; µ, σ, k) (C.13)

23One can easily show that solving equation (C.9) also gives result (2.4). However, equation (C.10)results in

σ2

N∑

j=1

1

(k − 2) σ2 + (rj − µ)2=

N k

(k + 1)(k − 2)(C.12)

which cannot be solved analytically for N ≥ 3.

50

0 5 10 151.84

1.845

1.85

1.855

1.86

1.865

1.87

1.875

1.88

1.885

1.89x 10

4

k

l(k)

k = ∞ →

Figure C.1: The logarithm of the likelihood function l(k) vs. k.

as a function of k.24 For completeness, we also show the point for k = ∞, i.e., for

the assumption of a normal distribution. As one might guess from Figure C.1, which

results in k = 4, the numerical computation of the estimator k entails no pitfalls.

This statement is also valid if the data sample is much smaller than in the example

above, for example if one has N + 1 = 250.

In our numerical simulations, the procedure to find k is the following: we start at

k = 3 and compute l(k = 3). Then we set k = 4 and compute l(k = 4) and so on.

We do these calculations until we obtain

l(k = i) > l(k = i − 1) and l(k = i) > l(k = i + 1) . (C.14)

Finally we set k = i.

Because of numerical accuracy it might happen that condition (C.14) does not

lead to an end of the algorithm just mentioned. In this case our algorithm to find k

will stop if

|l(k = i − 1) − l(k = i)| < ǫ (C.15)

where ǫ = 10−6.

24For the period 02/01/1980 to 30/12/1998 we have N = 4759, µ = −0.0000528 and σ = 0.00500(see page 33).

51

C.3 ML Estimation of the Copula Parameter

The second example of an application of the maximum likelihood method is the

calculation of the estimator of the copula parameter. As the focus in this thesis lies

on the Frank copula (3.7) and the Gumbel-Hougaard copula (3.11), we consider in

this section a two-dimensional copula C that can be parameterised by one copula

parameter θ.

Let us denote the distribution functions of the margins by F1 and F2, respectively.

The corresponding probability density functions are f1 = ∂F1/∂r1 and f2 = ∂F2/∂r2.

From (4.5) we find for the logarithm of the likelihood function

l(θ) = l((r1,1, r2,1), . . . , (r1,N , r2,N) ; θ)

=N

j=1

ln

(

∂2

∂u ∂vC(u, v)

u=F1(r1,j),v=F2(r2,j)

f1(r1,j) f2(r2,j)

)

. (C.16)

Further simplification gives

l(θ) =N

j=1

ln

(

∂2

∂u ∂vC(u, v)

u=F1(r1,j),v=F2(r2,j)

)

+

N∑

j=1

(ln f1(r1,j) + ln f2(r2,j)) .

(C.17)

The second sum in (C.17) does not depend on θ. Therefore it is sufficient to maximise

the function

l(θ) =N

j=1

ln

(

∂2

∂u ∂vC(u, v)

u=F1(r1,j),v=F2(r2,j)

)

(C.18)

with respect to θ to find θ.

Let us make some remarks on result (C.18):

• For the two copulas of interest, the partial derivative in (C.18) is given by

equations (4.6) and (4.7).

• If we consider t-distributed margins, the calculation of the distribution functions

Fi(ri,j) is done using (B.16)-(B.18).

• If we consider normally distributed margins, the distribution functions Fi(ri,j)

are given by (3.15). As (3.15) cannot be solved analytically, we can use the

approximation method presented in [16].

52

−1 0 1 2 3 4 5 6 7 8−200

−100

0

100

200

300

400

500

600

θ

~l(θ)

Figure C.2: The modified likelihood function l(θ) vs. θ.

In general, l(θ) has to be maximised numerically. For the period 02/01/1980

to 30/12/1998 we present in Figure C.2 a typical curve of l(θ). The margins under

consideration are the relative FX rates GBP/DEM, described by a t-distribution with

k = 4, and the relative FX rates USD/DEM, described by a t-distribution with k = 6.

As one can see from Figure C.2, only one global maximum occurs in that figure. In

particular, no local maxima can be seen. We observed this behaviour at all historical

dates that we tested. We therefore conclude that the maximisation algorithm that

we used to find θ is numerically stable and reliable.

53

Bibliography

[1] P. Embrechts, A. McNeil, D. Straumann, Correlation: Pitfalls and Alternatives,

RISK (May 1999), pages 69-71.

[2] P. Embrechts, A. McNeil, D. Straumann, Correlation and Dependence in Risk

Management: Properties and Pitfalls, published in Risk Management: Value at

Risk and Beyond, edited by M. Dempster, Cambridge University Press, Cam-

bridge (2002).

[3] R.B. Nelsen, An Introduction to Copulas, Springer, New York (1999).

[4] D. Duffie, J. Pan, An Overview of Value at Risk, The Journal of Derivatives, vol.

4, no. 3 (1997), pages 7-49.

[5] N.M. Zinchenko, Heavy-tailed Models in Finance and Insurance: A Survey, The-

ory of Stochastic Processes, vol. no. 1-2 (2001), pages 346-362.

[6] M.J. Frank, On the simultaneous associativity of F (x, y) and x + y − F (x, y),

Aequationes Math. 19 (1979), pages 194-226.

[7] T.P. Hutchinson, C.D. Lai, Continuous Bivariate Distributions, Emphasising

Applications, Rumsby Scientific Publishing, Adelaide (1990).

[8] Central Bank of the Federal Republic of Germany, Notes on Principle I, Part

VII, Section 34, January 2001.

[9] H.-P. Deutsch, Derivatives and Internal Models, 2nd edition, Palgrave (2002).

[10] Central Bank of the Federal Republic of Germany, Statistical Supplement “Ex-

change Rate Statistic” to the Monthly Report (January 1980 - December 1998).

[11] S. Brandt, Data Analysis, 3rd edition, Springer (1999).

[12] A. Sklar, Fonctions de repartition a n dimensions et leurs marges, Publ. Inst.

Statist. Univ. Paris 8 (1959), pages 229-231.

54

[13] A. Sklar, Random Variables, Distribution Functions, and Copulas – a Personal

Look Backward and Forward, published in Distributions with Fixed Marginals

and Related Topics, edited by L. Ruschendorf, B. Schweizer and M.D. Taylor,

Institute of Mathematical Statistics, Hayward, CA (1996), pages 1-14.

[14] J. Stoer, R. Bulirsch, Introduction to Numerical Analysis, 3rd edition, Springer

(2002).

[15] I.N. Bronshtein, K.A. Semendyayev, Handbook of Mathematics, 3rd edition,

Springer (1997).

[16] J.C. Hull, Options, Futures, and Other Derivatives, 4th edition, Prentice Hall

(2000).

55