Risk Analysis & Modelling Lecture 3: Underwriting Risk

Risk Analysis & Modelling

Lecture 3: Underwriting Risk

Course Website:

ram.edu-sys.net

Course Email:[email protected]

Underwriting RiskUnderwriting Risk is one of the primary risks faced by Insurance CompaniesIt is the Risk that the Insurer might experience an Underwriting Loss (Total Claims greater than Total Premiums)In order to model Underwriting Risk it is necessary to model the two primary uncertainties faced by Insurance Companies: the number of Claims (Claim Frequency) and the size of those Claims (Claim Severity)Actuarial Science uses Statistical Distributions to describe the range of possibilities for both the Frequency and Severity of Claims or LossesThe Frequency-Severity model is arguably the single most important model in Actuarial Science and is also the basis of the Reserve Risk and CAT Risk Models we will look at on the courseIt is also the basis of many Pricing ModelsThe Frequency-Severity Model is simple to understand and use as long as you have an understanding of the Distributions it uses….

Underwriting Risk & The Frequency-Severity Model

0

0.02

0.04

0.06

0.08

0.1

0.12

0 2 4 6 8 10 12 14 16 18 20 22 24Number Claims Over Next Day

Pro

babi

lity

Frequency-Severity Model

Distribution Describing Frequency of Losses

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

Distribution Describing the Severity of Losses

Distribution ofTotal Losses

Underwriting PROFIT or

LOSS

Total Premium

Distributions - Describing the Range of Outcomes

In the last lecture we introduced the idea of describing the possible range of outcomes and the probability of those outcomes as a useful tool in quantifying RiskThis essentially gave us a probability histogram showing the relationship between the outcome and the probability of the outcomeWe were dealing with discrete random variables which could only assume a fixed (or countable) number of valuesWhen the random variable is discrete this relationship between the probability and the outcome is known as the Probability Mass Function

Probability Histogram

0%

5%

10%

15%

20%

25%

30%

35%

40%

45%

50%

-10% -5% 0% 5% 10%

Change In Portfolio Value

Pro

ba

bili

ty

The probability histogram relates the range of outcomes to the probabilities of those outcomes

This random variable is discrete because there are a finite

number of outcomes (5 in total)

Probability Mass FunctionThe Probability Mass Function (PMF) (also known as the Probability Function) specifies the relationship between the outcome and the probability of the outcome

Probability Mass FunctionPMF(-10%) = 5%

Loss of -10% 5% Probability

In the case of our portfolio the PMF relates the size of a loss or gain to the probability of that loss or gain

The PMF can be a mathematical function or a simple table of probabilities

The values that the random variable described by the Probability Mass Function is called the Support, in this case the Support is the set values (-10%,-5%,0%,5%,10%)

Properties of the Probability Mass Function

The first property of the Probability Mass Function (PMF) is that the probabilities relating to the various outcomes cannot be negative:

0)( XPMF

The second property of the PMF is if we sum the probabilities for all the possible outcomes that can occur (N in total) we get a value of 1:

1)(1

N

iiXPMF

Probability of Poor PerformanceOne important statistic frequently used in the measurement of Risk is the probability of the outcome being worse than some specified valueFor example, we might want to know the probability of the return on our portfolio being less than or equal to say -5%This probability is also known as the QuantileThis can obviously be calculated by summing the probabilities of all the outcomes equal to or worse than -5% (20% + 5% = 25%)We could do this for all the possible outcomes for our portfolio (10%, -5%, 0%, 5%, 10%) to obtain a different way of representing the distribution….

Cumulative Distribution Graph

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

-10% -5% 0% 5% 10%

Portfolio Return

Pro

bab

ility

of

Ret

urn

Les

s T

han

Or

Eq

ual

Probability that the return will be less than or equal

to 0 is 45%

45%

Cumulative Distribution FunctionThe Cumulative Distribution Function (CDF) is one of the most important statistical concepts on this course

It specifies the probability of observing a random variable being less than or equal some value (A) and is related to the Probability Mass Function (PMF) by the following relationship:

)()(

AX

i

i

XPMFACDF

Where CDF(A) gives the probability of observing some random variable less than or equal to a level A (Quantile), this is equal to the sum of the probabilities of all the possible outcomes (given by the PMF) less than or equal to A

Because the PMF is always positive or zero the CDF is always a monotonic (increasing) function

PMF & CDF Example

The following table gives values for the PMF of a random variable, evaluate the CDF at 2 and 4

X PMF(X)

1 10%

2 20%

3 30%

4 30%

5 10%

The Poisson Distribution & the Frequency of Losses

The most widely used distribution for the Frequency (number) of Losses in Actuarial Models is the Poisson DistributionLike the distributions we have seen so far the Poisson Distribution is discrete, however its Probability Mass Function is calculated using a mathematical formulaAny random event whose occurrence across time is independent but occurs at a constant rate will follow a Poisson DistributionExamples of Poisson Distributed random events include the number of calls received by a call center in an hour, the number of hurricanes in a year, the number of fires that will occur in a property insurance portfolio, the number of car accidents that an individual will have in a year… All that is needed to describe the Poisson Distribution is the Average frequency of the event

The Poisson DistributionIts Probability Mass Function for the Poisson Distribution (PMF) is:

This gives us the probability of a Poisson Distributed random variable having a value of X for a given average Where e is the natural number (2.71828…)A Poisson Distributed Random variable can take on any positive integer (0,1,2,3,…..) – the SupportIn Excel the Poisson PMF can be calculated by typing =Poisson(x,,FALSE) So the probability of a Poisson distributed random with an average of 5 having a value of 3 is =Poisson(3,5,FALSE)

!.)(x

exPx x = (0,1,2,3…)

The CDF for the Poisson Distribution is just the sum of the PMF for all values less than or equal to x:

In Excel the CDF for the Poisson Distribution can be calculated as Poisson(x,,TRUE)

So the probability of a Poisson distributed random variable with an average of 5 being LESS THAN OR EQUAL TO 3 could be calculated as is =Poisson(3,5,TRUE)

x

y

y

yexCDF

0 !.)(

x = (0,1,2,3…)

Poisson PMF Average = 15

0

0.02

0.04

0.06

0.08

0.1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

Probability of a Poisson Distributed Random Variable with an Average of

15 being equal to 12 is 8.28%=POISSON(12,15,FALSE)0.0828

Poisson CDF Average = 15

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2 4 6 8 10 12 14 16 18 20 22 24

Probability of a Poisson Distributed Random Variable with an Average of 15 being less than or equal to 12 is

26.76%=POISSON(12,15,TRUE)

0.2676

Poisson Distribution Review Questions

A Portfolio of Policies will have an average number of losses over the next year of 100 and is believed to follow a Poisson DistributionWhat is the probably of exactly 100 losses occurring over the next yearWhat is the probability of the Number of Losses being less than or equal to 120What is the probability of the Number of Losses being greater than 120?

Claim Frequency and the Poisson Distribution

One reason for the popularity of the Poisson Distribution to describe the frequency of claims is it simplicityThe Insurer only has to estimate the Average number of claims to describe the entire range of outcomes for the frequencyIn addition, if the claims on a portfolio of policies are independent and at any point in time approximately have an equal probability of generating a claim (constant intensity) then the distribution of claim frequency must be Poisson Distributed…

Poisson Distribution & Claim Frequency

Insurance Policy……………………………………………………………………………………………

Probability Claim Today = X







0

0.02

0.04

0.06

0.08

0.1

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24

The distribution of Claim Frequency

Over the Next Month must be Poisson

If Claims are independent

and have constant intensity

How well does the Poisson Distribution fit Actual Claim Frequencies

As with any assumption it is important to see if the Poisson Distribution fits actual claim frequencies

Claim Frequency Data is often presented as the number of policies in a portfolio having 0,1,2.. Claims over a period

From this data we can calculate the average number of claims per policy and use this to compare the actual distribution of claim frequency to the Poisson…..

Empirical Motor Insurance Claims Frequency Data

Number Claims Number of Policies % Empirical Frequency Poisson Predicted0 88585 88.585% 88.410%1 10577 10.577% 10.890%2 779 0.779% 0.671%3 54 0.054% 0.028%4 4 0.004% 0.001%5 1 0.001% 0.000%6 0 0.000% 0.000%

TOTAL 100000

Average Frequency Per Policy: 0.123

779 out of 100000 or 0.779% had a frequency of 2 Claims

The Poisson Distribution with an average of 0.123 predicts that 0.671% of claims will have a frequency of 2 =Poisson(2,0.123,FALSE)

Poisson vs Empirical Frequency

0.000%

10.000%

20.000%

30.000%

40.000%

50.000%

60.000%

70.000%

80.000%

90.000%

100.000%

0 1 2 3 4 5 6

Claim Frequency Per Policy

Pro

bab

ility

Empirical Frequency Poisson Frequency

Discrete Vs Continuous Random Variables

Some random variables are naturally discrete such as the roll of a dice or the frequency of claims experienced by an insurance company A larger class of random variables are not naturally discrete and have a limitless range of valuesThe return on the portfolio is a good example, we have artificially made it discrete but in reality it could take on any value such as 1.654% or 5.3421%The cost of repairing a car after an accident could be $5672.55If we make continuous random variables artificially discrete for the purpose applying discrete distributions it will mean we are not modelling them accuratelyHowever, when we attempt to describe or model continuous random variables we encounter an interesting problem….

Zero ProbabilityThe problem with continuous random variables is they can take on a limitless (or very large) range of valuesWhat this means is that the probability of any given outcome is very small and effectively zero!What is the probability of a person selected at random being 1.6744234 meters? What is the probability of a random pebble picked up on a beach being 0.241312KG?What is the probability of your portfolio having a return of 1.23542%?What is the probability that the uniform random number will be 0.154345435123445?For continuous random variables we can only talk about the probability of the outcome being within a range of values, such as the probability of a uniform random number being between 0.1 and 0.3

Quantifying Continuous Random Variables using the CDF

The standard way to quantify the behaviour of a continuous random variable is by using the Cumulative Distribution Function (CDF)The CDF of a continuous random variable gives the probability of it being below a given value (Quantile)The CDF for a continuous random variable is nearly always a mathematical function which takes as its input a value and from this calculates the probability of observing the random variable below that value

CDF for a Uniform Random Variable

The simplest CDF is that of a Standard Uniform Random Variable (U), and is simply:

The support of this CDF are all values between 0 and 1 So for example if we want to know the probability of a uniform random value being less that 0.65

0.65 or 65% of the time the uniform random variable will be below the value of 0.65

XXCDFXUP )()~

( For 0 <= X <= 1

65.0)65.0()65.0~

( CDFUP

Cumulative Density for Uniformly Distributed Numbers

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

X

CD

F(X

)

CDF(X) = X

Probability Density Function (PDF)Although each possible outcome for a continuous random variable is effectively zero (the probability mass is zero), some of the outcomes are more clustered than othersHow clustered or dense outcomes are about a given point is measured by the Probability DensityProbability Density is measured using the Probability Density Function (PDF) and can be thought of as the continuous random variable’s equivalent to the Probability Mass Function (PMF)The higher the density of outcomes at a point the more likely an outcome is to appear in the region of that pointMathematically, the PDF measures the slope of the CDF at a valueEven though its meaning is abstract, the PDF is one of the most common tool with which to represent or visualise the nature of a continuous random variable – mainly because it is a bit like Probability Mass.

Probability Density for Uniformly Distributed Numbers

0

0.2

0.4

0.6

0.8

1

1.2

-0.5 0 0.5 1 1.5

Uniformly distributed random numbers have a constant density of 1 over the range 0 to 1, 0

elsewherePro

babi

lity

Den

sity

There is a zero density outside the range 0 to 1 because there are no

values there!

Uniform Density Graph

00.10.20.30.40.50.60.70.80.9

1

0 0.2 0.4 0.6 0.8 1

Points are spread out uniformly

Uniform Distribution PDF and CDF

PDF

CDF

Area under the PDF equals CDF

0

0.2

0.4

0.6

0.8

1

1.2

-0.5 0 0.5 1 1.5

Slope of the CDF (1) equals the PDF

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

X

CD

F(X

)

Mathematical Relationship between PDF and CDF

Mathematically we can express the relationship between the PDF and CDF for continuous random variables using calculus

Using integration we can express the relationship between the area under the pdf and the cdf as follows (where LB is the lower bound on the random variable, 0 in the case of the uniformly distribution)

a

LB

xxpdfacdf ).()(

x

xcdfxpdf

)(

)(

Using differentiation we can express the relationship between the slope of the CDF and the PDF as

Mathematical Uses of the PDFTraditionally, calculus was central to the manipulation of distributions for continuous random variables

Recall from lecture 2 we stated that the average of a discrete random variable was:

For a continuous random variable this is calculated using the PDF and integration:

n

1iix .xp )x~E(

i

UB

LB

xxxpdfxE .).()(

In the case of a uniformly distributed random variable this would be:

We stated that the variance for a discrete random variable was:

We can calculate the variance of a uniformly distributed random variable as:

Let us verify these results with a Monte Carlo Simulation in Excel…

2

10

2

1

2..1)(

1

0

21

0

x

xxxE

12

1

4

1

2

1

3

1

423.

2

1.1))((

1

0

231

0

22

xxxxxxExE

2n

1iir

2 ))~(.(rp)))~(-r~E(( )r~Var(i

rErE

Other Types of Continuous Random Variables

Most of the Distributions we will use on this course will be ContinuousThe Continuous Distributions we will look at will include the Normal, Log-Normal, Exponential and Gamma DistributionClaim Severities are best modeled using continuous distributions due to the large almost limitless range of values they can take (for example the loss could be £312.54 or £5123.87)One of the most widely used Severity Distribution in Actuarial Science is the Pareto Distribution….

Severity Distribution: The Pareto Distribution

The Pareto distribution is named after the Italian economist Vilfredo ParetoThe Pareto Distribution is a Continuous DistributionIt is probably the most commonly used Severity Distribution along side the Gamma DistributionIt models a pattern in which most claims are small but there is the potential for very large losses (heavy tails)It is particularly prevalent in modelling the claims severity experienced in Liability Insurance, Excess of Loss Reinsurance, Marine Insurance or any loss where the majority of losses are small but there is potential for large lossesThe Pareto Distribution is also widely used in Catastrophe or CAT models to model losses due to extreme events like Earthquakes or Windstorms

Pareto Distribution FormulaThe CDF (Cumulative Distribution Function) for a Pareto Distributed random variable is:

Where M is the minimum value of the Pareto random variable (greater than zero), and is a positive number greater than zero defining the shape or alpha of the distribution

The PDF (Probability Density Function) for a Pareto random variable is:

X

MXCDF 1)(

1.)(

)(

X

MXPDF

X

XCDF

For X >= M > 0

For X >= M > 0

Pareto CDF: = and M = 300

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 500 1000 1500 2000 2500 3000

M=3002000

942.02000

3001

5.1

A loss with a minimum 300 and a shape of 1.5 will be less than or equal to 2000 94.2% of the time

Pareto Review Question

An insured loss has a minimum size of 1000 and a Shape parameter of 2

Calculate the probability that the loss is less than (or equal to) 5000

*Calculate the probability the loss will be greater than 5000

Interpretation of the MinimumOne feature of Pareto distributed random variables is that they have a minimum value which they do not go belowThis minimum has two interpretations as far as the modelling of Claim SeveritiesOne interpretation of this minimum is that it represents the deductible on a policy, so claims on losses in value less than this minimum are never madeAnother interpretation is that due to the nature of insured losses there is a natural lower boundary to their size (for example, you will never have £100 of damage caused by a fire of an offshore platform)

Estimating the Shape of the Pareto Distribution

The Shape of the Pareto Distribution can be based on a generally accepted actuarial “rule of thumb”

For example, Swiss Re estimate a Shape of 1.3 for losses due to European windstorms and 0.8 for losses due to Earthquakes

The Shape can be calculated from the Minimum (M) and Average Loss () using the following formula:

This is especially useful when the shape is unknown and has to be estimated from sample data

M

Pareto Review Question

An insured loss has a minimum size of 10000 and an average size of 40000

Calculate the Shape

Using the Shape calculate the probability of the loss being less than 100000

This is going to take too long, lets learn how to make our own functions instead….

Custom Functions in Excel

Excel allows the user to create custom functions using VBAThese functions must be added to a Module – which is just a special page where code for an Excel workbook is writtenTo add a module to a workbook select the Visual Basic option under the Developer Tab (if you do not have the Developer Tab then check the Excel Options -> Popular -> Show Developer Tab in Ribbon) then select the Insert -> Module menu option in the Visual Basic editorLets start with a simple function that adds two numbers together

Add Numbers Function

Public Function AddNumbers(NumberA, NumberB) AddNumbers = NumberA + NumberB End Function

Special VBA words are displayed in blue (keywords).AddNumbers, NumberA and NumberB are just words selected at random. AddNumbers is the name of the function that is used to call the function on the spreadsheetNumberA and NumberB are used to reference the two parameters passed into the functionTo call this function from the spreadsheet we would type =AddNumbers(4,7)In this case NumberA would be 4 and NumberB would be 7

Our Pareto CDF Function!To save you time I have put this function into your spreadsheet

Public Function ParetoCDF(X,Average,Min) Shape = Average / (Average – Min) ParetoCDF = 1 – (Min/X)^Shape End Function

Notice it has two lines the first line calculates the Shape from the Average and MinimumThe second line uses this Shape to calculate the CDFSo if we wanted to calculate the probability of a loss with a minimum of 1000 and an average of 5000 being less than or equal to 10000 we would type =ParetoCDF(10000,5000,1000)Now go back to do the review question using it!

Pareto Distributions and Repeating Proportionalities

The Pareto distribution describes a phenomena we frequently see in the world about us – that is that there are repeating proportional patterns

One of the most famous examples of this was the observation by the Italian Economist that 80% of all the wealth was owned by the wealthiest 20% of the population

He also observed that within that wealthiest 20% the top 20% or the top 4% (20% * 20%) owned 80% of that 80% or owned 64% (80% * 80%) of all the wealth and so on…

This pattern can lead to some very wealthy people!

These repeating proportional patterns are also exhibited by the Pareto DistributionTo illustrate this assume losses are Pareto Distributed and have a minimum of 100 and an average of 500What proportions of losses are greater than 120 (100 * 120% or 20% above 100) above the minimum:

(1 – ParetoCDF(120,500,100) )What proportions of losses are greater than 144 (120 *120% or 20% above 120):

(1 – ParetoCDF(144,500,100) )What proportion of the losses above 120 are also above 144? (Divide the probability of it being greater than 144 by the probability of it being greater than 120 to see it is the same) This pattern can lead to some very large losses and these repeating proportional patterns are often observed in the losses caused by events such as fires or storms

Probable Maximum Loss (PML)

Probable Maximum Loss (PML) gives a measure of the worst loss that is likely to occur at some level of statistical significanceThe severity of the loss measured by the PML is measured in terms of the likelihood or probability of observing smaller losses

For example the PML95% is the loss such that 95% of all losses will be less than or equal to that amount - the 95% Quantile

The PML99% is the loss such that 99% of all losses will be less than or equal to that amount – the 99% QuantileIf we look at a graph for the CDF of the Pareto Distribution it is obvious to see how this can be calculated….

PML From Pareto CDF

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 500 1000 1500 2000 2500 3000

95%

2200

By reading the values of the CDF the “other way round” we can estimate the PML at 95% is about 2200

Inverse Cumulative Distribution Function (ICDF)

The Cumulative Distribution Function (CDF) calculates the probability of a random variable being less than or equal to some valueThe Inverse Cumulative Distribution Function (ICDF) calculates the outcome given the probability of it being less than or equal to that outcome or the QuantileThe ICDF can be calculated by inverting the CDF or using it in reverse..

Inverting the CDF Function

The Pareto CDF gives us the Quantile (Q) for an outcome (X) and its formula is:

To obtain the outcome (X) given the Quantile (Q) we can simply invert this function

X

MQ 1

This is the Inverse CDF for the Pareto Distribution and can be used to calculate the PML…

1

1 Q

MX

ParetoICDF Function

Again this function is a bit complex so we will use a VBA function to calculate the Inverse of Pareto CDF

Public Function ParetoICDF(Q,Average,Min) Shape = Average / (Average – Min) ParetoICDF = Min / (1 - Q) ^ (1 / Shape) End Function

The first line calculates the ShapeThe second line applies the Pareto Inverse CDF formulaFor example, if we have a loss with a minimum value of 1000 and an average of 5000 to find the loss that 95% of all losses will be less than or equal to (PML 95%) we would type =ParetoICDF(0.95,5000,1000)

Review Question

A loss is Pareto Distributed with a minimum value of 20000 and an average value of 150000

Using the Inverse CDF calculate the 95% PML

*Calculate the loss such that only 1% of losses will be greater than that loss (1% Exceedence Probability)

Combining the Frequency and Severity Model

To model the Underwriting Risk (or Total Loss) accurately it is necessary to Combine the statistical model for the Frequency and Severity of LossesThe total loss is determined by both the frequency and severity of losses, so to calculate the distribution of total losses we need to combine the frequency and severity distributionsThis can be very complicated mathematically, but fortunately we learnt about a technique in lecture 1 which will make it a lot easier….

Generating Discrete Random Variables

In lecture 1 we generated random samples for one of the simplest discrete random variables: the flip of a coinWe did this using intuition, cutting up the uniform random variables such that 50% of the outcomes represented a heads and 50% represented a tailsWe could have also done this using the Cumulative Distribution Function for the flip of the coin…

Coin Flip Distributions

0

0.1

0.2

0.3

0.4

0.5

0.6

0 (Tails) 1 (Heads)

Pro

ba

bili

ty

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 (Tails) 1 (Heads)

Cu

mu

lati

ve

Pro

ba

bili

ty

Probability Mass Function

Cumulative Distribution Function

Notice that the y-axis of the cumulative probability function has

values ranging between 0 and 1 like the uniform distribution

Generating Discrete Random Variables using the Inverse

Cumulative Probability Function

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 (Tails) 1 (Heads)

Cu

mu

lati

ve

Pro

ba

bili

ty

Rand()

We read off the value from the y axis and see which outcome this relates to

If Rand() is greater than 0.0 and less than or

equal to 0.5 then Tails

If Rand() is greater than 0.5 and less than or

equal to 1.0 then Heads

Generating Pareto Distributed Losses

This trick will also work for the Pareto Distribution.If we use rand to select the Quantile on the Y axis and then read off the value that this relates to on the X-Axis we will get a Pareto Distributed Random Variable…This method of converting a uniform distributed random variable into a random variable from another distribution is called the Inverse TransformSee appendix for a mathematical proof of why this technique works….

Generating Pareto Distributed Losses = and M = 300

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 500 1000 1500 2000 2500 3000

Rand()

Pareto Distributed Random Variables with a Minimum of 300 and shape of 1.5

Review Question

On the “Random Pareto Question” sheet use the ParetoICDF method (via the Inverse Transform) to convert rand into a Pareto Distributed Loss with a Minimum of 100 and Average of 130

Repeat this to generate a sample of 1000 values and test that the inverse transform works by confirming that the Minimum (using =Min(range) ) and Average (using =Average(range) ) of your sample is close to 100 and 130

Simulating the Aggregate Loss

Insurance Companies are often more concerned about the Total or Aggregate Loss across their Underwriting Portfolio than they are about individual lossesMathematically it can be very complex to describe the distribution of Aggregate Losses even if we know the distribution of individual lossesHowever, if we use the Monte Carlo Simulation it is very simple to simulate the Aggregate Loss, we just simulate the individual losses and then sum their values….

Review Question

An Insurer expects 20 losses to occur across a portfolio of 300 policies over the next year

The size of each of these losses follows a Pareto Distribution with a minimum of £1000 and an average of £12000

Simulate the Aggregate Loss for the portfolio assuming the expected number of losses (20) will occur

If the Total Premium across this portfolio of 300 policies is £250,000 simulate the Underwriting Profit or Loss

AggregateParetoLoss

So far we have seen how useful 2 lines of VBA can be with ParetoCDF and ParetoICDF!In your workbook there is a slightly more complex function called AggregateParetoLoss that will add or sum a number of losses from a Pareto DistributionFor example, if you want to simulate 20 Pareto Losses with an average of 8000 and minimum of 1000 with this function you would type: AggregateParetoLoss(8000,1000,20)See appendix for a description of this function…

Poisson Distributed Random Values

To simulate the total loss we are also going to need to simulate the frequency or number of lossesWe can also apply the Inverse Transform to the Inverse CDF for the Poisson distributionIn your spreadsheet there is a function Called RandomPoisson which will generate a random frequency from the Poisson DistributionFor example, if you want to generate a Random Claim Frequency with an average of 15 you would type =RandomPoisson(15)How this function actually works uses a slightly different technique explained in the appendix*

Generating Poisson Distributed Random Values

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 2 4 6 8 10 12 14 16 18 20 22 24

Rand()

Poisson Distributed Random Values

Combining The Frequency And Severity

To model Underwriting Risk we need to be able to simulate the combined effect of the random frequency and severity of lossesWe can now simulate both the frequency of losses and their severityTo simulate the total loss we simply generate a random value for the frequency of losses from the Poisson Distribution and then for each of the losses that occur generate a random Severity from the ParetoBy summing these losses we then obtain the Aggregate Loss and from that we also Simulate the Underwriting Profit and Loss


Poisson Distribution

0

0.02

0.04

0.06

0.08

0.1

0.12


Pro

ba

bili

ty

How Large Are Claims?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

For each loss that occurs Simulate their Size from the Pareto and Sum their Values

Simulate the Number of Losses from the Poisson Distribution

The Next Step…

0

0.02

0.04

0.06

0.08

0.1

0.12


Pro

bab

ility

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

Frequency Severity Model

MONTE CARLO SIMULATION

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

-2000000 -1500000 -1000000 -500000 0 500000

Underwriting Profit

Underwriting Profit & Loss Distribution

Appendix : Cantelli’s Inequality

Cantelli’s Inequality is an important result from statistics that places an upper boundary on the probability of a random variable being less than or equal to some value

The estimate can be calculated just from the mean and the variance of the random variable:

The true probability is likely to be less than this estimate BUT it has the advantage in that we do not have to know the distribution of the random variable just its Mean and Variance!

22

2

)~

(X

XUPUU

U

Cantelli’s Inequality Example

Imagine we have a portfolio whose average return is 8% (0.08) and variance is 0.003We can use Cantelli’s inequality to estimate the probability of losing more that 5% of the portfolios value (the random return on the portfolio being less than -5%)

So Cantelli’s inequality tells us that the chance or probability of losing more than 5% of the portfolio is 15.07% (0.1507)This is a worse case estimate of the true probability will be less than or equal to this

1507.0

)05.0(08.0003.0

003.0)~( 2

XrP

Appendix: Exponential DistributionExponentially distributed random variables can take on any value between 0 and positive infinity (the support)The formula for the CDF (Cumulative Distribution Function) of an Exponentially Distributed random variable (o) is

x

exoPxcdf

1)~()(

x

e

t

xcdfxpdf

)(

)(

Where e is the natural number (2.71828…)

Where is the average of the random variable

The formula for the PDF (Probability Density Function) of the Exponential Distribution is

Meaning behind the Exponential Distribution

The formula of exponentially distributed random variables describes a very specific type of behaviour – it has not just been taken out of the air!An exponentially distributed random variable exhibits what is known as constant intensity or is memorylessThis means that the probability of an exponentially distributed random variable being one unit larger is constantFor example, the probability of an exponentially distributed random variable being greater than 11 given that we know it is greater than 10 is the same as the probability of it being greater than 23 given that it is greater than 22We can illustrate this with an example

Appendix: Poisson Process

If we assume the waiting time between events is Exponentially Distributed we have a randomly spaced sequence (or process) across time called a Poisson Process

The Poisson Process is an example of a Stochastic Process

Stochastic Processes are phenomena who change randomly across time (such as the price of assets, the number of claims received, the spread of a virus in a population)

A Poisson Process Realisation

02468

1012141618202224262830

0 1 2 3 4 5

Time (Days)

To

tal

Nu

mb

er o

f E

ven

ts (

Lo

sses

)

Waiting Time Between Claims is Exponentially Distributed with an Average

of 0.1 days

Simplifying the Poisson Process : The Poisson Distribution

The Poisson Process is too detailed and abstract for most of the Risk Analysis carried out by Insurance CompaniesNormally we are interested in the number of losses or events that occur over a time period not the exact timing of those eventsAlso we do not think in terms of the average time between events but the average number of events that occurConverting average waiting time to average frequency is very simpleWe could can also use the Poisson Process to count the number of events that occur over a period to obtain the Poisson Distribution

Average Waiting Time and Average Frequency

Central to the Poisson Process was the average waiting time between events or claimsAverage Frequency measures the average number of events per unit of timeThe relationship between Average Frequency () and Average Waiting Time () is:

For example, if on average we observe 10 events per day () then the average waiting time is 1/10th of a day or 2.4 hours The greater the frequency the smaller the average waiting timeThe average waiting time between events is also known as the Return Period - this especially common in the context of the modelling of Catastrophic Risks or CAT Risks where the Return Period is often years

1

1

Counting the Number of Events that Take Place over a day

012345

6789

101112131415

1617181920

0 0.5 1 1.5 2 2.5 3

Time (Days)

To

tal N

um

ber

of

Even

ts (

Lo

sses)

7 Events or Claims Take in

the day


0123456789

1011121314151617181920

0 10 20 30 40 50 60 70 80

Time (Hours)

To

tal

Nu

mb

er o

f E

ven

ts (

Cla

ims)

Poisson Process

Poisson Distribution

0

0.02

0.04

0.06

0.08

0.1

0.12


Pro

ba

bili

ty

How Many Claims? How Large Are Claims?

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 10 20 30 40

(RandomPoisson)

Appendix: Random Poisson Function

There is an alternative to the for loop called the do while loopIt is slightly more complicated than the for loop but is more versatile

Sub PutNumbersLoop ()i = 1Do While i <= 100Cells(i,1) = 1i = i + 1LoopEnd Sub

Set i to an initial value of 1

Take the value of i add one to it and put the the result back into i

(increase i by one)

Algorithm: RandomPoisson

Generate a random number from an exponential distribution with the appropriate average waiting time and

call this number TotalTime

Set a number called TotalEvents equal to zero

Is TotalTime greater than 1?The random number of

claims in that day is TotalClaimsNo

Add one to TotalClaims

Generate a random number from an exponential distribution with the appropriate average time and add

this number to TotalTime

Yes

loop

Random Poisson Function

Implementing the algorithm discussed earlier using a do while loop:

Public Function RandomPoisson(AverageFequency) Application.Volatile TotalEvents = 0 AverageWaitingTime = 1 / AverageFequency TotalTime = EXPINV(Rnd(), AverageWaitingTime )

Do While TotalTime <= 1 TotalEvents = TotalEvents + 1 TotalTime = TotalTime + EXPINV(Rnd(), AverageWaitingTime ) Loop

RandomPoisson = TotalEventsEnd Function On each loop the TotalTime is increased by an

Exponentially Distributed random number representing the time to the next event

This line tells the function to recalculate each time F9 is pressed

Appendix: SumRandomPareto

This function sums or convolutes a number of random variables from a Pareto distribution

Public Function SumRandomPareto(Average, Minimum, NoClaims) Sum = 0 i = 1 Do While i <= NoClaims Sum = Sum + ParetoInverseCDF(Rnd(), Average, Minimum) i = i + 1 Loop SumRandomPareto = SumEnd Function

Add a random number to the sum with the appropriate Average and Minimum

Appendix : Inverse Transform Proof

Let F be the CDF of some random variable x such thatF(Z) = Pr(x <= Z)

Where Pr stands for probability and 0 < F(Z) < 1Let U be a uniform distributed random variable what is the distribution of F-1(U)?We will show that CDF of F-1(U) is the same as x:

Pr(F-1(U) < Z) = F(Z) Now since F is a CDF and therefore monotonic:

Pr(F-1(U) < Z) = Pr(F(F-1(U)) < F(Z)) = Pr(U < F(Z))Since U is a uniform random variable

Pr(U < Y) = Y (where 0 < Y < 1)So we conclude:

Pr(F-1(U) < Z) = Pr(U < F(Z)) = F(Z)Therefore the CDF of F-1(U) is the same as the random variable x

Documents

Risk Analysis & Modelling Lecture 3: Underwriting Risk