Upload
alexandra-sparks
View
223
Download
1
Embed Size (px)
Citation preview
Risk Analysis & Modelling
Lecture 3: Underwriting Risk
Underwriting RiskUnderwriting Risk is one of the primary risks faced by Insurance CompaniesIt is the Risk that the Insurer might experience an Underwriting Loss (Total Claims greater than Total Premiums)In order to model Underwriting Risk it is necessary to model the two primary uncertainties faced by Insurance Companies: the number of Claims (Claim Frequency) and the size of those Claims (Claim Severity)Actuarial Science uses Statistical Distributions to describe the range of possibilities for both the Frequency and Severity of Claims or LossesThe Frequency-Severity model is arguably the single most important model in Actuarial Science and is also the basis of the Reserve Risk and CAT Risk Models we will look at on the courseIt is also the basis of many Pricing ModelsThe Frequency-Severity Model is simple to understand and use as long as you have an understanding of the Distributions it uses….
Underwriting Risk & The Frequency-Severity Model
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10 12 14 16 18 20 22 24Number Claims Over Next Day
Pro
babi
lity
Frequency-Severity Model
Distribution Describing Frequency of Losses
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
Distribution Describing the Severity of Losses
Distribution ofTotal Losses
Underwriting PROFIT or
LOSS
Total Premium
Distributions - Describing the Range of Outcomes
In the last lecture we introduced the idea of describing the possible range of outcomes and the probability of those outcomes as a useful tool in quantifying RiskThis essentially gave us a probability histogram showing the relationship between the outcome and the probability of the outcomeWe were dealing with discrete random variables which could only assume a fixed (or countable) number of valuesWhen the random variable is discrete this relationship between the probability and the outcome is known as the Probability Mass Function
Probability Histogram
0%
5%
10%
15%
20%
25%
30%
35%
40%
45%
50%
-10% -5% 0% 5% 10%
Change In Portfolio Value
Pro
ba
bili
ty
The probability histogram relates the range of outcomes to the probabilities of those outcomes
This random variable is discrete because there are a finite
number of outcomes (5 in total)
Probability Mass FunctionThe Probability Mass Function (PMF) (also known as the Probability Function) specifies the relationship between the outcome and the probability of the outcome
Probability Mass FunctionPMF(-10%) = 5%
Loss of -10% 5% Probability
In the case of our portfolio the PMF relates the size of a loss or gain to the probability of that loss or gain
The PMF can be a mathematical function or a simple table of probabilities
The values that the random variable described by the Probability Mass Function is called the Support, in this case the Support is the set values (-10%,-5%,0%,5%,10%)
Properties of the Probability Mass Function
The first property of the Probability Mass Function (PMF) is that the probabilities relating to the various outcomes cannot be negative:
0)( XPMF
The second property of the PMF is if we sum the probabilities for all the possible outcomes that can occur (N in total) we get a value of 1:
1)(1
N
iiXPMF
Probability of Poor PerformanceOne important statistic frequently used in the measurement of Risk is the probability of the outcome being worse than some specified valueFor example, we might want to know the probability of the return on our portfolio being less than or equal to say -5%This probability is also known as the QuantileThis can obviously be calculated by summing the probabilities of all the outcomes equal to or worse than -5% (20% + 5% = 25%)We could do this for all the possible outcomes for our portfolio (10%, -5%, 0%, 5%, 10%) to obtain a different way of representing the distribution….
Cumulative Distribution Graph
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
-10% -5% 0% 5% 10%
Portfolio Return
Pro
bab
ility
of
Ret
urn
Les
s T
han
Or
Eq
ual
Probability that the return will be less than or equal
to 0 is 45%
45%
Cumulative Distribution FunctionThe Cumulative Distribution Function (CDF) is one of the most important statistical concepts on this course
It specifies the probability of observing a random variable being less than or equal some value (A) and is related to the Probability Mass Function (PMF) by the following relationship:
)()(
AX
i
i
XPMFACDF
Where CDF(A) gives the probability of observing some random variable less than or equal to a level A (Quantile), this is equal to the sum of the probabilities of all the possible outcomes (given by the PMF) less than or equal to A
Because the PMF is always positive or zero the CDF is always a monotonic (increasing) function
PMF & CDF Example
The following table gives values for the PMF of a random variable, evaluate the CDF at 2 and 4
X PMF(X)
1 10%
2 20%
3 30%
4 30%
5 10%
The Poisson Distribution & the Frequency of Losses
The most widely used distribution for the Frequency (number) of Losses in Actuarial Models is the Poisson DistributionLike the distributions we have seen so far the Poisson Distribution is discrete, however its Probability Mass Function is calculated using a mathematical formulaAny random event whose occurrence across time is independent but occurs at a constant rate will follow a Poisson DistributionExamples of Poisson Distributed random events include the number of calls received by a call center in an hour, the number of hurricanes in a year, the number of fires that will occur in a property insurance portfolio, the number of car accidents that an individual will have in a year… All that is needed to describe the Poisson Distribution is the Average frequency of the event
The Poisson DistributionIts Probability Mass Function for the Poisson Distribution (PMF) is:
This gives us the probability of a Poisson Distributed random variable having a value of X for a given average Where e is the natural number (2.71828…)A Poisson Distributed Random variable can take on any positive integer (0,1,2,3,…..) – the SupportIn Excel the Poisson PMF can be calculated by typing =Poisson(x,,FALSE) So the probability of a Poisson distributed random with an average of 5 having a value of 3 is =Poisson(3,5,FALSE)
!.)(x
exPx x = (0,1,2,3…)
The CDF for the Poisson Distribution is just the sum of the PMF for all values less than or equal to x:
In Excel the CDF for the Poisson Distribution can be calculated as Poisson(x,,TRUE)
So the probability of a Poisson distributed random variable with an average of 5 being LESS THAN OR EQUAL TO 3 could be calculated as is =Poisson(3,5,TRUE)
x
y
y
yexCDF
0 !.)(
x = (0,1,2,3…)
Poisson PMF Average = 15
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
Probability of a Poisson Distributed Random Variable with an Average of
15 being equal to 12 is 8.28%=POISSON(12,15,FALSE)0.0828
Poisson CDF Average = 15
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20 22 24
Probability of a Poisson Distributed Random Variable with an Average of 15 being less than or equal to 12 is
26.76%=POISSON(12,15,TRUE)
0.2676
Poisson Distribution Review Questions
A Portfolio of Policies will have an average number of losses over the next year of 100 and is believed to follow a Poisson DistributionWhat is the probably of exactly 100 losses occurring over the next yearWhat is the probability of the Number of Losses being less than or equal to 120What is the probability of the Number of Losses being greater than 120?
Claim Frequency and the Poisson Distribution
One reason for the popularity of the Poisson Distribution to describe the frequency of claims is it simplicityThe Insurer only has to estimate the Average number of claims to describe the entire range of outcomes for the frequencyIn addition, if the claims on a portfolio of policies are independent and at any point in time approximately have an equal probability of generating a claim (constant intensity) then the distribution of claim frequency must be Poisson Distributed…
Poisson Distribution & Claim Frequency
Insurance Policy……………………………………………………………………………………………
Probability Claim Today = X
Insurance Policy……………………………………………………………………………………………
Probability Claim Today = X
Insurance Policy……………………………………………………………………………………………
Probability Claim Today = X
Insurance Policy……………………………………………………………………………………………
Probability Claim Today = X
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
The distribution of Claim Frequency
Over the Next Month must be Poisson
If Claims are independent
and have constant intensity
How well does the Poisson Distribution fit Actual Claim Frequencies
As with any assumption it is important to see if the Poisson Distribution fits actual claim frequencies
Claim Frequency Data is often presented as the number of policies in a portfolio having 0,1,2.. Claims over a period
From this data we can calculate the average number of claims per policy and use this to compare the actual distribution of claim frequency to the Poisson…..
Empirical Motor Insurance Claims Frequency Data
Number Claims Number of Policies % Empirical Frequency Poisson Predicted0 88585 88.585% 88.410%1 10577 10.577% 10.890%2 779 0.779% 0.671%3 54 0.054% 0.028%4 4 0.004% 0.001%5 1 0.001% 0.000%6 0 0.000% 0.000%
TOTAL 100000
Average Frequency Per Policy: 0.123
779 out of 100000 or 0.779% had a frequency of 2 Claims
The Poisson Distribution with an average of 0.123 predicts that 0.671% of claims will have a frequency of 2 =Poisson(2,0.123,FALSE)
Poisson vs Empirical Frequency
0.000%
10.000%
20.000%
30.000%
40.000%
50.000%
60.000%
70.000%
80.000%
90.000%
100.000%
0 1 2 3 4 5 6
Claim Frequency Per Policy
Pro
bab
ility
Empirical Frequency Poisson Frequency
Discrete Vs Continuous Random Variables
Some random variables are naturally discrete such as the roll of a dice or the frequency of claims experienced by an insurance company A larger class of random variables are not naturally discrete and have a limitless range of valuesThe return on the portfolio is a good example, we have artificially made it discrete but in reality it could take on any value such as 1.654% or 5.3421%The cost of repairing a car after an accident could be $5672.55If we make continuous random variables artificially discrete for the purpose applying discrete distributions it will mean we are not modelling them accuratelyHowever, when we attempt to describe or model continuous random variables we encounter an interesting problem….
Zero ProbabilityThe problem with continuous random variables is they can take on a limitless (or very large) range of valuesWhat this means is that the probability of any given outcome is very small and effectively zero!What is the probability of a person selected at random being 1.6744234 meters? What is the probability of a random pebble picked up on a beach being 0.241312KG?What is the probability of your portfolio having a return of 1.23542%?What is the probability that the uniform random number will be 0.154345435123445?For continuous random variables we can only talk about the probability of the outcome being within a range of values, such as the probability of a uniform random number being between 0.1 and 0.3
Quantifying Continuous Random Variables using the CDF
The standard way to quantify the behaviour of a continuous random variable is by using the Cumulative Distribution Function (CDF)The CDF of a continuous random variable gives the probability of it being below a given value (Quantile)The CDF for a continuous random variable is nearly always a mathematical function which takes as its input a value and from this calculates the probability of observing the random variable below that value
CDF for a Uniform Random Variable
The simplest CDF is that of a Standard Uniform Random Variable (U), and is simply:
The support of this CDF are all values between 0 and 1 So for example if we want to know the probability of a uniform random value being less that 0.65
0.65 or 65% of the time the uniform random variable will be below the value of 0.65
XXCDFXUP )()~
( For 0 <= X <= 1
65.0)65.0()65.0~
( CDFUP
Cumulative Density for Uniformly Distributed Numbers
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
X
CD
F(X
)
CDF(X) = X
Probability Density Function (PDF)Although each possible outcome for a continuous random variable is effectively zero (the probability mass is zero), some of the outcomes are more clustered than othersHow clustered or dense outcomes are about a given point is measured by the Probability DensityProbability Density is measured using the Probability Density Function (PDF) and can be thought of as the continuous random variable’s equivalent to the Probability Mass Function (PMF)The higher the density of outcomes at a point the more likely an outcome is to appear in the region of that pointMathematically, the PDF measures the slope of the CDF at a valueEven though its meaning is abstract, the PDF is one of the most common tool with which to represent or visualise the nature of a continuous random variable – mainly because it is a bit like Probability Mass.
Probability Density for Uniformly Distributed Numbers
0
0.2
0.4
0.6
0.8
1
1.2
-0.5 0 0.5 1 1.5
Uniformly distributed random numbers have a constant density of 1 over the range 0 to 1, 0
elsewherePro
babi
lity
Den
sity
There is a zero density outside the range 0 to 1 because there are no
values there!
Uniform Density Graph
00.10.20.30.40.50.60.70.80.9
1
0 0.2 0.4 0.6 0.8 1
Points are spread out uniformly
Uniform Distribution PDF and CDF
CDF
Area under the PDF equals CDF
0
0.2
0.4
0.6
0.8
1
1.2
-0.5 0 0.5 1 1.5
Slope of the CDF (1) equals the PDF
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.2 0.4 0.6 0.8 1
X
CD
F(X
)
Mathematical Relationship between PDF and CDF
Mathematically we can express the relationship between the PDF and CDF for continuous random variables using calculus
Using integration we can express the relationship between the area under the pdf and the cdf as follows (where LB is the lower bound on the random variable, 0 in the case of the uniformly distribution)
a
LB
xxpdfacdf ).()(
x
xcdfxpdf
)(
)(
Using differentiation we can express the relationship between the slope of the CDF and the PDF as
Mathematical Uses of the PDFTraditionally, calculus was central to the manipulation of distributions for continuous random variables
Recall from lecture 2 we stated that the average of a discrete random variable was:
For a continuous random variable this is calculated using the PDF and integration:
n
1iix .xp )x~E(
i
UB
LB
xxxpdfxE .).()(
In the case of a uniformly distributed random variable this would be:
We stated that the variance for a discrete random variable was:
We can calculate the variance of a uniformly distributed random variable as:
Let us verify these results with a Monte Carlo Simulation in Excel…
2
10
2
1
2..1)(
1
0
21
0
x
xxxE
12
1
4
1
2
1
3
1
423.
2
1.1))((
1
0
231
0
22
xxxxxxExE
2n
1iir
2 ))~(.(rp)))~(-r~E(( )r~Var(i
rErE
Other Types of Continuous Random Variables
Most of the Distributions we will use on this course will be ContinuousThe Continuous Distributions we will look at will include the Normal, Log-Normal, Exponential and Gamma DistributionClaim Severities are best modeled using continuous distributions due to the large almost limitless range of values they can take (for example the loss could be £312.54 or £5123.87)One of the most widely used Severity Distribution in Actuarial Science is the Pareto Distribution….
Severity Distribution: The Pareto Distribution
The Pareto distribution is named after the Italian economist Vilfredo ParetoThe Pareto Distribution is a Continuous DistributionIt is probably the most commonly used Severity Distribution along side the Gamma DistributionIt models a pattern in which most claims are small but there is the potential for very large losses (heavy tails)It is particularly prevalent in modelling the claims severity experienced in Liability Insurance, Excess of Loss Reinsurance, Marine Insurance or any loss where the majority of losses are small but there is potential for large lossesThe Pareto Distribution is also widely used in Catastrophe or CAT models to model losses due to extreme events like Earthquakes or Windstorms
Pareto Distribution FormulaThe CDF (Cumulative Distribution Function) for a Pareto Distributed random variable is:
Where M is the minimum value of the Pareto random variable (greater than zero), and is a positive number greater than zero defining the shape or alpha of the distribution
The PDF (Probability Density Function) for a Pareto random variable is:
X
MXCDF 1)(
1.)(
)(
X
MXPDF
X
XCDF
For X >= M > 0
For X >= M > 0
Pareto CDF: = and M = 300
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 500 1000 1500 2000 2500 3000
M=3002000
942.02000
3001
5.1
A loss with a minimum 300 and a shape of 1.5 will be less than or equal to 2000 94.2% of the time
Pareto Review Question
An insured loss has a minimum size of 1000 and a Shape parameter of 2
Calculate the probability that the loss is less than (or equal to) 5000
*Calculate the probability the loss will be greater than 5000
Interpretation of the MinimumOne feature of Pareto distributed random variables is that they have a minimum value which they do not go belowThis minimum has two interpretations as far as the modelling of Claim SeveritiesOne interpretation of this minimum is that it represents the deductible on a policy, so claims on losses in value less than this minimum are never madeAnother interpretation is that due to the nature of insured losses there is a natural lower boundary to their size (for example, you will never have £100 of damage caused by a fire of an offshore platform)
Estimating the Shape of the Pareto Distribution
The Shape of the Pareto Distribution can be based on a generally accepted actuarial “rule of thumb”
For example, Swiss Re estimate a Shape of 1.3 for losses due to European windstorms and 0.8 for losses due to Earthquakes
The Shape can be calculated from the Minimum (M) and Average Loss () using the following formula:
This is especially useful when the shape is unknown and has to be estimated from sample data
M
Pareto Review Question
An insured loss has a minimum size of 10000 and an average size of 40000
Calculate the Shape
Using the Shape calculate the probability of the loss being less than 100000
This is going to take too long, lets learn how to make our own functions instead….
Custom Functions in Excel
Excel allows the user to create custom functions using VBAThese functions must be added to a Module – which is just a special page where code for an Excel workbook is writtenTo add a module to a workbook select the Visual Basic option under the Developer Tab (if you do not have the Developer Tab then check the Excel Options -> Popular -> Show Developer Tab in Ribbon) then select the Insert -> Module menu option in the Visual Basic editorLets start with a simple function that adds two numbers together
Add Numbers Function
Public Function AddNumbers(NumberA, NumberB) AddNumbers = NumberA + NumberB End Function
Special VBA words are displayed in blue (keywords).AddNumbers, NumberA and NumberB are just words selected at random. AddNumbers is the name of the function that is used to call the function on the spreadsheetNumberA and NumberB are used to reference the two parameters passed into the functionTo call this function from the spreadsheet we would type =AddNumbers(4,7)In this case NumberA would be 4 and NumberB would be 7
Our Pareto CDF Function!To save you time I have put this function into your spreadsheet
Public Function ParetoCDF(X,Average,Min) Shape = Average / (Average – Min) ParetoCDF = 1 – (Min/X)^Shape End Function
Notice it has two lines the first line calculates the Shape from the Average and MinimumThe second line uses this Shape to calculate the CDFSo if we wanted to calculate the probability of a loss with a minimum of 1000 and an average of 5000 being less than or equal to 10000 we would type =ParetoCDF(10000,5000,1000)Now go back to do the review question using it!
Pareto Distributions and Repeating Proportionalities
The Pareto distribution describes a phenomena we frequently see in the world about us – that is that there are repeating proportional patterns
One of the most famous examples of this was the observation by the Italian Economist that 80% of all the wealth was owned by the wealthiest 20% of the population
He also observed that within that wealthiest 20% the top 20% or the top 4% (20% * 20%) owned 80% of that 80% or owned 64% (80% * 80%) of all the wealth and so on…
This pattern can lead to some very wealthy people!
These repeating proportional patterns are also exhibited by the Pareto DistributionTo illustrate this assume losses are Pareto Distributed and have a minimum of 100 and an average of 500What proportions of losses are greater than 120 (100 * 120% or 20% above 100) above the minimum:
(1 – ParetoCDF(120,500,100) )What proportions of losses are greater than 144 (120 *120% or 20% above 120):
(1 – ParetoCDF(144,500,100) )What proportion of the losses above 120 are also above 144? (Divide the probability of it being greater than 144 by the probability of it being greater than 120 to see it is the same) This pattern can lead to some very large losses and these repeating proportional patterns are often observed in the losses caused by events such as fires or storms
Probable Maximum Loss (PML)
Probable Maximum Loss (PML) gives a measure of the worst loss that is likely to occur at some level of statistical significanceThe severity of the loss measured by the PML is measured in terms of the likelihood or probability of observing smaller losses
For example the PML95% is the loss such that 95% of all losses will be less than or equal to that amount - the 95% Quantile
The PML99% is the loss such that 99% of all losses will be less than or equal to that amount – the 99% QuantileIf we look at a graph for the CDF of the Pareto Distribution it is obvious to see how this can be calculated….
PML From Pareto CDF
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 500 1000 1500 2000 2500 3000
95%
2200
By reading the values of the CDF the “other way round” we can estimate the PML at 95% is about 2200
Inverse Cumulative Distribution Function (ICDF)
The Cumulative Distribution Function (CDF) calculates the probability of a random variable being less than or equal to some valueThe Inverse Cumulative Distribution Function (ICDF) calculates the outcome given the probability of it being less than or equal to that outcome or the QuantileThe ICDF can be calculated by inverting the CDF or using it in reverse..
Inverting the CDF Function
The Pareto CDF gives us the Quantile (Q) for an outcome (X) and its formula is:
To obtain the outcome (X) given the Quantile (Q) we can simply invert this function
X
MQ 1
This is the Inverse CDF for the Pareto Distribution and can be used to calculate the PML…
1
1 Q
MX
ParetoICDF Function
Again this function is a bit complex so we will use a VBA function to calculate the Inverse of Pareto CDF
Public Function ParetoICDF(Q,Average,Min) Shape = Average / (Average – Min) ParetoICDF = Min / (1 - Q) ^ (1 / Shape) End Function
The first line calculates the ShapeThe second line applies the Pareto Inverse CDF formulaFor example, if we have a loss with a minimum value of 1000 and an average of 5000 to find the loss that 95% of all losses will be less than or equal to (PML 95%) we would type =ParetoICDF(0.95,5000,1000)
Review Question
A loss is Pareto Distributed with a minimum value of 20000 and an average value of 150000
Using the Inverse CDF calculate the 95% PML
*Calculate the loss such that only 1% of losses will be greater than that loss (1% Exceedence Probability)
Combining the Frequency and Severity Model
To model the Underwriting Risk (or Total Loss) accurately it is necessary to Combine the statistical model for the Frequency and Severity of LossesThe total loss is determined by both the frequency and severity of losses, so to calculate the distribution of total losses we need to combine the frequency and severity distributionsThis can be very complicated mathematically, but fortunately we learnt about a technique in lecture 1 which will make it a lot easier….
Generating Discrete Random Variables
In lecture 1 we generated random samples for one of the simplest discrete random variables: the flip of a coinWe did this using intuition, cutting up the uniform random variables such that 50% of the outcomes represented a heads and 50% represented a tailsWe could have also done this using the Cumulative Distribution Function for the flip of the coin…
Coin Flip Distributions
0
0.1
0.2
0.3
0.4
0.5
0.6
0 (Tails) 1 (Heads)
Pro
ba
bili
ty
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 (Tails) 1 (Heads)
Cu
mu
lati
ve
Pro
ba
bili
ty
Probability Mass Function
Cumulative Distribution Function
Notice that the y-axis of the cumulative probability function has
values ranging between 0 and 1 like the uniform distribution
Generating Discrete Random Variables using the Inverse
Cumulative Probability Function
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 (Tails) 1 (Heads)
Cu
mu
lati
ve
Pro
ba
bili
ty
Rand()
We read off the value from the y axis and see which outcome this relates to
If Rand() is greater than 0.0 and less than or
equal to 0.5 then Tails
If Rand() is greater than 0.5 and less than or
equal to 1.0 then Heads
Generating Pareto Distributed Losses
This trick will also work for the Pareto Distribution.If we use rand to select the Quantile on the Y axis and then read off the value that this relates to on the X-Axis we will get a Pareto Distributed Random Variable…This method of converting a uniform distributed random variable into a random variable from another distribution is called the Inverse TransformSee appendix for a mathematical proof of why this technique works….
Generating Pareto Distributed Losses = and M = 300
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 500 1000 1500 2000 2500 3000
Rand()
Pareto Distributed Random Variables with a Minimum of 300 and shape of 1.5
Review Question
On the “Random Pareto Question” sheet use the ParetoICDF method (via the Inverse Transform) to convert rand into a Pareto Distributed Loss with a Minimum of 100 and Average of 130
Repeat this to generate a sample of 1000 values and test that the inverse transform works by confirming that the Minimum (using =Min(range) ) and Average (using =Average(range) ) of your sample is close to 100 and 130
Simulating the Aggregate Loss
Insurance Companies are often more concerned about the Total or Aggregate Loss across their Underwriting Portfolio than they are about individual lossesMathematically it can be very complex to describe the distribution of Aggregate Losses even if we know the distribution of individual lossesHowever, if we use the Monte Carlo Simulation it is very simple to simulate the Aggregate Loss, we just simulate the individual losses and then sum their values….
Review Question
An Insurer expects 20 losses to occur across a portfolio of 300 policies over the next year
The size of each of these losses follows a Pareto Distribution with a minimum of £1000 and an average of £12000
Simulate the Aggregate Loss for the portfolio assuming the expected number of losses (20) will occur
If the Total Premium across this portfolio of 300 policies is £250,000 simulate the Underwriting Profit or Loss
AggregateParetoLoss
So far we have seen how useful 2 lines of VBA can be with ParetoCDF and ParetoICDF!In your workbook there is a slightly more complex function called AggregateParetoLoss that will add or sum a number of losses from a Pareto DistributionFor example, if you want to simulate 20 Pareto Losses with an average of 8000 and minimum of 1000 with this function you would type: AggregateParetoLoss(8000,1000,20)See appendix for a description of this function…
Poisson Distributed Random Values
To simulate the total loss we are also going to need to simulate the frequency or number of lossesWe can also apply the Inverse Transform to the Inverse CDF for the Poisson distributionIn your spreadsheet there is a function Called RandomPoisson which will generate a random frequency from the Poisson DistributionFor example, if you want to generate a Random Claim Frequency with an average of 15 you would type =RandomPoisson(15)How this function actually works uses a slightly different technique explained in the appendix*
Generating Poisson Distributed Random Values
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 2 4 6 8 10 12 14 16 18 20 22 24
Rand()
Poisson Distributed Random Values
Combining The Frequency And Severity
To model Underwriting Risk we need to be able to simulate the combined effect of the random frequency and severity of lossesWe can now simulate both the frequency of losses and their severityTo simulate the total loss we simply generate a random value for the frequency of losses from the Poisson Distribution and then for each of the losses that occur generate a random Severity from the ParetoBy summing these losses we then obtain the Aggregate Loss and from that we also Simulate the Underwriting Profit and Loss
Frequency-Severity Model
Poisson Distribution
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10 12 14 16 18 20 22 24Number Claims Over Next Day
Pro
ba
bili
ty
How Large Are Claims?
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
For each loss that occurs Simulate their Size from the Pareto and Sum their Values
Simulate the Number of Losses from the Poisson Distribution
The Next Step…
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10 12 14 16 18 20 22 24Number Claims Over Next Day
Pro
bab
ility
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
Frequency Severity Model
MONTE CARLO SIMULATION
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
-2000000 -1500000 -1000000 -500000 0 500000
Underwriting Profit
Underwriting Profit & Loss Distribution
Appendix : Cantelli’s Inequality
Cantelli’s Inequality is an important result from statistics that places an upper boundary on the probability of a random variable being less than or equal to some value
The estimate can be calculated just from the mean and the variance of the random variable:
The true probability is likely to be less than this estimate BUT it has the advantage in that we do not have to know the distribution of the random variable just its Mean and Variance!
22
2
)~
(X
XUPUU
U
Cantelli’s Inequality Example
Imagine we have a portfolio whose average return is 8% (0.08) and variance is 0.003We can use Cantelli’s inequality to estimate the probability of losing more that 5% of the portfolios value (the random return on the portfolio being less than -5%)
So Cantelli’s inequality tells us that the chance or probability of losing more than 5% of the portfolio is 15.07% (0.1507)This is a worse case estimate of the true probability will be less than or equal to this
1507.0
)05.0(08.0003.0
003.0)~( 2
XrP
Appendix: Exponential DistributionExponentially distributed random variables can take on any value between 0 and positive infinity (the support)The formula for the CDF (Cumulative Distribution Function) of an Exponentially Distributed random variable (o) is
x
exoPxcdf
1)~()(
x
e
t
xcdfxpdf
)(
)(
Where e is the natural number (2.71828…)
Where is the average of the random variable
The formula for the PDF (Probability Density Function) of the Exponential Distribution is
Meaning behind the Exponential Distribution
The formula of exponentially distributed random variables describes a very specific type of behaviour – it has not just been taken out of the air!An exponentially distributed random variable exhibits what is known as constant intensity or is memorylessThis means that the probability of an exponentially distributed random variable being one unit larger is constantFor example, the probability of an exponentially distributed random variable being greater than 11 given that we know it is greater than 10 is the same as the probability of it being greater than 23 given that it is greater than 22We can illustrate this with an example
Appendix: Poisson Process
If we assume the waiting time between events is Exponentially Distributed we have a randomly spaced sequence (or process) across time called a Poisson Process
The Poisson Process is an example of a Stochastic Process
Stochastic Processes are phenomena who change randomly across time (such as the price of assets, the number of claims received, the spread of a virus in a population)
A Poisson Process Realisation
02468
1012141618202224262830
0 1 2 3 4 5
Time (Days)
To
tal
Nu
mb
er o
f E
ven
ts (
Lo
sses
)
Waiting Time Between Claims is Exponentially Distributed with an Average
of 0.1 days
Simplifying the Poisson Process : The Poisson Distribution
The Poisson Process is too detailed and abstract for most of the Risk Analysis carried out by Insurance CompaniesNormally we are interested in the number of losses or events that occur over a time period not the exact timing of those eventsAlso we do not think in terms of the average time between events but the average number of events that occurConverting average waiting time to average frequency is very simpleWe could can also use the Poisson Process to count the number of events that occur over a period to obtain the Poisson Distribution
Average Waiting Time and Average Frequency
Central to the Poisson Process was the average waiting time between events or claimsAverage Frequency measures the average number of events per unit of timeThe relationship between Average Frequency () and Average Waiting Time () is:
For example, if on average we observe 10 events per day () then the average waiting time is 1/10th of a day or 2.4 hours The greater the frequency the smaller the average waiting timeThe average waiting time between events is also known as the Return Period - this especially common in the context of the modelling of Catastrophic Risks or CAT Risks where the Return Period is often years
1
1
Counting the Number of Events that Take Place over a day
012345
6789
101112131415
1617181920
0 0.5 1 1.5 2 2.5 3
Time (Days)
To
tal N
um
ber
of
Even
ts (
Lo
sses)
7 Events or Claims Take in
the day
Frequency-Severity Model
0123456789
1011121314151617181920
0 10 20 30 40 50 60 70 80
Time (Hours)
To
tal
Nu
mb
er o
f E
ven
ts (
Cla
ims)
Poisson Process
Poisson Distribution
0
0.02
0.04
0.06
0.08
0.1
0.12
0 2 4 6 8 10 12 14 16 18 20 22 24Number Claims Over Next Day
Pro
ba
bili
ty
How Many Claims? How Large Are Claims?
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 10 20 30 40
(RandomPoisson)
Appendix: Random Poisson Function
There is an alternative to the for loop called the do while loopIt is slightly more complicated than the for loop but is more versatile
Sub PutNumbersLoop ()i = 1Do While i <= 100Cells(i,1) = 1i = i + 1LoopEnd Sub
Set i to an initial value of 1
Take the value of i add one to it and put the the result back into i
(increase i by one)
Algorithm: RandomPoisson
Generate a random number from an exponential distribution with the appropriate average waiting time and
call this number TotalTime
Set a number called TotalEvents equal to zero
Is TotalTime greater than 1?The random number of
claims in that day is TotalClaimsNo
Add one to TotalClaims
Generate a random number from an exponential distribution with the appropriate average time and add
this number to TotalTime
Yes
loop
Random Poisson Function
Implementing the algorithm discussed earlier using a do while loop:
Public Function RandomPoisson(AverageFequency) Application.Volatile TotalEvents = 0 AverageWaitingTime = 1 / AverageFequency TotalTime = EXPINV(Rnd(), AverageWaitingTime )
Do While TotalTime <= 1 TotalEvents = TotalEvents + 1 TotalTime = TotalTime + EXPINV(Rnd(), AverageWaitingTime ) Loop
RandomPoisson = TotalEventsEnd Function On each loop the TotalTime is increased by an
Exponentially Distributed random number representing the time to the next event
This line tells the function to recalculate each time F9 is pressed
Appendix: SumRandomPareto
This function sums or convolutes a number of random variables from a Pareto distribution
Public Function SumRandomPareto(Average, Minimum, NoClaims) Sum = 0 i = 1 Do While i <= NoClaims Sum = Sum + ParetoInverseCDF(Rnd(), Average, Minimum) i = i + 1 Loop SumRandomPareto = SumEnd Function
Add a random number to the sum with the appropriate Average and Minimum
Appendix : Inverse Transform Proof
Let F be the CDF of some random variable x such thatF(Z) = Pr(x <= Z)
Where Pr stands for probability and 0 < F(Z) < 1Let U be a uniform distributed random variable what is the distribution of F-1(U)?We will show that CDF of F-1(U) is the same as x:
Pr(F-1(U) < Z) = F(Z) Now since F is a CDF and therefore monotonic:
Pr(F-1(U) < Z) = Pr(F(F-1(U)) < F(Z)) = Pr(U < F(Z))Since U is a uniform random variable
Pr(U < Y) = Y (where 0 < Y < 1)So we conclude:
Pr(F-1(U) < Z) = Pr(U < F(Z)) = F(Z)Therefore the CDF of F-1(U) is the same as the random variable x