Nonparametric Model Construction

Nonparametric Model Construction

Chapters 4 and 12

Stat 477 - Loss Models

Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 1 / 28

Types of data

Types of data

For non-life insurance, the types obviously come in the form ofnumber of claims and the amount of loss or claims.

However, because the process of model estimation and selectionprocedures are not different, we will also consider another type ofdata that may typically be found in life insurance: duration data.

Examples of this type of data include: (1) the time of death of anewborn, (2) the time until death of a policyholder from policy issue,(3) the survival time of a patient after a major surgery, (4) the lengthof time an individual stays ill or unable to perform work, and (5) thelength of time a person remain unemployed.

Each of these could also fall into data called failure time,age-at-death, survival time.


Types of data Types of distribution models

Types of distribution models

Data-dependent distribution: its complexity depends on the data (orknowledge) used to produce the distribution - the number of“parameters” increases with the number of data.

Parametric distribution: consists of a family of distributions wheremembers are determined according to a specification of its parameters(usually finite and fixed).

Empirical distribution: obtained by assigning equal probabilities toeach of the observed data.

Kernel smoothed distribution: obtained by replacing each data with acontinuous random variable and then assigning probability 1/n toeach such random variable. The random variables are identical exceptfor a location or scale change that is related to its associated data.

REMARK: The empirical distribution is a special case of the kernelsmoothed distribution where the random variable assigns a probability of 1to each data.


Types of data Illustrative data sets

Illustrative data sets

Data Set A Table 13.1 - consists of the distribution of the number ofaccidents by one driver in one year from an automobile insuranceportfolio.

Data Set B Table 13.2 - artificial data representing 20 amounts paidon workers compensation medical benefits (full loss).

Data Set C Table 13.3 - data representing payments on 227 claimsfrom a portfolio of general liability insurance policies. The data issummarized according to various payment range.

Data Sets D Tables 13.4 and 13.5 - represents data sets of durationto some event. Here, we have a 5-year term insurance policy and weobserve their times until death or surrender or expiration (contracttermination). The presentations of the data are different in each table.


Types of data Datasets

Data Sets A and B



Data Sets C and D1



Data Sets D2

* These data sets are taken directly from Klugman, et al. (2008).


Complete, individual data


Suppose X is the random variable of interest and denote by X1, . . . , Xn

the values of X for n observations.

Denote the observed values by x1, . . . , xn.

The empirical distribution function or ECDF is

Fn(x) =1

n

n∑i=1

I(Xi ≤ x),

where n is the total number of observations. I is the indicator function.

Because of possible duplications of values, we re-define the sample byconsidering the k distinct values arranged in the order y1 < y2 < · · · < yk,with k ≤ n. Define sj to be the number of times yj appears in the sample.

Clearly,∑k

j=1 sj = n.


Complete, individual data continued


Define rj to be the risk set corresponding to the observation yj and it isthe number of observations greater than or equal to yj . That is:

rj =

k∑i=j

si.

Clearly for example, r1 = n.

Based on this notation, we can define the ECDF as follows:

Fn(x) =

0, for x < y1

1− rjn , for yj−1 ≤ x < yj , j = 2, . . . , k

1, for x ≥ yk


Complete, individual data Illustrations

Illustrations

For purposes of illustration, construct the empirical distribution functionsfor each of the following cases:

Data Set B in Table 13.2

Example 13.2

Observed values: 2,3,5,5,5,6,6,8,8,8,12,14,18,18,24,24.


Complete, individual data Cumulative hazard rate function

Cumulative hazard rate functionThe cumulative hazard rate function is defined to be

H(x) = − logS(x),

where log denotes ‘natural logarithm’. Some properties:

H(x) =

∫ x

−∞h(z)dz

F (x) = 1− S(x) = 1− e−H(x)

The Nelson-Aalen estimate of the cumulative hazard function is given by

H(x) =

0, for x < y1∑j−1

i=1siri, for yj−1 ≤ x < yj , j = 2, . . . , k∑k

i=1siri, for x ≥ yk

Derive the Nelson-Aalen estimates for the previous illustrations.


Grouped data

Grouped dataWhen data is grouped, we approximate the distribution function byconnecting the points (at the boundaries) with straight lines.

Let the values of the data be divided into k intervals:(c0, c1], (c1, c2], . . . , (ck−1, ck] where c0 < c1 < · · · < ck, where (often)c0 = 0 and ck =∞.

Denote by nj the number of observations in the interval (cj−1, cj ] so that∑kj=1 nj = n. We are then able to estimate the empirical distribution at

each group boundary: Fn(cj) = 1n

∑ji=1 ni.

The distribution function then obtained by connecting the values of theempirical distribution function at the group boundaries with straight linesis called the ogive.

The formula is given by

Fn(x) =cj − xcj − cj−1

Fn(cj−1) +x− cj−1cj − cj−1

Fn(cj), for x in (cj−1, cj ].


Grouped data Empirical density

Empirical density for grouped data

The ogive is differentiable everywhere except at the boundaries, and henceit is made the density function by arbitrarily making it right-continous.

The derivative of the ogive gives the empirical density for grouped data,and the result is called a histogram.

The formula is given by

fn(x) =Fn(cj)− Fn(cj−1)

cj − cj−1=

njn(cj − cj−1)

, for x in [cj−1, cj).

For illustration:

Construct the ogive and histogram for Data Set C.


Grouped data SOA Exam Question

SOA Exam QuestionYou are given:

A random sample of payments from a portfolio of policies resulted inthe following:

Interval Number of Policies

(0, 50] 36(50, 150] x

(150, 250] y(250, 500] 84

(500, 1000] 80(1000, ∞) 0

Total n

Two values of the ogive constructed from the data above are:

Fn(90) = 0.21 and Fn(210) = 0.51

Calculate x.Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 14 / 28

Incomplete data

Incomplete or modified data

Observations may be incomplete because of censoring and/or truncation.

An observation is:

left truncated at d if when it is below d, it is not recorded, but whenit is above d, it is recorded at its observed value.

right truncated at u if when it is above u it is not recorded, but whenit is recorded at its observed value.

left censored at d if when it is below d, it is recorded as being equalto d, but when it is above d, it is recorded at its observed value.

right censored at u if when it is above u, it is recorded as being equalto u, but when it is below u, it is recorded at its observed value.

Most common to find left truncated and right censored observations. Lefttruncation usually occurs when a policy has an ordinary deductible d.Right censoring occurs with a policy limit.


Incomplete data Interpreting the risk set

Interpreting the risk set

For an individual data, denote the truncation point to be dj (with dj = 0if no truncation). Denote the observation itself by xj which could becensored or not. If censored, denote the value by uj .

Consider the uncensored observations with y1 < y2 < · · · < yk being the kunique values of the observed xj ’s and k less than or equal to the numberof uncensored observations.

The risk set for the j-th ordered observation yj is given by

rj =∑i

I(xi ≥ yj) +∑i

I(ui ≥ yj)−∑i

I(di ≥ yj)

=∑i

I(di < yj)−∑i

I(xi < yj)−∑i

I(ui < yj)

with the second equation being true since the total number of di’s equalthe total number of xi’s and ui’s.


Incomplete data Interpreting the risk set

Interpreting the risk set

Remember:

For survival/mortality data, the risk set is the number of peopleobserved alive at age yj .

For loss amount data, the risk set is the number of policies withobserved loss amounts (either the actual amount or the maximumamount due to a policy limit) larger than or equal to yj less thosewith deductibles greater than or equal to yj .

Recursive relationship:

rj = rj−1 +∑i

I(yj−1 ≤ di < yj)

−∑i

I(xi = yj−1)−∑i

I(yj−1 ≤ ui < yj)

For illustration: do Example 14.1.


Incomplete data Kaplan-Meier estimator

Kaplan-Meier (product-limit) estimator

The Kaplan-Meier (product limit) estimate for the survival function isgiven by

Sn(t) =

1, 0 ≤ t < y1,∏j−1i=1

(ri−siri

), yj−1 ≤ t < yj , j = 2, . . . , k,∏k

i=1

(ri−siri

)or 0, t ≥ yk.

Clearly when sk = rk, we have Sn(t) = 0 for t ≥ yk.



Incomplete data SOA Exam Question

SOA Exam Question

You are given:

All members of a mortality study are observed from birth. Some leavethe study by means other than death.

s3 = 1, s4 = 3

The following Kaplan-Meier product limit estimates were obtained:

Sn(y3) = 0.65, Sn(y4) = 0.50, Sn(y5) = 0.25

Between times y4 and y5, six observations were censored.

Assume no observations were censored at the times of deaths.

Calculate the value of s5.


Incomplete data Nelson-Aalen estimator

Nelson-Aalen estimator

First, derive the Nelson-Aalen estimator for the cumulative hazard ratefunction as given by

H(t) =

0, 0 ≤ t < y1,∑j−1

i=1siri, yj−1 ≤ t < yj , j = 2, . . . , k,∑k

i=1siri, t ≥ yk.

Then use S(t) = e−H(t) to estimate the survival function.



Incomplete data (Modified) SOA Exam Question

(Modified) SOA Exam Question

You are studying the length of time attorneys are involved in settlingbodily injury lawsuits. Let T represent the number of months from thetime an attorney is assigned such a case to the time the case is settled.

Nine cases were observed during the study period, two of which were notsettled at the conclusion of the study. For those two cases, the time spentup to the conclusion of the study, 4 months and 6 months, was recordedinstead.

The observed values of T for the other seven cases are as follows:

1 3 3 5 8 8 9

Use the Nelson-Aalen estimator to estimate Pr(3 ≤ T ≤ 5). Compare theestimate using Kaplan-Meier estimator.


Incomplete data Kernel density models

Kernel density modelsNotation:

p(yj) = the probability assigned to value yj , for j = 1, . . . , k by theempirical distribution.

Ky(x) = a (continuous) distribution function with mean y.

ky(x) = density function corresponding to Ky(x), and is called thekernel function.

A kernel density estimator of a distribution function is defined by

F (x) =

k∑j=1

p(yj)Kyj (x),

with corresponding estimator of the density function given by

f(x) =k∑j=1

p(yj)kyj (x),

Examples of kernels: Uniform, Triangular, Gamma and Normal.Chapters 4 and 12 (Stat 477) Nonparametric Model Construction Brian Hartman - BYU 22 / 28

Incomplete data Uniform kernel

Uniform kernel

In the case where the kernel is Uniform on [y − b, y + b], we have:

ky(x) =1

2b, for y − b ≤ x ≤ y + b.

The corresponding distribution function is given by

Ky(x) =x− y + b

2b, for y − b ≤ x ≤ y + b,

where Ky(x) = 0 for x < y − b and Ky(x) = 1 for x > y + b.

The value b is sometimes called the bandwidth.

For illustration: do Exercise 14.29.


Incomplete data Triangular kernel

Triangular kernelIn the case where the kernel is Triangular, we have:

ky(x) =

0, x < y − b,x−y+bb2

, y − b ≤ x < y,

y+b−xb2

, y ≤ x < y + b,

0, x ≥ y + b.

The corresponding distribution function is

Ky(x) =

0, x < y − b,(x−y+b)2

2b2, y − b ≤ x < y,

1− (y+b−x)22b2

, y ≤ x < y + b,

1, x ≥ y + b.


Incomplete data Gamma kernel

Gamma kernel

In the case where the kernel is a Gamma distribution with shape parameterα and scale parameter y/α, we have:

ky(x) =xα−1e−xα/y

(y/α)αΓ(α).

Note that in this case, we have:

Mean: α ∗ (y/α) = y

Variance: α ∗ (y/α)2 = y2/α.


Incomplete data Kernel density estimates (Source: Y-K. Tse’s slides)

Kernel density estimates

−4 −3 −2 −1 0 1 2 3 40

0.2

0.4

0.6

0.8

1

1.2

x

Ke

rne

l fu

nct

ion

Gaussian kernelRectangular kernelTriangular kernel

0 10 20 30 40 50 60 70 800

0.005

0.01

0.015

0.02

0.025

0.03

Loss variable x

De

nsi

ty e

stim

ate

with

re

cta

ng

ula

r ke

rne

l

Bandwidth = 3Bandwidth = 8

−10 0 10 20 30 40 50 60 70 80 900

0.005

0.01

0.015

0.02

0.025

0.03

Loss variable x

De

nsi

ty e

stim

ate

with

Ga

uss

ian

ke

rne

l

Bandwidth = 3Bandwidth = 8


Incomplete data SOA Exam Question

SOA Exam Question

Suppose you use a Uniform kernel density estimator with b = 50 tosmooth the following workers compensation loss payments:

82 126 161 294 384

If F (x) denotes the estimated distribution function and F5(x) denotes theempirical distribution function, determine |F (150)− F5(150)|.


Incomplete data Additional problem

Additional problem on kernel estimation

This problem is adopted from an old SOA question:

The times to death in a study of 5 lives from the onset of a disease todeath are:

2 3 3 3 7

Using a triangular kernel with bandwidth of 2, estimate the densityfunction at 2.5.


Documents

Nonparametric Model Construction