26
Testing Models on Simulated Data Presented at the Casualty Loss Reserve Seminar September 19, 2008 Glenn Meyers, FCAS, PhD ISO Innovative Analytics

Testing Models on Simulated Data Presented at the Casualty Loss Reserve Seminar September 19, 2008 Glenn Meyers, FCAS, PhD ISO Innovative Analytics

Embed Size (px)

Citation preview

  • Testing Models on Simulated Data

    Presented at the Casualty Loss Reserve Seminar September 19, 2008Glenn Meyers, FCAS, PhDISO Innovative Analytics

  • The ApplicationEstimating Loss ReservesGiven a triangle of incremental paid lossesTen years with 55 observations arranged by accident year and settlement lagEstimate the distribution of the sum of the remaining 45 accident years/settlement lagsLoss reserve models typically have many parametersExamples in this presentation 9 parameters

  • Danger Possible Overfitting!Model describes the sample, but not the populationUnderstates the range of resultsThe range is the goal!Is overfitting a problem with loss reserve models?If so, what do we do about it?

  • OutlineIllustrate overfitting with a simple exampleExample fit a normal distribution with three observationsIllustrate graphically the effects overfittingIllustrate overfitting with a loss reserve modelReasonably good loss reserve modelShow similar graphical effects as normal example

  • Normal DistributionMLE for parameters m and s

    n = 3 in these examples

  • Simulation Testing StrategySelect 3 observations at random Population - m = 1000, s = 500Predict a normal distribution using the maximum likelihood estimatorSelect 1,000 additional observations at random from the same populationCompare distribution of additional observations with the predicted distribution

  • Simulated Fits

  • PP PlotsPlot Predicted Percentiles x Uniform PercentilesIf predicted percentiles are uniformly distributed, the plot should be a 45o line.

  • PP Plots

  • Simulated Fits

  • View Maximum Likelihood As an Estimation StrategyIf you estimate distributions by maximum likelihood repeatedly, how well do you do in the aggregate?Consider a space of possible parameters for a modelSelect parameters at randomSelect a sample for estimation (training)Select a sample for post-estimation (testing)

  • Continuing Prior SlideSelect parameters at randomSelect a sample for estimation (training)Select a sample for post-estimation (testing) Fit a model for each training sampleCalculate the predicted percentiles of the testing sample.Combine for all samples.In the aggregate, the predicted percentiles should be uniformly distributed.

  • PP Plots for Normal Distribution (n = 3)S-Shaped PP Plot - Tails are too light!

  • Problem How to Fit Distribution?Proposed solution Bayesian AnalysisLikelihood = Pr{Data|Model}We need Pr{Model|Data} for each model in the prior

  • Bayesian FitsPredictive distributions are spread out more than the MLEs.On individual fits, they do not always match the testing data.

  • Bayesian Fits as a StrategyParameters of model were selected at random from the prior distributionNear perfect uniform distribution of predicted percentilesAt least in this example, the Bayesian strategy does not overfit.Combined PP Plot forBayesian Fitting Strategy

  • Analyze Overfitting in Loss Reserve FormulasMany candidate formulas - Pick a good onePaid LossAY,Lag ~ Collective Risk ModelClaim count distribution is negative binomialClaim severity distribution is ParetoClaim severity increases with settlement lagCalculate likelihood using FFT

  • Simulation Testing StrategySelect triangles of data at randomPayment pattern at randomELR at random{Loss|Expected Loss} for each cell in the triangle from Collective Risk Model Randomly select outcomes using the same payment pattern and ELREvaluate the Maximum Likelihood and Bayesian fitting methodology with PP plots.

  • Background on FormulaFit a Bayesian model to over 100 insurers and produced an acceptable combined PP plot on test data from six years later.This paper tests the approach to simulated data, rather than real data.

  • Prior Payout Patterns

  • Prior Probabilities for ELR

  • Selected Individual Estimates

  • Maximum Likelihood Fitting MethodologyPP Plots for Combined FitsPP plot reveals the S-shape that characterizes overfitting.The tails are too light

  • Bayesian Fitting MethodologyPP Plots for Combined FitsNailed the Tails

  • SummaryExamples illustrate the effect of overfittingBayesian approach provides a solutionThese examples are based on simulated data, with the advantage that the prior is known.Previous paper extracted prior distributions from maximum likelihood estimates of similar claims of other insurers

  • ConclusionIt is not enough to know if assumptions are correct.To avoid the light tails that arise from overfitting, one has to get information that is:Outside

    The

    Triangle