Upload
alexander-hood
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
Dirty Data on Both Sides of the Dirty Data on Both Sides of the Pond: GIRO Data Quality Pond: GIRO Data Quality
Working Party ReportWorking Party Report2007 Ratemaking Seminar2007 Ratemaking Seminar
Atlanta GeorgiaAtlanta Georgia
Data Quality Working Party Data Quality Working Party MembersMembers
Robert CampbellRobert Campbell Louise Francis (chair) Louise Francis (chair) Virginia R. PrevostoVirginia R. Prevosto Mark RothwellMark Rothwell Simon SheafSimon Sheaf
AgendaAgenda
Literature ReviewLiterature Review Horror StoriesHorror Stories SurveySurvey ExperimentExperiment ActionsActions QuestionsQuestions Concluding RemarksConcluding Remarks
Literature ReviewLiterature Review
Data quality is maintained and improved by Data quality is maintained and improved by good data management practices. While the good data management practices. While the vast majority of the literature is directed vast majority of the literature is directed towards the I.T. industry, the paper highlights towards the I.T. industry, the paper highlights the following more actuary- or insurance-the following more actuary- or insurance-specific information:specific information:Actuarial Standard of Practice (ASOP) #23: Data Actuarial Standard of Practice (ASOP) #23: Data QualityQualityCasualty Actuarial Society White Paper on Data QualityCasualty Actuarial Society White Paper on Data QualityInsurance Data Management Association (IDMA)Insurance Data Management Association (IDMA)Data Management Educational Materials Working PartyData Management Educational Materials Working Party
Actuarial Standard of Practice Actuarial Standard of Practice (ASOP) #23(ASOP) #23
The American standard for all practice areas The American standard for all practice areas developed by the Actuarial Standards Board developed by the Actuarial Standards Board
Provides descriptive standards for:Provides descriptive standards for:• selecting data, selecting data, • relying on data supplied by others, relying on data supplied by others, • reviewing and using data, andreviewing and using data, and• making disclosures about data qualitymaking disclosures about data quality
http://www.actuarialstandardsboard.org/pdf/asops/http://www.actuarialstandardsboard.org/pdf/asops/asop023_097.pdfasop023_097.pdf
CAS White Paper on Data CAS White Paper on Data QualityQuality
Developed by tDeveloped by the Casualty Actuarial he Casualty Actuarial Society’s Committee on Management Data Society’s Committee on Management Data and Informationand Information
Provides guidelines to satisfy ASOP 23Provides guidelines to satisfy ASOP 23Describes Describes a system of standardised a system of standardised procedures to insure the integrity of procedures to insure the integrity of statistical data for personal automobilestatistical data for personal automobile
http://www.casact.org/pubs/forum/http://www.casact.org/pubs/forum/97wforum/97wf145.pdf97wforum/97wf145.pdf
Insurance Data Management Insurance Data Management AssociationAssociation
The IDMA is an American organization which The IDMA is an American organization which promotes professionalism in the Data Management promotes professionalism in the Data Management discipline through education, certification and discipline through education, certification and discussion forumsdiscussion forums
The IDMA web site:The IDMA web site:• Suggests publications on data quality,Suggests publications on data quality,• Describes a data certification model, andDescribes a data certification model, and• Contains Data Management Value Propositions Contains Data Management Value Propositions
which document the value to various insurance which document the value to various insurance industry stakeholders of investing in data qualityindustry stakeholders of investing in data quality
http://www.idma.orghttp://www.idma.org
CAS Data Management Educational CAS Data Management Educational Materials Working PartyMaterials Working Party
Reviewed a shortlist of texts recommended by Reviewed a shortlist of texts recommended by the IDMA for actuaries (9 in total)the IDMA for actuaries (9 in total)
Publishing a review of each text in the CAS Publishing a review of each text in the CAS Actuarial ReviewActuarial Review (starting with the August 2006 (starting with the August 2006 issue)issue)
Combined the reviews into an actuarial Combined the reviews into an actuarial introduction to data managementintroduction to data management
This was published in the Winter 2007 CAS This was published in the Winter 2007 CAS ForumForum
Both the reviews and the final paper are Both the reviews and the final paper are available through www.casact.orgavailable through www.casact.org
Literature Review SummaryLiterature Review Summary
Standards are generally prescriptive but Standards are generally prescriptive but descriptive information is availabledescriptive information is available
www.idma.orgwww.idma.org and and www.casact.orgwww.casact.org are are good sources for more information, good sources for more information, containing papers and other information in containing papers and other information in addition to those reviewed in the paperaddition to those reviewed in the paper
Look for an introductory overview paper to Look for an introductory overview paper to be published in the Winter 2008 CAS be published in the Winter 2008 CAS ForumForum
AgendaAgenda
Literature ReviewLiterature Review Horror StoriesHorror Stories SurveySurvey ExperimentExperiment ActionsActions Concluding RemarksConcluding Remarks QuestionsQuestions
Horror Stories – Non-InsuranceHorror Stories – Non-Insurance
Heart-and-Lung Transplant – wrong blood Heart-and-Lung Transplant – wrong blood typetype
Bombing of Chinese Embassy in BelgradeBombing of Chinese Embassy in Belgrade Mars Orbiter – confusion between Mars Orbiter – confusion between
imperial and metric unitsimperial and metric units Fidelity Mutual Fund – withdrawal of Fidelity Mutual Fund – withdrawal of
dividenddividend Porter County, Illinois – Tax Bill and Porter County, Illinois – Tax Bill and
Budget ShortfallBudget Shortfall
Horror Stories - ReservingHorror Stories - Reserving
NAIC concerns over non-US country NAIC concerns over non-US country datadata
Canadian federal regulator uncovered:Canadian federal regulator uncovered: Inaccurate accident year allocationInaccurate accident year allocation Double-counted IBNRDouble-counted IBNR Claims notified but not properly recordedClaims notified but not properly recorded
Former US regulator – requirement for Former US regulator – requirement for reconciliation exhibits in actuarial reconciliation exhibits in actuarial opinions motivated by belief that opinions motivated by belief that inaccurate data being usedinaccurate data being used
Horror Stories – Rating/PricingHorror Stories – Rating/Pricing
Examples faced by ISO:Examples faced by ISO: Exposure recorded in units of $10,000 Exposure recorded in units of $10,000
instead of $1,000instead of $1,000 Large insurer reporting personal auto Large insurer reporting personal auto
data as miscellaneous and hence missed data as miscellaneous and hence missed from ratemaking calculationsfrom ratemaking calculations
One company reporting all its Florida One company reporting all its Florida property losses as fire (including property losses as fire (including hurricane years)hurricane years)
Mismatched coding for policy and claims Mismatched coding for policy and claims datadata
Horror Stories - KatrinaHorror Stories - Katrina
US Weather models underestimated costs US Weather models underestimated costs Katrina by approx. 50% (Westfall, 2005)Katrina by approx. 50% (Westfall, 2005)
2004 RMS study highlighted exposure data 2004 RMS study highlighted exposure data that was:that was: Out-of-dateOut-of-date IncompleteIncomplete Mis-codedMis-coded
Many flood victims had no flood insurance after Many flood victims had no flood insurance after being told by agents that they were not in flood being told by agents that they were not in flood risk areas.risk areas.
AgendaAgenda
Literature ReviewLiterature Review Horror StoriesHorror Stories SurveySurvey ExperimentExperiment ActionsActions Concluding RemarksConcluding Remarks QuestionsQuestions
SurveySurvey
Purpose: Assess the impact of data Purpose: Assess the impact of data quality issues on the work of P&C quality issues on the work of P&C insurance actuariesinsurance actuaries
2 questions:2 questions: percentage of time spent on data quality percentage of time spent on data quality
issuesissues proportion of projects adversely affected proportion of projects adversely affected
by such issuesby such issues
TargetedTargeted Approach to Approach to DistributionDistribution
Members of the Working PartyMembers of the Working Party Members of CAS Committee on Members of CAS Committee on
Management Data and InformationManagement Data and Information Members of CAS Data Management and Members of CAS Data Management and
Information Educational Materials Working Information Educational Materials Working PartyParty
Members of the Working Party each Members of the Working Party each personally contacted a handful of personally contacted a handful of additional peopleadditional people
This resulted in 38 responsesThis resulted in 38 responses
Results - Percentage of TimeResults - Percentage of Time
EmployerEmployer NoNo..
MeanMean MediaMediann
MinMin MaxMax
Insurer/Insurer/ReinsurerReinsurer 1717 26.4%26.4% 25.0%25.0% 5.0%5.0% 50.0%50.0%
ConsultancyConsultancy 1313 27.1%27.1% 25.0%25.0% 7.5%7.5% 60.0%60.0%
OtherOther 88 23.4%23.4% 12.5%12.5% 2.0%2.0% 75.0%75.0%
AllAll 3838 26.0%26.0% 25.0%25.0% 2.0%2.0% 75.075.0%%
Results - Percentage of ProjectsResults - Percentage of Projects
EmployerEmployer No.No. MeanMean MediaMediann
MinMin MaxMax
Insurer/Insurer/ReinsurerReinsurer 1515 27.9%27.9% 20.0%20.0% 5.0%5.0% 60%60%
ConsultancyConsultancy 1313 43.3%43.3% 35.0%35.0% 10.010.0%%
100%100%
OtherOther 88 22.6%22.6% 20.0%20.0% 1.0%1.0% 50%50%
AllAll 3636 32.3%32.3% 30.0%30.0% 1.0%1.0% 100%100%
Survey ConclusionsSurvey Conclusions
Data quality issues have a significant impact on Data quality issues have a significant impact on the work of general insurance actuariesthe work of general insurance actuaries about a quarter of actuarial departments’ time about a quarter of actuarial departments’ time
is spent on such issuesis spent on such issues about a third of projects are adversely affectedabout a third of projects are adversely affected
The impact varies widely between different The impact varies widely between different actuaries, even those working in similar actuaries, even those working in similar organizationsorganizations
Limited evidence to suggest that the impact is Limited evidence to suggest that the impact is more significant for consultantsmore significant for consultants
AgendaAgenda
Literature ReviewLiterature Review Horror StoriesHorror Stories SurveySurvey ExperimentExperiment ActionsActions Concluding RemarksConcluding Remarks QuestionsQuestions
HypothesisHypothesis
Uncertainty of actuarial estimates of Uncertainty of actuarial estimates of ultimate incurred losses based on poor ultimate incurred losses based on poor data is significantly greater than that data is significantly greater than that of good dataof good data
Data Quality ExperimentData Quality Experiment
Examine the impact of incomplete Examine the impact of incomplete and/or erroneous data on actuarial and/or erroneous data on actuarial estimates of ultimate losses and the estimates of ultimate losses and the loss reservesloss reserves
Use real data with simulated Use real data with simulated limitations and/or errors and observe limitations and/or errors and observe the potential error in the actuarial the potential error in the actuarial estimatesestimates
Data Used in ExperimentData Used in Experiment
Real data for primary private Real data for primary private passenger bodily injury liability passenger bodily injury liability business for a single no-fault statebusiness for a single no-fault state
Eighteen (18) accident years of fully Eighteen (18) accident years of fully developed data; thus, true ultimate developed data; thus, true ultimate losses are knownlosses are known
Actuarial Methods UsedActuarial Methods Used
Paid chain ladder modelsPaid chain ladder models Incurred chain ladder modelsIncurred chain ladder models Frequency-severity modelsFrequency-severity models
Inverse power curve for tail factorsInverse power curve for tail factors
No judgment used in applying No judgment used in applying methodsmethods
Experiment: Three AspectsExperiment: Three Aspects
1.1. Vary size of the sample; that is, Vary size of the sample; that is, 1)1) All yearsAll years
2)2) Use only 7 accident yearsUse only 7 accident years
3)3) Use only last 3 diagonalsUse only last 3 diagonals
Experiment: Three AspectsExperiment: Three Aspects
2.2. Simulated data quality issuesSimulated data quality issues1.1. Misclassification of losses by accident yearMisclassification of losses by accident year
2.2. Early years not availableEarly years not available
3.3. Late processing of financial informationLate processing of financial information
4.4. Paid losses replaced by crude estimatesPaid losses replaced by crude estimates
5.5. Overstatements followed by corrections in Overstatements followed by corrections in following periodfollowing period
6.6. Definition of reported claims changedDefinition of reported claims changed
Experiment: Three AspectsExperiment: Three Aspects
3.3. BootstrappingBootstrapping
Results – Experiment 1Results – Experiment 1
More data generally reduces the More data generally reduces the volatility of the estimation errorsvolatility of the estimation errors
Results – Experiment 2Results – Experiment 2
Extreme volatility, especially those Extreme volatility, especially those based on paid databased on paid data
Actuaries ability to recognise and Actuaries ability to recognise and account for data quality issues is account for data quality issues is criticalcritical
Actuarial adjustments to the data Actuarial adjustments to the data may never fully correct for data may never fully correct for data quality issuesquality issues
Results – Experiment 3Results – Experiment 3
Less dispersion in results for error Less dispersion in results for error free datafree data
Standard deviation of estimated Standard deviation of estimated ultimate losses greater for the ultimate losses greater for the modified data (data with errors)modified data (data with errors)
Confirms original hypothesisConfirms original hypothesis
Conclusions Resulting from Conclusions Resulting from ExperimentExperiment
Greater accuracy and less variability in Greater accuracy and less variability in actuarial estimates when:actuarial estimates when: Quality data usedQuality data used Greater number of accident years usedGreater number of accident years used
Data quality issues can erode or even reverse Data quality issues can erode or even reverse the gains of increased volumes of data:the gains of increased volumes of data: If errors are significant, more data may worsen If errors are significant, more data may worsen
estimates due to the propagation of errors for certain estimates due to the propagation of errors for certain projection methodsprojection methods
Significant uncertainty in results when:Significant uncertainty in results when: Data is incompleteData is incomplete Data has errorsData has errors
AgendaAgenda
Literature ReviewLiterature Review Horror StoriesHorror Stories SurveySurvey ExperimentExperiment ActionsActions Concluding RemarksConcluding Remarks QuestionsQuestions
ActionsActions
Data Quality AdvocacyData Quality Advocacy Data Quality Data Quality
MeasurementMeasurement Management IssuesManagement Issues Screening DataScreening Data
Data Quality Advocacy - Data Quality Advocacy - ExamplesExamples
The Casualty Actuarial Society:The Casualty Actuarial Society: Data Management and Information Data Management and Information
Committee Committee Data Management and Information Data Management and Information
Education Materials Working Party Education Materials Working Party
Data Quality Measurement Data Quality Measurement IdeasIdeas
Quantify traditional aspects of quality data Quantify traditional aspects of quality data such as accuracy, consistency, uniqueness, such as accuracy, consistency, uniqueness, timeliness and completeness using a score timeliness and completeness using a score assigned by an expertassigned by an expert
Measure the consequences of data quality Measure the consequences of data quality problems problems measure the number of times in a sample that measure the number of times in a sample that
data quality errors cause errors in analyses, and data quality errors cause errors in analyses, and the severity of those errorsthe severity of those errors
Use measurement to motivate improvementUse measurement to motivate improvement
Management IssuesManagement Issues
Redman : Manage Information ChainRedman : Manage Information Chain establish management responsibilitiesestablish management responsibilities describe information chartdescribe information chart understand customer needsunderstand customer needs establish measurement systemestablish measurement system establish control and check performanceestablish control and check performance identify improvement opportunitiesidentify improvement opportunities make improvements make improvements
Management IssuesManagement Issues
Data supplier managementData supplier management Let suppliers know what you wantLet suppliers know what you want Provide feedback to suppliersProvide feedback to suppliers Balance the followingBalance the following
Known issues with supplierKnown issues with supplier Importance to the businessImportance to the business Supplier willingness to experiment togetherSupplier willingness to experiment together Ease of meeting face to faceEase of meeting face to face
Screening Data – Graphical Screening Data – Graphical DisplaysDisplays
-20
0
20
40
60
80
No
rma
lly D
istr
ibu
ted
Ag
e
+1.5*midspread
median
Outliers
75th Percentile
25th Percentile
Box and Whisker by CategoryBox and Whisker by CategoryAge by InjuryAge by Injury
Major Minor Serious Strain/Sprain
INJURY
0
25
50
75
100
Ag
e
Box Plot with OutlierBox Plot with Outlier
101
102
103
104
105
106
107
Pai
d Lo
ss
Screening Data – Graphical Screening Data – Graphical DisplaysDisplays
16
18
20
22
24
26
28
30
32
34
36
38
40
42
44
46
48
50
52
54
56
58
60
62
64
66
68
70
72
74
76
78
80
82
84
86
88
90
92
96
Age
0
200
400
600
800
1,000
Fre
qu
en
cy
Bar Plot for CategoriesBar Plot for Categories
Screening Data - Descriptive Screening Data - Descriptive StatisticsStatistics
N Minimum Maximum Mean Std.
Deviation License Year 30,250 490 2,049 1,990 16.3
Valid N 30,250
Multivariate MethodsMultivariate Methods
1( ) ' ( )
is a vector of variables
is a vector of means
is a variance-covariance matrix
MD is Mahalanobis Distance
MD x μ Σ x μ
x
μ
Σ
ConclusionsConclusions
Data quality issues significantly Data quality issues significantly impact the work of property and impact the work of property and casualty actuaries; andcasualty actuaries; and
Such issues could have a material Such issues could have a material impact on the results of property and impact on the results of property and casualty companies casualty companies
Concluding RemarksConcluding RemarksThe Working Party believes that insurers The Working Party believes that insurers should devote more time and resources should devote more time and resources to increasing the accuracy and to increasing the accuracy and completeness of their data by improving completeness of their data by improving their practices for collecting and handling their practices for collecting and handling data. In particular, insurers would benefit data. In particular, insurers would benefit from the investment of increased senior from the investment of increased senior management time in this area. By taking management time in this area. By taking such action, they could improve both such action, they could improve both their profitability and their efficiency. their profitability and their efficiency.
AgendaAgenda
Literature ReviewLiterature Review Horror StoriesHorror Stories SurveySurvey ExperimentExperiment ActionsActions Concluding RemarksConcluding Remarks QuestionsQuestions