13
SOA Predictive Analytics Seminar – Taiwan 31 Aug. 2018 | Taipei, Taiwan Session 3 Using Predictive Analytics to Solve Business Problems Yi-Ling Lin, FSA, MAAA

Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

  • Upload
    others

  • View
    12

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

SOA Predictive Analytics Seminar – Taiwan 31 Aug. 2018 | Taipei, Taiwan

Session 3

Using Predictive Analytics to Solve Business Problems

Yi-Ling Lin, FSA, MAAA

Page 2: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

2018 SOA Predictive Analytics TaiwanYI‐LING LIN, FSA, MAAA, FCA

THE TERRY GROUP

Using Predictive Analytics to Solve Business ProblemsAugust 31, 2018

What’s the Buzz is all About…

Many terms used interchangeably…confusing…exciting…and seem magical

Big Data

Machine Learning

Artificial Intelligence

Deep Learning

Wide Data

Tall Data

Predictive Analytics

Data Analytics

Big Data• Five “V”s…• Structured/unstructured data…• Traditional data……to provide value need…

Data Analytics• Spectrum by function, business value and 

sophistication… uses traditional and emerging algorithms…

Statistical methods and machine learning• ML is part of AI, deep learning is part of 

ML• Mathematically defined “learning” 

architecture plus optimization framework…

Page 3: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Applying Data Analytics to Business Problems

3

Predictive

Prescriptive

Spectrum of data analytics: hindsight to insight to foresight

Value towardsbusiness solutions

What happened?

Why did it happen?

What will happen?

0

1

2

3

4

5

6

7

8

A B C D

Diagnostic

Descriptive

How do I influence what will happen?

0

1

2

3

4

5

6

7

8

A B C D

0

4

8

12

16

0 4 8 12 16 20

Y1

X1

0

4

8

12

16

0 4 8 12 16 20

Y1

X1

Adapted from Gartner’s Data Analytics Maturity Model

Analytical sophistication

…or decision‐informing

Application of Data Analytics in HealthCurrent and emerging applications of data analytics in US healthcare

• Dashboards, descriptive statistics, and reporting

• Trends exploration and forecasting 

• Pricing

• Claims reserving

• Risk scoring, risk stratification, and risk adjustment

• Plan design modeling 

• Stress testing

• Care management and decease prediction

• Fraud detection and outliers analysis

• Assumptions setting

… and many more

2009 2010 2011 2012 2013 2014 2015 2016Year

9.2%

9.9%

9.9%

11.4%

8.8%

7.8%7.9%7.4%

6.6%

8.7%

7.2%

8.2%

Annual Trend

Page 4: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Models and Business ProblemsData analytics toolbox: spectrum of traditional to emerging methodologies

Generalized linear regression/

logit

Clustering/ classification

Survival/ Markov models

Time Series

Deep Leaning

Machine Learning

• Trend forecasting• Stress testing

• Fraud identification• Targeted marketing• High cost group stratification• Provider referral patterns

• Disease progression• Claims reserving

• Risk Scoring and Risk Adjustment

• Plan design and choice modeling

• Product conversion

• Signal processing• Text processing

New Programming Paradigm: Machine Learning• Humans Input data & answers• And how to “learn”… and what does it mean to 

be wrong…• Example: clustering algorithm or neural networks 

or decision tree/Random Forest

Traditional Modeling versus Machine LearningCould computer automatically learn the rules by looking at data?

Traditional

Classical programming model• Humans input data and set of 

rules/function on how to arrive at answers • Also how close they want data to fit to the 

“model”…• Example: linear regression or generalized 

linear regression

Data

Rules

AnswersHumans input:

Machine Learning

Data

Rules

Answers

Humans input:

Page 5: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Model Evaluation and ValidationModel evaluation is an important part of any modeling project

• Relevance and importance of criteria 

• Appropriate and consistent with purpose

• On “unseen” or “test” sample of data

• Examples of criteria/metrics

Standard statistical measures (R squared, RMSE, MAE, etc.)

Predictive Ratios on groups of interest (e.g. Diagnostic groups or age groups)

Tolerance curves

Sensitivity and specificity (confusion matrix)

ROC curves

Comparison with naïve and standard models

…and many more

Cautionary tale!Famous Anscombe’s quartet: all four datasets have the same statistical properties, including R squared=0.67, means and variance of x and y, correlation and linear regression model: y=3+0.5x

Challenges Exciting things often come with challenges and potential pitfalls

• Messy, often high‐dimensional with missing values, data and data quality issues

• Potential bias in data

• Use of proxies

• Non‐discrimination, security and confidentiality

• Transparency vs. “black box”

• Spurious correlations: correlation vs. causality

• Interpretability and replicability

• Overfitting and overreliance

• Business purpose appropriateness and applicability

… and many more

Stories from the front lines…

o Quality of data. NYC taxi story: if it’s too good to be true… it’s probably isn’t!

o Bias inherited from humans through training data (training facial analytics on male white faces!)

o Spurious correlations…(divorce rate in Maine correlates with per capita margarine consumption at 99.3%)

Page 6: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Case Study: Risk Scoring Modeling in Health

Risk Scoring Modeling in Healthcare Application of data analytics in healthcare for risk scoring

Risk scoring = using individuals’ data to “predict”/estimate outcome of an event

Used in many areas: cost predictions, mortality hazard ratios, probability of disease

Credit scores, healthcare risk scores, mortality risk scores, life underwriting…

In healthcare risk scoring (cost-based) in US moving from traditional to emerging

– Methodologies: from linear regression models to machine learning algorithms

– Data: traditional claims and enrollment data (Rx, Dx, demographic, prior year costs, etc.) to new and emerging data, e.g. socio-economic status factors

– Types of risk scoring models: concurrent vs. prospective; off-the-shelf vs. custom

Health risk scores are used for variety 

of purposes

Many sources of uncertainty 

Page 7: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Case Study: Descriptive AnalyticsDashboards, distributions, descriptive statistics

In this case study babies under age of 2 were excluded, and population shown were enrolled at least for one month in both years

This is traditional analysis to inform what actually happened and the first step in any modeling project

160237

374

501605 593

710657

704780 805

964

395 348 331

405456

348422

482

744

884

1,221

1,541

0

200

400

600

800

1,000

1,200

1,400

1,600

1,800

baby child 18‐24 25‐29 30‐34 35‐39 40‐44 45‐49 50‐54 55‐59 60‐64 65+

Average PMPM (Medical and Rx) by year and gender 

F‐2016 PMPM F‐2017 PMPM M‐2016 PMPM M‐2017 PMPM

Case study for illustration purposes only

Case Study: Diagnostic AnalyticsInvestigating and identifying trends & relationship

Claim costs are lognormally distributed; prior and current somewhat linearly correlated 

(0.54)

Diagnostics focused on uncovering 

patterns,  relationships, trends, 

and potentially engineering 

predictive features

Case study for illustration purposes only

Case study for illustration purposes only

Page 8: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Case Study: Comparing Custom and Off‐the‐shelfGreat improvement on accuracy and predictive ratios 

On Test Data:is 0.24, correlation 0.65, 

MAE=67% for off‐the‐shelf HCC

is 0.48, correlation 0.70, MAE = 73% for residual custom‐recalibrated HCC

Case study for illustration purposes only

0

1000

2000

3000

4000

5000

6000

0 1000 2000 3000 4000 5000 6000

Residual recalibrated HCC‐based Risk Scores

0

1000

2000

3000

4000

5000

6000

0 1000 2000 3000 4000 5000 6000

Off‐the‐shelf HCC Risk Scores

0.62

0.74

0.88

0.99

1.03

1.10

1.01

1.08

1.161.12

1.17

0.93

1.10

1.011.04

0.991.03

1.05

0.971.00 0.99 0.98

1.02 1.03

0.6

0.7

0.8

0.9

1

1.1

1.2

baby child 18‐24 25‐29 30‐34 35‐39 40‐44 45‐49 50‐54 55‐59 60‐64 65+

Predictive Ratios by Age Group

Perfect fit HCC Recalibrated HCC‐based

Actual cost 

Actual cost 

Predicted  cost Predicted

cost

Case Study: Decision‐informing AnalyticsVarious uses of risk scoring in health care: population health and care management

From low to high

Identifying best cases for care management

Acute illness, trauma, accidents

Chronic deceases, high cost and risk

HealthyRising cost?

Risk

Cost

Assess characteristics 

of high risk/low cost 

group: potential for 

care management

Prevention and wellness programs

Case study for illustration purposes only

Page 9: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Case Study: Health Plan Choice Modeling

Case Study: Choice ModelingAn employer group wants to change the medical plans it offers to employees

Important to “predict” who will select various plans for financial and other modeling

Current Options

Actuarial Value

Percent

HMO 0.90 10%

High Value 0.86 42%

Medium Value 0.81 45%

CDHP 0.76 2%

New Options

Actuarial Value

Percent

Plan A 0.87

Plan B 0.81

Plan C 0.75

Plan D 0.68

Modeling

Participants data

Current choices /utility

Probability of selecting new options

Page 10: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

17

Subject Matter Expertise is Critical

ID Medical Coverage Coverage Tier Zip Code

2 Medium Value Option Employee + 1 Dependent 95066

6 Medium Value Option Employee Only 98053

7 Medium Value Option Employee Only 60630

8 HMO Plan  Employee Only 95121

9 Medium Value Option Employee Only 33472

14 HMO Plan  Employee + 1 Dependent 94610

15 Medium Value Option Employee Only 94109

16 Medium Value Option Employee Only 60478

17 HMO Plan  Employee + 1 Dependent 94607

21 Medium Value Option Employee + 1 Dependent 11377

24 Medium Value Option Employee Only 94587

25 HMO Plan  Employee Only 94109

26 Medium Value Option Employee Only 94606

27 Medium Value Option Employee Only 94133

28 Medium Value Option Employee Only 60532

31 Medium Value Option Employee + 2 or More Dependents 95635

32 HMO Plan  Employee + 1 Dependent 91206

33 Medium Value Option Employee Only 02458

34 Medium Value Option Employee Only 94103

36 Medium Value Option Employee Only 98121

39 Medium Value Option Employee Only 60614

45 Medium Value Option Employee + 1 Dependent 10604

48 HMO Plan  Employee Only 94121

53 HMO Plan  Employee Only 94122

54 HMO Plan  Employee Only 95356

55 Medium Value Option Employee Only 33026

58 Medium Value Option Employee Only 94107

59 Medium Value Option Employee Only 94123

61 HMO Plan  Employee Only 94501

63 HMO Plan  Employee + 2 or More Dependents 94595

69 Medium Value Option Employee + 1 Dependent 32703

72 Medium Value Option Employee Only 90046

73 HMO Plan  Employee + 2 or More Dependents 91367

77 Medium Value Option Employee Only 94587

81 Medium Value Option Employee Only 28173

82 HMO Plan  Employee Only 95691

85 Medium Value Option Employee Only 10573

89 Medium Value Option Employee Only 94115

Feature engineering is often informed by subject matter expert

Data Visualizations and Feature Engineering

18

$0K $100K $200K $300K $400K $500K

Salary

$0

$50

$100

$150

$200

$250

$300

$350

$400

$450

$500

$550

$600

$650

$700

$750

$800

To

talM

on

thly

De

du

ctio

ns

(no

Med

)

Current Medical Plan

CDHP Opt ion

High Value Opt ion

CDHP Opt ion High Value Opt ion

$0K $50K $100K $150K $200K $250K

Salary

$0K $50K $100K $150K $200K $250K

Salary

$0

$50

$100

$150

$200

$250

$300

$350

$400

$450

$500

$550

$600

$650

$700

$750

$800

To

talM

on

thly

Ded

uct

ion

s (n

o M

ed)

Data visualization helps with feature selection, model evaluation and much more…

Page 11: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Process of Predicting and Evaluating Choice

What are predicted costs for individuals and plan?

Logit Model

Age / Stage in Life

Individual Plan 

Elections

Risk Tolerance

Premium / Contributions

Expected Claims

Plan Design

Individual Annual Cost

Total Employee 

Cost Sharing for the 

individual

Plan Cost

Case Study: Modeling ApproachEstimating Parameters

Heterogeneous Logit Model

• i – individuals

• j ‐ plan options

• k ‐ # of attributes with weights  ik

• ij – utility of plan option j to 

person i

ij =  j  i +  ij ik =  k +  k  ik

• Monte Carlo simulation and 

maximize log likelihood function

Logit Model

Age / Stage in Life

Individual Plan 

Elections

Risk Tolerance

Premium / Contributions

Expected Claims

Plan Design

Page 12: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Case Study: Modeling New ChoicesProbability of participants’ elections of options A, B, C, or D

New choices

Know preferences ( and  ) 

and now changing the 

attributes ( )

Logit Model

Age / Stage in Life

Individual Plan 

Elections

Risk Tolerance

Premium / Contributions

Expected Claims

Plan Design

Case Study: Choice Modeling Results

22

Current Options

Actuarial Value

Percent

HMO 0.90 10%

High Value 0.86 42%

Medium Value 0.81 45%

CDHP 0.76 2%

New Options

Actuarial Value

Percent

Plan A 0.87 57%

Plan B 0.81 22%

Plan C 0.75 17%

Plan D 0.68 5%

Use results to inform business decisions and identify areas for further study

In this case study, people are more likely to choose a higher value plan option regardless of their current plan

Plan A (87% AV) Plan B (81% AV) Plan C (75% AV) Plan D (68% AV)

0.6

0.8

1.0

1.2

1.4

1.6

1.8

2.0

2.2

Loss

Ra

tio

Page 13: Using Predictive Analytics to Solve Business Problems · Using Predictive Analytics to Solve Business Problems . Yi-Ling Lin, FSA, MAAA . 9/12/2018 ... Predictive Analytics Data Analytics

9/12/2018

Questions? Thoughts… Comments?