52
Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014 Maximizing a Churn Campaign’s Profitability With Cost-Sensitive Predictive Analytics Alejandro Correa Bahnsen, Luxembourg University Andres Felipe Gonzalez Montoya, DIRECTV

Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Embed Size (px)

DESCRIPTION

Presentation at SAS Analytics conference 2014 Predictive analytics has been applied to solve a wide range of real-world problems. Nevertheless, current state-of-the-art predictive analytics models are not well aligned with business needs since they don't include the real financial costs and benefits during the training and evaluation phases. Churn modeling does not yield the best results when it's measured by investment per subscriber on a loyalty campaign and the financial impact of failing to detect a churner versus wrongly predicting a non-churner. This presentation will show how using a cost-sensitive modeling approach leads to better results in terms of profitability and predictive power – and is applicable to many other business challenges.

Citation preview

Page 1: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014

Maximizing a Churn Campaign’s Profitability With Cost-Sensitive

Predictive Analytics

Alejandro Correa Bahnsen, Luxembourg University Andres Felipe Gonzalez Montoya, DIRECTV

Page 2: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Agenda

• Churn modeling

• Evaluation Measures

• Offers

• Predictive modeling

• Cost-Sensitive Predictive Modeling

Cost Proportionate Sampling

Bayes Minimum Risk

CS – Decision Trees

• Conclusions

Page 3: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Churn Modeling

• Detect which customers are likely to abandon

Voluntary churn

Involuntary churn

Page 4: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Customer Churn Management Campaign

Inflow

New Customers

Customer Base

Active Customers

*Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models.

Predicted Churners

Predicted Non-Churners

TP: Actual Churners

FP: Actual Non-Churners

FN: Actual Churners

TN: Actual Non-Churners

Outflow

Effective Churners

Churn Model Prediction

1

1

1 − 𝛾 𝛾

1

Page 5: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Evaluation of a Campaign

• Confusion Matrix

• Accuracy =𝑇𝑃+𝑇𝑁

𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁

• Recall =𝑇𝑃

𝑇𝑃+𝐹𝑁

• Precision =𝑇𝑃

𝑇𝑃+𝐹𝑃

• F1-Score = 2𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 ∗ 𝑅𝑒𝑐𝑎𝑙𝑙

𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑅𝑒𝑐𝑎𝑙𝑙

True Class (𝑦𝑖)

Churner (𝑦𝑖=1) Non-Churner(𝑦𝑖=0)

Predicted class (𝑐𝑖)

Churner (𝑐𝑖=1) TP FP

Non-Churner (𝑐𝑖=0) FN TN

Page 6: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Evaluation of a Campaign

• However these measures assign the same weight to different errors

• Not the case in a Churn model since Failing to predict a churner carries a different cost than wrongly

predicting a non-churner

Churners have different financial impact

Page 7: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Financial Evaluation of a Campaign

Inflow

New Customers

Customer Base

Active Customers

*Verbraken et. al (2013). A novel profit maximizing metric for measuring classification performance of customer churn prediction models.

Predicted Churners

Predicted Non-Churners

TP: Actual Churners

FP: Actual Non-Churners

FN: Actual Churners

TN: Actual Non-Churners

Outflow

Effective Churners

Churn Model Prediction

0

𝐶𝐿𝑉

𝐶𝐿𝑉 + 𝐶𝑎 𝐶𝑜 + 𝐶𝑎

𝐶𝑜 + 𝐶𝑎

Page 8: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Financial Evaluation of a Campaign

• Cost Matrix

where:

True Class (𝑦𝑖)

Churner (𝑦𝑖=1) Non-Churner(𝑦𝑖=0)

Predicted class (𝑐𝑖)

Churner (𝑐𝑖=1)

Non-Churner (𝑐𝑖=0)

𝐶𝑎 = Administrative cost 𝐶𝐿𝑉𝑖 = Client Lifetime Value of customer 𝑖

𝐶𝑜𝑖 = Cost of the offer made to

customer 𝑖

𝛾𝑖 = Probability that customer 𝑖 accepts the offer

𝐶𝑇𝑃𝑖= 𝛾𝑖𝐶𝑜𝑖 + 1 − 𝛾𝑖 𝐶𝐿𝑉𝑖 + 𝐶𝑎

𝐶𝐹𝑁𝑖= 𝐶𝐿𝑉𝑖 𝐶𝑇𝑁𝑖

= 0

𝐶𝐹𝑃𝑖= 𝐶𝑜𝑖 + 𝐶𝑎

Page 9: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Financial Evaluation of a Campaign • Using the cost matrix the total cost is calculated as:

𝐶 = 𝑦𝑖 𝑐𝑖 ∙ 𝐶𝑇𝑃𝑖 + 1 − 𝑐𝑖 𝐶𝐹𝑁𝑖 + 1 − 𝑦𝑖 𝑐𝑖 ∙ 𝐶𝐹𝑃𝑖 + 1 − 𝑐𝑖 𝐶𝑇𝑁𝑖

• Additionally the savings are defined as:

𝐶𝑠 =𝐶0 − 𝐶

𝐶0

where 𝐶0 is the cost when all the customers are predicted as non-churners

Page 10: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

• Customer Lifetime Value

Financial Evaluation of a Campaign

*Glady et al. (2009). Modeling churn using customer lifetime value.

Page 11: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Agenda

• Churn modeling

• Evaluation Measures

• Offers

• Predictive modeling

• Cost-Sensitive Predictive Modeling

Cost Proportionate Sampling

Bayes Minimum Risk

CS – Decision Trees

• Conclusions

Page 12: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Offers

• Same offer may not apply to all customers (eg. Already have premium channels)

• An offer should be made such that it maximizes the probability of acceptance (𝛾) and CLV

Page 13: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Offers clusters

Page 14: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Offers Analysis

Improve to HD DVR

Monthly Discount

Premium Channels

Evaluate Offers

Performance

Page 15: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Offers Analysis

88%

90%

92%

94%

96%

98%

100%

0.0%

1.0%

2.0%

3.0%

4.0%

5.0%

6.0%

Cluster 1 Cluster 2 Cluster 3 Cluster 4

Churn Rate Gamma (right axis)

𝛾 = Probability that a customer accepts the offer

Page 16: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling

• Using predictive analytics for detecting the behavioral patterns of those customer's who had defect in the past

Page 17: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling

• Then check which of the current customers share the same patterns

Page 18: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling

• Dataset

Dataset N Churn 𝑪𝟎 (Euros)

Total 9410 4.83% 580,884

Training 3758 5.05% 244,542

Validation 2824 4.77% 174,171

Testing 2825 4.42% 162,171

Under-Sampling 374 50.80% 244,542

Page 19: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling

• Algorithms

Decision Trees

Logistic Regression

Random Forest

Page 20: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling - Results

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

DecisionTrees

LogisticRegression

RandomForest

F1-Score

Training Under-Sampling

0%

1%

2%

3%

4%

5%

6%

7%

8%

Decision Trees LogisticRegression

RandomForest

Savings

Training Under-Sampling

Page 21: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling - SMOTE

• Synthetic Minority Over-sampling Technique D

im 2

Dim 1 Synthetic samples

Page 22: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling - SMOTE

• Dataset

Dataset N Churn 𝑪𝟎 (Euros)

Total 9410 4.83% 580,884

Training 3758 5.05% 244,542

Validation 2824 4.77% 174,171

Testing 2825 4.42% 162,171

Under-Sampling 374 50.80% 244,542

SMOTE 6988 48.94% 4,273,083

Page 23: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling - SMOTE

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

DecisionTrees

LogisticRegression

RandomForest

F1-Score

Training Under-Sampling SMOTE

0%

1%

2%

3%

4%

5%

6%

7%

8%

Decision Trees LogisticRegression

RandomForest

Savings

Training Under-Sampling SMOTE

Page 24: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Predictive Modeling - SMOTE

• Sampling techniques helps to improve models’ predictive power however not necessarily the savings

• There is a need for methods that aim to increase savings

Page 25: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Agenda

• Churn modeling

• Evaluation Measures

• Offers

• Predictive modeling

• Cost-Sensitive Predictive Modeling

Cost Proportionate Sampling

Bayes Minimum Risk

CS – Decision Trees

• Conclusions

Page 26: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost-Sensitive Predictive Modeling

• Traditional methods assume the same cost for different errors

• Not the case in Churn modeling

• Some cost-sensitive methods assume a constant cost difference between errors

• Example-Dependent Cost-Sensitive Predictive Modeling

Page 27: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost-Sensitive Predictive Modeling

• Changing class distribution Cost Proportionate Rejection Sampling

Cost Proportionate Over Sampling

• Direct Cost Bayes Minimum Risk

• Modifying a learning algorithm CS – Decision Tree

Page 28: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost Proportionate Sampling

• Normalized Cost weight

𝑤𝑖 = 𝐶𝐹𝑃𝑖 𝑖𝑓 𝑦𝑖 = 0

𝐶𝐹𝑁𝑖 𝑖𝑓 𝑦𝑖 = 1

𝑤 𝑖 =𝑤𝑖

max𝑗

𝑤𝑗

Page 29: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost Proportionate Sampling

• Cost Proportionate Over Sampling

Example 𝑦𝑖 𝑤𝑖

1 0 1

2 1 10

3 0 2

4 1 20

5 0 1

Initial Dataset

(1,0,1) (2,1,10) (3,0,2)

(4,1,20) (5,0,1)

Cost Proportionate Dataset

(1,0,1) (2,1,1), (2,1,1), …, (2,1,1)

(3,0,2), (3,0,2) (4,1,1), (4,1,1), (4,1,1), …, (4,1,1), (4,1,1)

(5,0,1)

*Elkan, C. (2001). The Foundations of Cost-Sensitive Learning.

Page 30: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost Proportionate Sampling

• Cost Proportionate Rejection Sampling

Example 𝑦𝑖 𝑤𝑖

1 0 1

2 1 10

3 0 2

4 1 20

5 0 1

Initial Dataset

(1,0,1) (2,1,10) (3,0,2)

(4,1,20) (5,0,1)

Cost Proportionate

Dataset

(2,1,1) (4,1,1) (4,1,1) (5,0,1)

*Zadrozny et al. (2003). Cost-sensitive learning by cost-proportionate example weighting.

𝑤 𝑖

0.05

0.5

0.1

1

0.05

Page 31: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost Proportionate Sampling

• Dataset

Dataset N Churn 𝑪𝟎 (Euros)

Total 9410 4.83% 580,884

Training 3758 5.05% 244,542

Validation 2824 4.77% 174,171

Testing 2825 4.42% 162,171

Under-Sampling 374 50.80% 244,542

SMOTE 6988 48.94% 4,273,083

CS – Rejection-Sampling 428 41.35% 231,428

CS – Over-Sampling 5767 31.24% 2,350,285

Page 32: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Cost Proportionate Sampling

0%

5%

10%

15%

20%

25%

Decision Trees LogisticRegression

RandomForest

Savings

Training Under SMOTE

CS-Rejection CS-Over

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

DecisionTrees

LogisticRegression

RandomForest

F1-Score

Training Under SMOTE

CS-Rejection CS-Over

Page 33: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

• Decision model based on quantifying tradeoffs between various decisions using probabilities and the costs that accompany such decisions

• Risk of classification 𝑅 𝑐𝑖 = 0|𝑥𝑖 = 𝐶𝑇𝑁𝑖 1 − 𝑝 𝑖 + 𝐶𝐹𝑁𝑖 ∙ 𝑝 𝑖

𝑅 𝑐𝑖 = 1|𝑥𝑖 = 𝐶𝐹𝑃𝑖 1 − 𝑝 𝑖 + 𝐶𝑇𝑃𝑖 ∙ 𝑝 𝑖

Bayes Minimum Risk

Page 34: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

• Using the different risks the prediction is made based on the following condition:

𝑐𝑖 = 0 𝑅 𝑐𝑖 = 0|𝑥𝑖 ≤ 𝑅 𝑐𝑖 = 1|𝑥𝑖 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

• Example-dependent threshold

𝑡𝐵𝑀𝑅𝑖 =𝐶𝐹𝑃𝑖 − 𝐶𝑇𝑁𝑖

𝐶𝐹𝑁𝑖 − 𝐶𝑇𝑁𝑖 − 𝐶𝑇𝑃𝑖 + 𝐶𝐹𝑃𝑖

Bayes Minimum Risk

Page 35: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Bayes Minimum Risk

0%

5%

10%

15%

20%

25%

30%

35%

- BMR - BMR - BMR

Decision Trees Logistic Regression Random Forest

Savings

Training Under-Sampling SMOTE CS-Rejection CS-Over

Page 36: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Bayes Minimum Risk

0

0.02

0.04

0.06

0.08

0.1

0.12

0.14

- BMR - BMR - BMR

Decision Trees Logistic Regression Random Forest

F1-Score

Training Under-Sampling SMOTE CS-Rejection CS-Over

Page 37: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Bayes Minimum Risk

• Bayes Minimum Risk increases the savings by using a cost-insensitive method and then introducing the costs

• Why not introduce the costs during the estimation of the methods?

Page 38: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees

• Decision trees

Classification model that iteratively creates binary decision rules

𝑥𝑗 , 𝑙𝑗𝑚 that maximize certain criteria

Where 𝑥𝑗 , 𝑙𝑗𝑚 refers to making a rule using feature 𝑗 on value 𝑚

Page 39: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

• Decision trees – Construction

• Then the impurity of each leaf is calculated using:

Misclassification : 𝐼𝑚 𝜋1 = 1 −𝑚𝑎𝑥 𝜋1, (1 − 𝜋1)

Entropy : 𝐼𝑒 𝜋1 = −𝜋1 log 𝜋1 − 1 − 𝜋1 log (1 − 𝜋1)

Gini : 𝐼𝑔 𝜋1 = 2𝜋1 1 − 𝜋1

𝜋1is the percentage of positives.

CS – Decision Trees

𝑆

𝑆𝑙 𝑆𝑟

𝑆𝑙 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥𝑗𝑖≤ 𝑙𝑗𝑚 𝑆𝑟 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥𝑗

𝑖> 𝑙𝑗𝑚

𝑥𝑗 , 𝑙𝑗𝑚

Page 40: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

• Decision trees – Construction

• Afterwards the gain of applying a given rule to the set 𝑆 is:

𝐺𝑎𝑖𝑛 𝑥𝑗 , 𝑙𝑗𝑚 = 𝐼 𝜋1 −𝑆𝑙

𝑆𝐼(𝜋𝑙

1) −𝑆𝑟

𝑆𝐼(𝜋𝑟

1)

CS – Decision Trees

𝑆

𝑆𝑙 𝑆𝑟

𝑆𝑙 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥𝑗𝑖≤ 𝑙𝑗𝑚 𝑆𝑟 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥𝑗

𝑖> 𝑙𝑗𝑚

𝑥𝑗 , 𝑙𝑗𝑚

Page 41: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

• Decision trees – Construction • The rule that maximizes the gain is selected

𝑏𝑒𝑠𝑡𝑥, 𝑏𝑒𝑠𝑡𝑙 = argmax(𝑗,𝑚)

𝐺𝑎𝑖𝑛 𝑥𝑗 , 𝑙𝑗𝑚

• The process is repeated until a stopping criteria is met:

CS – Decision Trees

S

S S

S S S S

S S S S

Page 42: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees • Decision trees - Pruning • Calculation of the Tree error and pruned Tree error

• After calculating the pruning criteria for all possible trees. The maximum improvement is selected and the Tree is pruned.

• Later the process is repeated until there is no further improvement.

S

S S

S S S S

S S S S

S

S S

S S S S

S S

S

S S

S S

𝜖 𝑇𝑟𝑒𝑒 𝜖 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝜖 𝑇𝑟𝑒𝑒

𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|

𝜖 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝜖 𝑇𝑟𝑒𝑒

𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|

Page 43: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees

• Maximize the accuracy is different than maximizing the cost

• To solve this, some studies had been proposed method that aim to introduce the cost-sensitivity into the algorithms

• However, research have been focused on class-dependent methods Instead we used a: Example-dependent cost based impurity measure

Example-dependent cost based pruning criteria

Page 44: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees • Cost based impurity measure

• The impurity of each leaf is calculated using:

𝐼𝑐 𝑆 = 𝑚𝑖𝑛 𝐶0, 𝐶1

𝑓(𝑆) = 0 𝐶0 ≤ 𝐶1 1 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒

𝑆

𝑆𝑙 𝑆𝑟

𝑆𝑙 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥𝑗𝑖≤ 𝑙𝑗𝑚 𝑆𝑟 = 𝑆|𝑋𝑖 ∈ 𝑆 ⋀ 𝑥𝑗

𝑖> 𝑙𝑗𝑚

𝑥𝑗 , 𝑙𝑗𝑚

Page 45: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees

• Cost sensitive pruning

𝑃𝐶𝑐 =𝐶 𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ) − 𝐶 𝑇𝑟𝑒𝑒

𝑇𝑟𝑒𝑒 − |𝐸𝐵(𝑇𝑟𝑒𝑒, 𝑏𝑟𝑎𝑐ℎ)|

• New pruning criteria that evaluates the improvement in cost of eliminating a particular branch

Page 46: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees

0%

10%

20%

30%

40%

50%

Error Pruning Cost Pruning

Decision Trees Cost-Sensitive Decision Trees

Savings

Training Under-Sampling SMOTE CS-Rejection CS-Over

Page 47: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

CS – Decision Trees

0

0.05

0.1

0.15

0.2

0.25

0.3

F1-Score

Training Under-Sampling SMOTE CS-Rejection CS-Over

Page 48: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Comparison of Models

0%

10%

20%

30%

40%

50%

Random ForestTrain

Logistic RegressionCSRejection

Logistic RegressionBMR Train

Decision TreeCostPruningCSRejection

CS-Decision TreeTrain

Savings F1-Score

Page 49: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Conclusions

• Selecting models based on traditional statistics does not gives the best results measured by savings

• Incorporating the costs into the modeling helps to achieve higher savings

Page 50: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Other Applications • Fraud Detection

Correa Bahnsen et al. (2013). Cost Sensitive Credit Card Fraud Detection using Bayes Minimum Risk.

Correa Bahnsen, et al. (2014). Improving Credit Card Fraud Detection with Calibrated Probabilities.

• Credit Scoring Correa Bahnsen, et al. (2014). Example-Dependent Cost-Sensitive Credit

Scoring using Bayes Minimum Risk.

• Direct Marketing Correa Bahnsen, et al. (2014). Example-Dependent Cost-Sensitive Decision

Trees.

Page 51: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014, SAS Institute Inc. All rights reserved. #analytics2014

Contact Information

Alejandro Correa Bahnsen

University of Luxembourg

Luxembourg

[email protected]

http://www.linkedin.com/in/albahnsen

http://www.slideshare.net/albahnsen

Andres Gonzalez Montoya

DIRECTV

Colombia

[email protected]

Page 52: Maximizing a churn campaign’s profitability with cost sensitive predictive analytics

Copyright © 2014 SAS Institute Inc. All rights reserved. #analytics2014

Thank you!

Alejandro Correa Bahnsen, Luxembourg University Andres Felipe Gonzalez Montoya, DIRECTV