16
How to data model Churn Real life examples

Churn prediction data modeling

Embed Size (px)

Citation preview

Page 1: Churn prediction data modeling

How to data model Churn Real life examples

Page 2: Churn prediction data modeling

Quick quizz •  How many of you are familiar with Churn issue?

•  with Machine Learning? Logistic Regression, Random Forest, Gradient Boosting trees? (Not the subject here)

•  With SQL? (we may see some code later)

•  What database tech do you use? What about EMC Greenplum or Vertica?

Page 3: Churn prediction data modeling

Who I am

•  Senior Data Scientist at Dataiku (worked on churn prediction, fraud detection, bot detection, recommender systems, graph analytics, smart cities, … )

•  Occasional Kaggle competitor

•  Mostly code with python and SQL

•  Twitter @prrgutierrez

Page 4: Churn prediction data modeling

Churn definition •  Wikipedia: “Churn rate (sometimes called attrition rate), in its broadest sense, is a measure of the number of individuals or items moving out of a collective group over a specific period of time” = Customer leaving

Page 5: Churn prediction data modeling

Two types of Churn •  Subscription models:

•  Telco •  E-gamming (Wow) •  Ex : Coyote -> 1 year subscription

-> you know when someone leave •  Non subscription models:

•  E-Business (Amazon, Price Minister, Vente Privée) •  E-gamming (Candy Crush, free MMORPG)

-> you approximate someone leaving Candy Crush: days / weeks MMORPG: 2 months (holidays) Price Minister: months

Page 6: Churn prediction data modeling

Two types of Churn •  Blurred Separation:

•  Ex: T-mobile: 1 month subscription -> paying each call •  Ex: Wow: 1 month to 6 month subscription •  Banking?

•  Focus : no subscription:

•  Can be seen as a generalization where you have to approximate the target

•  Bonus : Seller churn •  Market places •  Clients that participate product life

•  Forums (Reddit) •  E-gamming (Korean competitions, guilds etc.)

Page 7: Churn prediction data modeling

Dealing with churn •  Motivations :

•  Saturated market -> cost get new client >>> cost keep client •  Ex : http://www.bain.com/publications/articles/breaking-the-back-of-customer-churn.aspx

•  Wireline company : 2% to 2.5 % churn rate per month. •  If 5 M customers -> 1.32 M churn per year •  When reducing from 2.5% to 2% lowest estimation : 240 M $ in 18 month

Page 8: Churn prediction data modeling

Dealing with churn •  Predict churn :

•  One model for performance <- our focus, short term, more ML •  One model for understanding <- long term, more Analytics

•  Act on it (short term) :

•  Special offer (telco call, free in game money, discount coupon … ) •  Does it work? Feedback loop needed! •  Model probabilities of leaving because of offer. A/B tests. Multi arms Bandit? •  Significant LTV for activation?

•  Act on it (long term) : •  Is there a problem in my purchasing funnel? •  Is the game too hard at some point?

Page 9: Churn prediction data modeling

Dealing with churn

•  Candy Crush Rumor : •  Change the distribution of

probabilities of candies / bombs •  Change the difficulty of the

game •  Loosing a lot makes the game

easier

Page 10: Churn prediction data modeling

Modelling Churn •  Machine learning model (classification) -> target:

•  Known in subscription •  Unknown in general

•  Step 1 : Maintain customer status

•  Do you care only about your best? •  Anyway churn action won’t be the same •  Has a client churned? -> target = churner = don’t buy / visit since time X -> best = buy / visit more than y since time Y •  Can be refined (“new customer”, several class of best or inactive, reactivated…) •  Storage : maintain only the difference!

Page 11: Churn prediction data modeling

Modelling Churn •  Machine learning model -> features:

•  Explicative factors to use as input for the model

•  Step 2 : Maintain customer features •  Social (woman, age, etc.) •  Behavioral!

•  Utilization / buying rate •  Trend in utilization / buying rate •  Ad hoc features :

•  WoW / Social game churn: take into account friend network churn •  Telco: call to call centers

•  Beware of time dependence!

Page 12: Churn prediction data modeling

Data Model

Page 13: Churn prediction data modeling

Computation Dependency diagram

Page 14: Churn prediction data modeling

Ex : Train and predict scheme

Time  

T  :  present  ,me  T  –  4  month  

Data  is  used  for  target  crea,on  :  ac,vity  during  the  last  4  months  

Data  is  used  for  feature  genera,on.  

Use  model  to  predict  future  churn  

Train  model  using  features  and  target  

Page 15: Churn prediction data modeling

Ex : Train Evaluation and Predict Scheme

Time  

T  :  present  ,me  T  –  4  month  

Data  is  used  for  target  crea,on  :  ac,vity  during  the  last  4  months  

Data  is  used  for  feature  genera,on  

Valida&on  set  

Use  model  to  predict  future  churn  

Training  

Evaluate  on  the  target  of  the  valida,on  set  

T  –  8  month  

Data  is  used  for  features  genera,on.  

Data  is  used  for  target  crea,on  :  ac,vity  during  the  last  4  months  

Page 16: Churn prediction data modeling

Thank you for your attention !