15
Data mining in advertising: air loyalty programs and coupon purchase prediction Dmitry Efimov December 15, 2015

Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Data mining in advertising:air loyalty programs and

coupon purchase prediction

Dmitry Efimov

December 15, 2015

Page 2: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Outline

Predicting value customers in Air Loyalty programs

Coupon purchase prediction

Data preprocessing

Features

Models for prediction

Page 3: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

About me

I PhD in Mathematics from Moscow State UniversityI The main specialization: machine learning and data miningI TOP-10 in the global ranking on kaggle.com

Page 4: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Predicting value customers in Air Loyalty programs

I Data are provided by Etihad Air Company

I The problem is to predict passengers that change tierstatus during the next few weeks

I The suggested methodology is patented by Dmitry Efimovand Jose Berengueres

Page 5: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me
Page 6: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me
Page 7: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Example of an application that enhances the value ofmiles program:

Page 8: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Coupon purchase prediction

Page 9: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Provided data

I Train: user-coupon purchases for 52 weeksI Train: user-coupon visits for 52 weeksI User list: gender, age, locations

I ≈ 20 000 usersI Coupon list: price, discount, genre name, locations

I train: ≈ 18 000 couponsI test: ≈ 400 coupons

Predict: user-coupon purchases for the 53rd week

Page 10: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Cross validation

Problem: possible pairs user-coupon in train: ≈ 360 000 000

I To decrease the size of train, for each week:I take coupons with at least one purchaseI take users with at least one purchaseI it gives ≈ 600 000 pairs for each week, or ≈ 30 000 000

pairs for the whole trainI use last few weeks to predict test week (we used last 5

weeks)I Validation set: pairs for the last week

Page 11: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Feature engineering

I Dummies: one-hot encoding

I Counts: number of samples for different feature values

I Counts unique: number of different values of one featurefor fixed value of another feature

I Likelihoods

I Similarities

Page 12: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Likelihood features

I Type 1: using sliding window by weeks

Example:I for each week calculate the rate of purchases by each

GENRE NAME based on the previous 10 weeksI Type 2: using multi class algorithms

Example:I one purchase — sample with target GENRE NAMEI for the test week: predict what will be next GENRE NAMEI ⇒ XGBOOST with 13 classes

Page 13: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Similarities

I the idea: for each user find the similarity between testcoupons and coupons purchased before

I use cosine distance with weights

I coordinates for each coupon — coupon features

Page 14: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Final model and resutls

I XGBOOST with rank:map objective and map@10evaluation metric

Place Team Leaderboardscore

1 Herra Huu 0.0099732 Halla Yang 0.0098483 threecourse 0.009484... ... ...20 Dmitry and Leustagos 0.007642

Page 15: Data mining in advertising: air loyalty programs and ...efimov-ml.com/pdfs/MCM353_2015.pdf · Coupon purchase prediction Data preprocessing Features Models for prediction. About me

Thank you! Questions???

Dmitry [email protected]