De-Mystefying Predictive Analytics

Preview:

DESCRIPTION

Keynote address at 2012 ReTechCon.com (annual conference of the Retailers Association of India), Mumbai.

Citation preview

De-mystifying Predictive Analytics

Galit Shmuéli

SRITNE Chaired Prof. of Data Analytics

Will the customer pay?

Today’s Talk

1. How predictive analytics differ from Reporting and other BI tools

2. The predictive analytics process

3. Examples of problems that can be tackled

4. Logic behind predictive analytics algorithms

5. Predictive Analytics for retail in India

Past Present Future

Case Studies

Overall Behaviour

“Presonalized” Behaviour

Today’s Talk

1. How predictive analytics differ from Reporting and other BI tools

2. The predictive analytics process

3. Examples of problems that can be tackled

4. Logic behind predictive analytics algorithms

5. Predictive Analytics for retail in India

The Predictive Analytics Process

Problem Identification

Deployment Re-evaluation More data

Determine Outcome and Predictors

Measurement

Draw sample, Split into training/holdout

Data

Data Mining algorithms & Evaluation

Models

Today’s Talk

1. How predictive analytics differ from Reporting and other BI tools

2. The predictive analytics process

3. Examples of problems that can be tackled

4. Logic behind predictive analytics algorithms

5. Predictive Analytics for retail in India

Problem Identification

Deployment (or not!) Re-evaluation More data

Outcome: redemption Predictors: customer, shop & product info

Measurement

From similar past campaign (redeemers and non-redeemers)

Data

? Expected gain per

offer sent

Models

Example 1: Personalized

Offer

Who to target?

Which coupon?

What medium?

Problem Identification

Deployment (or not!) Re-evaluation More data

Outcome: performance Predictors: employee & training info

Measurement

From past training efforts (successes and failures)

Data

? Expected gain per

employee

Models

Which employees to train?

Example 2: Employee Training

Problem Identification

Deployment (or not!) Re-evaluation More data

Measurement

Outcome: renewal Predictors: customer & membership info

Data

Past renewal campaigns (successes and failures)

? Expected gain per

customer

Models

Which members most likely not to renew?

Membership renewal

Example 3: Customer Churn

Example 4: Product-level demand forecasting

Problem Identification

Deployment (or not!) Re-evaluation More data

Historic info

Data

? Expected gain

Models

Measurement

Outcome: month-ahead weekly forecasts of #units purchased per item Predictors: past demand for this & related items, special events, economic outlook, social media

Weekly forecasts per clothing item

Problem Identification

Deployment (or not!) Re-evaluation More data

Outcome: pay/not Predictors: customer, product, transaction info

Measurement

Past deliveries (payments and non-payments)

Data

? Expected gain per

transaction

Models

Predict payment probability

Example 5: COD Prediction

Today’s Talk

1. How predictive analytics differ from Reporting and other BI tools

2. The predictive analytics process

3. Examples of problems that can be tackled

4. Logic behind predictive analytics algorithms

5. Predictive Analytics for retail in India

Predictive Analytics: It’s all about correlation, not causation

Algorithms search for correlation between the outcome and predictors Different algorithms search for different types of structure

Every time they turn on the seatbelt sign it gets bumpy!

Example: Direct Marketing

Maharaja Bank wants to run a campaign for current customers to purchase a loan They want to identify the customers most likely to accept the offer They use data from a previous campaign on 5000 customers, where 480 (9.6%) accepted

Data sample

Data Partitioning

4,000 customers

Training

1,000 customers

Holdout

Classification & Regression Trees

No Yes

Yes Yes No

No

Regression Models

Probability (Accept Offer) = function of b0 + b1 Age + b2 Experience + b3 Income + b4 CCAvg +…

The Regression Model

Coefficient

-6.16805744

-0.0227915

0.03030424

0.06047214

-0.00006691

0.61913204

0.13191609

0.00016262

-0.51986736

4.10482931

-1.11415482

-1.02319455

3.93598175

4.01372194

Online

CreditCard

EducGrad

EducProf

ZIP Code

Family

CCAvg

Mortgage

Securities Account

CD Account

Input variables

Constant term

Age

Experience

Income

K-Nearest Neighbours

Customer1 = [age=25, exper=1, income=49, family=4, CCAvg=1.6, education=UG,…] Customer2 = [age=49, exper=19,income=34, family=3, CCAvg=1.5, education=UG,…]

Performance Evaluation: Holdout Data

Predict each customer’s action

Overall Error

Missed acceptors

Targeted non-acceptors

Baseline: no offers 9.3% 9.3% 0.0%

Tree 2.5% 12.9% 1.4%

Regression 4.3% 35.5% 1.1%

K-NN 4.3% 41.9% 0.4%

Different: Identify 20% of customers most likely to accept

1,000 customers

Holdout

More predictive analytics methods: based on distance

Customer1 = [age=25, exper=1, income=49, family=4, CCAvg=1.6, education=UG,…] Customer2 = [age=49, exper=19,income=34, family=3, CCAvg=1.5, education=UG,…]

Where do the buzzwords fit in?

Unstructured data

Mobile Data

Social Media

Real-time data

Cloud Computing Big Data

Today’s Talk

1. How predictive analytics differ from Reporting and other BI tools

2. The predictive analytics process

3. Examples of problems that can be tackled

4. Logic behind predictive analytics algorithms

5. Predictive Analytics for retail in India

Step 1: Identify “classic” applications used by other companies

Step 2: Get Creative In India:

Cash On Delivery Counter service Huge growth in ATMs Multiple languages Regional customer preferences Informative names Bargaining

What you’ll need

Top management commitment Analytics team

with close ties to all departments (IT, Marketing,…) understands the business and its goals creative and fearless is allowed to experiment (and fail)

Data in a reachable place Software

Last Thought: Mindful Predictive Analytics

“VIP syndrome”

Predictive analytics for scaling-up to public white-glove treatment

Predictive analytics for reducing the burden on consumers, employees etc. (less offers & overload)

Asia Analytics Lab @ ISB facebook.com/groups/asiaanalytics

Recommended