45
IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson Research Center Collaborators: Claudia Perlich, Rick Lawrence, Srujana Merugu, et al.

IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

Embed Size (px)

Citation preview

Page 1: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation

Customer Wallet and OpportunityEstimation: Analytical Approachesand Applications

Saharon RossetIBM T. J. Watson Research Center

Collaborators: Claudia Perlich, Rick Lawrence, Srujana Merugu, et al.

Page 2: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation2

Targeting,Sales force

mgmt.

Business problem definition

Wallet / opportunity estimation

Modeling problem definition

Quantile est.,Latent

variable est.

Statistical problem definition

Quantile est.,Graphical

model

Modeling methodology design

Programming,Simulation,IBM Wallets

Model generation & validation

OnTarget,MAP

Implementation & application development

Project evolution and modeler’s roles

Minor role

Major role

Leadingcontributor

Page 3: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation3

Outline Introduction

– Business motivation and different wallet definitions

Modeling approaches for conditional quantile estimation

– Local and global models

– Empirical evaluation

A graphical model approach to wallet estimation

– Generic algorithm for class of latent variable modeling problems

MAP (Market Alignment Program)

– Description of application and goals

– The interview process and the feedback loop

– Evaluation of Wallet models performance in MAP

Page 4: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation4

What is Wallet (AKA Opportunity)?

Total amount of money a company can spend on a certain category of products.

Company Revenue

IT Wallet

IBM Sales

IBM sales IT wallet Company revenue

Page 5: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation5

Why Are We Interested in Wallet?

Customer targeting

– Focus on acquiring customers with high wallet

– Evaluate customers’ growth potential by combining wallet estimates and sales history

– For existing customers, focus on high wallet, low share-of-wallet customers

Sales force management

– Make resource assignment decisions

• Concentrate resources on untapped

– Evaluate success of sales personnel and sales channel by share-of-wallet they attain

OnT

argetM

AP

Page 6: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation6

Wallet Modeling Problem

Given:

– customer firmographics x (from D&B): industry, emloyee number, company type etc.

– customer revenue r

– IBM relationship variables z: historical sales by product

– IBM sales s

Goal: model customer wallet w, then use it to “predict” present/future wallets

No direct training data on w or information about its distribution!

Page 7: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation7

Historical Approaches within IBM Top down: this is the approach used by IBM

Market Intelligence in North America (called ITEM)

– Use econometric models to assign total “opportunity” to segment (e.g., industry geography)

– Assign to companies in segment proportional to their size (e.g., D&B employee counts)

Bottom up: learn a model for individual companies

– Get “true” wallet values through surveys or appropriate data repositories (exist e.g. for credit cards)

Many issues with both approaches (won’t go into detail)

– We would like a predictive approach from raw data

Page 8: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation8

Relevant Work in the Literature

While wallet (or share of wallet) is widely recognized as important, not much work on estimating it:

Du, Kamakura and Mela (2006) developed “list augmentation” approach, using survey data to model spending with competitors

Epsilon Data Management in white paper in 2001, proposed survey-based methodology

Zadrozny, Costa and Kamakura (2005) compared bottom-up and top-down approaches on IBM data. Evaluation is based on a survey.

Page 9: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation9

Traditional Approaches to Model Evaluation

Evaluate models based on surveys

– Cost and reliability issues

Evaluate models based on high-level performance indicators:

– Do the wallet numbers sum up to numbers that “make sense” at segment level (e.g., compared to macro-economic models)?

– Does the distribution of differences between predicted Wallet and actual IBM Sales and/or Company Revenue make sense? In particular, are the % we expect bigger/smaller?

– Problem: no observation-level evaluation

Page 10: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation10

Proposed Hierarchical IT Wallet Definitions TOTAL: Total customer available IT budget

– Probably not quantity we want (IBM cannot sell it all)

SERVED: Total customer spending on IT products covered by IBM

– Share of wallet is portion of this number spent with IBM?

REALISTIC: IBM sales to “best similar customers”

– This can be concretely defined as a high percentile of:P(IBM revenue | customer attributes)

– Fits typical definition of opportunity?

REALISTIC SERVED TOTAL

TOTAL

SERVED

REALISTIC

Page 11: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation11

An Approach to Estimating SERVED Wallets

Wallet is unobserved, all other variables are

Two families of variables --- firmographics and IBM relationship are conditionally independent given wallet

We develop inference procedures and demonstrate them

Theoretically attractive, practically questionable

(We will come back to this later)

Company

firmographics

IT spendwith IBM

Historical relationship

with IBM

SERVEDWallet

Page 12: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation12

Distribution of IBM sales to the customer given customer attributes: s|r,x,z ~ f,r,x,z

E.g., the standard linear regression assumption:

What we are looking for is the pth percentile of this distribution

REALISTIC Wallet: Percentile of Conditional

),(~,,| 2 zrxNzrxs

E(s|r,x,z) REALISTIC

Page 13: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation13

Estimating Conditional Distributions and Quantiles Assume for now we know which percentile p we

are looking for

First observe that modeling well the complete conditional distribution P(s|r,x,z) is sufficientIf have good parametric model and distribution

assumptions can also use it to estimate quantiles

– E.g.: linear regression under linear model and homoskedastic iid gaussian errors assumptions

Practically, however, may not be good idea to count on such assumptions

– Especially not a gaussian model, because of statistical robustness considerations

Page 14: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation14

Modeling REALISTIC Wallet Directly

REALISTIC defines wallet as pth percentile of conditional of spending given customer attributes

– Implies some (1-p)% of the customers are spending full wallet with IBM

Two obvious ways to get at the pth percentile:

– Estimate the conditional by integrating over a neighborhood of similar customers Take pth percentile of spending in neighborhood

– Create a global model for pth percentile Build global regression models

Page 15: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation15

Local Models: K-Nearest Neighbors Design distance metric, e.g.:

– Same industry

– Similar employees/revenue

– Similar IBM relationship

Neighborhood sizes (k):

– Neighborhood size has significant effect on prediction quality

Prediction:

– Quantile of firms in the neighborhood

Indu

stry

Employees IBM

spe

nd

Universe of IBM customers with D&B information

Neighborhood of target company

Target company i

Fre

qu

en

cy

IBM Sales

Wallet Estimate

Page 16: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation16

Global Estimation: the Quantile Loss Function Our REALISTIC wallet definition calls for estimating the

pth quantile of P(s|x,z).

Can we devise a loss function which correctly estimates the quantile on average?Answer: yes, the quantile loss function for quantile p.

This loss function is optimized in expectation when we correctly predict REALISTIC:

yyyyp

yyyypyyLp ˆ if )ˆ()1(

ˆ if )ˆ()ˆ,(

)|( of quantile p)|)ˆ,((minarg thˆ xyPxyyLE py

Page 17: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation17

-3 -2 -1 0 1 2 3

01

23

4

Some Quantile Loss Functions

p=0.8

p=0.5 (absolute loss)

Residual (observed-predicted)

Page 18: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation18

Quantile Regression Squared loss regression:

– Estimation of conditional expected value by minimizing sum of squares

Quantile regression:

– Minimize Quantile loss:

Implementation:

– assume linear function in some representation y = t f(x,z), solution using linear programming

– Linear quantile regression package in R (Koenker, 2001)

n

iiiip xzfsL

1

)),,(,(min

quantile regression

loss function

n

iiii xzfs

1

2)),,((min

yyyyp

yyyypyyLp ˆ if )ˆ()1(

ˆ if )ˆ()ˆ,(

Page 19: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation19

Quantile Regression Tree – Local or Global? Motivation:

– Identify a locally optimal definition of neighborhood

– Inherently nonlinear

Adjustments of M5/CART for Quantile prediction:

– Predict the quantile rather than the mean of the leaf

– Empirically, splitting/pruning criteria do not require adjustment

Industry = ‘Banking’

Sales<100K

IBM Rev 2003>10K

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

no

yes

no

no

yes

yes

Industry = ‘Banking’

Sales<100K

IBM Rev 2003>10K

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

Fre

qu

ency

IBM Sales

Wallet Estimate

no

yes

no

no

yes

yes

Page 20: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation20

Aside: Log-Scale Modeling of Monetary Quantities Due to exponential, very long tailed typical

distribution of monetary quantities (like Sales and Wallet), it is typically impossible to model them on original scale, because e.g.:

– Biggest companies dominate modeling and evaluation

– Any implicit homoskedasticity assumption in using fixed loss function is invalid

Log scale is often statistically appropriate, for example if % change is likely to be “homoskedastic”

Major issue: models ultimately judged in dollars, not log-dollars…

Page 21: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation21

Empirical Evaluation: Quantile Loss

Setup

– Four domains with relevant quantile modeling problems:direct mailing, housing prices, income data, IBM sales

– Performance on test set in terms of 0.9th quantile loss

– Approaches: Linear quantile regression, Q-kNN, Quantile trees, Bagged quantile trees, Quanting (Langrofd et al. 2006 -- reduces quantile estimation to averaged classification using trees)

Baselines

– Best constant model

– Traditional regression models for expected values, adjusted under Gaussian assumption (+1.28)

Page 22: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation22

Performance on Quantile Loss

Conclusions

– Standard regression models are not competitive

– If there is a time-lagged variable, LinQuantReg is best

– Otherwise, bagged quantile trees (and quanting) perform best

– Q-kNN is not competitive

Page 23: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation23

Residuals for Quantile Regression

Total positive holdout residuals: 90.05% (18009/20000)

Page 24: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation24

Graphical Model for SERVED(?) Wallet Estimation

Customer’s Firmographics (X)

Customer’s IT Wallet (W)

Customer’s Spendingwith IBM (S)

Customer’s Relationshipwith IBM (Z)

View 1

View 2

Two conditionally independent views !

Page 25: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation25

Generic Problem Setting

Unsupervised learning scenario:

Unobserved target variable

Observations on multiple predictor variables

Domain knowledge suggesting that the predictors form multiple conditionally independent views

Goal: To predict the target variable

Page 26: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation26

Summary of Results on Generic Problem Analysis of a relevant class of latent variable

models– Markov blanket can be split into conditionally independent

views

– For exponential linear models, the maximum likelihood estimation reduces to convex optimization problem

Solution approaches for Gaussian likelihoods– Reduction to single linear least squares regression

– ANOVA for testing conditional independence assumptions

Empirical evaluation– Comparable to supervised learning with significant amount of

training data

– Case study on wallet estimation

Page 27: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation27

Discriminative Maximum Likelihood Inference Given: Directed graphical model and parametric form of the

conditional distributions of nodes given their parents

Goal: Predict the target W using the parameter estimates that are most likely given the observed data and the graphical model:

Where = (0, 1) is the parameter vector for the parametric conditional likelihoods, and D is our data

Solution: Expectation-Maximization (EM) algorithm

– Converges to a local optimum in general

Estimating W: Mean or mode of “posterior”

dwww Z),|(SPX)|(Plogmax Z)X, | (SP logmax10D,

*

Z)X,|W(P *

Page 28: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation28

General Theoretical Result: Exponential Models

Theorem: When the conditional distributions p(W|X) and p(S|W,Z), correspond to exponential linear models with matching link functions, the incomplete discriminative log-likelihood: LD() = log PD,(S|X,Z)is a concave function of the parameters

Maximum likelihood estimation reduces to a convex optimization problem

EM algorithm converges to the globally optimal solution

Page 29: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation29

Gaussian Likelihoods and Linear Regression Assume both discriminative likelihoods P(W|X)

and P(S|W,Z) are linear and gaussian:

wi - txi = iw ~ N(0, w2) i.i.d

si - wi - tzi = is ~ N(0, s2) i.i.d

Previous theorem says that EM would give ML solution MLE= (MLE, MLE)

But if we add equations up we eliminate W:

si - txi - tzi = (is+ iw) ~ N(0, s2+ w

2) i.i.d

Maximum likelihood solution of this problem is linear regression and gives solution LS= (LS, LS)

– Are the two solutions the same?

Page 30: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation30

Equivalence and Interpretation

Equivalence Theorem: When U=[X,Z] is a full column rank matrix, the two estimates are identical: MLE =LS

Consistency of LS and unbiasedness of resulting W estimates

Can make use of linear regression computation and inference toolsIn particular: ANOVA to test validity of assumptions

Some caveats we glossed over

– In particular, full rank requirement implies cannot have intercept in both gaussian likelihoods!

Page 31: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation31

ANOVA for Testing Independence Assumptions

ANOVA: Variance-based analysis for determining the goodness of fit for nested linear models

Example of nested models:

– Model A: Linear model with only variables in X, Z and no interactions

– Model B: Allow interactions only within X and Z

– Model C: Allow interactions between variables in X and Z

Key Idea: if model C is statistically superior to model B conditional independence and/or parametric assumptions are rejected

Page 32: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation32

Some Simulation Results

Z

z

Page 33: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation33

Wallet Case Study Results

Modeling equations: (monetary values log scale)

log(wi) = f(xi) + cw + iw, iw~ N(0, σ2)

log(si) − log(wi) = g(zi) + cs + is, is~ N(0, σ2)

(cw, cs are intercepts, f, g are parametric forms)

Data is 2000 IBM customers in finance sector

ANOVA results consistent with cond. independence:

Page 34: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation34

Market Alignment Project (MAP): Background

MAP - Objective:

– Optimize the allocation of sales force

– Focus on customers with growth potential

– Set evaluation baselines for sales personal

MAP – Components:

– Web-interface with customer information

– Analytical component: wallet estimates

– Workshops with Sales personal to review and correct the wallet predictions

– Shift of resources towards customers with lower wallet share

Page 35: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation35

MAP Tool Captures Expert Feedback from the Client Facing Teams

Transaction Data

D&BData

Wallet models: Predicted

Opportunity

ResourceAssignments

Expert validated

Opportunity

Analytics and Validation

Data Integration

Insight Delivery and Capture

Post-processing

MAP Interview Team Client Facing Unit (CFU) Team

Web Interface

MAP interview process – all Integrated and Aligned Coverages

The objective here is to use expert feedback (i.e. validated revenue opportunity) from from last year’s workshops to evaluate our latest opportunity models

Page 36: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation36

MAP Workshops Overview Calculated 2005 opportunity using naive Q-kNN

approach

2005 MAP workshops

– Displayed opportunity by brand

– Expert can accept or alter the opportunity

Select 3 brands for evaluation: DB2, Rational, Tivoli

Build ~100 models for each brand using different approaches

Compare expert opportunity to model predictions

– Error measures: absolute, squared

– Scale: original, log, root

Page 37: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation37

Initial Q-kNN Model Used

Distance metric

– Identical Industry

– Euclidean distance on size (Revenue or employees)

Neighborhood sizes 20

Prediction

– Median of the non-zero neighbors

– (Alternatives Max, Percentile)

Post-Processing

– Floor prediction by max of last 3 years revenue

Indu

stry

Employees Reven

ue

Universe of IBM customers with D&B information

Neighborhood of target company

Target company i

Page 38: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation38

0

2

4

6

8

10

12

14

16

18

20

0 2 4 6 8 10 12 14 16 18 20

Expert Feedback

MODEL_OPPTY

Expert Feedback (Log Scale) to Original Model (DB2)

Experts reduce opportunity to 0(15%)

Experts acceptopportunity (45%)

Experts changeopportunity (40%)

Increase (17%)

Decrease (23%)

Page 39: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation39

Observations

Many accounts are set for external reasons to zero

– Exclude from evaluation since no model can predict this

Exponential distribution of opportunities

– Evaluation on the original (non-log) scale suffers from huge outliers

Experts seem to make percentage adjustments

– Consider log scale evaluation in addition to original scale and root as intermediate

– Suspect strong “anchoring” bias, 45% of opportunities were not touched

Page 40: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation40

Evaluation Measures

Different scales to avoid outlier artifacts

– Original: e = model - expert

– Root: e = root(model) - root(expert)

– Log: e = log(model) - log(expert)

Statistics on the distribution of the errors

– Mean of e2

– Mean of |e|

Total of 6 criteria

Page 41: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation41

Model Comparison Results

Model Rational DB2 Tivoli

Displayed Model (kNN) 6 6 4 5 6 6

Max 03-05 Revenue 1 1 0 3 1 4

Linear Quantile 0.8 5 6 2 4 3 5

Regression Tree 1 3 2 4 1 2

Q-kNN 50 + flooring 2 3 6 6 4 6

Decomposition Center 0 0 3 5 0 4

Quantile Tree 0.8 0 1 2 4 1 4

(Anchoring)

(Best)

We count how often a model scores within the top 10 and 20 for each of the 6 measures:

Page 42: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation42

MAP Experiments Conclusions Q-kNN performs very well after flooring but is

typically inferior prior to flooring

80th percentile Linear quantile regression performs consistently well (flooring has a minor effect)

Experts are strongly influenced by displayed opportunity (and displayed revenue of previous years)

Models without last year’s revenue don’t perform well

Use Linear Quantile Regression with q=0.8 in MAP 06

Page 43: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation43

MAP Business Impact MAP launched in 2005

– In 2006 420 workshops held worldwide, with teams responsible for most of IBM’s revenue

Most important use is segmentation of customer base

– Shift resources into “invest” segments with low wallet share

Extensive anecdotal evidence to success of process

– E.g., higher growth in “invest” accounts after resource shifts

MAP recognized as 2006 IBM Research Accomplishment

– Awarded based on “proven” business impact

Page 44: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation44

Summary

Wallet estimation problem is practically important and under-researched

Our contributions:

– Propose Wallet definitions: SERVED and REALISTIC

– Offer corresponding modeling approaches:

• Quantile estimation methods• Graphical latent variable model

– Evaluation on simulated, public and internal data

– Implementation within MAP project

We are interested in extending both theory and practice to other domains than IBM

Page 45: IBM Research © 2006 IBM Corporation Customer Wallet and Opportunity Estimation: Analytical Approaches and Applications Saharon Rosset IBM T. J. Watson

IBM Research

© 2006 IBM Corporation45

Thank you!

[email protected]