49
From Data to Revenue: Prescriptive Analytics with RapidMiner David Weisman, Ph.D. [email protected] L A T E X compile time: August 21, 2014, 14:27

RM World 2014: Prescriptive analytics with RapidMiner

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: RM World 2014: Prescriptive analytics with RapidMiner

From Data to Revenue:Prescriptive Analytics with RapidMiner

David Weisman, Ph.D.

[email protected]

LATEX compile time: August 21, 2014, 14:27

Page 2: RM World 2014: Prescriptive analytics with RapidMiner

© 2014 David Weisman. All rights reserved.

If you’d like to use this material for any purpose,please contact [email protected].

Page 3: RM World 2014: Prescriptive analytics with RapidMiner

Outline

Prescriptive Analytics Landscape

Sneak Peek: RapidMiner Prescriptive Analytics Extension

Page 4: RM World 2014: Prescriptive analytics with RapidMiner

Outline

Prescriptive Analytics Landscape

Sneak Peek: RapidMiner Prescriptive Analytics Extension

Page 5: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics delivers largest value

DescriptiveAnalytics

PrescriptiveAnalytics

How can wemake it happen?

PredictiveAnalytics

What willhappen?

Whathappened?

Difficulty

Value

Derived from: Gartner (December 2012)

Page 6: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics provides large gains

20 millionmiles / year

UPS Right-hand turns minimizefuel consumption

$20 million / year American Airlines Optimize crew pairing

$200 million NBC Improve productivity

35% cost reduction Volkswagen USA Optimize supply chain

$294.8 million Proctor & Gamble Optimize sourcing withsuppliers

$3.5 million / year Fingerhut Choose most profitablesequence of catalogs

$53 million Taco Bell Optimize costs by schedulingand allocating work crews

Chen, DS, et al. Applied Integer Programming: Modeling and Solution, 2011

Page 7: RM World 2014: Prescriptive analytics with RapidMiner

Gartner Hype Cycle describes technology adoption

Innovation Trigger

Peak of

Inflated Expectations

Trough of Disillusionment

Slope of Enlightenment Plateau of

Productivity

time

expectations

Plateau will be reached in:

As of July 2013 Bioacoustic Sensing

Smart Dust

Quantum Computing

Quantified Self 3D Bioprinting Brain-Computer Interface

Human Augmentation

Volumetric and Holographic Displays

Electrovibration

Affective Computing

Prescriptive Analytics

Autonomous Vehicles Biochips

Neurobusiness

3D Scanners Mobile Robots

Speech-to-Speech Translation

Internet of Things Natural-Language Question Answering

Big Data

Consumer 3D Printing Gamification Wearable User Interfaces

Complex-Event Processing

Content Analytics

In-Memory Database Management Systems Virtual Assistants

Augmented Reality Machine-to-Machine Communication Services

Mobile Health Monitoring NFC

Mesh Networks: Sensor

Cloud

Computing

Virtual Reality

In-Memory Analytics

Gesture Control

Activity Streams Enterprise 3D Printing

Biometric Authentication Methods

Consumer Telematics Location Intelligence

Speech Recognition

Predictive Analytics

Source: Gartner

Page 8: RM World 2014: Prescriptive analytics with RapidMiner

Predictive Analytics is mainstream

g

Location Intelligence

Speech Recognition

Predictive Analytics

Source: Gartner

Page 9: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics is coming fast

Electrovibration

Affective Computing

Prescriptive Analytics

Source: Gartner

Page 10: RM World 2014: Prescriptive analytics with RapidMiner

Example: Maximize hospital bed utilizationOptimize hospital bed allocationNaïve allocation Optimized allocation

bed 1

bed 2

D1-3

A1-2

E2-4

B5-7

C4-7

November1 2 3 4 5 6 7

November1 2 3 4 5 6 7

D1-3

A1-2

E2-4

B5-7

C4-7

F3-5

In practice:I Many roomsI Many bedsI Many constraints (medical equipment, patient requirements)I Hugely inefficient with ad-hoc planning http://docs.jboss.org

Page 11: RM World 2014: Prescriptive analytics with RapidMiner

Predictive Analytics → Prescriptive Analytics

Optimize hospital bed allocationNaïve allocation Optimized allocation

bed 1

bed 2

D1-3

A1-2

E2-4

B5-7

C4-7

November1 2 3 4 5 6 7

November1 2 3 4 5 6 7

D1-3

A1-2

E2-4

B5-7

C4-7

F3-5

Boston Globe, Smart Scheduling

Page 12: RM World 2014: Prescriptive analytics with RapidMiner

Gartner describes Prescriptive Analytics

Descriptive Predictive Prescriptive

% Usage 43%–68% 13% < 3%

Technologies OptimizationSimulation

AnalyticalMethods

Linear ProgrammingInteger ProgrammingStochastic ProgrammingMonte-Carlo Approaches

Vendors Matlab, Wolfram,iThink, Oracle GAMS,CPLEX (IBM), Gurobi,Xpress (Fico), SAS-OR

Derived from Gartner materials

Page 13: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics market has an enormous gap

Limited tools Operations Researcher tools

Page 14: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics market has an enormous gap

Operations Researcher toolsLimited tools

Microsoft

Page 15: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics market has an enormous gap

Limited tools Operations Researcher tools

POSITIVE VARIABLESDISASSEMB QUANTITY OF CHICKENS DISASSEMBLED

PARAMETER SALEPRICE(COMPONENTS)SALES PRICE OF COMPONENT PARTS/NECK 0.20, GIBLETS 0.70/

EQUATIONSOBJT OBJECTIVE FUNCTION (NETINCOME)LIMIT CHICKEN AVAILABILITY;OBJT.. NETINCOME =E=

+ SUM(COMPONENTS,SALEPRICE(COMPONENTS)* SALES(COMPONENTS));

LIMIT.. 1/3 *DISASSEMB =L= 1500;

GAMS sample from McCarl, BA, et al. Applied mathematical programming using algebraic systems, 1997

Page 16: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics market has an enormous gap

Limited tools Operations Researcher tools

Optimize

obj

con

res

Page 17: RM World 2014: Prescriptive analytics with RapidMiner

Business optimization has 3 components

Objective: Business goal

I Maximize sales revenueI Minimize manufacturing costI Maximize patient health

Decision variables: Elements we control to reach goal

I Which customers to up-sellI Volume to purchase from each supplierI Which patients get home treatment

Constraints: Limitations or requirements

I Boston sales must book >2× NYC salesI Each supplier gets < 20% of our businessI Hospital has 200 beds

Page 18: RM World 2014: Prescriptive analytics with RapidMiner

Business optimization has 3 components

Objective: Business goal

I Maximize sales revenueI Minimize manufacturing costI Maximize patient health

Decision variables: Elements we control to reach goal

I Which customers to up-sellI Volume to purchase from each supplierI Which patients get home treatment

Constraints: Limitations or requirements

I Boston sales must book >2× NYC salesI Each supplier gets < 20% of our businessI Hospital has 200 beds

Page 19: RM World 2014: Prescriptive analytics with RapidMiner

Business optimization has 3 components

Objective: Business goal

I Maximize sales revenueI Minimize manufacturing costI Maximize patient health

Decision variables: Elements we control to reach goal

I Which customers to up-sellI Volume to purchase from each supplierI Which patients get home treatment

Constraints: Limitations or requirements

I Boston sales must book >2× NYC salesI Each supplier gets < 20% of our businessI Hospital has 200 beds

Page 20: RM World 2014: Prescriptive analytics with RapidMiner

Business optimization has 3 components

Objective: Business goal

I Maximize sales revenueI Minimize manufacturing costI Maximize patient health

Decision variables: Elements we control to reach goal

I Which customers to up-sellI Volume to purchase from each supplierI Which patients get home treatment

Constraints: Limitations or requirements

I Boston sales must book >2× NYC salesI Each supplier gets < 20% of our businessI Hospital has 200 beds

Page 21: RM World 2014: Prescriptive analytics with RapidMiner

Outline

Prescriptive Analytics Landscape

Sneak Peek: RapidMiner Prescriptive Analytics Extension

Page 22: RM World 2014: Prescriptive analytics with RapidMiner

Take a sneak peek: Prescriptive Analytics Extension

Optimize

obj

con

res

Today you’ll see:I Mixed-integer linear solverI Heuristic route optimizerI Fun Boston demoI Cloud-based optimization

Page 23: RM World 2014: Prescriptive analytics with RapidMiner

Here are some optimization algorithms

HelpViewToolsProcessEditFile

Operators

Search

Prescriptive Analytics (5)

Local Prescriptive Analytics (4)

Linear And Integer Programming (3)

Heuristic (1)

Traveling-salesperson solver (simple 2d)

Cloud Prescriptive Analytics (1)

Linear And Integer Programming (1)

Cloud-based NEOS MILP solver

Page 24: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics are easy in RapidMinerHelpViewToolsProcess

big help big process WizardResults (F9) Design (F8) Home

Help

C l o u d - b a s e d N E O S M I L P s o l v e r

S y n o p s i s

Solve linear and integer programs in the cloud using NEOS.

D e s c r i p t i o n

This cloud-based operator solves linear, integer, and

mixed-integer linear programs using the NEOS server. The

underlying solver engine is Gurobi.

I n p u t

Parameters

typedecision variable datatype attribute

nameconstraint name attribute

inequalityconstraint inequality attr ibute

resourceLimitconstraint l imit attr ibute

coefficientdecision variables coefficients

maximizecriterion

NEOS_MILP (Cloud-based NEOS MILP solver)

LogProblems

No problems found

Message Fixes Location

Process

Main Process

Read CSV

f i l ou t

Read CSV (2)

f i l ou t

NEOS_MILP

Dec

Con

Exa

Exa

inp res

res

res

Process

Onboard April 9

LPSolve simple

set exp 01

Prescriptive Analytics (1)

Cloud Prescriptive Analytics (1)

Linear And Integer Programming (1)

Cloud-based NEOS MILP solver

Page 25: RM World 2014: Prescriptive analytics with RapidMiner

Let’s do Prescriptive Analytics Challenge #1

Objective I Maximize fun during Boston visit

Decision variables I 50 sites we could visit:XFenway ParkXWally’s Cafe (best local jazz)...X History tour with quiz at the end

Constraints I $500 budgetI Required minimum levels of:

Culture >= 12 sitesHistory >= 4 sitesEntertainment >= 9 sitesSports >= 5 sites

Page 26: RM World 2014: Prescriptive analytics with RapidMiner

Define a fun level for each tourist activity

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Read Objective)

Row No. I tem FunLevel Type

1 Back Bay 6 bin

2 Beacon Hill 5 b in

3 Bleacher Bar 8 bin

4 Boston Children's Museum 2 bin

5 Boston Harbor Islands National Recreation Area 7 bin

6 Boston Pops 4 bin

7 Boston Public Garden 4 bin

8 Boston Public Library 1 bin

9 Symphony Hall 4 bin

1 0 Bunker Hill Monument 7 bin

1 1 Café 939 8 bin

1 2 Charles River Esplanade 6 bin

1 3 Duck Tours 7 bin

1 4 Dunkin Donuts 9 bin

1 5 Faneuil Hall Marketplace 5 bin

1 6 Fenway Park 1 0 bin

1 7 Harpoon Brewery Tour 1 0 bin

1 8 Harvard University 7 bin

1 9 Harvard Museum of Natural History 8 bin

2 0 House of Blues 8 bin

2 1 Huntington Theatre Company 6 bin

2 2 Institute of Contemporary Art 100 9 bin

2 3 Isabella Stewart Gardner Museum 4 bin

2 4 John F. Kennedy Presidential Museum & Library 5 bin

allFilter (47 / 47 examples):ExampleSet (47 examples, 1 special attribute, 2 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Read Objective)

Row No. I tem FunLevel Type

2 4 John F. Kennedy Presidential Museum & Library 5 bin

2 5 Massachusetts Institute of Technology 9 bin

2 6 Mapparium 5 bin

2 7 Massachusetts State House 4 bin

2 8 Museum of Fine Arts 8 bin

2 9 Museum of Science 7 bin

3 0 New England Aquarium 7 bin

3 1 Newbury Street 6 bin

3 2 Newport Mansions 2 bin

3 3 North End 8 bin

3 4 Old North Church 5 bin

3 5 Old South Meeting House 5 bin

3 6 Paul Revere House 5 bin

3 7 Regattabar Jazz Club - Charles Hotel 7 bin

3 8 Ryles Jazz Club 7 bin

3 9 Scullers Jazz Club 7 bin

4 0 SoWa Galleries 6 bin

4 1 Spirit Cruises 7 bin

4 2 Sports Museum 8 bin

4 3 Top of the Hub Restaurant & Skywalk 6 bin

4 4 USS Constitution 5 bin

4 5 Trinity Church 5 bin

4 6 Wally’s Café 1 0 bin

4 7 Whale Watch 8 bin

allFilter (47 / 47 examples):ExampleSet (47 examples, 1 special attribute, 2 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 27: RM World 2014: Prescriptive analytics with RapidMiner

Define costs and categoriesHelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Join)

Row No. I tem Back Bay Beacon Hill Duck Tours Dunkin Donuts Wally’s Café Boston Children's Museum Boston Harbor Islands ...

1 Cost 0 0 3 5 3 4 0 2 0 5 0

2 FunLevel 6 5 7 9 1 0 2 7

3 Culture 1 1 0 1 1 1 0

4 History 1 0 0 1 1 0 1

5 Sports 0 0 0 1 0 0 0

6 Entertainment 0 0 1 1 1 0 1

allFilter (6 / 6 examples):ExampleSet (6 examples, 1 special attribute, 49 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 28: RM World 2014: Prescriptive analytics with RapidMiner

Define business constraints

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Join)

Row No. I tem Direction Limit Back Bay Beacon Hill Duck Tours Dunkin Donuts Wally’s Café Boston Children's Museum

1 Cost < = 500 0 0 3 5 3 4 0 2 0

2 FunLevel > = 1 0 6 5 7 9 1 0 2

3 Culture > = 1 2 1 1 0 1 1 1

4 History > = 4 1 0 0 1 1 0

5 Sports > = 5 0 0 0 1 0 0

6 Entertainment > = 9 0 0 1 1 1 0

allFilter (6 / 6 examples):ExampleSet (6 examples, 1 special attribute, 49 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 29: RM World 2014: Prescriptive analytics with RapidMiner

LP Solve

Dec

Con

Exa

Exagives our optimized choices

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Read CSV)

Row No. decisionVar value

1 OBJECTIVE 218

2 Back Bay 1

3 Beacon Hill 1

4 Bleacher Bar 1

5 Boston Children's Museum 0

6 Boston Harbor Islands National Recreation Area 0

7 Boston Pops 0

8 Boston Public Garden 1

9 Boston Public Library 1

1 0 Symphony Hall 0

1 1 Bunker Hill Monument 1

1 2 Café 939 1

1 3 Charles River Esplanade 1

1 4 Duck Tours 1

1 5 Dunkin Donuts 1

1 6 Faneuil Hall Marketplace 0

1 7 Fenway Park 1

1 8 Harpoon Brewery Tour 1

1 9 Harvard University 1

2 0 Harvard Museum of Natural History 1

2 1 House of Blues 1

2 2 Huntington Theatre Company 0

2 3 Institute of Contemporary Art 100 1

2 4 Isabella Stewart Gardner Museum 1

allFilter (54 / 54 examples):ExampleSet (54 examples, 0 special attributes, 2 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Read CSV)

Row No. decisionVar value

3 1 New England Aquarium 0

3 2 Newbury Street 1

3 3 Newport Mansions 0

3 4 North End 1

3 5 Old North Church 1

3 6 Old South Meeting House 1

3 7 Paul Revere House 1

3 8 Regattabar Jazz Club - Charles Hotel 0

3 9 Ryles Jazz Club 0

4 0 Scullers Jazz Club 0

4 1 SoWa Galleries 1

4 2 Spirit Cruises 0

4 3 Sports Museum 1

4 4 Top of the Hub Restaurant & Skywalk 0

4 5 USS Constitution 1

4 6 Trinity Church 1

4 7 Wally’s Café 1

4 8 Whale Watch 0

4 9 Cost 497

5 0 FunLevel 218

5 1 Culture 1 6

5 2 History 1 9

5 3 Sports 5

5 4 Entertainment 1 1

allFilter (54 / 54 examples):ExampleSet (54 examples, 0 special attributes, 2 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 30: RM World 2014: Prescriptive analytics with RapidMiner

How well did we do?

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (Read CSV)

Row No. decisionVar value

3 1 New England Aquarium 0

3 2 Newbury Street 1

3 3 Newport Mansions 0

3 4 North End 1

3 5 Old North Church 1

3 6 Old South Meeting House 1

3 7 Paul Revere House 1

3 8 Regattabar Jazz Club - Charles Hotel 0

3 9 Ryles Jazz Club 0

4 0 Scullers Jazz Club 0

4 1 SoWa Galleries 1

4 2 Spirit Cruises 0

4 3 Sports Museum 1

4 4 Top of the Hub Restaurant & Skywalk 0

4 5 USS Constitution 1

4 6 Trinity Church 1

4 7 Wally’s Café 1

4 8 Whale Watch 0

4 9 Cost 497

5 0 FunLevel 218

5 1 Culture 1 6

5 2 History 1 9

5 3 Sports 5

5 4 Entertainment 1 1

allFilter (54 / 54 examples):ExampleSet (54 examples, 0 special attributes, 2 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 31: RM World 2014: Prescriptive analytics with RapidMiner

Let’s evaluate our solution

I 1014 possible combinations of activitiesI Far too complex to check by hand

I Plan:Generate 1000 other solutions that meet all constraintsCompare against our optimal solution

Page 32: RM World 2014: Prescriptive analytics with RapidMiner

We did far better than 1000 other feasible solutions

We maximized fun

within our budget

0

50

100

150

100 150 200 250

FunLevel

Exp

erim

ent c

ount

Page 33: RM World 2014: Prescriptive analytics with RapidMiner

We used most of budget to maximize fun

We maximized fun

within $500 budget

100

125

150

175

200

225

$425 $450 $475 $500

Cost

Fun

Sub−optimalsolution count

5 10 15 20

Page 34: RM World 2014: Prescriptive analytics with RapidMiner

Category levels well above minimum constraints

●●

● ●

Cost FunLevel Culture History Sports Entertainment

5% vertical jitter for clarity

Page 35: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics Challenge #2:Optimize our tourist route

●●

●●

●●●

●●

●●●●

●●

●●●●

●●

●●

●●●

●●

Google

Page 36: RM World 2014: Prescriptive analytics with RapidMiner

Define Boston tourist optimization challenge

Objective I Minimize transportation distance

Decision variables I Ordered list of sites to visit

Constraints I Start in Back BayI Visit each site exactly onceI End in Back Bay

Huge problem: 1034 routes satisfy constraints

Page 37: RM World 2014: Prescriptive analytics with RapidMiner

Define Boston tourist optimization challenge

Objective I Minimize transportation distance

Decision variables I Ordered list of sites to visit

Constraints I Start in Back BayI Visit each site exactly onceI End in Back Bay

Huge problem: 1034 routes satisfy constraints

Page 38: RM World 2014: Prescriptive analytics with RapidMiner

RapidMiner + Google Maps provide geocoding

Main Process

Read CSV

f i l ou t

preprocess

in

in

ou t

ou t

Encode URLs

exa exa

ori

Enrich Data.. .

Exa Exa

Write CSV

inp th r

f i l

inp res

res

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (//Local Repository/tmp/data/places)

Row No. I tem Street

1 4 Dunkin Donuts 127 Tremont

1 5 Faneuil Hall Marketplace ?

1 6 Fenway Park 4 Yawkey Way

1 7 Harpoon Brewery Tour 306 Northern Avenue

1 8 Harvard University ?

1 9 Harvard Museum of Natural History 26 Oxford Street

2 0 House of Blues 15 Lansdowne St

2 1 Huntington Theatre Company 264 Huntington Ave

2 2 Institute of Contemporary Art 100 Northern Ave

2 3 Isabella Stewart Gardner Museum 25 Evans Way

2 4 John F. Kennedy Presidential Museum & Library Columbia Point

2 5 Massachusetts Institute of Technology ?

2 6 Mapparium 200 Massachusetts Ave

2 7 Massachusetts State House ?

2 8 Museum of Fine Arts ?

2 9 Museum of Science 1 Science Park

3 0 New England Aquarium Central Wharf

3 1 Newbury Street 1 Newbury St

3 2 Newport Mansions 424 Bellevue Ave Newport RI

3 3 North End 1 Prince St

3 4 Old North Church 193 Salem Street

3 5 Old South Meeting House 310 Washington Street

3 6 Paul Revere House 19 North Square

3 7 Regattabar Jazz Club - Charles Hotel 1 Bennett Street

3 8 Ryles Jazz Club 212 Hampshire Street

3 9 Scullers Jazz Club 400 Soldiers Field Road

4 0 SoWa Galleries 450 Harrison Ave

4 1 Spirit Cruises 200 Seaport Boulevard

4 2 Sports Museum 100 Legends Way

4 3 Top of the Hub Restaurant & Skywalk 800 Boylston Street

4 4 USS Constitution ?

4 5 Trinity Church 206 Clarendon St

4 6 Wally’s Café 427 Massachusetts Ave

4 7 Whale Watch Central Wharf

allFilter (47 / 47 examples):ExampleSet (47 examples, 0 special attributes, 2 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

ExampleSet (//Local Repository/tmp/data/placesGeocoded) ExampleSet (//Local Repository/tmp/data/places)

Result Overview ExampleSet (//Local Repository/tmp/data/placesGeocoded)

Row No. I tem address longitude lati tude

1 4 Dunkin Donuts 127 Tremont Street, Boston, MA 02108, USA-71.062 42.356

1 5 Faneuil Hall Marketplace Faneuil Hall, Faneuil Hall Marketplace, 1 Faneuil Hall Square, Boston, MA 02109, USA-71.056 42.360

1 6 Fenway Park 4 Yawkey Way, Boston, MA 02215, USA -71.098 42.346

1 7 Harpoon Brewery Tour 306 Northern Avenue, Boston, MA 02210, USA-71.034 42.347

1 8 Harvard University Harvard University, Cambridge, MA 02138, USA-71.117 42.377

1 9 Harvard Museum of Natural History26 Oxford Street, Harvard University, Cambridge, MA 02138, USA-71.115 42.379

2 0 House of Blues 15 Lansdowne Street, Boston, MA 02215, USA-71.096 42.347

2 1 Huntington Theatre Company 264 Huntington Avenue, Boston, MA 02115, USA-71.086 42.342

2 2 Institute of Contemporary Art 100Northern Avenue, Boston, MA, USA -71.042 42.350

2 3 Isabella Stewart Gardner Museum25 Evans Way, Boston, MA 02115, USA -71.098 42.338

2 4 John F. Kennedy Presidential Museum & LibraryColumbia Point, University of Massachusetts Boston, Boston, MA 02125, USA-71.035 42.315

2 5 Massachusetts Institute of TechnologyMassachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, USA-71.094 42.360

2 6 Mapparium 200 Massachusetts Avenue, Boston, MA 02115, USA-71.086 42.345

2 7 Massachusetts State House Massachusetts State House, Boston, MA 01233, USA-71.064 42.359

2 8 Museum of Fine Arts Museum of Fine Arts, Boston, 465 Huntington Avenue, Boston, MA 02115, USA-71.094 42.339

2 9 Museum of Science Museum Of Science Driveway, Boston, MA 02114, USA-71.071 42.368

3 0 New England Aquarium Central Wharf, Boston, MA 02110, USA -71.050 42.359

3 1 Newbury Street 1 Newbury Street, Boston, MA 02116, USA -71.071 42.353

3 2 Newport Mansions 424 Bellevue Avenue, Newport, RI 02840, USA-71.309 41.482

3 3 North End 1 Prince Street, Boston, MA 02113, USA -71.053 42.364

3 4 Old North Church 193 Salem Street, Boston, MA 02113, USA -71.055 42.366

3 5 Old South Meeting House 310 Washington Street, Boston, MA 02108, USA-71.059 42.357

3 6 Paul Revere House 19 North Square, Boston, MA 02113, USA -71.054 42.364

3 7 Regattabar Jazz Club - Charles Hotel1 Bennett Street, Cambridge, MA 02138, USA-71.123 42.372

3 8 Ryles Jazz Club 212 Hampshire Street, Cambridge, MA 02139, USA-71.100 42.373

3 9 Scullers Jazz Club 400 Soldiers Field Road, Allston, MA 02134, USA-71.118 42.360

4 0 SoWa Galleries 450 Harrison Avenue, Boston, MA 02118, USA-71.065 42.343

4 1 Spirit Cruises 200 Seaport Boulevard, Boston, MA 02210, USA-71.040 42.351

4 2 Sports Museum 100 Legends Way, Boston, MA 02114, USA -71.062 42.366

4 3 Top of the Hub Restaurant & Skywalk800 Boylston Street, The Shops at the Prudential Center, Boston, MA 02199, USA-71.083 42.348

4 4 USS Constitution USS Constitution Museum, Charlestown, MA 02129, USA-71.055 42.374

4 5 Trinity Church 206 Clarendon Street, Boston, MA 02116, USA-71.075 42.350

4 6 Wally’s Café 427 Massachusetts Avenue, Boston, MA 02118, USA-71.082 42.341

4 7 Whale Watch Central Wharf, Boston, MA 02110, USA -71.050 42.359

allFilter (47 / 47 examples):ExampleSet (47 examples, 0 special attributes, 4 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 39: RM World 2014: Prescriptive analytics with RapidMiner

Optimize route with Traveling-Salesperson solver

Main Process

Read Objec.. .

f i l ou t

prep Constr. . .

in ou t

ou t

LP Solve

Dec

Con

Exa

Exa

prep route

in

in

in

ou t

ou t

ou t

ou t

Write TSP

inp th r

f i l

Traveling-s.. .

Exa Exa

inp

res

res

res

res

Page 40: RM World 2014: Prescriptive analytics with RapidMiner

Traveling-s.. .

Exa Exa optimized our itineraryHelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Result Overview ExampleSet (LP Solve) ExampleSet (Multiply)ExampleSet (Traveling-salesperson solver (simple 2d))

Row No. f rom t o distance

1 Back Bay Wally’s Café 8 8

2 Wally’s Café Mapparium 6 0

3 Mapparium Museum of Fine Arts 9 6

4 Museum of Fine Arts Isabella Stewart Gardner Museum 4 0

5 Isabella Stewart Gardner Museum Fenway Park 8 4

6 Fenway Park Bleacher Bar 1 7

7 Bleacher Bar House of Blues 1 0

8 House of Blues Harvard University 363

9 Harvard University Harvard Museum of Natural History 2 4

1 0 Harvard Museum of Natural History Massachusetts Institute of Technology 283

1 1 Massachusetts Institute of Technology Charles River Esplanade 9 6

1 2 Charles River Esplanade Café 939 5 4

1 3 Café 939 Duck Tours 2 8

1 4 Duck Tours Boston Public Library 5 5

1 5 Boston Public Library Trinity Church 2 3

1 6 Trinity Church Newbury Street 4 9

1 7 Newbury Street Boston Public Garden 1 8

1 8 Boston Public Garden Massachusetts State House 7 7

1 9 Massachusetts State House Dunkin Donuts 3 2

2 0 Dunkin Donuts Old South Meeting House 3 2

2 1 Old South Meeting House Beacon Hill 2 4

2 2 Beacon Hill Sports Museum 7 8

2 3 Sports Museum Museum of Science 8 8

2 4 Museum of Science Bunker Hill Monument 136

allFilter (33 / 33 examples):ExampleSet (33 examples, 0 special attributes, 3 regular attributes)

Data

Statistics

Charts

Advanced Charts

Annotation

Page 41: RM World 2014: Prescriptive analytics with RapidMiner

Here’s our optimized route

●●

●●●

●●

●●

●●

●●

● ●●

●●

●●

●●

FunLevel●

2.5

5.0

7.5

10.0

0

20

40

60

80Cost

Page 42: RM World 2014: Prescriptive analytics with RapidMiner

Our optimized route was shorter than 10,000 others

We did better

0

500

1000

1500

2000

0.00 0.25 0.50 0.75 1.00 1.25

Distance

Cou

nt

●●

●●●

●●

●●

●●

●●

● ●●

●●

●●

●●

FunLevel●

2.5

5.0

7.5

10.0

0

20

40

60

80Cost

Page 43: RM World 2014: Prescriptive analytics with RapidMiner

Prescriptive Analytics Challenge #3:Optimize churn customer retention campaign

HelpViewToolsProcessEditFile

big help big process Wizard Results (F9) Design (F8) Home

Application Wizard

Churn Data

To build a predict ive model of churn, you start with a table of historical data about your customers. This table has one row per customer, and columns describing the customer. Typical columns for churn prediction include

length of the business relationshipfrequency of customer transactionstypes of products purchasedaverage purchase sizessurvey resultscomplaint frequency.

Customer demographics can also improve predictive accuracy, and you can include attributes such as

agegendersegmentjob typegeographic or postal location.

In general, it is good to include columns based on your business expertise and intuition.Customer relationships change over time and this evolution can help predict departures. To represent these changes, you can include columns that describe t rends . For example, you can include year-over-year changes in sales for the past five years.The data table must include a column indicating churn . In the demo data provided here, this column is named churn . For customers whose churn status you know, put a yes or n o in the churn column. For

4 Run Analysis

Take m e t o t h e results

3 Churn Column

yesPositive class:

ChurnSelect column:

Please select a column containing the information whether or not a customer churned.

I D Churn Gender A g e Region code Transaction count Avg balance Total accounts

113704 no f 5 2 1100 3 2 145490 4

622299 ? f 5 7 8715 1 242542 1

609274 no m 4 4 5145 2 8 79100 5

623378 ? f 5 7 2857 1 6 1 1

860912 yes f 4 7 3368 4 4 63939 1

Clear dataYour Data

ChurnResultsD a t aApplicationR

ep

osi

tori

es

Page 44: RM World 2014: Prescriptive analytics with RapidMiner

Traditional marketing churn prediction uses lift chart

HelpViewToolsProcessEditFile

WizardResults (F9)Design (F8)Home

Application Wizard

Export Result Dashboard

Show the process

Churn Results

Congratulations, you have successfully built a churn prediction model. This dashboard shows the results of building and evaluating the quality of this model.

Lift ChartThe first result is a lift chart, which relates prediction confidence to the actual number of customers who churn. The leftmost bar represents the highest confidence predictions, and its height represents the number of churned customers predicted. In the ideal case, the high confidence predictions correctly predict the majority of churned customers.

Top Churn CandidatesThe next panel shows the customers with the highest prediction of churning. You can directly target these customers for retention.

Decision TreeRapidMiner built a decision tree to predict customer churn. You can examine this tree to understand why customers churn, and directly use this information in your marketing strategy. To see all parts of the tree, you can zoom the tree with the mouse wheel, and drag the tree by holding the left mouse button and moving the mouse.

6 Analytical Results

Count for Churn = yes Cumulative (Percent)

[0 .8 - ] [0.6 - 0.8] [0.2 - 0.4] [ - - 0.2]

Confidence for yes

0

2

4

6

8

10

12

14

16

18C

ou

nt

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

Percen

t of yes

18 / 20

3 / 44 / 6

10 / 42

For the model's strongest predictions, the lift chart examines how likely these customers are to churn.

Lift chart

> 1.500

> 4245

> 1.500 1 .500

> 6432.500

> 127912.500 127912.500

6432.500

4 2 4 5

> 1921.500 1921.500

1 .500

Transaction count

Region code

Total accounts

noRegion code

Avg balance

yes no

no

Region code

yes no

yes

The decision tree models customer churn.

Decision Tree

ChurnResultsDataApplication

Rep

osito

ries

Page 45: RM World 2014: Prescriptive analytics with RapidMiner

Real-world marketing has complex constraints

Objective I Optimize campaign forlargest total account balance

Decision variables I RapidMiner predicted 50k churnersDecide best customers to target≈ 1015,000 possible solutions

Constraints I Budget to target 5k customersI Target average > 30 transactions/yearI Max of 50 targets in regions 2 – 4I Balance gender (1/3 – 2/3 female)

Page 46: RM World 2014: Prescriptive analytics with RapidMiner

Cloud computing solves big data optimization

SYMPHONYMOSEK

XpressMP MINTOMinto

scip

Main Process

prep Input

in

in

out

out

Multiply

inp out

out

out

out

prep Object...

in

in

out

out

prep Constr...

in

in

out

out

associate c...

in

in

in

out

out

Cloud-based NEOS MILP solver

Dec

Con

Exa

Exa

inp

inp res

res

res

Gurobi

Page 47: RM World 2014: Prescriptive analytics with RapidMiner

Success: Cloud optimized our retention campaign

Constraints:I Budget ≤ 5000

I Avg trans/year ≥ 30

I ≤ 50 in regions 2 – 4

points jittered

Page 48: RM World 2014: Prescriptive analytics with RapidMiner

Here’s the Prescriptive Analytics Roadmap

I Integrate multiple LP and MILP solversI Open source solvers with appropriate licensingI Commercial solvers

I Integrate cloud-based solvers

I Integrate heuristic bin packing methods

I Ship to select customers in 2015 Q1

I Provide domain-specific RapidMiner Wizards

I Provide simulation

Page 49: RM World 2014: Prescriptive analytics with RapidMiner

Let’s summarize Prescriptive Analytics

I Prescriptive analytics produces large financial gains

I RapidMiner fills a huge product gap

Limited tools Operations Researcher tools

I Cloud computing solves large optimization challenges