Upload
raul-chong
View
222
Download
4
Embed Size (px)
DESCRIPTION
Technical introduction to Business Analytics and optimization. This is part 2. Part 1 can be found here: http://www.slideshare.net/rfchong/business-analytics-and-optimization-introduction
Citation preview
Business Analytics and Optimization: A Technical Introduction (Part 2)
Oleksandr Romanko, Ph.D. Senior Research Analyst, Risk Analytics – Business Analytics, IBM Adjunct Professor, University of Toronto
Toronto SMAC Meetup October 9, 2014
© 2014 IBM Corporation
Business Analytics
© 2014 IBM Corporation
Predictive Analytics What will happen?
Descriptive Analytics What has happened?
Prescriptive Analytics What should we do?
What is analytics?
Data Insight Action
Decide Analyze
Business Value
3
Analytics is the scientific process of deriving insights from
data in order to make decisions
© 2014 IBM Corporation
Business Analytics Education
© 2014 IBM Corporation
IBM Academic Initiative program
Cognos SPSS ILOG
© 2014 IBM Corporation
Business Analytics programs – curriculum
Applied Statistics and Probability
Fundamentals of Computational Mathematics
Data Mining and Knowledge Discovery
Simulation Modelling
Optimization
Financial Decision Making
Computational Methods for Business Data Analysis
Computational Finance and Risk Management
Visual Analytics and Knowledge Representation
Mathematical Modelling for Business
Machine Learning, Cognitive Computing and Artificial Intelligence
Marketing Analytics
Strategies for Managing Innovations
Analytics of Web, Social Networks and Business News
© 2014 IBM Corporation
Applied Statistics
© 2014 IBM Corporation
What kind of data are we dealing with?
Types of data
• Quantitative
• Categorical (ordered, unordered)
Data collection
• Independent observations (one observation per subject)
• Dependent observations (repeated observation of the same subject, relationships
within groups, relationships over time or space)
Type of data drives the direction of your analysis
• How to plot
• How to summarize
• How to draw inferences and conclusions
• How to issue predictions
8
© 2014 IBM Corporation
Quantitative data
Examples: temperature, age, income
Quick check: “Does it makes sense to calculate an average?”
Appropriate summary statistics:
– Mean and Median
– Standard Deviation
– Percentiles
More advanced predictive methods: Regression, Time Series Analysis, …
Plot your data!
9
© 2014 IBM Corporation
Summarizing quantitative data
One-number summaries
– Mean
Average, obtained by summing all observations and dividing by the number of obs.
– Median
The center value, below and above which you will find 50% of the observations.
Summarizing your data with one number may not tell the whole story:
10
Median = 19.8 Median = 19.8 Median = 10.5
© 2014 IBM Corporation
“Most observations fall within ±2 standard deviations of the mean.”
Standard deviation
11
If the data is normally distributed
95 % of observations
Standard Deviation = 4.2
~95% of observations between 11.4 and 28.2
© 2014 IBM Corporation
Distributions: Normal distribution
12
© 2014 IBM Corporation
Distributions
13
© 2014 IBM Corporation 14
Distributions
Estimate of the probability distribution of global mean temperature resulting
from a doubling of CO2 relative to its pre-industrial value, made from
100000 simulations
© 2014 IBM Corporation
Modeling
© 2014 IBM Corporation 16
Models
© 2014 IBM Corporation 17
Models
Simplified representation or abstraction of reality
Capture essence of system without unnecessary details
Models tailored for specific types of problems
Models help us understand the world – Prediction (What if?) – Optimization (What’s best?)
Often models much easier, faster, and cheaper to experiment with than the real system
© 2014 IBM Corporation 18
Models and reality
Problem
Decisions
Model
Interpretation
Calculations
From Monahan, G., “Management Decision Making”, Cambridge University Press, 2000
“Real” World
Analysts World
Simplified abstraction
of reality
Capture essence of
problem
© 2014 IBM Corporation 19
Environmental risk management
© 2014 IBM Corporation 20
Predictive maintenance
Wind turbines are big and expensive machines, so keeping them running
smoothly helps keeping their operational cost down. The sensor data generated
by the turbine can help achieving this – by analysing it, you can spot potential
failures earlier. The longer the warning period before a part fails, the better you
can prepare for it.
To do that, you need to be able to
anticipate failures in heavy and
expensive parts like the gearbox,
generator and main shaft.
Preventive maintenance saves
money:
Shorter downtime and less lost
production
Better planning of people and
materials
Cheaper repairs
Source: Algoritmica, http://www.algoritmica.nl
© 2014 IBM Corporation 21
Predictive maintenance – how it works
Wind turbines have an array of sensors that measure temperatures, pressures, voltages,
currents, and blade angles. This data is available for analysis, typically as 10-minute
averages of the sensor values.
The computer that controls the turbine uses these measurements for its operations. This
includes error thresholds like ‘the gearbox oil temperature should be below 120 degrees
Celsius’. However, by the time the threshold is exceeded it is usually too late: the damage
has already been done. To catch failures earlier we should look for anomalies, e.g.
measurements that are unexpected and therefore might indicate a problem – but are not
yet so severe that they exceed a threshold.
Source: Algoritmica, http://www.algoritmica.nl
© 2014 IBM Corporation 22
Predictive maintenance – anomaly detection
Anomaly detection begins by defining what measurement values are expected and then
calculating the difference with the actual situation. Since sensor data is delivered as a time
series, we create a model that predicts the next value of a specific sensor given its
previous values as well as the previous values of any other sensors that may be relevant.
Based on these multiple inputs, the model then calculates its predicted value and
compares it with the actual sensor reading. The difference (or residual) is now a measure
of how much the turbine is deviating from its expected performance. If it is persistent or
grows too large (i.e. becomes an anomaly), an analyst can investigate the cause and
decide on a course of action together with the operations staff at the wind farm.
Source: Algoritmica, http://www.algoritmica.nl
© 2014 IBM Corporation 23
Predictive maintenance – machine learning model
To create such a sensor model we apply machine learning or data mining, i.e. one or
more algorithms that use a set of examples (the ‘training set’) to learn a predictive model.
For a wind turbine, it is a natural fit to use a year of sensor data as the training set so that
all seasonal variations are included.
Source: Algoritmica, http://www.algoritmica.nl
© 2014 IBM Corporation 24
Predictive maintenance – driven by data
This is a data-driven approach: the model learns the relationship between the various
sensor readings purely based on the training data. This is in contrast to a so-called
physical model that explicitly describes the turbine design using detailed knowledge of its
physical characteristics.
The main advantage of a data-driven approach is that the model can be trained by a non-
turbine expert and matches the actual situation by definition, whereas a physical model has
to be carefully calibrated by an expert.
Source: Algoritmica, http://www.algoritmica.nl
© 2014 IBM Corporation
Simulation – Business Case Study
© 2014 IBM Corporation 26
Study environmental impact of restaurant operations
Restaurant
order types and probabilities
processing times (fixed portion and variable portion)
design alternatives
Drive Through
number of service windows
queuing capacity
Parking Lot
parking capacity
customer prioritization
Goals:
maximize customer satisfaction (high customer service level)
minimize environmental impact (quantity of emissions)
Case study – optimal store design
© 2014 IBM Corporation
Problem description
© 2014 IBM Corporation
Restaurant operations
© 2014 IBM Corporation
Restaurant operations
© 2014 IBM Corporation 30
Most of the variable portion of the emissions are generated
at the drive through lane
Customers should be encouraged
to park their cars and enter the
restaurant
Drive through customers
should be served as fast as
possible
Problems with the standard design
less than
12 minutes
waiting for
more than
1 minute to
enter
Results – key indicatotrs
Simulation results
© 2014 IBM Corporation 33
Emissions vs. Customer Satisfaction
Data Points
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
Emissions vs. Customer Satisfaction
Data Points and Efficient Frontier
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
Customer Prioritization
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
Outside
Equal
Inside
Comparing 72 alternatives:
– Limiting drive through to coffee/bakery orders
– Pull-off space for large drive through orders
– 2 or 3 service windows in drive through
– Customer prioritization: inside, outside or equal
– Varying queuing/parking capacity
Drive Through 2- and 3-Window Design
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
3-Window Design
2-Window Design
Pull-Off Space
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
Disabled
Enabled
Parking Capacity
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
Capacity #1
Capacity #2
Capacity #3
Capacity #4
Drive Through Food Variety
94
95
96
97
98
99
100
35 45 55 65 75 85 95
Emissions (kg/week)
Cu
sto
mer
sati
sfa
cti
on
(%
)
Drive through limited to
coffee/baked goods
Drive through serving everything
yes
no
3
outside
layout #4
(6/19)
Results - alternatives
© 2014 IBM Corporation 34
Additional extensions and policies
Make orders more expensive for the drive through customers – equivalent of introducing the emission sales tax and can be justified from the
environmental point of view
Provide customers with the information about expected waiting times and
greenhouse gas emissions per vehicle for the drive through lane and for
using the parking lot – this information can be displayed on the illuminated indicator board (lighting panel)
outside the restaurant
The “green” policy of the restaurant:
make drive through more efficient or
encourage customers to use parking lot instead
© 2014 IBM Corporation 35
Recommendations
We recommend implementing the following design:
Drive through limited to coffee and baked goods
No pull-off space
Separate pay and pickup windows at the drive through (3 service
windows)
Priority given to drive through customers (or equal priority if any
difficulties are expected with prioritizing the outside customers)
Any reasonable parking lot/drive through design would work (it
depend more on the physical restrictions on the available space for
the newly planned locations than on the other factors)
Implement our additional recommendations about the staffing patterns and
waiting area size as well as “green” policies
© 2014 IBM Corporation
Data Mining
© 2014 IBM Corporation
Data mining
37
Data mining application classes of problems –Classification –Clustering –Regression –Forecasting –Others
Hypothesis or discovery driven
Iterative
Scalable
© 2014 IBM Corporation
What is the difference between descriptive (BI) and predictive analytics?
38
John Lives in Seattle, zip: 98109 21 years old iPhone 5 Plan: $98 a month Talk: 400 minutes Data: 1.9Gb SMS: 370 Complaints: 0 Customer care calls: 1 Dropped calls: low
Mike Lives in Atlanta, zip: 30308 38 years old Samsung Galaxy S3 Plan: $78 a month Talk: 1200 minutes Data: 0.2 Gb of data SMS: 8 Customer care calls: 6 Dropped calls: high
Low churn risk
High churn risk
Descriptive Predictive
© 2014 IBM Corporation
Classification
Classification is a supervised learning technique, which maps data into predefined classes or groups
Training set contains a set of records, where one of the records indicates class
Modeling objective is to assign a class variable to all of the records, using attributes of other variables to predict a class
Data is divided into test / train, where “train” is used to build the model and “test” is used to validate the accuracy of classification
Typical techniques: Decision Trees, Neural Networks
39
Gender Age Lipstick
Female 21 Yes
Male 30 No
Female 14 No
Female 35 Yes
Male 17 No
Female 16 Yes
Customers
Female Male
>=15 years <15 years
Yes No
No
© 2014 IBM Corporation
Classification: Creating Model
40
Gender Age Lipstick
Female 21 Yes
Male 30 No
Female 14 No
Female 35 Yes
Male 17 No
Female 16 Yes
Classification Algorithms
Training Data
Trained Classifier
Purchased lipstick if Gender = Female
and Age >= 15
Works with both interval and categorical variables
© 2014 IBM Corporation
Classification: Applying Rules
41
Gender Age Lipstick
Female 27 ?
Male 55 ?
Female 47 ?
Male 39 ?
Female 27 ?
Male 19 ?
Gender Age Lipstick
Female 27 P Yes
Male 55 P No
Female 47 P Yes
Male 39 P No
Female 27 P Yes
Male 19 P No
Apply Scoring
If Gender = Female
and Age >= 15 then
Purchase lipstick = YES
© 2014 IBM Corporation
Decision (classification) Trees
A tree can be "learned" by splitting the source set into
subsets based on an attribute value test
Tree partitions samples into mutually exclusive groups
by selecting the best splitting attribute, one group for
each terminal node
The process is repeated recursively for each derived
subset, until the stopping criteria is reached
Works with both interval and
categorical variables
No need to normalize the data
Intuitive if-then rules are easy to
extract and apply
Best applied to binary outcomes
Decision trees can be used to
support multiple modeling objectives
o Customer segmentation
o Investment / portfolio decisions
o Issuing a credit card or loan
o Medical patient / disease classification
Customers
Female Male
>=15 years <15 years
Yes No
No
© 2014 IBM Corporation
Cluster Analysis (segmentation)
Unsupervised learning algorithm
o Unlabeled data and no “target” variable
Frequently used for segmentation (to identify natural groupings of customers)
o Market segmentation, customer segmentation
Most cluster analysis methods involve the use of a distance measure to calculate
the closeness between pairs of items
o Data points in one cluster are more similar to one another
o Data points in separate clusters are less similar to one another
43
Spend
Income
Cluster #1 Cluster #3
Cluster #2
© 2014 IBM Corporation
K-means clustering
44
© 2014 IBM Corporation
K-means clustering
45
© 2014 IBM Corporation
K-means clustering
46
© 2014 IBM Corporation
Clustering: LinkedIn
47
© 2014 IBM Corporation 48
Clustering: LinkedIn
© 2014 IBM Corporation
Optimization
© 2014 IBM Corporation 50
Optimization
Optimization problem
Examples:
– Minimize cost
– Maximize profit
© 2014 IBM Corporation
Shortest path or most beautiful path?
7
© 2014 IBM Corporation
Shortest path or most beautiful path?
7
© 2014 IBM Corporation 53 53
.85
1
.80
1.05 200
M1
100 M2
500 M3
600 M4
Cash, USD
Debt, USD
Cash, EUR
Debt, EUR
200 +
Collateral optimization – problem setup
x8
x1 200
R1
550 R2
300 R3
Only cash
Any
Only EUR
© 2014 IBM Corporation 54 54
Collateral optimization – problem setup
.85
1
.80
1.05 200 M1
100 M2
500 M3
600 M4
Cash, USD
Debt, USD
Cash, EUR
Debt, EUR
200 +
x8
x1 200 R1
550 R2
300 R3
Only cash
Any
Only EUR
© 2014 IBM Corporation 55 55
200 M1
100 M2
500 M3
200 R1
550 R2
600 M4
300 R3
.85
1
.80
1.05 200 +
Cash, USD
Debt, USD
Cash, EUR
Debt, EUR 100
Collateral optimization – optimal cost = 985
0
Only cash
Any
Only EUR
0
100
415
600
© 2014 IBM Corporation 56 56
.85
1
.80
1.05 200
M1
100 M2
500 M3
600 M4
Cash, USD
Debt, USD
Cash, EUR
Debt, EUR
200 +
Collateral optimization – concentration constraints
x8
x1 200
R1
550 R2
300 R3
Only cash
Any
Only EUR
At most 50% EUR in total
© 2014 IBM Corporation 57
Multi-objective optimization
Multi-objective optimization: simultaneously optimizing two or more
conflicting objectives subject to certain constraints
Examples:
Finance: Minimize risk & Maximize return
Business: Minimize cost & Minimize environmental impact
Health care: Maximize X-ray dose to tumor &
Minimize X-ray dose to healthy tissues
Units of the objectives are typically not the same:
dollars, probability, units of time, …
© 2014 IBM Corporation 58
Multi-objective optimization
Solving multi-objective optimization problems:
© 2014 IBM Corporation
Visual Analytics
© 2014 IBM Corporation 60
Visual analytics
Visual statistics of the Napoleon Campaign: the Minard Map
© 2014 IBM Corporation 61
Visual analytics
© 2014 IBM Corporation 62
Visual analytics – portfolio
© 2014 IBM Corporation 63
Historical visualization
Activity Histogram Heat Map Track Summary
Distribution of events over time
How long objects spent in different places
Show tracks of all objects returned from search
© 2014 IBM Corporation 64
Visual analytics
© 2014 IBM Corporation 65
Visual analytics
http://www.nytimes.com/2011/11/06/opinion/sunday/population-control-marauder-style.html
• cause (vertical location) • historical time (horizontal location) • duration (equator)
• number of deaths (circle size) • continent (color) • rank, cause, number of deaths (text)
© 2014 IBM Corporation 66
Visualization types
© 2014 IBM Corporation 67
Visualization formatting
© 2014 IBM Corporation 68
Watson Analytics
Natural language dialogue
Cloud-based agility
Data discovery
Quick start intuitive interface
Mobile-ready
© 2014 IBM Corporation 69
Watson Analytics
Unified analytics experience
Visual storytelling
Intelligent automation
Data access and refinement
Report and dashboard
creation
Integrated social business
Guided analytic discovery
© 2014 IBM Corporation 70
Watson Analytics
© 2014 IBM Corporation 71
Watson Analytics
© 2014 IBM Corporation
Analytics Software
© 2014 IBM Corporation 73
Software for analytics
© 2014 IBM Corporation 74
Software for analytics
Lavastorm survey of analytics tools
Source: R. Muenchen "The Popularity of Data Analysis Software", http://r4stats.com/articles/popularity/
© 2014 IBM Corporation 75
Software for analytics
Gartner “Magic Quadrant” plot of companies that sell advanced analtyics software (2014)
Source: R. Muenchen "The Popularity of Data Analysis Software", http://r4stats.com/articles/popularity/
© 2014 IBM Corporation 76
© 2014 IBM Corporation 77
Questions?