Upload
ray-poynter
View
41
Download
3
Tags:
Embed Size (px)
Citation preview
A Presenta*on from Big Data
22 February 2013
Big Data Analytics: avoiding the pitfalls with robust analytics
All copyright owned by The Future Place and the presenters of the material For more informa:on about NewMR events visit NewMR.org
Steve Cohen In4mation insights
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Big Data Analytics: avoiding the pitfalls
Steve Cohen Partner, in4mation insights
[email protected] www.in4ins.com
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
• What Big Data is NOT • The danger of Big Data • New methods for Big Data • Robust analytics for deep dives on Big Data
Agenda
3
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
1. Cut time to market and improve quality 2. Quantify variability and improve performance 3. Segment to customize action 4. Improve decision making and minimize risk 5. Create new products and services
Harness Big Data Big Value
4
Source: McKinsey Global Institute Report (May 2011)
Big Data is driving the demand for skilled problem solvers
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
What is Big Data?
5
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Three V’s of Big Data
6
Volume
Source: Doug Laney, Gartner
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Solving the Big Data Problem
7
Mach
ines
Source: UC Berkeley AMP Lab & McKinsey
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Where is all of the buzz?
8
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Dominated by H & H?
9
1
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Dominated by H & H?
10
5
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
The Long Tail
11
SA
LES
PRODUCTS
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Variability
The fourth V
12
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 13
Apophenia
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 14
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 15
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 16
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 17
Some hints
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 18
“Nothing is so alien to the human mind as the idea of randomness.”
John Cohen
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
“The sexy job in the next ten years will be statisticians … The ability to take data—to be able to understand it, to process it, to extract value from it, to visualize it, to communicate it—that’s going to be a hugely important skill.”
Statistics is sexy!
19
Hal Varian, chief economist at Google
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
“I’m talking about the notion of “whole-population analytics” against the entire population of data, rather than just the traditional capacity-constrained samples/subsets.”
No more samples
20
James Kobelius, IBM
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
What skills are needed for Big Data?
21
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Discover and quantify all sources of variability in market response or in customer behavior at the level of the individual SKU or the individual consumer.
Bayesian statistical models facilitate micro-marketing.
22
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 23
Bayesian statistics ≠
Bayesian networks
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
• Complex systems of linear or nonlinear equations • Often no analytic solution • Monte Carlo simulation • Predict quantitative or qualitative • Incorporate sensible prior beliefs or knowledge • Different coefficient for each unit of analysis at the
“lower” level • “Upper” level = “why behind the what” • “Borrow” when sparse
Hierarchical Bayesian statistics
24
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
What could effect sales of SKUs in a store?
25
Lower Model
National TV
Local TV
Radio
Outdoor
Magazines
Newspapers
Social media activity
Website & search
Upper Model
Channel
Geography
Ingredients
Location at point of sale
Store size
Store age
Store format
Company vs. franchise
Demos of trading area
Lower Model
Base Price
Discounted Price
Feature
Display
Form
Size
Coupons
Seasonality
Holidays
Weather
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Over 1,700 stores, 208 weeks of data, ~3,000 SKUs =
1.06 Billion sales numbers
Lower X N SKUs = Lower coefficients 50 X 3,000 = 150,000
Lower X Upper = Upper coefficients 50 X 100 = 5,000
At every iteration from 1 … 5,000 (or more) !!
Big Data in. Big Data out.
26
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Why doesn't everyone use hierarchical Bayesian statistics on Big Data?
27
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Average & base price across sizes and channels over time
28
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
Price elasticity across sizes and channels over time
29
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
• Danger in Big Data is Variability • Avoid apophenia • Use theory & statistics & avoid mindless data mining • Full dataset analytics, not samples • Hierarchical Bayesian statistics quantify variability
and permit very deep dives on marketing elasticities • Move Big Data analytics beyond a hardware and
software solution to a change in business philosophy where decisions are data-driven
Avoid Big Data pitfalls
30
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013 31
Q & A
Steve Cohen In4mation insights
Ray Poynter Vision Critical University
Steve Cohen, in4mation insights, Boston, MA USA Big Data, 22 February 2013
In4mation insights • Marketing analytics, research, and
technology consulting firm
• Marketing Mix Modeling, Price/Promotion Optimization, advanced Choice models, Assortment Optimization, Consumer and Market Segmentation, and Customer Lifetime Value modeling
• Hierarchical Bayesian statistical models, parallel code written in C++, & high performance computation cluster applied to Big Data
Steve Cohen • Winner 2010 AMA Parlin Award for
lifetime achievement in marketing research
• Winner 2012 NextGen MR Award as Individual Disruptive Innovator
• First to conduct Choice-based Conjoint Analysis in USA (1983)
• Introduced Menu-based Conjoint Analysis for BYO tasks (2001)
• Won 3 awards for introducing Maximum Difference Scaling (2002).
in4mation insights & Steve Cohen
32
Steve Cohen office: 781-444-1237 x104
mobile: 617-510-2144 web: www.in4ins.com
LinkedIn: www.linkedin.com/in/stevenhcohen