View
197
Download
0
Category
Preview:
Citation preview
fr.linkedin.com/in/paperon/pierre.paperon@gmail.com
Big data in Apple
These slides present a tiny part of the work achieved for Apple few years ago. A business process has been choosen to share knowledge in a structured way.
It was part of a broader engagement aimed at a radical improvement of salesforce performance (leading to iSales) and EMEA re-org due to iPhone weight
Obviously, big data was available and datamining brought new perspectives on far more than salesforce performance :
Assessment of country potential as well as region or suburb
Link between sales of different products
Drivers and levers of POS (point of sales) growth
Impact on sales of advertising campaigns
Estimate of POS potential and identification of ideal mix of format to exploit area potential
Purpose of this document is to show that
datamining and big data are parts of a global innovation game plan for any company
Each company have enough data to define and begin its own digital or bid data journey
Define a vision or global move
path
Transform findings in
actions
Exec summary
Killmisleading
beliefs
Build models and practice darwinism
between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Big picture for data crunching + resource allocation : from HQ to markets
AppStore
iPhone/Carriers
POSAPR
(MajorAccounts)
Retail
iTuneU(media, writers,
…)OnLine Store
iTune
Apple Care
AppleRetail Store
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Install darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Always begin a Big Data work by few questions that may evolve during the work (hypothesis driven approach/ McKinsey)
1. Can I model store sales and benchmark store performance between them in a fair way by taking into account competitors and specific environments?
2. How can we forecast the number of CPU in a town or geographical area (close to potential) in order to allocate efficiently resources among them?
3. What is the nature of the relationship between CPU sales and iPhone sales at a micro (zip code) or macro (country) level?
4. Can I forecast PC market in a country by using very simple set of variables? (that is to say 2-3 max)
What is the optimal channel mix to capture potential of a specific geographical area or maximize my ROCE or ROI?
What is optimal resource allocation between existing POS, new town coverage or new emerging country?
“Begin Big data with big picture and big questions”
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Install darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Find a way to structure data and define collectively one path to follow Whatever the kind of bid data you are doing or will do, find an elegant way to
structure information or data. More especially when you are facing complexity brought by multi – Products or categories Channels Partners Geos Time frames and granulometries
For the job done in Apple, a structure was looked for early on to communicate, share insights and define a path to explore and classify informations.
The pyramid, ranking information from macro at the top down to micro at the bottom, offered a neat image to combine sources and explore new paths which had never been walked before
Basement as been added for example to begin predictive modeling of individual behavior or patterns
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Install darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Pyramid structure to classify predictive models and first R2 calculations
Registered PCs,Online Sales, registered mobiles, population, GDP, income, kids ...
Training points, population in X km radius around POS, distance to competitors, product registrations ...
Predictive model of:Correlation
rate (R2) Key variables
PC Total Market/country 91% GDP
Mobile
PC Available market/country 89% Broad-band
access
iPhone pull effect on PC sales per week
50 to 90% Mobile sales
Units/POS180 retail 77%
MobilePopulation GDP Training
Sales per ZipCode 89%
MobilePopulationGDP
GDP, Internet, literacy, mobile subscriber, broadband penetration, roads, >$35k income, life expectancy, general physicians ...
Data available
EMEIA
France(among 130 countries)
600 POS(e.g. for France)
6 000 zip codes(e.g. for France)
Online Sales, mobile registered,PCs + mobile sales …
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Install darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Correlation of 83% at a macro level (countries) between PC and mobiles…
Mobile sales
PC sales
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Install darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Correlation of 92% at a POS between PC and mobiles…
But let’s be cautious with correlation rate which is not causality. What’s the nature of the relationship between the two variables and more specifically: • what is the best predictor of PC (or CPU) sales ?• what is the predictive model to forecast sales ?
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Following initial findings came exhaustive gathering of informations available for each of 6000 geographical units: 250 exogenous (external to Apple) variables, 50 endogenous variables
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Mapping of data needed to do appropriate business intelligence
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
List all statistical tools and maximum skills within your team to cover maximum of ground or push the envelop to its extreme.
What is the best predictor of CPU sales at micro level (zip code) ?CPU info at Zip Code
levelPOS sales
On Line sales Total
Registered CPU 158 000 95 000 253 000
Non registered
CPU294 000
(2/3) 294 000
Tolal 452 000 95 000 547 000
R2 with iPhone
registered POS sales On Line sales
CPU 96% 94%
R2 with population POS sales On Line
sales
RegisteredCPU 74% 72%
iPhone and OnLine sales are unbeatable predictors of sales at a micro level = potential
France ‘09 formula: 1 PC = 3 iPhone
This allows to benchmark existing coverage and shows holes in coverage
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Exemple of a formula for PC sales per zip code:(linear simplification of genetic algorithm model for average value)
:
Example of POS sales based on other POS presence(linear visualisation of a sigmoid function issued from neural network modeling)
Kill misleading beliefs. Example : there is an halo effect for iPhone - 1
Number of POS per zip code
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
It was a static vision of the impact of iPhone on other products.
Obviously something was going on between the two products
Some specific modeling led to key findings
Yes iPhone had an impact by opening the market and building brand equity and product value
So, far more interestingly than halo effect, it was a towing effect, iPhone providing a tremendous traction to all other products and increasing sales potential
Kill misleading beliefs. Example : there is an halo effect for iPhone - 2
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Findings in actions: Identification of Zip Codes with highest untapped potential
Use of X iPhone for 1 CPU ratio to define highest potential zip codes in France
4500 zip codes plotted The asymptote is 1.5 and is reached in many zip
codes and more often with POS>4 French potential for CPU is x millions but it
moves with iPhone penetration growth Utility function of cost of coverage has to be
plugged in to find cut-off points iPhone POS and sales per zipcode is critical to
balance resource allocation
iPhone sold/CPU sold off-lineper zip code
Number of POS per zip code
Angers 2.6OnLine = 15%
Strasbourg 1.5OnLine = 13%
Toulouse 1.9OnLine = 13%
Montpellier 1.7OnLine = 6%
Geneviliers 5.2OnLine = 14%
Pau 5.3OnLine = 40%
Concarneau 6OnLine = 40%
Lannion 9OnLine = 55%
Paris17 2.3OnLine = 16%
Nancy 2OnLine = 13%
asymptote
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Ranking of most attractive zip codes for CPU and extra coverage opportunities
*red bar means no POS
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
How big data is connected with salesforce performance ?
How to improve Account Manager
performance ?
Use few IT systems leveraging Corp. tool
and support
Assess account potentials for better coverage and right
allocation of resource
Be world-class in Account
Management for clients (new+current)
Get better forecasts weekly, M, Q, H and Y wise to free up time
Client 3.0 initiative + incentives
Predictive analytics
Definition and making of single SalesForce automation tool for the whole company
New predictive tools available
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Which dynamic path to extract the most actionable and profitable levers from data ?
Data collection and warehousing
Cleaning of data Security
management Choice of
updating frequency
Data
Analysis and visualization
Static modeling
Predictivemodeling
Global and real time modeling
2-3 dimensional cuts on charts
Geo mapping Benchmarking
amongst formats or geos
Broaden set of users
Simple statistics with correlation rates to identify 4-5 key variables
Simple linear regression models
Portfolio cinematic view
Field testing with relevant managers
Advanced stats with non-linear modeling
Models in competition with each others
Broad business perspective: town against country against levers
Integration in Account management tools
Real time improvement of models
Generation of new models Direct link with EDW Global/WW business
perspective Apply to all services (e.g. HR
for 6 month ahead needs)
Time
Collective datamining
skill
Key
activ
ities
Current EMEIA position
Outputs:Main business leversLink between productsPotential estimatesPOS performanceSAMI impactTraining needs
Outputs:New coverage needsOptimal mix of POSForecast weekly/quart.Efficiency of allocationsKPI and reco in SFA“What if” functionsIndividual buyer modelCross-selling directions
Outputs:Resource optimization between BusIdentification of lead countriesCollective sharing of findings and best practicesCross-selling opportunitiesBetter accuracy due to meta-modelsLive and new models
Define a vision or global move
path
Transform findings in
actions
Kill misleading beliefs
Build models and practice darwinism between models
Generate first findings and
expand
Define a path to datamine
Big picture and big questions
Recommended