19
fr.linkedin.com/in/paperon/ [email protected] Big data in Apple

Big data in Apple- initiation of a long long journey

Embed Size (px)

Citation preview

Page 1: Big data in Apple- initiation of a long long journey

fr.linkedin.com/in/paperon/[email protected]

Big data in Apple

Page 2: Big data in Apple- initiation of a long long journey

These slides present a tiny part of the work achieved for Apple few years ago. A business process has been choosen to share knowledge in a structured way.

It was part of a broader engagement aimed at a radical improvement of salesforce performance (leading to iSales) and EMEA re-org due to iPhone weight

Obviously, big data was available and datamining brought new perspectives on far more than salesforce performance :

Assessment of country potential as well as region or suburb

Link between sales of different products

Drivers and levers of POS (point of sales) growth

Impact on sales of advertising campaigns

Estimate of POS potential and identification of ideal mix of format to exploit area potential

Purpose of this document is to show that

datamining and big data are parts of a global innovation game plan for any company

Each company have enough data to define and begin its own digital or bid data journey

Define a vision or global move

path

Transform findings in

actions

Exec summary

Killmisleading

beliefs

Build models and practice darwinism

between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 3: Big data in Apple- initiation of a long long journey

Big picture for data crunching + resource allocation : from HQ to markets

AppStore

iPhone/Carriers

POSAPR

(MajorAccounts)

Retail

iTuneU(media, writers,

…)OnLine Store

iTune

Apple Care

AppleRetail Store

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Install darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 4: Big data in Apple- initiation of a long long journey

Always begin a Big Data work by few questions that may evolve during the work (hypothesis driven approach/ McKinsey)

1. Can I model store sales and benchmark store performance between them in a fair way by taking into account competitors and specific environments?

2. How can we forecast the number of CPU in a town or geographical area (close to potential) in order to allocate efficiently resources among them?

3. What is the nature of the relationship between CPU sales and iPhone sales at a micro (zip code) or macro (country) level?

4. Can I forecast PC market in a country by using very simple set of variables? (that is to say 2-3 max)

What is the optimal channel mix to capture potential of a specific geographical area or maximize my ROCE or ROI?

What is optimal resource allocation between existing POS, new town coverage or new emerging country?

“Begin Big data with big picture and big questions”

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Install darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 5: Big data in Apple- initiation of a long long journey

Find a way to structure data and define collectively one path to follow Whatever the kind of bid data you are doing or will do, find an elegant way to

structure information or data. More especially when you are facing complexity brought by multi – Products or categories Channels Partners Geos Time frames and granulometries

For the job done in Apple, a structure was looked for early on to communicate, share insights and define a path to explore and classify informations.

The pyramid, ranking information from macro at the top down to micro at the bottom, offered a neat image to combine sources and explore new paths which had never been walked before

Basement as been added for example to begin predictive modeling of individual behavior or patterns

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Install darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 6: Big data in Apple- initiation of a long long journey

Pyramid structure to classify predictive models and first R2 calculations

Registered PCs,Online Sales, registered mobiles, population, GDP, income, kids ...

Training points, population in X km radius around POS, distance to competitors, product registrations ...

Predictive model of:Correlation

rate (R2) Key variables

PC Total Market/country 91% GDP

Mobile

PC Available market/country 89% Broad-band

access

iPhone pull effect on PC sales per week

50 to 90% Mobile sales

Units/POS180 retail 77%

MobilePopulation GDP Training

Sales per ZipCode 89%

MobilePopulationGDP

GDP, Internet, literacy, mobile subscriber, broadband penetration, roads, >$35k income, life expectancy, general physicians ...

Data available

EMEIA

France(among 130 countries)

600 POS(e.g. for France)

6 000 zip codes(e.g. for France)

Online Sales, mobile registered,PCs + mobile sales …

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Install darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 7: Big data in Apple- initiation of a long long journey

Correlation of 83% at a macro level (countries) between PC and mobiles…

Mobile sales

PC sales

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Install darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 8: Big data in Apple- initiation of a long long journey

Correlation of 92% at a POS between PC and mobiles…

But let’s be cautious with correlation rate which is not causality. What’s the nature of the relationship between the two variables and more specifically: • what is the best predictor of PC (or CPU) sales ?• what is the predictive model to forecast sales ?

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 9: Big data in Apple- initiation of a long long journey

Following initial findings came exhaustive gathering of informations available for each of 6000 geographical units: 250 exogenous (external to Apple) variables, 50 endogenous variables

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 10: Big data in Apple- initiation of a long long journey

Mapping of data needed to do appropriate business intelligence

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 11: Big data in Apple- initiation of a long long journey

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 12: Big data in Apple- initiation of a long long journey

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

List all statistical tools and maximum skills within your team to cover maximum of ground or push the envelop to its extreme.

Page 13: Big data in Apple- initiation of a long long journey

What is the best predictor of CPU sales at micro level (zip code) ?CPU info at Zip Code

levelPOS sales

On Line sales Total

Registered CPU 158 000 95 000 253 000

Non registered

CPU294 000

(2/3) 294 000

Tolal 452 000 95 000 547 000

R2 with iPhone

registered POS sales On Line sales

CPU 96% 94%

R2 with population POS sales On Line

sales

RegisteredCPU 74% 72%

iPhone and OnLine sales are unbeatable predictors of sales at a micro level = potential

France ‘09 formula: 1 PC = 3 iPhone

This allows to benchmark existing coverage and shows holes in coverage

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Exemple of a formula for PC sales per zip code:(linear simplification of genetic algorithm model for average value)

:

Example of POS sales based on other POS presence(linear visualisation of a sigmoid function issued from neural network modeling)

Page 14: Big data in Apple- initiation of a long long journey

Kill misleading beliefs. Example : there is an halo effect for iPhone - 1

Number of POS per zip code

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

It was a static vision of the impact of iPhone on other products.

Obviously something was going on between the two products

Some specific modeling led to key findings

Yes iPhone had an impact by opening the market and building brand equity and product value

So, far more interestingly than halo effect, it was a towing effect, iPhone providing a tremendous traction to all other products and increasing sales potential

Page 15: Big data in Apple- initiation of a long long journey

Kill misleading beliefs. Example : there is an halo effect for iPhone - 2

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 16: Big data in Apple- initiation of a long long journey

Findings in actions: Identification of Zip Codes with highest untapped potential

Use of X iPhone for 1 CPU ratio to define highest potential zip codes in France

4500 zip codes plotted The asymptote is 1.5 and is reached in many zip

codes and more often with POS>4 French potential for CPU is x millions but it

moves with iPhone penetration growth Utility function of cost of coverage has to be

plugged in to find cut-off points iPhone POS and sales per zipcode is critical to

balance resource allocation

iPhone sold/CPU sold off-lineper zip code

Number of POS per zip code

Angers 2.6OnLine = 15%

Strasbourg 1.5OnLine = 13%

Toulouse 1.9OnLine = 13%

Montpellier 1.7OnLine = 6%

Geneviliers 5.2OnLine = 14%

Pau 5.3OnLine = 40%

Concarneau 6OnLine = 40%

Lannion 9OnLine = 55%

Paris17 2.3OnLine = 16%

Nancy 2OnLine = 13%

asymptote

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 17: Big data in Apple- initiation of a long long journey

Ranking of most attractive zip codes for CPU and extra coverage opportunities

*red bar means no POS

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 18: Big data in Apple- initiation of a long long journey

How big data is connected with salesforce performance ?

How to improve Account Manager

performance ?

Use few IT systems leveraging Corp. tool

and support

Assess account potentials for better coverage and right

allocation of resource

Be world-class in Account

Management for clients (new+current)

Get better forecasts weekly, M, Q, H and Y wise to free up time

Client 3.0 initiative + incentives

Predictive analytics

Definition and making of single SalesForce automation tool for the whole company

New predictive tools available

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions

Page 19: Big data in Apple- initiation of a long long journey

Which dynamic path to extract the most actionable and profitable levers from data ?

Data collection and warehousing

Cleaning of data Security

management Choice of

updating frequency

Data

Analysis and visualization

Static modeling

Predictivemodeling

Global and real time modeling

2-3 dimensional cuts on charts

Geo mapping Benchmarking

amongst formats or geos

Broaden set of users

Simple statistics with correlation rates to identify 4-5 key variables

Simple linear regression models

Portfolio cinematic view

Field testing with relevant managers

Advanced stats with non-linear modeling

Models in competition with each others

Broad business perspective: town against country against levers

Integration in Account management tools

Real time improvement of models

Generation of new models Direct link with EDW Global/WW business

perspective Apply to all services (e.g. HR

for 6 month ahead needs)

Time

Collective datamining

skill

Key

activ

ities

Current EMEIA position

Outputs:Main business leversLink between productsPotential estimatesPOS performanceSAMI impactTraining needs

Outputs:New coverage needsOptimal mix of POSForecast weekly/quart.Efficiency of allocationsKPI and reco in SFA“What if” functionsIndividual buyer modelCross-selling directions

Outputs:Resource optimization between BusIdentification of lead countriesCollective sharing of findings and best practicesCross-selling opportunitiesBetter accuracy due to meta-modelsLive and new models

Define a vision or global move

path

Transform findings in

actions

Kill misleading beliefs

Build models and practice darwinism between models

Generate first findings and

expand

Define a path to datamine

Big picture and big questions