12
Three ways to Fail your Data Lab Implementation

Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Embed Size (px)

Citation preview

Page 1: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Three ways to Fail your Data LabImplementation

Page 2: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Dataiku DSS

Page 3: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Data Labs

10 M€ in 2014121 499 M€ in 2014 3 029 M€ in 2015

5 454 M€ in 2014816 M€ in 201410 M€ in 2008

Marketing / Webü Behavioral segmentation

ü Churn predictionü Sales forecast

ü Dynamic Pricing

Industrie & Infrastructureü Predictive maintenanceü Logistic Optimization

ü Smart Cities

Bank & Insuranceü Fraud detection

ü Risk anticipation ü Lifetime moment detection

Page 4: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Why a data Lab?

• 1 single Workflow : from a segmentated workflow to a transversal one• Several use cases: Ability to adress many different data centric topics within a

single unit• Multiple competences: Business focused approached mixing many different

competences• End to end projects : combining data from different sources to handle several

aspects on a single topic

Page 5: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Deployment of the predictions

Dataiku DSS for fraud prediction

Client service

Sensor data

Garage data

Administration

• 1 Project Owner (IT)• 1 Project Manager (Business)• 1 Data scientist in house• 3 data scientist sfrom 3 different firms• 3 consultants from 3 different firms• 1 architect (external)

Accepted file

INVESTIGATE !

The transactions are blockeddepending on their gap with the

business rules and behavioralpatterns

Page 6: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Welcome to Technoslavia !

6

Page 7: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Focus on the framework, not on the input

Data Acquisition &

Understanding

Data Preparation Model Creation

Evaluation Deployment

Scored dataset

Scored dataset

Iteration 1

Iteration 2

Iteration n

✓ Read and import raw data✓ Detect schemas and structure

✓ Analyze distributions✓ Assess quality: outliers,

missing values...

✓ Performance metrics✓ Robustness & generalization

(cross validation)✓ Insights (eg variable importance)

✓ Create derived and aggregated variables→ Analytical dataset

→ Report

✓ Feature selection✓ Compare algorithms

✓ Scoring engine✓ Publish predictions✓ Monitor performance

✓ API

Business Understanding

Adapted from the CRISP-DM methodology

Dataset 1

Dataset 2

Dataset n

Page 8: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

People and Governance

?Polyglott VS dictator

Problems : • Collaboration between

technical and non technical profiles insidea single project• Nécessary

collaboration betweenbusiness and techteams to adresstransversal projectsaccurately

Focus :• Promote diversity• …within a workflow

centric environment

Page 9: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

End to end, from prototyping into production Do it you way …

Page 10: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

…and scale!

Page 11: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Data Lab OrganisationData Lab

Lab Environment

Multydisciplinary Team:

Direction / Project Management

Business Analysts

Data Miners / Data Scientists

Production Environment

Business needs

Internal Data sources

External datasources

Missions :

Priorisation of the business needs

Prototyping / Agile solution engineering

Support for Apps deployment

Business Applications

Marketing Campaign Automation

Reporting webanalytics

Data as A Service Platform

Conception of “DATA PRODUCTS”

Integration of Data Products

Optimisation Engine

Real Time Scoring

Data Flow

Insights & Services

Processing chain

API Deployment

Page 12: Anne-Sophie Roessler, International Business Developer, Dataiku - "3 ways to Fail your Data Lab Implementation"

Thank you !