34
Data Science Software Platform The Solitude Of The Data Team Manager Data Driven NYC 04/11/16

Dataiku - data driven nyc - april 2016 - the solitude of the data team manager

  • Upload
    dataiku

  • View
    398

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Data ScienceSoftware Platform

The Solitude Of The Data Team Manager

Data Driven NYC 04/11/16

Page 2: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

My nerdy background

Type Systems Automated Proving Abstract Program Interpretation Functional Programming Garbage Collection and Vms

Graph Analytics Chess IA Natural Language Processing 80% Emacs / 20% VIM

Page 3: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

So to sum it up …

3

I (USED TO) HATEGUIs

Page 4: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

…and our software is

A supercharged Visual IDE for Data Teams

Deployment

Page 5: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Meet HAL

Hal AlowneBI ManagerDim’s Private Showroom

Page 6: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Meet Hal’s Boss, DIM

Hey Hal ! We need a big data platform like the big guys.Just do what they’re doing!

”Big DataCopy Cat Project

Page 7: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

TECHNOLOGYDisconnect

What technologies should I use ? ‟ ”

Page 8: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Welcome to Technoslavia !

8

Page 9: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

TOY PLATFORM ANTI-PATTERN

9

Test and Invest in Infrastructure == Skilled Peopleor

Go For Cloud / Packaged Infrastructure

YourBrandNewHadoopClusterisperceivedasslow,notsousedandnotreliable

Page 10: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

TECHNO MISMATCH ANTI-PATTERN

10

Assume Being Polyglotor

Be a Dictator

VS

VS

ThePythonClan

TheRTribe

TheOldElephantFraternity

TheNewElephantClub

Page 11: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

PREDICTIVE ANALYTICS DEPLOYMENT STRATEGY

11

Website2000’winners

Companiesthatwereabletorelease fast

"ArtificialIntelligencewithDataforInternetofThings"2010’winners

Companiesabletoputintelligenceinproduction

?

Design a way to put “PREDITICTIVE MODELS” IN PRODUCTION

Page 12: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

PEOPLEDisconnect

Who should I hire ?‟ ”

Page 13: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Classic Business Intelligence Team Organization

Business Leader Data Consumer

Line-of-business Data Consumer Business Project

Sponsor

BI Solution Architect

Model Designer

ETL Developer

Dashboard / Report Designer

SpecsDim

Big Boss

Page 14: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Data Science Team Organization

Business Leader Data Consumer

Line-of-business Data Consumer

Business ProjectSponsor

Data Engineer

Data Analyst

System Engineer / Data Architect

Business Needs

Data Scientist

ITConstraints

I.T.

Page 15: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Manage Expectations

15

Data Plumberer

DataEngineer

Data Scientist

Data Waiter

DataCleaner

DataAnalyst

REALJOB

DREAMJOB

Page 16: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Managing Extreme Personalities

16

Data Scientist

Highly Creative

Passionate

Hard to hire?

Hard to manage?

Want to take Hal’s job? Ambitious

Hard to retain?

Page 17: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Paired for Data

17

Data AnalystDiscover Patterns

Data EngineerMake things work

Fightdata entropy

Fighttech

entropy

Page 18: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

What do you prefer?

18

One AnalystOne EngineerOne Data Scientist

Four data scientists

OR

Page 19: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Two Mindsets Can Coexist

CLICKERS CODERS

Page 20: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

DATADisconnect

What about data ?‟ ”

Page 21: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

What is the main reason for data project to fail ?

21

> DATA NOT AVAILABLE

Page 22: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

BUT FOR ONLY INCREMENTAL GAIN

50 30 20

0% 25% 50% 75% 100%

Contribution totheoverallprojectperformance

BusinessGoalDefinitionandData FeatureEngineering Algorithm

Page 23: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

How to Get Data if you don’t have it

23

THECICADA THESPIDER THEFOX

Be Optimistic !

Wait for Open Data InitiativesAnd data available in the Enterprise Hub / Data Lake !

Create Network !

Create a set of trackers or Addictive Data Collection internallyTo get Data on your side !

Hunt for Big Problem!

Convince the CEO that you can Solve a Business Critical problem And use it as an excuse to get allThe data you want !

Page 24: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

PRODUCTDisconnect

Page 25: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

What is Big Data about ?

25

Page 26: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

The Age Of Distributed Intelligence

26

Global,PersonalisedandRealTimeDataDrivenServices

Page 27: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Data to Visualize or Data to Automate ?

2013 2014 2015 2016 2017 2018

Moving to a world of automated decision making

27

DATA FOR MORE INSIGHTS

DATAFOR AUTOMATED DECISIONS

Page 28: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Involve Product Team

28

ProductFeaturePersonalised ItemRanking

ProductFeatureNotifyUserOnlywhenNeeded

ProductFeature:HistoricalDataForPathOptimisation

Have Product Management Deeply Involved In the Data Team

Page 29: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Focus on your added value

29

Build by the Data Team

Is the problem at the Core of my Business Process?

Is it a commonproblem / with share data?

Can i solve it on my own?

Really?

Hire Consultant and Learn

Build bythe Data Team

Go for Best of BreedSaaS Solution

Build bythe Data Team?

YesNo

No Yes No Yes

No Yes

Page 30: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Create an API culture

Do not shareo Random Piece of Codeo Flat Fileo Email

Do shareü Reproductible documented workflowsü Clean, documented APIs

Page 31: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Did Hal found his solutions ?

Technology

Data

People

Product

Polyglot on top of open source

Find a way to make clickers and coders work together

Create an API culture and involve the product teams

Hunt for Big Problems and Convince the CEO

Is this the end ?‟”Hal AlowneBI ManagerDim’s Private Showroom

Page 32: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

That was the (romanced) story…

data scientists and engineers25

2 locations

1 software

by the numbers

For a simple software for clickers and coders

Page 33: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Car Sharing Worldwide

Leader

Flash SalesWorldwide

Leader

One Mission : Never leave Hal ALONE

3,700 Hotels

Worldwide

2500 lovelyusers

by the numbers

70 customers

Page 34: Dataiku  -  data driven nyc  - april  2016 - the  solitude of the data team manager

Food forthoughtswww.dataiku.com/blog

THANKYOU!

FREE (as in Beer) Software www.dataiku.com/dss