38
Data Science + Data Engineering Annika Jimenez Secret Weapon of the Strategic Enterprise

Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

1 © Copyright 2014 EMC Corporation. All rights reserved.

Data Science + Data Engineering

Annika Jimenez

Secret Weapon of the Strategic Enterprise

Page 2: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

2 © Copyright 2014 EMC Corporation. All rights reserved.

Agenda

Data Science: What is it and why do we do it?

The Importance of Data Engineering

An Example: Kaiser Code-a-thon

Transforming Your Enterprise with Pivotal

Pivotal Data Labs: Data Engineering + Data Science

Closing Advice

Page 3: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

3 © Copyright 2014 EMC Corporation. All rights reserved.

What Matters: Apps. Data. Analytics.

Apps power business, and those apps generate data Analytic insights from that data drive new app functionality, which in-turn drives new data The faster you can move around the cycle, the faster you learn, innovate and pull away from the competition

Page 4: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

4 © Copyright 2014 EMC Corporation. All rights reserved.

Primary Motions for Pivotal

Agile: data-driven apps and rapid time to value Data Lake: store everything, analyze anything

Enterprise PaaS: revolutionary software development and speed; build the right thing

Page 5: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

5 © Copyright 2014 EMC Corporation. All rights reserved.

DATA SCIENCE

The use of statistical and machine learning techniques on big, multi-structured data

– in a distributed computing environment – to identify correlations and causal relationships, classify and predict events, identify patterns and

anomalies, and infer probabilities, interest, and sentiment.

Page 6: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

6 © Copyright 2014 EMC Corporation. All rights reserved.

But, why do we use Data Science?

Page 7: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

7 © Copyright 2014 EMC Corporation. All rights reserved.

BI – show dashboard

Page 8: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

8 © Copyright 2014 EMC Corporation. All rights reserved.

Is the Goal Any of These Things?

A. Cool Visualizations

B. Custom Querying

C. Decision Enablement

D. Insights

E. All of the above… NO

Page 9: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

9 © Copyright 2014 EMC Corporation. All rights reserved.

DRIVE AUTOMATED LOW LATENCY ACTIONS

IN RESPONSE TO EVENTS OF INTEREST

Page 10: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

10 © Copyright 2014 EMC Corporation. All rights reserved.

YOUR DATA

DATA SCIENCE + = MODELS

Page 11: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

11 © Copyright 2014 EMC Corporation. All rights reserved.

Drive Automated

Low Latency Actions

Production Data Feeds

Low Latency Model

Scoring

API Availability or Push to

Apps

Business Logic

Application

Response

New Events

(aka, Data) Model Operationalization

(“O16N”)

Page 12: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

12 © Copyright 2014 EMC Corporation. All rights reserved.

Data Science Value Chain

Instrumen-tation

Logs Capture Store Transform

& Prepare Access Model Dev. Deploy Apps Process Change

Product Engineer

Data Engineer DBA Data

Engineer Data

Engineer Data Scientist

Data Engineer

Application Developer PMO

Page 14: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

14 © Copyright 2014 EMC Corporation. All rights reserved.

Code-a-Thon Details – Logistics

24-Hour Data Science Code-a-Thon

5 resources per vendor

Vendors were asked to be prepared for any use in the domain

A 15-minute presentation to senior leaders, executives, doctors and pharmacists

Teams were required to use Tableau in their presentation

Page 15: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

15 © Copyright 2014 EMC Corporation. All rights reserved.

Code-a-Thon – Pivotal Team

Hulya Emir-Farinas Data Scientist

Noah Zimmerman Data Scientist

Jacque Istok Application Developer

Dillon Woods Application Developer

Randy Williard Big Data Engineer

Jemish Patel Big Data Engineer

Adam Shook Big Data Engineer

Roy Mims Coordinator

Page 16: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

16 © Copyright 2014 EMC Corporation. All rights reserved.

The Day of the Code-a-Thon…

Page 17: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

17 © Copyright 2014 EMC Corporation. All rights reserved.

Key Insight

Page 18: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

18 © Copyright 2014 EMC Corporation. All rights reserved.

Asthma Population Management Application

Page 19: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

19 © Copyright 2014 EMC Corporation. All rights reserved.

Asthma Management Application

Page 20: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

20 © Copyright 2014 EMC Corporation. All rights reserved.

What Did We Learn in 2013? Pivotal has a world-class Data Science team, the best there is

Data Science alone is good, but Data Science + Expert Data Engineering and Architecture is great

Corollary: Data Science + Data Engineering + Apps trumps everything

– This is the path to rapid value creation and ROI

Page 21: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

21 © Copyright 2014 EMC Corporation. All rights reserved.

DATA SCIENCE

DATA ENGINEERING

PIVOTAL LABS

Data Science + Data Engineering + Pivotal Labs = The Magic in the Middle

Page 22: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

22 © Copyright 2014 EMC Corporation. All rights reserved.

What Is Pivotal Data Labs?

Data Science Data Engineering

+

Page 23: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

23 © Copyright 2014 EMC Corporation. All rights reserved.

Pivotal Data Scientists are technical professionals with strong programming skills, anchored in vertical/horizontal domains or in specialized academic research, able to identify real-world problems requiring predictive analytics, formulate these mathematically, and solve them by applying machine learning and statistical algorithms, on Big Data, in Pivotal and third-party technologies.

Pivotal Data Engineers are Big Data experts and industry veterans with a passion to leverage these skills to drive business value for Pivotal customers. They posses expert knowledge and skills with the Pivotal data products and excel at architecting enterprise scale solutions to the most demanding data and analytic challenges.

Data Science Data Engineering

+

What Is Pivotal Data Labs?

Page 24: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

24 © Copyright 2014 EMC Corporation. All rights reserved.

What is a “Data Scientist”? P

rog

ram

min

g S

kills

Mathematical/Statistical Skills

Page 25: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

25 © Copyright 2014 EMC Corporation. All rights reserved.

What is a “Data Engineer”?

Prog

ramm

ing

Skills

Architectural Skills

Page 26: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

26 © Copyright 2014 EMC Corporation. All rights reserved.

World’s Leading Experts Pivotal Labs – Pivotal Data Labs

BATCH BATCH

NEAR TIME NEAR TIME HAWQ Greenplum DB

Pivotal HD

REAL TIME REAL TIME GemFire XD GemFire

Page 27: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

27 © Copyright 2014 EMC Corporation. All rights reserved.

Pivotal One

SOLUTIONS Pivotal One

SERVICESS

Pivo

tal O

ne

PIVOTAL

MySQL

Elastic Runtime Services: Java, Spring, Ruby, Node.JS

Value-adds: Installation, Management & Monitoring

(Core OSS)

• Data Lake Solutions (Security Analytics, Corp Comm Analytics, Business) • RTI for Telco (RTI4T)

PIVOTAL GemFire XD

PIVOTAL Data Dispatch

Coming in 2014

Pivotal HD Hadoop+Que

ry

GPDB MPP

Analytics

GemFire In-Memory

Grid

Spring App

Framework

RabbitMQ, Redis…

Pivotal One

MARKETPLACE

Pivotal Data Labs in Data Fabric Building Towards Pivotal One

Page 28: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

28 © Copyright 2014 EMC Corporation. All rights reserved.

Introducing Pivotal Data Labs Our Charter: Pivotal Data Labs is Pivotal’s differentiated and highly opinionated data-centric service delivery organization.

Our Goals: Expedite customer time-to-value and ROI, by driving business-aligned innovation and solutions assurance within Pivotal’s Data Fabric technologies.

Drive customer adoption and autonomy across the full spectrum of Pivotal Data technologies through best-in-class data science and data engineering services, with a deep emphasis on knowledge transfer.

Page 29: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

29 © Copyright 2014 EMC Corporation. All rights reserved.

Highly-Opinionated & Differentiated

“Highly-Opinionated” – Highly prescriptive in our counsel of data best practices to customers and partners, drawing from best-in-class talent and deep experience operating on the Pivotal Data Fabric

“Differentiated” – An expert Data services business that is unique in its class and unlike the Data services available elsewhere

Page 30: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

30 © Copyright 2014 EMC Corporation. All rights reserved.

What Will PDL Deliver For Customers?

Accelerated time-to-value and real ROI for customers

Page 31: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

31 © Copyright 2014 EMC Corporation. All rights reserved.

How Do We Do This? Best-in-class Data Science to drive value creation on

Pivotal stack and customer data Best-in-class Data Engineering to drive pragmatic, well-

designed, customized architecture for end-to-end Pivotal stack Assured solutions success in Pivotal data service delivery Operationally-optimized predictive models Collaboration with Pivotal Labs to deliver data-driven

applications

Page 32: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

32 © Copyright 2014 EMC Corporation. All rights reserved.

INSTALL VERIFY ENABLE We will verify the installation

making sure it’s fully operational in your environment and ready

to give you the Pivotal advantage.

Our experts work with your staff to plan, install and fully-

configure the Pivotal software based on your environment and

requirements.

Lastly, we’ll train your people and conduct knowledge transfer

to make sure you are comfortable using and

supporting Pivotal software.

Getting Started with Pivotal Software

Engagement Management – site prep, project management, customer support overview

Hardware Validation / Installation

Software Installation / Verification

Training & Knowledge Transfer

PIVOTAL ONBOARDIN

G SERVICES

Page 33: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

33 © Copyright 2014 EMC Corporation. All rights reserved.

PIVOTAL SOLUTION

ASSURANCE

INCEPTION ADVISE SUPPORT Leveraging expert-services in

Pivotal Data Labs, Pivotal Labs, and Certified Pivotal Partners we’ll work with your architects and developers to assure that

your system design and development is aligned with

Pivotal best practices.

Getting off the ground with a well-formed plan, a solid team and realistic expectations are

fundamental to overall success. Our experts help with design,

guidance, oversight and lessons from other customers to

get your initiative going in the best direction from the start.

.

We’ll act as your conduit to Certified Professional Services, Customer Support and Pivotal R&D to make you successful,

quickly determine answers and bring in specialized expertise

where needed.

Leverage Pivotal Data Experts for Success

Engagement / Success Alignment – Regular meetings for status and guidance, Resource advice

Architecture Design Implementation Design and Assistance

Resource Assistance, Expedited Response

Page 34: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

34 © Copyright 2014 EMC Corporation. All rights reserved.

DISCOVERY INSIGHTS RESULTS Once the data is understood, we set ourselves apart by making optimal use of Pivotal’s Data

Fabric, our analytical tools and our data science experts to build models creating deep actionable insights on key events of interest

in any use case.

Getting the most from your data requires understanding what

you have today and discovering what your data can do for you. Combining our data scientists

with your data starts the path to value creation.

Driving insights into actionable results is enabled through data

scoring and model code optimization, documentation

and knowledge transfer.

Value Creation With Predictive Insights

Engagement / Business / Technical Alignment – Regular meetings for status and validation

Data exploration, readiness assessment, and prep Combining domain knowledge and

data science for predictive modeling Code documentation, and knowledge sharing

DATA SCIENCE

LABS

Page 35: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

35 © Copyright 2014 EMC Corporation. All rights reserved.

DATA LAB INCEPTION IMPLEMENT EXCEL

Delivered by a dedicated team drawing from Pivotal Data Labs’

experts in architecture, data engineering, data science, and

application development, implementations of Pivotal

technologies will be targeted to maximize value creation against

your specific technical and business goals.

Getting off the ground with a well-formed holistic

architectural, data science, and application plan is fundamental to overall success. Our experts

will drive this process leveraging deep experience in these areas, to streamline the

path to success for your initiative.

Years of experience building successful data projects give us

the know-how to quickly and efficiently work through the challenging phases of any project including design, scaling, integration and production readiness.

Engagement / Business / Technical Alignment – Regular meetings for status and guidance

Data platform architecting and strategic analytiic use-case roadmaps Data and Application Fabric

deployments, Data Science modeling & App development Knowledge sharing,

training and expert assistance

Deep Partnering to Maximize Value Creation

Page 36: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

36 © Copyright 2014 EMC Corporation. All rights reserved.

Pivotal Data Labs + Pivotal Labs = The Magic in the Middle

RAPID VALUE!

PIVOTAL LABS

* PIVOTAL DATA LABS

Page 37: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise

37 © Copyright 2014 EMC Corporation. All rights reserved.

My Advice to Enterprises 1. Know your data and its potential value

2. Get “vision” and question status quo

3. Understand the technical paradigm shift underway

4. Hire or grow your Data Dream Team: Data Scientists and Data Engineers

5. Clear the path to operationalization (aka, value)

6. Manage the disruption, don’t reject it

Page 38: Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise