20
2015 Healthcare Data Science Practical Data Science: The WPC Healthcare Strategy for Delivering Meaningful Data Science Projects Damian Mingle @OPENDATASCI

Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

Embed Size (px)

Citation preview

Page 1: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

2015 Healthcare Data Science

Practical Data Science: The WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

Damian Mingle@OPENDATASCI

Page 2: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects
Page 4: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

4

What’s the Problem?A Common Scenario: Johnny Data Scientist

• Does not like working with others• Too much black magic – not enough explanation• His process is always different• Uses multiple languages• Hates producing presentations• Constant unclear project status• Doesn’t capture business needs• Models aren’t production quality

Page 5: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

5

Why Jupyter?Interactive Computing Environment

• Notebook Web Application: Writing and running code interactively

• Kernels: Over 40 programming languages• Notebook Documents: Self-contained documents which

include: Live code, Interactive widgets, Plots, Narrative text, Equations, Images, and Video

Page 6: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

6

Why a Data Science Methodology?Data Science Projects Involve Risk

• Strategically: Provides confidence to the business that Data Science projects can be delivered profitably

• Tactically: Management can understand status assessments• Operationally: Empowers the Data Science team to do the

right thing, the right way, the first time.

Page 7: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

7

Business UnderstandingUncover important factors at the Start

• Determine business objectives• Assess situation• Determine data science goals• Produce project plan

Understand the Data Science project objectives and requirements from a business perspective. Then convert this knowledge into a Data Science problem definition and preliminary plan designed to achieve the objective.

Page 9: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

9

Data UnderstandingBecome familiar with the data

• Collect initial data• Describe data• Explore data• Verify data quality

Identify data quality problems, discover first insights into the data, and/or detect interesting subsets to form hypotheses regarding hidden information.

Page 11: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

11

Data PreparationConstruct the Final Dataset

• Select data• Clean data• Construct data• Integrate data• Format data

Data Science task in this phase have to do with selection of table, record, and attributes. In addition, transformation and cleaning of data.

Page 13: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

13

ModelingVarious Modeling Techniques Are Selected

• Select modeling technique• Generate test design• Build model• Assess model

In this phase, calibrating parameters is important. Some techniques may require the Data Scientist to go back to the data preparation phase.

Page 15: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

15

EvaluationReview Your Steps with Certainty

• Evaluate results• Review process• Determine next steps

At the end of this phase, a decision on the use of the data science results should be reached.

Page 17: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

17

DeploymentMake Use of The Model

• Plan deployment• Plan monitoring and maintenance• Produce final report• Review project

This phase can be as simple as generating a report or as complex as implementing a repeatable data science process across the enterprise.

Page 19: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

19

Data Scientist 2.0Lead Analytically Your Organization

• Use Jupyter to document your process – real time – using whatever language you want!

• Establish a Data Science Methodology that is comprehensive• Provide insights that help the organization make better

decisions to solve their business problems

Page 20: Practical Data Science the WPC Healthcare Strategy for Delivering Meaningful Data Science Projects

Have Questions?E-mail: [email protected]: @damianmingleLinkedIn: DamianRMingle