21

Demystifying the Data Scientist

Embed Size (px)

Citation preview

Page 1: Demystifying the Data Scientist
Page 2: Demystifying the Data Scientist

Demystifying the Data Scientist

Dan McClary, Ph.D.Big Data Product ManagementOracle

Note: The speaker notes for this slide include detailed instructions on how to customize this Title Slide with your own picture.

Tip! Remember to remove this text box.

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Page 3: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Data Scientists: By The Numbers

What’s a Data Scientist?

Do I need a Data Scientist?

How do I grow my own Data Scientist?

1

2

3

Page 4: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

What’s a Data Scientist

Page 5: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

What’s Data ScienceBuzzword or Essential Discipline?

• The buzz around “Data Science” is growing

• But isn’t it a bit like saying “chai tea?”

• What is a functional definition for data science?

Page 6: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

A Working Definition

• Data Science seeks to

– Extract meaning from data

– Create “data products”

– Use all available data to tell a valuable story to non-practioners

• So what makes a Data Scientist?

Page 7: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Anatomy of a Data Scientist

Statistical analysis

Scientific training

PhD in Computer Science?Statistics?Physics?Biology?!

Production-grade programmer inJava?Python?SQL

Business sensibility

Visualization

IT OperationsDatabases

Design Sensibility

Published researcher

BI Tools

Machine LearningPattern Recognition

Competitive IntelligenceHadoop

Big Data

Excellent Communicator/PresenterJavascript

Page 8: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Anatomy of a Data Scientist

Does anyone like that even exist?

Page 9: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Anatomy of a Data Scientist: Revised

Business

DataAnalytics

• Value Proposition• Goals• Communicate

Results

• Techniques• Interpretation• Model

Requirements

• Integration• Manipulation• Quality

Assurance

A person who has some degree of experience in each of

Page 10: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Do I Need a Data Scientist?

Page 11: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Do You Need A Data Scientist?

• Do you need an army of PhDs to solve machine learning problems?

– Probably not

• Could you find more value in the data you do and can collect?

– Undoubtedly

• Do you need people to find that value– Almost certainly

Page 12: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Fitting for Data Scientists

• Where?

– Kaggle.com – a community for Data Science• +100,000 members

– KDNuggets – forum for Data Mining and Data Science

• Who do I hire?– Some call themselves “data scientists,” but most call themselves• Mathematicians

• Scientists

• Reasearchers

• Physicists

Who? Where? How Many?

Page 13: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Fitting for Data Scientists

• Most organizations will benefit from a few seasoned data scientists

– Help transition to a more data-driven business

– Direct efforts to integrate analytics more tightly with LoBs

– Good understanding of how to tackle new problems

• Data scientists can be grown at home

– Leverage the existing workforce

– Provide growth opportunities for employees

How many do I need?

Page 14: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

How do I grow Data Scientists?

Page 15: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #1: Find Motivated Individuals

• Developers who want to

– Become more statistically oriented

– Better understand business challenges

• Business Analysts who– Have some programming ability

–Want to grow their technical capabilities

• All candidates should

– Possess tremendous curiosity

– Be able to self-manage

Sources for Good Candidates

Page 16: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #2: Find Low-Hanging Fruit

• Find a project that has

– High ROI

– Limited, defined scope

– Isn’t impossible

• Define

– The business value

– The time to invest

Analytically Important, Not Impossible

Val

ue

to B

usi

nes

s

Time to Answer

Page 17: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #3: Combine

• Add your data science team

• And the well-defined project

– Add a seasoned data scientist for best results

• Watch the team grow new skills

• Evaluate the outcome

– For the team members

– For the business

And Iterate

Page 18: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Step #4: Publish and PromoteShare Data Science Results as a Service

Data Scientist

Useful Derived Dataset

Anyone

Spark

Page 19: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Summary

• What is a Data Scientist

– Someone who can help drive value through data

• Do you need one?

– Possibly

• Can you grow a data scientist– Absolutely

Page 20: Demystifying the Data Scientist

Copyright © 2014, Oracle and/or its affiliates. All rights reserved. |

Page 21: Demystifying the Data Scientist