12
Software Engineering for Data Science

Software Engineering in Data Science

Embed Size (px)

Citation preview

Page 1: Software Engineering in Data Science

Software Engineerin

gfor Data Science

Page 2: Software Engineering in Data Science

what is data science?

"A data scientist is a statistician who lives in San Francisco"

Page 3: Software Engineering in Data Science
Page 4: Software Engineering in Data Science

understanding data-mining processCRISP-DM

Cross Industry Standard Process for Data Mining

Conceived in 1996

Describe 6 high-level analytics process

Current Version CRISP-DM 2.O

Page 5: Software Engineering in Data Science
Page 6: Software Engineering in Data Science

agile software developmentSupports incremental development

Use iterative work cadences, known as sprints

Fits with CRISP-DM methodology

Allows feedback loop in development

“software should not be developed like an automobile on an assembly line, in which each piece is added in sequential phases.” - Dr. Winston Royce

Page 7: Software Engineering in Data Science

language supportSCRIPTING VS COMPILED LANGUAGES

SCRIPTS

Interpreted not compiled

Loosely typed

Can run with errors

Perfect for prototyping & incremental development

Examples: Python, Javascript, PHP, Ruby, R

Page 8: Software Engineering in Data Science

language supportCOMPILED LANGUAGES

Strict syntax

Compiled language

Use an underlying framework

Examples: C,C++, Java

Page 9: Software Engineering in Data Science

python and r?Awesome data structures (data frames,vectors,matrices)

Incremental programming

Statistical packages

Web integration (databases,websites, APIs)

Good for quick and dirty work

Can be modularized

Easy to read syntax

Page 10: Software Engineering in Data Science

BEST PRACTICES

Write rough code (prototyping/proof of concept)

Abstract and separate code into functions

Group functions into library/package

Page 11: Software Engineering in Data Science

LET’S LOOK AT SOME CODE

Page 12: Software Engineering in Data Science