35

Data Science Orientation

Embed Size (px)

Citation preview

Page 1: Data Science Orientation
Page 2: Data Science Orientation

Haritha ThilakarathneSoftware Engineer – Data Science & AnalyticsTech One Global – Enadoc Dev Center http://haritha.me

Page 3: Data Science Orientation
Page 4: Data Science Orientation
Page 5: Data Science Orientation

Data science is a multidisciplinary blend of data inference, algorithm development, & Technology in order to solve analytically complex problems.

• Making decisions• Confirming hypotheses• Gaining insights• Predicting future

Page 6: Data Science Orientation
Page 7: Data Science Orientation
Page 8: Data Science Orientation
Page 9: Data Science Orientation
Page 10: Data Science Orientation

Big Data Manipulation & Analysis

Page 11: Data Science Orientation
Page 12: Data Science Orientation

Data Mining

Page 13: Data Science Orientation
Page 14: Data Science Orientation

Data Visualization

Page 15: Data Science Orientation

Detail on distribution of artworks in the Tate collection by birthdate of artists, visualized by Florian Krautli.

Page 16: Data Science Orientation

Data Collection & Preparation • Extracting data from difficult sources• Filling in missing values•Removing suspicious data•Making formats, encoding, and units consistent•De-duplicating and matching

Page 17: Data Science Orientation

Correlation and Causation•Correlation – Values track each other• Height and Shoe Size • Grades and Entrance Exam Scores

•Causation – One value directly influences another • Education Level ->Starting Salary • Temperature -> Cold Drink Sales

Page 18: Data Science Orientation
Page 19: Data Science Orientation
Page 20: Data Science Orientation

Overfitting & Underfitting

Page 21: Data Science Orientation

Languages, Systems, Platforms• Spreadsheets• Programming Languages (R/Python)• Relational Database Management Systems • NoSQL Systems (Cassandra/ DocumentDB/ MongoDB)• Specialized Languages on scalable systems ( MapReduce/

Hadoop)• Systems for data visualization (PowerBI/ Tableau)• Data Processing on Cloud (Azure, Amazon Web Services)

Page 22: Data Science Orientation
Page 23: Data Science Orientation
Page 24: Data Science Orientation
Page 25: Data Science Orientation
Page 26: Data Science Orientation
Page 27: Data Science Orientation
Page 28: Data Science Orientation

Regression

Page 29: Data Science Orientation

Regression Goal: Function f applied to training data should

produce values as close as possible in aggregate to actual outputs

Page 30: Data Science Orientation

Classification

Page 31: Data Science Orientation

Clustering

Page 32: Data Science Orientation
Page 33: Data Science Orientation

Neural Networks

Page 34: Data Science Orientation
Page 35: Data Science Orientation