How to teach yourself data scienceby Claudia Gold
Background:SlideRule Learning Path
SlideRule learning path:The minimum you need
to be an entry-level data scientist
What do I need to knowto be a data scientist?1
Statistics
Modeling and
forecasting Machine Learning
Data Visualization
Data Engineering
Areas of Data Science
Extract and
manipulate data
Find trends and patterns among noise
Visualize and communicate results
General computer knowledge:Version control, the command
line, cron, etc
Designing sound experiments:Research methodology
Business Sense
• Ask the right questions• Focus on actionable insights• Gain subject expertise
Source: Adapted from Data Science Venn Diagram,
author Drew Conway, posted to Wikimedia on Sep 30, 2010.
Math & Statistics
KnowledgeHacking
Skills
SubjectExpertise
Traditional
ResearchDanger
Zone!
Machine Learning
DataScience
Learning data science takes months, not hours.
(and that’s assuming you already know some math)
Data Scientist or
Data Analyst?
What tools do data scientists use?2
The essentials:Python, R, and SQL
Data at scale: Hadoop, Hive, MapReduce
Alternatives:Julia, MongoDB, Java
Data visualization:D3, R/Shiny
How these tools might fit together
SQL database
Website data
Customer support data
Event tracking data
R or Python
Website
Analysis
Dashboards and reports
Predictive models
What is it really like to be a data scientist?3
Source: Programmer writing code with unit tests, author Joonspoon, uploaded
to Wikipedia on November 4, 2014
Working with other teams
Data Science
Marketing
Product
Business decisions
Operations
Data products
Balancing requests
How you’ll spend your time
80%
20%
Analyzing dataPreparing to analyze data
Accuracy and completeness vs
time constraints
Communicating resultsto varied audiences
Advocating for datawithin an organization
Balancing data with other priorities
Unlimited applications
Source: Burtch Works Study, Salaries of Data Scientists, pg 14
Always more to learn
Play a vital role within an organization
We’re hiring!www.goodeggs.com/jobs