Upload
others
View
12
Download
0
Embed Size (px)
Citation preview
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
The Sexiest Job of the 21st Century-Harvard Business Review
Taken from Insight Data Fellows White Paper
So what is a Data Scientist?
Someone who knows more stats than a computer scientist and more programming than a statistician
So what is a Data Scientist?
Someone who knows more stats than a computer scientist and more programming than a statisticianAnd who can use a bunch
of data to ask and answer the right questions
● Asking and answering the right questions
● Discovering what we don't know from data
● Obtaining Predictive actionable insights from data
● Creating Data products that have business impact now
● Communicating relevant business stories from data
● Building Confidence in decision that drive business value
Some Points Taken from Carlos Somohano's Talk
What is Data Science?What is Data Science?
Using Data to create actionable insights and productsUsing Data to create actionable insights and products
Examples of Data ScienceExamples of Data Science
● Google's Search Algorithm & Advertisements
● Helping People find products, movies, news stories, music, … they want
● Amazon, ebay, Pandora, Netflix, Yahoo, ...● Linkedin
● Helping people find a job● Facebook
● Helping people interact with their friends● Using past patient data to improve diagnosis or alert
doctors to a patient's particular needs
● Helping people not die
Examples of Data ScienceExamples of Data Science
● Figuring what your project should be about
● Is it doable? Will it have an impact?● And actually doing it
● Implementing the code to perform the analysis that you need to do
● And communicating it
● Via presentations, plots, or text
Examples of Data ScienceExamples of Data Science
● Figuring what your project should be about
● Is it doable? Will it have an impact?● And actually doing it
● Implementing the code to perform the analysis that you need to do
● And communicating it
● Via presentations, plots, or text
Sound Familiar yet?
Examples of Data ScienceExamples of Data Science
● Classifying interesting targets from an image from background stars
● Using observed luminosities in bands (or a spectrum) to determine a galaxy's age, mass, and/or metallicity
● Automated Galaxy morphological classification
● PCA of anything
● Fitting Curves
● Any application of machine learning
● Even if you don't know those words you have probably used these methods without realizing it
Day to Day Life of a Data ScienstistDay to Day Life of a Data Scienstist
The Same as what you are doing now!
Day to Day Life of a Data ScienstistDay to Day Life of a Data Scienstist
The Same as what you are doing now!
Coding
Learning New Techniques
Reading about the field
Working in a collaborative team
Working with Cool Data
Present your work, listen to talks
Group Meetings
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
Why I Love AstronomyWhy I Love Astronomy
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
Why I Love AstronomyWhy I Love Astronomy
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
Why I Love Data ScienceWhy I Love Data ScienceWhy I Love Data ScienceWhy I Love Data Science
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
Why I Love AstronomyWhy I Love Astronomy
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
● Large Impact
● Staying in the Bay Area
● Security to Start a Family
Why I Love Data ScienceWhy I Love Data Science
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
Why I Love AstronomyWhy I Love Astronomy
● Interesting Problems
● Collaborative Environment
● Working with Data
● Analysis
● Figuring out the Right Questions to Ask
● Creativity
● Sharing my work with others
● Large Impact
● Staying in the Bay Area
● Security to Start a Family
● Money
Why I Love Data ScienceWhy I Love Data Science
What does a Data Science Job Look Like?What does a Data Science Job Look Like?
SalarySalary 80k – 120k + 5% stock + 5% bonus
WithoutPhd
WithPhd
What does a Data Science Job Look Like?What does a Data Science Job Look Like?
SalarySalary 80k – 120k + 5% stock + 5% bonus
LocationLocation San Francisco, SoMa, Bay Area, everywhere
Start-upStart-up
● Opportunity to grow ● Bigger Risk/Higher Reward● More exciting● More ownership of your contribution● More/cross responsibilities● Longer hours
EstablishedEstablished
● Security● Harder to move up to high levels● Perks can include
● 5 months maternity/paternity leave● Free lunch from Michelin Chef● Bike/Car Maintenance● Dry Cleaning● As Many Vacation Days as You Want
HoursHours 40 hr normally; can be 60 hr at crunch time
What does a Data Science Job Look Like?What does a Data Science Job Look Like?
SalarySalary 80k – 120k + 5% stock + 5% bonus
LocationLocation San Francisco, SoMa, Bay Area, everywhere
Start-upStart-up
● Opportunity to grow ● Bigger Risk/Higher Reward● More exciting● More ownership of your contribution● More/cross responsibilities● Longer hours
EstablishedEstablished
● Security● Harder to move up to high levels● Perks can include
● 5 months maternity/paternity leave● Free lunch from Michelin Chef● Bike/Car Maintenance● Dry Cleaning● As Many Vacation Days as You Want
HoursHours 40 hr normally; can be 60 hr at crunch time
What does a Data Science Job Look Like?What does a Data Science Job Look Like?
SalarySalary 80k – 120k + 5% stock + 5% bonus
LocationLocation San Francisco, SoMa, Bay Area, everywhere
Start-upStart-up
● Opportunity to grow ● Bigger Risk/Higher Reward● More exciting● More ownership of your contribution● More/cross responsibilities● Longer hours
EstablishedEstablished
● Security● Harder to move up to high levels● Perks can include
● 5 months maternity/paternity leave● Free lunch from Michelin Chef● Bike/Car Maintenance● Dry Cleaning● As Many Vacation Days as You Want
HoursHours 40 hr normally; can be 60 hr at crunch time
Key Differences From AstronomyKey Differences From Astronomy
● Not working on Astronomical problems
● Astronomy is slow to progress
● Papers take 6-12 months
● Data Science projects will be days, weeks, or months
– 80/20 rule: 80% of results with 20% of the work
– Don't find the “right” answer, find a “good enough” answer● Data Science projects can easily impact >100 million people compared to
roughly 50
● More social than astronomy
● More collaborative within a team
● Work with people outside your team and your realm of expertise
– Designer, product manager, etc.● Strong Emphasis on Machine Learning
What is Machine Learning
A branch of artificial intelligenceconcerned with the construction
of systems that can learn from data
What is Machine Learning
A branch of artificial intelligenceconcerned with the construction
of systems that can learn from dataIf you ever fit a line
you have “machine learned”!
What is Machine Learning
A branch of artificial intelligenceconcerned with the construction
of systems that can learn from dataIf you ever fit a line
you have “machine learned”!
Being Lazy. Lettingthe computer write your
programs for you
● Supervised Learning
● Useful Methods
– Dimensionality Reduction– Greedy Search / Gradient Descent / Optimization– Regularization (aka setting a prior)– Model Ensembles / Bagging / Stacking– Neural Networks
● Regression
– Neural Networks– Least Squares
● Classification
What is Machine Learning?What is Machine Learning?
- Support Vector Machines- Rule Induction
- Bayesian Classifiers- Logistic Regression
● Supervised Learning
● Useful Methods
– Dimensionality Reduction– Greedy Search / Gradient Descent / Optimization– Regularization (aka setting a prior)– Model Ensembles / Bagging / Stacking– Neural Networks
● Regression
– Neural Networks– Least Squares
● Classification
What is Machine Learning?What is Machine Learning?
- Support Vector Machines- Rule Induction
- Bayesian Classifiers- Logistic Regression
Given Examples of (x; f(x));
Predict the function f(x)for new examples of x
x and f(x)can be of any dimension
● Supervised Learning
● Regression
– Symbolic– Least Squares
● Classification
● Natural Language Processing
● Instance Based Learning
– Recommender Systems– K-nearest Neighbors
What is Machine Learning?What is Machine Learning?
- Support Vector Machines- Rule Induction
- Bayesian Classifiers- Logistic Regression
Symbolic Regression with Genetic AlgorithmsSymbolic Regression with Genetic Algorithms
- Support Vector Machines- Rule Induction
- Bayesian Classifiers- Logistic Regression
● Supervised Learning
● Regression
– Symbolic– Least Squares
● Classification
● Natural Language Processing
● Instance Based Learning
– Recommender Systems– K-nearest Neighbors
What is Machine Learning?What is Machine Learning?
- Bayesian Classifiers- Logistic Regression
- Support Vector Machines- Rule Induction- Decision Trees
Machine Learning
Male6
Poor
Female24
Rich
Male26
Poor
Male36
Rich
Female66
Poor
Male6
Poor
Female12
Rich
Female36
Poor
Male36
Poor
Machine Learning
Male6
Poor
Female24
Rich
Male26
Poor
Male36
Rich
Female66
Poor
Male6
Poor
Female12
Rich
Female36
Poor
Male36
Poor
Male18
Rich
Machine Learning
Male6
Poor
Female24
Rich
Male26
Poor
Male36
Rich
Female66
Poor
Male6
Poor
Female12
Rich
Female36
Poor
Male36
Poor
Male18
Rich ???
Machine Learning
Male6
Poor
Female24
Rich
Male26
Poor
Male36
Rich
Female66
Poor
Male6
Poor
Female12
Rich
Female36
Poor
Male36
Poor
Male18
Rich ???
Male18
Rich
Male18
Rich
???
Machine Learning
Male6
Poor
Female24
Rich
Male26
Poor
Male36
Rich
Female66
Poor
Male6
Poor
Female12
Rich
Female36
Poor
Male36
Poor
Male18
Rich
Male18
Rich
Machine Learning
Male6
Poor
Female24
Rich
Male26
Poor
Male36
Rich
Female66
Poor
Male6
Poor
Female12
Rich
Female36
Poor
Male36
Poor
Male18
Rich
Male18
Rich
We do the Same thing in Astronomy!
Machine Learning
SmallRed25
SmallRed24
BigRed17
SmallBlue14
BigBlue17
SmallRed22
BigRed20
SmallRed21
SmallRed22
SmallBlue15
SmallBlue15
We do the Same thing in Astronomy!
● Supervised Learning
● Regression
– Symbolic– Least Squares
● Classification
● Natural Language Processing
● Instance Based Learning
– K-nearest Neighbors– Recommender Systems
What is Machine Learning?What is Machine Learning?
- Support Vector Machines- Rule Induction- Decision Trees
- Bayesian Classifiers- Logistic Regression
● Unsupervised Learning
● Clustering
– K-means clustering– Social Network Analysis
● Useful Methods
– Dimensionality Reduction; PCA– Greedy Search / Gradient Descent / Optimization– Regularization (aka setting a prior)– Model Ensembles / Bagging / Stacking– Neural Networks
What is Machine Learning?What is Machine Learning?
● Unsupervised Learning
● Clustering
– K-means clustering– Social Network Analysis
● Useful Methods
● Dimensionality Reduction; PCA● Greedy Search / Gradient Descent / Optimization● Regularization (aka setting a prior)● Model Ensembles / Bagging / Stacking● Neural Networks
What is Machine Learning?What is Machine Learning?
● Don't Use IDL or Fortran!!!
● Python, R, Java instead● Slug's Guide to Getting a job
● Email me for access● Coursera
● Machine Learning Courses● Introduction to Data Science
● Data Science Meet-ups
● Kaggle
How to Prepare for Data ScienceHow to Prepare for Data Science
● AMS Designated Emphasis Requirements
● AMS 203: Introduction to Probability Theory
● AMS 206: Classical Bayesian Inference
● AMS 207: Intermediate Bayesian Statistics
● AMS 256: Linear Statistical Models
● One Elective From
● 205B: Intermediate Classical Inference
● 206B: Intermediate Bayesian Inference
● 221: Bayesian Decision Theory
● 223: Time Series Analysis
● 225: Multivariate Statistical Methods
● 241: Bayesian Nonparametric Methods
● 245: Spatial Statistics
● 274: Generalized Linear Models
How to Prepare for Data ScienceHow to Prepare for Data Science
Theory etc.
- 261: Probability Theory
with Markov Chains
- 263: Stochastic Processes
- 291: Advanced Topics in
Bayesian Statsistics
● Resume Workshop● Machine Learning Seminar
What Do You Want Next?What Do You Want Next?
Email Me for Access to [email protected]
or for any questions