Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
STATISTICAL MODELING: THETWO CULTURES
Based on Leo Breiman's paper
Christoph MolnarDepartment of Statistics,LMU Munich
..
STATISTICAL MODELING: THETWO CULTURES
Based on Leo Breiman's paper
Christoph MolnarDepartment of Statistics,LMU Munich
..20
14-01-20
Statistical Modeling: The Two Cultures
Abstract:This presentation compares two cultures of statisticalmodeling: the data modeling culture, which assumes astochastic process that produced the data. This culture isassociated with traditional statistics. The other culture is calledalgorithmic modeling culture, which can be reduced tooptimization of a loss function with an algorithm. This culture isassociated with Machine Learning. It is argued to usealgorithmic modeling more often in statistics.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
OUTLINE
1. Statistics: Data Modeling Culture2. Machine Learning: Algorithmic Modeling Culture3. Statistical Learning Principles4. Personal Experience5. Summary
Content heavily based on: ``Statistical Modeling: The two cultures''from Leo Breiman [1]
1 / 19
..
OUTLINE
1. Statistics: Data Modeling Culture2. Machine Learning: Algorithmic Modeling Culture3. Statistical Learning Principles4. Personal Experience5. Summary
Content heavily based on: ``Statistical Modeling: The two cultures''from Leo Breiman [1]
..20
14-01-20
Statistical Modeling: The Two Cultures
Outline
This presentation is based on ``Statistical Modeling: The twocultures'' from Leo Breiman [1].The first segment introduces the data modeling culture andanalogously the second segment explains the algorithmicmodeling culture together with the presentation of threealgorithms. The part about statistical learning principalspresents aspects which help to compare both of cultures.Personal experiences in both cultures are addressed. Theconclusion summarizes the message of the paper [1].
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
WORK OF A DATA ANALYST
→ Predict→ Reveal associations→ Munge data, design experiments, visualize data, …
2 / 19
..
WORK OF A DATA ANALYST
→ Predict→ Reveal associations→ Munge data, design experiments, visualize data, …
..20
14-01-20
Statistical Modeling: The Two Cultures
Data Modeling
Work of a Data Analyst
The work of a data analyst is very diverse. This presentationfocuses on the modeling of data, which can be reduced to twotargets: Learn a model to predict the outcome for newcovariates and get a better understanding about therelationship between covariates and outcome.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
SIMPLIFIED WORLDVIEW
..nature. x.y ..
3 / 19
..
SIMPLIFIED WORLDVIEW
..nature. x.y ..
..20
14-01-20
Statistical Modeling: The Two Cultures
Data Modeling
Simplified worldview
In a strongly simplified world an arbitrary outcome y isproduced by the nature given the covariates x. The knowledgeabout the natures true mechanisms range between entirelyunknown and established (scientific) explanations of themechanism. One example: Outcome y is the rent forapartments and covariates x are size, number of bathroomsand location.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
DATA MODELING CULTURE
..
LogisticRegression,Cox Model,
GEE,…
. x.y ..
Find a stochastic model of the data-generating process:y = f(x, parameters, random error)
4 / 19
..
DATA MODELING CULTURE
..
LogisticRegression,Cox Model,
GEE,…
. x.y ..
Find a stochastic model of the data-generating process:y = f(x, parameters, random error)
..20
14-01-20
Statistical Modeling: The Two Cultures
Data Modeling
Data Modeling Culture
The direct modeling of the mechanism in the ``box'' is labeled»Data Modeling Culture« by Leo Breiman. In this culture astochastic model for the data- generating process is assumed.A common formulation of these model is: y is a function of xwith corresponding weights and a random error. For example:Given the covariates size, number of bathrooms and location,the rent of apartments is normal distributed.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
TYPICAL ASSUMPTIONS AND RESTRICTIONS
→ Specific stochastic model that generated the data→ Distribution of residuals→ Linearity (e.g. linear predictor)→ Manual specification of interactions
5 / 19
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
PROBLEMS
→ Conclusions about model, not about nature→ Assumptions often violated→ Often no model evaluation→ ⇒ can lead to irrelevant theory and questionable
statistical conclusions→ Focus not on prediction→ Data models fail in areas like image and speech
recognition
6 / 19
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
MACHINE LEARNING
..unknown. x.y ..
algorithm
....
Find a function f(X) that minimizes the loss: L(Y, f(X))
7 / 19
..
MACHINE LEARNING
..unknown. x.y ..
algorithm
....
Find a function f(X) that minimizes the loss: L(Y, f(X))
..20
14-01-20
Statistical Modeling: The Two Cultures
Algorithmic Modeling
Machine Learning
In the »Algorithmic Modeling Culture«, the true mechanism istreated as unknown. It is not the target to find the truedata-generating mechanism but to use an algorithm thatimitates the mechanism as good as possible. Modeling isreduced to a mere problem of function optimization: Given thecovariates x, outcome y and a loss function find a function f(x)which minimizes the loss for the prediction of the outcome. Thisculture is lived in the machine learning area.Summary: The data modeling culture tries to find the truedata-generating mechanism, the algorithmic modeling culturetries to imitate the true mechanism as good as possible.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
ALGORITHM IN MACHINE LEARNING
→ Boosting→ Support Vector Machines→ Artificial neural networks→ Random Forests→ Hidden Markov→ Bayes-Nets→ … 1
1Details and more algorithms in ``Elements of statistical learning''[2]8 / 19
..
ALGORITHM IN MACHINE LEARNING
→ Boosting→ Support Vector Machines→ Artificial neural networks→ Random Forests→ Hidden Markov→ Bayes-Nets→ … 1
1Details and more algorithms in ``Elements of statistical learning''[2]
..20
14-01-20
Statistical Modeling: The Two Cultures
Algorithmic Modeling
Algorithm in Machine Learning
The algorithms used in machine learning are motivateddifferently. Three algorithms are presented in short.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
ARTIFICIAL NEURAL NETWORKS
2
2http://commons.wikimedia.org/wiki/File:Mouse_cingulate_cortex_neurons.jpghttp://commons.wikimedia.org/wiki/File:Neural_network.svg
9 / 19
..
ARTIFICIAL NEURAL NETWORKS
2
2http://commons.wikimedia.org/wiki/File:Mouse_cingulate_cortex_neurons.jpghttp://commons.wikimedia.org/wiki/File:Neural_network.svg
..20
14-01-20
Statistical Modeling: The Two Cultures
Algorithmic Modeling
Artificial neural networks
Artificial neural networks are used in classification andregression. They are inspired by the brain, which consists of anetwork of brain cells (neurons). Mathematically artificialneural networks are a concatenation of weighted functions. Anexemplary application is image processing.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
SUPPORT VECTOR MACHINES
3
3http://commons.wikimedia.org/wiki/File:Svm_10_perceptron.JPG 10 / 19
..
SUPPORT VECTOR MACHINES
3
3http://commons.wikimedia.org/wiki/File:Svm_10_perceptron.JPG
..20
14-01-20
Statistical Modeling: The Two Cultures
Algorithmic Modeling
Support Vector Machines
Support Vector Machines (SVM) were originally a classificationmethod (regression is also possible). SVMs try to draw aborder between two classes in the covariate space. Thedistance between the border and the class points ismaximized. They use a mathematical trick to implicitly projectthe covariates in a space with higher dimensions (yes, it soundsa bit crazy) in order to achieve class separation. Textclassification is an exemplary usage.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
RANDOM FORESTS
4
4http://openclipart.org/detail/175304/forest-by-z-175304
11 / 19
..
RANDOM FORESTS
4
4http://openclipart.org/detail/175304/forest-by-z-175304
..20
14-01-20
Statistical Modeling: The Two Cultures
Algorithmic Modeling
Random Forests
Random Forests™(invented by Leo Breiman) are used forregression and classification. A Random Forest consists ofmany decision trees. Two random mechanisms are used totrain different trees on the data. Results from all trees areaveraged for the prediction.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
RASHOMON EFFECT
(Often) Many different models describe a situationequally accurate.
12 / 19
..
RASHOMON EFFECT
(Often) Many different models describe a situationequally accurate.
..20
14-01-20
Statistical Modeling: The Two Cultures
Principles
Rashomon Effect
Rashomon is a Japanese movie in which four witnesses telldifferent versions of a crime. All versions account for the factsbut they contradict each other. In terms of statistical learningthis means, that often different models (e.g. same y butdifferent covariates) can be equally accurate. Each model hasa different interpretation which makes it difficult to find thetrue mechanism in the data modeling cultures. In thealgorithmic modeling culture the Rashomon effect is exploitedby aggregating over many models. Random forests use thiseffect by aggregating over many trees. It is also common toaverage the predictions of different algorithms.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
DIMENSIONALITY OF THE DATA
→ The higher the dimensionality of the data (#covariates) the more difficult is the separation ofsignal and noise
→ Common practice in data modeling: variable selection(by expert selection or data driven) and reduction ofdimensionality (e.g. PCA)
→ Common practice in algorithmic modeling:Engineering of new features (covariates) to increasepredictive accuracy; algorithms robust for manycovariates
13 / 19
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
PREDICTION VS. INTERPRETATION
.. Interpretability.
Predictive accuracy
.
Tree
.
Random Forest
.
Logist. Regression
.
Neuronal networks
.
…
.
…
14 / 19
..
PREDICTION VS. INTERPRETATION
.. Interpretability.
Predictive accuracy
.
Tree
.
Random Forest
.
Logist. Regression
.
Neuronal networks
.
…
.
…
..20
14-01-20
Statistical Modeling: The Two Cultures
Principles
Prediction vs. Interpretation
There is a trade-off between interpretability and predictiveaccuracy: the models that are good in prediction are oftencomplex and models that are easy to interpret are often badpredictors. See for example trees and Random Forests: Asingle decision tree is very intuitive and easy to read fornon-professionals, but they are unstable and give weakpredictions. A complex aggregation of decision trees (RandomForest) has an excellent prediction accuracy, but it isimpossible to interpret the model structure.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
GOODNESS OF MODEL
→ Data modeling culture: Goodness of fit often basedon model assumptions (e.g. AIC) and calculated ontraining data.
→ Algorithmic modeling culture: Evaluation of predictiveaccuracy with an extra test set or cross validation.
How good is a statistical model if the predictive accuracyis weak? Is it legit to interpret parameters and p-values?
15 / 19
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
STATISTICAL CONSULTING
Stereotypical user ...
→ are e.g. veterinarians, linguists, biologists, …→ crave p-values→ want interpretability→ ignore model diagnosis
16 / 19
..
STATISTICAL CONSULTING
Stereotypical user ...
→ are e.g. veterinarians, linguists, biologists, …→ crave p-values→ want interpretability→ ignore model diagnosis
..20
14-01-20
Statistical Modeling: The Two Cultures
Experiences
Statistical Consulting
From my experience in the statistical consulting unit of ouruniversity, most user want to test their scientific hypothesiswith models. They want models which are easy to interpretregarding their questions. Thus it is more important to have amodel that gives parameters associated with covariates andp-values than to have a model that predicts the data well.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
KAGGLE
Algorithms of winners on kaggle, a platform forprediction challenges:
→ Job Salary Prediction: »I used deep neural networks«→ Observing Dark Worlds: »Bayesian analysis provided
the winning recipe for solving this problem«→ Give Me Some Credit: »In the end we only used five
supervised learning methods: a random forest ofclassification trees, a random forest of regressiontrees, a classification tree boosting algorithm, aregression tree boosting algorithm, and a neuralnetwork.«
17 / 19
..
KAGGLE
Algorithms of winners on kaggle, a platform forprediction challenges:
→ Job Salary Prediction: »I used deep neural networks«→ Observing Dark Worlds: »Bayesian analysis provided
the winning recipe for solving this problem«→ Give Me Some Credit: »In the end we only used five
supervised learning methods: a random forest ofclassification trees, a random forest of regressiontrees, a classification tree boosting algorithm, aregression tree boosting algorithm, and a neuralnetwork.«
..20
14-01-20
Statistical Modeling: The Two Cultures
Experiences
kaggle
Kaggle (http://www.kaggle.com) is a platform for predictionchallenges. Data are provided and the participant with the bestprediction on a test set wins the challenge. To generateinsights about the mechanisms in the data is secondarybecause the prediction is all that counts to win. That's why thealgorithmic modeling culture is superior in this field.
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
CONCLUSION
Data analysts should:
→ Use predictive accuracy to evaluate models→ Seek the best model→ Add Machine Learning to their toolbox
18 / 19
Data Modeling Algorithmic Modeling Principles Experiences Conclusions
FURTHER LITERATURE
L. BreimanStatistical modeling: The two cultures (with commentsand a rejoinder by the author)Institute of Mathematical Statistics, 2001.
T. Hastie, R. Tibshirani and J. FriedmanThe elements of statistical learningSpringer New York, 2001
19 / 19