31
Common Problems in Hyperparameter Optimization Alexandra Johnson @alexandraj777

Common Problems in Hyperparameter Optimization

  • Upload
    sigopt

  • View
    36

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Common Problems in Hyperparameter Optimization

Common Problems in Hyperparameter Optimization

Alexandra Johnson@alexandraj777

Page 2: Common Problems in Hyperparameter Optimization

What are Hyperparameters?

Page 3: Common Problems in Hyperparameter Optimization

Hyperparameter Optimization● Hyperparameter

tuning, model tuning, model selection

● Finding "the best" values for the hyperparameters of your model

Page 4: Common Problems in Hyperparameter Optimization

Better Performance● +315% accuracy boost for TensorFlow● +49% accuracy boost for xgboost● -41% error reduction for recommender system

Page 5: Common Problems in Hyperparameter Optimization

#1 Trusting the Defaults

Page 6: Common Problems in Hyperparameter Optimization

● Default values are an implicit choice● Defaults not always appropriate for your model● You may build a classifier that looks like this:

Default Values

Page 7: Common Problems in Hyperparameter Optimization

#2 Using the Wrong Metric

Page 8: Common Problems in Hyperparameter Optimization

Choosing a Metric● Balance long-term

and short-term goals● Question underlying

assumptions● Example from

Microsoft

Page 9: Common Problems in Hyperparameter Optimization

Choose Multiple Metrics● ● Composite Metric

● Multi-metric

Page 10: Common Problems in Hyperparameter Optimization

#3 Overfitting

Page 11: Common Problems in Hyperparameter Optimization

Metric Generalization● Cross validation● Backtesting

● Regularization terms

Page 12: Common Problems in Hyperparameter Optimization

Metric Generalization● Cross validation● Backtesting

● Regularization terms

Page 13: Common Problems in Hyperparameter Optimization

Metric Generalization● Cross validation● Backtesting

● Regularization terms

Page 14: Common Problems in Hyperparameter Optimization

#4 Too Few Hyperparameters

Page 15: Common Problems in Hyperparameter Optimization

Optimize all Parameters at Once

Page 16: Common Problems in Hyperparameter Optimization

Include Feature Parameters

Page 17: Common Problems in Hyperparameter Optimization

Include Feature Parameters

Page 18: Common Problems in Hyperparameter Optimization

Example: xgboost● Optimized model

always performed better with tuned feature parameters

● No matter which optimization method

Page 19: Common Problems in Hyperparameter Optimization

#5 Hand Tuning

Page 20: Common Problems in Hyperparameter Optimization

What is an Optimization Method?

Page 21: Common Problems in Hyperparameter Optimization

You are not an Optimization Method● Hand tuning is time

consuming and expensive

● Algorithms can quickly and cheaply beat expert tuning

Page 22: Common Problems in Hyperparameter Optimization

Grid Search Random Search Bayesian Optimization

Use an Algorithm

Page 23: Common Problems in Hyperparameter Optimization

#6 Grid Search

Page 24: Common Problems in Hyperparameter Optimization

No Grid Search

Hyper-parameters

Model Evaluations

2 100

3 1,000

4 10,000

5 100,000

Page 25: Common Problems in Hyperparameter Optimization

#7 Random Search

Page 26: Common Problems in Hyperparameter Optimization

Random Search● Theoretically more

effective than grid search

● Large variance in results

● No intelligence

Page 27: Common Problems in Hyperparameter Optimization

Use an Intelligent MethodGenetic algorithms

Bayesian optimizationParticle-based methods

Convex optimizersSimulated annealing

To name a few...

Page 28: Common Problems in Hyperparameter Optimization

SigOpt: Bayesian Optimization ServiceThree API calls:

1. Define hyperparameters

2. Receive suggested hyperparameters

3. Report observed performance

Page 29: Common Problems in Hyperparameter Optimization

Thank You!

Page 30: Common Problems in Hyperparameter Optimization

IntroIan Dewancker. SigOpt for ML: TensorFlow ConvNets on a Budget with Bayesian Optimization.Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization.Ian Dewancker. SigOpt for ML : Bayesian Optimization for Collaborative Filtering with MLlib.#1 Trusting the DefaultsKeras recurrent layers documentation#2 Using the Wrong MetricRon Kohavi et al. Trustworthy Online Controlled Experiments: Five Puzzling Outcomes Explained. Xavier Amatriain. 10 Lessons Learning from building ML systems [Video at 19:03]. Image from PhD Comics.See also: SigOpt in Depth: Intro to Multicriteria Optimization.#4 Too Few HyperparametersImage from TensorFlow Playground.Ian Dewancker. SigOpt for ML: Unsupervised Learning with Even Less Supervision Using Bayesian Optimization.#5 Hand TuningOn algorithms beating experts: Scott Clark, Ian Dewancker, and Sathish Nagappan. Deep Neural Network Optimization with SigOpt and Nervana Cloud.#6 Grid SearchNoGridSearch.com

References - by Section

Page 31: Common Problems in Hyperparameter Optimization

References - by Section#7 Random SearchJames Bergstra and Yoshua Bengio. Random search for hyper-parameter optimization. Ian Dewancker, Michael McCourt, Scott Clark, Patrick Hayes, Alexandra Johnson, George Ke. A Stratified Analysis of Bayesian Optimization Methods. Learn Moreblog.sigopt.comsigopt.com/research