Empirical Hardness Models: Methods and Useslarsko/slides/1qb.pdf · Empirical Hardness Models:...

Empirical Hardness Models:Methods and Uses

Lars KotthoffUniversity of British Columbia

larsko@cs.ubc.ca

1QBit, 18 October 2016

1 / 44

Big Picture

▷ advance the state of the art through meta-algorithmictechniques

▷ rather than inventing new things, use existing things better▷ “supplement” our understanding of how algorithms work

▷ Algorithm Selection: choose the best algorithm to solve agiven problem

▷ Algorithm Configuration: find the best parameterisation▷ at the core of both: empirical performance models

2 / 44

Big Picture

▷ advance the state of the art through meta-algorithmictechniques

▷ rather than inventing new things, use existing things better▷ “supplement” our understanding of how algorithms work▷ Algorithm Selection: choose the best algorithm to solve a

given problem▷ Algorithm Configuration: find the best parameterisation▷ at the core of both: empirical performance models

2 / 44

Prominent Application

Fréchette, Alexandre, Neil Newman, Kevin Leyton-Brown. “Solving theStation Packing Problem.” In Association for the Advancement of ArtificialIntelligence (AAAI), 2016.

3 / 44

Performance Differences

0.1 1 10 100 1000

Virtual Best CSP

Hurley, Barry, Lars Kotthoff, Yuri Malitsky, and Barry O’Sullivan. “Proteus:A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014.

4 / 44

Leveraging the Differences

Xu, Lin, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown.“SATzilla: Portfolio-Based Algorithm Selection for SAT.” J. Artif. Intell. Res.(JAIR) 32 (2008): 565–606.

5 / 44

Performance Improvements

Configuration of a SAT Solver for Verification [Hutter et al, 2007]

Ran FocusedILS, 2 days × 10 machines

– On a training set from each benchmark

Compared to manually-engineered default

– 1 week of performance tuning

– Competitive with the state of the art

– Comparison on unseen test instances

10−2

10−1

10−2

10−1

SPEAR, original default (s)

R, optim

ized for

4.5-fold speedup

on hardware verification

10−2

10−1

10−2

10−1

SPEAR, original default (s)

R, optim

ized for

500-fold speedup won category

QF BV in 2007 SMT competitionHutter & Lindauer AC-Tutorial AAAI 2016, Phoenix, USA 17

Hutter, Frank, Domagoj Babic, Holger H. Hoos, and Alan J. Hu.“Boosting Verification by Automatic Tuning of Decision Procedures.” InFMCAD ’07: Proceedings of the Formal Methods in Computer Aided Design,27–34. Washington, DC, USA: IEEE Computer Society, 2007. 6 / 44

Algorithm Selection

Given a problem, choose the best algorithm to solve it.

7 / 44

Original Model

x ∈ PProblem space

A ∈ AAlgorithm space

p ∈ Rn

Performancemeasure space

‖p‖ = Algorithmperformance

Selectionmapping

p(A,x)

Performancemapping

Normmapping

Rice, John R. “The Algorithm Selection Problem.” Advances in Computers15 (1976): 65–118.

8 / 44

Contemporary Model

x ∈ P

Problem spaceA ∈ A

Algorithm space

Selection modelS(x) = A

Prediction

p ∈ Rn

Performancemeasure

∀x ∈ P′⊂ P ,

A ∈ A :p(A,x)

Training data

Featureextraction

Feedback

9 / 44

Portfolios

▷ instead of a single algorithm, use several complementaryalgorithms

▷ idea from Economics – minimise risk by spreading it outacross several securities

▷ same for computational problems – minimise risk of algorithmperforming poorly

▷ in practice often constructed from competition winners

Huberman, Bernardo A., Rajan M. Lukose, and Tad Hogg. “An EconomicsApproach to Hard Computational Problems.” Science 275, no. 5296 (1997):51–54. doi:10.1126/science.275.5296.51.

10 / 44

Key Components of an Algorithm Selection System

▷ feature extraction▷ performance model▷ prediction-based selector/scheduler

optional:▷ presolver▷ secondary/hierarchical models and predictors (e.g. for feature

extraction time)

11 / 44

Features

▷ relate properties of problem instances to performance▷ relatively cheap to compute▷ specified by domain expert▷ syntactic – analyse instance description▷ probing – run algorithm for short time▷ dynamic – instance changes while algorithm is running

12 / 44

Syntactic Features

▷ number of variables, number of clauses/constraints/…▷ ratios▷ order of variables/values▷ clause/constraints–variable graph or variable graph:

▷ node degrees▷ connectivity▷ clustering coefficient▷ …

▷ …

13 / 44

Probing Features

▷ number of nodes/propagations within time limit▷ estimate of search space size▷ tightness of problem/constraints▷ …

14 / 44

Dynamic Features

▷ change of variable domains▷ number of constraint propagations▷ number of failures a clause participated in▷ …

15 / 44

What Features do we need in Practice?

▷ trade-off between complex features and complex models▷ in practice, very simple features (e.g. problem size) can

perform well, depending on the model used

16 / 44

Types of Performance Models

▷ models for entire portfolios▷ models for individual algorithms▷ models that are somewhere in between (e.g. pairs of

algorithms)

17 / 44

Models for Entire Portfolios▷ predict the best algorithm in the portfolio▷ alternatively: cluster and assign best algorithms to clusters

optional (but important):▷ attach a “weight” during learning (e.g. the difference between

best and worst solver) to bias model towards the “important”instances

▷ special loss metric

Gent, Ian P., Christopher A. Jefferson, Lars Kotthoff, Ian Miguel, NeilMoore, Peter Nightingale, and Karen E. Petrie. “Learning When to Use LazyLearning in Constraint Solving.” In 19th European Conference on ArtificialIntelligence, 873–78, 2010.

Kadioglu, Serdar, Yuri Malitsky, Meinolf Sellmann, and Kevin Tierney.“ISAC – Instance-Specific Algorithm Configuration.” In 19th EuropeanConference on Artificial Intelligence, 751–56, 2010.

18 / 44

Models for Individual Algorithms

▷ predict the performance for each algorithm separately▷ combine the predictions to choose the best one▷ for example: predict the runtime for each algorithm, choose

the one with the lowest runtime

19 / 44

Hybrid Models

▷ get the best of both worlds▷ for example: consider pairs of algorithms to take relations into

account▷ for each pair of algorithms, learn model that predicts which

one is faster

Xu, Lin, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown.“Hydra-MIP: Automated Algorithm Configuration and Selection for MixedInteger Programming.” In RCRA Workshop on Experimental Evaluation ofAlgorithms for Solving Problems with Combinatorial Explosion at theInternational Joint Conference on Artificial Intelligence (IJCAI), 16–30, 2011.

Kotthoff, Lars. “Hybrid Regression-Classification Models for AlgorithmSelection.” In 20th European Conference on Artificial Intelligence, 480–85,2012.

20 / 44

Types of Predictions/Algorithm Selectors

▷ best algorithm▷ n best algorithms ranked▷ allocation of resources to n algorithms▷ change the currently running algorithm?

Kotthoff, Lars. “Ranking Algorithms by Performance.” In LION 8, 2014.

Kadioglu, Serdar, Yuri Malitsky, Ashish Sabharwal, Horst Samulowitz, andMeinolf Sellmann. “Algorithm Selection and Scheduling.” In 17th InternationalConference on Principles and Practice of Constraint Programming, 454–69,2011.

Stergiou, Kostas. “Heuristics for Dynamically Adapting Propagation inConstraint Satisfaction Problems.” AI Commun. 22, no. 3 (2009): 125–41.

21 / 44

Time of Prediction

before problem is being solved▷ select algorithm(s) once▷ no recourse if predictions are bad

while problem is being solved▷ continuously monitor problem features and/or performance▷ can remedy bad initial choice or react to changing problem

22 / 44

Example System – SATzilla

▷ 7 SAT solvers, 4811 problem instances▷ syntactic (33) and probing features (15)▷ ridge regression to predict log runtime for each solver, choose

the solver with the best predicted performance▷ later version uses random forests to predict better algorithm

for each pair, aggregation through simple voting scheme▷ pre-solving, feature computation time prediction, hierarchical

model▷ won several competitions

23 / 44

Algorithm Configuration

Given a (set of) problem(s), find the best parameter configuration.

24 / 44

Algorithm Configuration

Frank Hutter and Marius Lindauer, “Algorithm Configuration: A Hands onTutorial”, AAAI 2016

25 / 44

Local Search

▷ start with random configuration▷ change a single parameter (local search step)▷ if better, keep the change, else revert▷ repeat▷ optional (but important): restart with new random

configurations

Hutter, Frank, Holger H. Hoos, Kevin Leyton-Brown, and Thomas Stützle.“ParamILS: An Automatic Algorithm Configuration Framework.” J. Artif. Int.Res. 36, no. 1 (2009): 267–306.

26 / 44

Local Search ExampleGoing Beyond Local Optima: Iterated Local Search

(Initialisation)

Animation credit: Holger HoosHutter & Lindauer AC-Tutorial AAAI 2016, Phoenix, USA 33

graphics by Holger Hoos

27 / 44

(Initialisation)

28 / 44

(Local Search)

29 / 44

(Local Search)

30 / 44

(Perturbation)

31 / 44

(Local Search)

32 / 44

(Local Search)

33 / 44

(Local Search)

34 / 44

Selection (using Acceptance Criterion)

35 / 44

Model-Based Search

▷ build model of parameter-response surface▷ allows targeted exploration of new configurations▷ (can take instance features into account like algorithm

selection)

Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. “SequentialModel-Based Optimization for General Algorithm Configuration.” In LION 5,507–23, 2011.

36 / 44

Model-Based Search Example

● ●

●0.00

0.00 0.25 0.50 0.75 1.00x

● init

Iter = 1, Gap = 0.0000e+00

graphics by Bernd Bischl with mlrMBO R package 37 / 44

● ●

●0.0

0.00 0.25 0.50 0.75 1.00x

● init

Iter = 6, Gap = 0.0000e+00

● ●

●0.0

0.00 0.25 0.50 0.75 1.00x

● init

Iter = 13, Gap = 0.0000e+00

● ●

●0.0

0.0000

0.0005

0.0010

0.0015

0.0020

0.0025

0.00 0.25 0.50 0.75 1.00x

● init

Iter = 20, Gap = 0.0000e+00

Overtuning

▷ similar to overfitting in machine learning▷ performance improves on training instances, but not on test

instances▷ configuration is too “tailored”, e.g. specific to satisfiable

instances▷ but can be combined with Algorithm Selection

41 / 44

Summary

Algorithm Selection choose the best algorithm for solving aproblem

Algorithm Configuration choose the best parameter configurationfor solving a problem with an algorithm

▷ mature research areas▷ can combine configuration and selection▷ effective tools are available

42 / 44

Tools and Resources

Algorithm Selectionautofolio https://bitbucket.org/mlindauer/autofolio/LLAMA https://bitbucket.org/lkotthoff/llamaSATzilla http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/

Algorithm ConfigurationiRace http://iridia.ulb.ac.be/irace/

mlrMBO https://github.com/mlr-org/mlrMBOSMAC http://www.cs.ubc.ca/labs/beta/Projects/SMAC/

Spearmint https://github.com/HIPS/SpearmintTPE https://jaberg.github.io/hyperopt/

Kotthoff, Lars. “Algorithm Selection for Combinatorial Search Problems: ASurvey.” AI Magazine 35, no. 3 (2014): 48–60.

43 / 44

COSEAL

and Selection of Algorithms

CO S E AL

Interested? Join the COSEAL group!

http://coseal.net 44 / 44

Empirical Hardness Models: Methods and Useslarsko/slides/1qb.pdf · Empirical Hardness Models:...

Documents

6-1 Introduction To Empirical Models

Empirical Models to deal with Unobserved Heterogeneity

Hierarchical Hardness Models for SAT

HEC-HMS Runoff Computation Modeling Direct Runoff with HEC-HMS Empirical models Empirical models - traditional UH models - traditional UH models - a

Empirical Models of Market Entry - aguirregabiria.netaguirregabiria.net/courses/eco2901/book_dynamic_io_chapter_05.pdf · CHAPTER 5 Empirical Models of Market Entry (Note: Distinguish

EMPIRICAL APPROACH TO THE COMPLEXITY OF HARD …robotics.stanford.edu/~eugnud/papers/eugnud.pdf · crete applications of empirical hardness models. First, these models can be used

Linear Models and Empirical Bayes Methods for · PDF fileLinear Models for Microarray Data Linear Models and Empirical Bayes for Microarrays Implementation and Examples ... Estimation

Empirical Transition Matrix of Multistate Models: The etm ... · PDF fileIntroductionPackage DescriptionIllustrationSummary Empirical Transition Matrix of Multistate Models: The etm

Empirical Option Pricing Models - biz.uiowa.edu

Comparison of Empirical Models

New Keynesian DSGE Models: Theory, Empirical

Microarchitecture Sensitive Empirical Models for Compiler

Empirical Models of Differentiated Products

Empirical Models of Wind Conditions on Upper Klamath … · Empirical Models of Wind Conditions on ... and Wood, T.M., 2010, Empirical models of wind conditions on Upper Klamath Lake,

Understanding the Empirical Hardness of Random

Regression models are empirical as opposed to mechanistic models

Empirical Models of Lobbying* - GitHub Pages

The Future of Transposition Models: From Isotropic ...Empirical transposition models show bias Empirical transposition models consider a more detailed analysis of the downwelling diffuse

Empirical Applications of Neoclassical Growth Models

Empirical Kraft Pulping Models