Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
AI-Augmented AlgorithmsHow I Learned to Stop Worrying and Love Choice
Lars KotthoffUniversity of Wyoming
Boulder, 16 January 2019
Outline
▷ Big Picture▷ Motivation▷ Choosing Algorithms▷ Tuning Algorithms▷ (NCAR-relevant) Applications▷ Outlook and Resources
2
Big Picture▷ advance the state of the art through meta-algorithmic
techniques▷ rather than inventing new things, use existing things more
intelligently – automatically▷ invent new things through combinations of existing things
https://xkcd.com/720/
3
Big Picture▷ advance the state of the art through meta-algorithmic
techniques▷ rather than inventing new things, use existing things more
intelligently – automatically▷ invent new things through combinations of existing things
https://xkcd.com/720/ 3
Motivation – What DifferenceDoes It Make?
4
Prominent Application
Fréchette, Alexandre, Neil Newman, Kevin Leyton-Brown. “Solving theStation Packing Problem.” In Association for the Advancement of ArtificialIntelligence (AAAI), 2016.
5
Performance Differences
0.1
1
10
100
1000
0.1 1 10 100 1000
Virt
ual B
est S
AT
Virtual Best CSP
Hurley, Barry, Lars Kotthoff, Yuri Malitsky, and Barry O’Sullivan. “Proteus:A Hierarchical Portfolio of Solvers and Transformations.” In CPAIOR, 2014.
6
Leveraging the Differences
Xu, Lin, Frank Hutter, Holger H. Hoos, and Kevin Leyton-Brown.“SATzilla: Portfolio-Based Algorithm Selection for SAT.” J. Artif. Intell. Res.(JAIR) 32 (2008): 565–606.
7
Performance Improvements
Configuration of a SAT Solver for Verification [Hutter et al, 2007]
Ran FocusedILS, 2 days × 10 machines
– On a training set from each benchmark
Compared to manually-engineered default
– 1 week of performance tuning
– Competitive with the state of the art
– Comparison on unseen test instances
10−2
10−1
100
101
102
103
104
10−2
10−1
100
101
102
103
104
SPEAR, original default (s)
SP
EA
R, optim
ized for
IBM
−B
MC
(s)
4.5-fold speedup
on hardware verification
10−2
10−1
100
101
102
103
104
10−2
10−1
100
101
102
103
104
SPEAR, original default (s)
SP
EA
R, optim
ized for
SW
V (
s)
500-fold speedup won category
QF BV in 2007 SMT competitionHutter & Lindauer AC-Tutorial AAAI 2016, Phoenix, USA 17
Hutter, Frank, Domagoj Babic, Holger H. Hoos, and Alan J. Hu.“Boosting Verification by Automatic Tuning of Decision Procedures.” InFMCAD ’07: Proceedings of the Formal Methods in Computer Aided Design,27–34. Washington, DC, USA: IEEE Computer Society, 2007. 8
Common Theme
Performance models of black-box processes▷ also called surrogate models▷ substitute expensive underlying process with cheap
approximate model▷ build approximate model using machine learning techniques
based on results of evaluations of the underlying process▷ no knowledge of what the underlying process is required (but
can be helpful)▷ may facilitate better understanding of the underlying process
through interrogation of the model
9
Choosing Algorithms
10
Algorithm Selection
Given a problem, choose the best algorithm to solve it.
Rice, John R. “The Algorithm Selection Problem.” Advances in Computers15 (1976): 65–118.
11
Algorithm Selection
Portfolio
Algorithm 2Algorithm 1 Algorithm 3
Training Instances
Instance 2Instance 1 Instance 3
Algorithm Selection
Performance Model
Instance 4Instance 5Instance 6
...
Instance 4: Algorithm 2Instance 5: Algorithm 3Instance 6: Algorithm 3
...
Feature Extraction
FeatureExtraction
12
Algorithm Portfolios
▷ instead of a single algorithm, use several complementaryalgorithms
▷ idea from Economics – minimise risk by spreading it outacross several securities
▷ same for computational problems – minimise risk of algorithmperforming poorly
▷ in practice often constructed from competition winners orother algorithms known to have good performance
Huberman, Bernardo A., Rajan M. Lukose, and Tad Hogg. “An EconomicsApproach to Hard Computational Problems.” Science 275, no. 5296 (1997):51–54. doi:10.1126/science.275.5296.51.
13
Algorithms
“algorithm” used in a very loose sense▷ algorithms▷ heuristics▷ machine learning models▷ software systems▷ machines▷ …
14
Parallel Portfolios
Why not simply run all algorithms in parallel?▷ not enough resources may be available/waste of resources▷ algorithms may be parallelized themselves▷ memory/cache contention
15
Building an Algorithm Selection System
▷ requires algorithms with complementary performance▷ most approaches rely on machine learning▷ train with representative data, i.e. performance of all
algorithms in portfolio on a number of instances▷ evaluate performance on separate set of instances▷ potentially large amount of prep work
16
Key Components of an Algorithm Selection System
▷ feature extraction▷ performance model▷ prediction-based selector/scheduler
optional:▷ presolver▷ secondary/hierarchical models and predictors (e.g. for feature
extraction time)
17
Types of Performance Models
Regression Models
A1
A2A3
A1: 1.2A2: 4.5A3: 3.9
Classification Model
A1A3 A1
A2 A1
Pairwise Classification Models
A1 vs. A2
A1A2 A1
A1
A1 vs. A3
A1A1 A3
A3 …A1: 1 voteA2: 0 votesA3: 2 votes
Pairwise Regression Models
A1 - A2
0
A1 - A3
0…
A1: -1.3A2: 0.4A3: 1.7
Instance 1Instance 2Instance 3
...
Instance 1: Algorithm 2Instance 2: Algorithm 1Instance 3: Algorithm 3
...
18
Tuning Algorithms
19
Algorithm Configuration
Given a (set of) problem(s), find the best parameter configuration.
20
Parameters?
▷ anything you can change that makes sense to change▷ e.g. search heuristic, optimization level, computational
resolution▷ not random seed, whether to enable debugging, etc.▷ some will affect performance, others will have no effect at all
21
Automated Algorithm Configuration
▷ no background knowledge on parameters or algorithm –black-box process
▷ as little manual intervention as possible▷ failures are handled appropriately▷ resources are not wasted▷ can run unattended on large-scale compute infrastructure
22
Algorithm Configuration
Frank Hutter and Marius Lindauer, “Algorithm Configuration: A Hands onTutorial”, AAAI 2016
23
General Approach
▷ evaluate algorithm as black-box function▷ observe effect of parameters without knowing the inner
workings, build surrogate model based on this data▷ decide where to evaluate next, based on surrogate model▷ repeat
24
When are we done?
▷ most approaches incomplete, i.e. do not exhaustively exploreparameter space
▷ cannot prove optimality, not guaranteed to find optimalsolution (with finite time)
▷ performance highly dependent on configuration space_ How do we know when to stop?
25
Time Budget
How much time/how many function evaluations?▷ too much _ wasted resources▷ too little _ suboptimal result▷ use statistical tests▷ evaluate on parts of the instance set▷ for runtime: adaptive capping▷ in general: whatever resources you can reasonably invest
26
Grid and Random Search▷ evaluate certain points in parameter space
Bergstra, James, and Yoshua Bengio. “Random Search forHyper-Parameter Optimization.” J. Mach. Learn. Res. 13, no. 1 (February2012): 281–305.
27
Model-Based Search
▷ evaluate small number of configurations▷ build model of parameter-performance surface based on the
results▷ use model to predict where to evaluate next▷ repeat▷ allows targeted exploration of new configurations▷ can take instance features into account like algorithm selection
Hutter, Frank, Holger H. Hoos, and Kevin Leyton-Brown. “SequentialModel-Based Optimization for General Algorithm Configuration.” In LION 5,507–23, 2011.
28
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.000
0.005
0.010
0.015
0.020
0.025
x
type
● init
prop
type
y
yhat
ei
Iter = 1, Gap = 1.9909e−01
29
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.00
0.01
0.02
0.03
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 2, Gap = 1.9909e−01
30
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.000
0.002
0.004
0.006
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 3, Gap = 1.9909e−01
31
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
2e−04
4e−04
6e−04
8e−04
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 4, Gap = 1.9992e−01
32
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−04
2e−04
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 5, Gap = 1.9992e−01
33
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.00000
0.00003
0.00006
0.00009
0.00012
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 6, Gap = 1.9996e−01
34
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−05
2e−05
3e−05
4e−05
5e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 7, Gap = 2.0000e−01
35
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.0e+00
5.0e−06
1.0e−05
1.5e−05
2.0e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 8, Gap = 2.0000e−01
36
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0.0e+00
2.5e−06
5.0e−06
7.5e−06
1.0e−05
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 9, Gap = 2.0000e−01
37
Model-Based Search Example
●
●
●
●
●
yei
−1.0 −0.5 0.0 0.5 1.0
0.0
0.4
0.8
0e+00
1e−07
2e−07
3e−07
4e−07
x
type
● init
prop
seq
type
y
yhat
ei
Iter = 10, Gap = 2.0000e−01
38
Selected Applications
39
Compiler Parameter Tuning▷ pre-defined optimization levels offer not much flexibility▷ improvements possible by tuning full compiler parameter space▷ tuned compute-intensive AI algorithms▷ up to 40% runtime improvement over gcc -O2/-O3
Pérez Cáceres, Leslie, Federico Pagnozzi, Alberto Franzin, and ThomasStützle. “Automatic Configuration of GCC Using Irace.” In Artificial Evolution,edited by Evelyne Lutton, Pierrick Legrand, Pierre Parrend, NicolasMonmarché, and Marc Schoenauer, 202–16. Cham: Springer InternationalPublishing, 2018.
40
Compiler Parameter Tuning▷ not only for C/C++▷ JavaScript (JavaScriptCode, V8) optimized for standard
benchmarks▷ up to 35% runtime improvement
1.0 1.1 1.2 1.3 1.4 1.5 1.6 1.7 1.8 1.9 2.0Default configuration PAR10 (CPU s) (log scale)
1.0
1.1
1.2
1.3
1.4
1.5
1.6
1.7
1.8
1.9
2.0
Tuned c
onfigura
tion P
AR
10 (
CPU
s)
(log s
cale
)
Fawcett, Chris, Lars Kotthoff, and Holger H. Hoos. “Hot-Rodding theBrowser Engine: Automatic Configuration of JavaScript Compilers.” CoRRabs/1707.04245 (2017). http://arxiv.org/abs/1707.04245. 41
“Deep” Parameter Tuning▷ automatically identify non-exposed parameters and allow them
to be tuned (e.g. magic constants)▷ tuned dlmalloc library, specialized for e.g. awk, flex, sed▷ runtime improvements of up to 12%, decrease in memory
consumption of up to 21%
Wu, Fan, Westley Weimer, Mark Harman, Yue Jia, and Jens Krinke.“Deep Parameter Optimisation.” In Conference on Genetic and EvolutionaryComputation, 1375–82. GECCO ’15. New York, NY, USA: ACM, 2015.https://doi.org/10.1145/2739480.2754648. 42
Beyond Software
43
Outlook
44
Quo Vadis, Software Engineering?
Run
Hoos, Holger H. “Programming by Optimization.” Communications of theAssociation for Computing Machinery (CACM) 55, no. 2 (February 2012):70–80. https://doi.org/10.1145/2076450.2076469.
45
Quo Vadis, Software Engineering?
Run
+ AI
Hoos, Holger H. “Programming by Optimization.” Communications of theAssociation for Computing Machinery (CACM) 55, no. 2 (February 2012):70–80. https://doi.org/10.1145/2076450.2076469.
45
(Much) More Information
https://larskotthoff.github.io/assurvey/
Kotthoff, Lars. “Algorithm Selection for Combinatorial Search Problems: ASurvey.” AI Magazine 35, no. 3 (2014): 48–60.
46
Tools and Resources
LLAMA https://bitbucket.org/lkotthoff/llamaSATzilla http://www.cs.ubc.ca/labs/beta/Projects/SATzilla/
iRace http://iridia.ulb.ac.be/irace/mlrMBO https://github.com/mlr-org/mlrMBO
SMAC http://www.cs.ubc.ca/labs/beta/Projects/SMAC/Spearmint https://github.com/HIPS/Spearmint
TPE https://jaberg.github.io/hyperopt/
autofolio https://bitbucket.org/mlindauer/autofolio/Auto-WEKA http://www.cs.ubc.ca/labs/beta/Projects/autoweka/
Auto-sklearn https://github.com/automl/auto-sklearn
47
Summary
Algorithm Selection choose the best algorithm for solving aproblem
Algorithm Configuration choose the best parameter configurationfor solving a problem with an algorithm
▷ mature research areas▷ can combine configuration and selection▷ effective tools are available▷ COnfiguration and SElection of ALgorithms group COSEAL
http://www.coseal.net
Don’t set parameters prematurely, embrace choice!
48
I’m hiring!
Several funded graduate positions available.
49