Upload
paniz
View
49
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Performance Prediction and Automated Tuning of Randomized and Parametric Algorithms: An Initial Investigation. Frank Hutter 1 , Youssef Hamadi 2 , Holger Hoos 1 , and Kevin Leyton-Brown 1 1 University of British Columbia, Vancouver, Canada 2 Microsoft Research Cambridge, UK. - PowerPoint PPT Presentation
Citation preview
Performance Prediction andAutomated Tuning of
Randomized and Parametric Algorithms:An Initial Investigation
Frank Hutter1, Youssef Hamadi2, Holger Hoos1, and Kevin Leyton-Brown1
1University of British Columbia, Vancouver, Canada2Microsoft Research Cambridge, UK
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
2
Motivation: Why automatic parameter tuning? (1)
• Most approaches for solving hard combinatorial problems (e.g. SAT, CP) are highly parameterized Tree search
- Variable/value heuristic- Propagation- Whether and when to restart- How much learning
Local search- Noise parameter- Tabu length in tabu search - Strength of penalty increase and decrease in DLS- Pertubation, acceptance criterion, etc. in ILS
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
3
Motivation: Why automatic parameter tuning? (2)
• Algorithm design: new algorithm/application : A lot of time is spent for parameter tuning
• Algorithm analysis: comparability Is algorithm A faster than algorithm B because they
spent more time tuning it ?
• Algorithm use in practice: Want to solve MY problems fast, not necessarily the
ones the developers used for parameter tuning
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
4
Related work in automated parameter tuning
• Best fixed parameter setting for instance set [Birattari et al. ’02, Hutter ’04, Adenso-Daz & Laguna ’05]
• Algorithm selection/configuration per instance [Lobjois and Lemaître, ’98, Leyton-Brown, Nudelman et al. ’02 & ’04,
Patterson & Kautz ’02]
• Best sequence of operators / changing search strategy during the search [Battiti et al, ’05, Lagoudakis & Littman, ’01 & ‘02]
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
5
Overview
• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]
• EH models for randomized algorithms
• EH models automatic tuning for parametric algorithms
• Conclusions
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
6
Empirical hardness models:Basics (1 algorithm)
• Training: Given a set of t instances s1,...,st For each instance si
- Compute instance features xi = (xi1,...,xim)
- Run algorithm and record its runtime yi
Learn function f: features runtime, such that yi f(xi) for i=1,…,t
• Test: Given a new instance st+1 Compute features xt+1 Predict runtime yt+1 = f(xt+1)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
7
Empirical hardness models:Which instance features?
• Features should be polytime computable (for us, in seconds) Basic properties, e.g. #vars, #clauses, ratio (13) Graph-based characterics (10) Estimates of DPLL search space size (5) Local search probes (15)
• Combine features to form more expressive basis functions Basis functions = (1,...,q) can be arbitrary combinations of the
features x1,...,xm
• Basis functions used for SAT in [Nudelman et al. ’04] 91 original features: xi
Pairwise products of features: xi * xj
Only subset of these (drop useless basis functions)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
8
Empirical hardness models:How to learn function f: features runtime?
• Runtimes vary by orders of magnitudes, and we need to pick a model that can deal with that Log-transform the output
e.g. runtime is 103 sec yi = 3
• Learn linear function of the basis functionsf(i) = i * wT yi
Learning reduces to fitting the weights w (ridge regression: w = ( + T )-1 Ty)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
9
Algorithm selection based on empirical hardness models (e.g. satzilla)
• Given portfolio of n different algorithms A1,...,An
Pick best algorithm for each instance
• Training: Learn n separate functions
fj: features runtime of algorithm j
• Test (for each new instance st+1): Predict runtime yj
t+1 = fj(t+1) for each algorithm Choose algorithm Aj with minimal yj
t+1
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
10
Overview
• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]
• EH models for randomized algorithms
• EH models automatic tuning for parametric algorithms
• Conclusions
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
11
Empirical hardness models for randomized algorithms
• Can this same approach predict the run-time of incomplete, randomized local search algorithms? Yes!
• Incomplete Limitation to satisfiable instances (train & test)
• Local search No changes needed
• Randomized Ultimately, want to predict entire run-time distribution (RTDs) For our algorithms, these RTDs are typically exponential and can thus be
characterized by a single sufficient statistic, such as median run-time
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
12
• Training: Given a set of t instances s1,...,st For each instance si
- Compute features xi = (xi1,...,xim)
- Run algorithm multiple times to get its runtimes yi1, …, yi
k
- Compute median mi of yi1, …, yi
k Learn function f: features median run-time, mi f(xi)
• Test: Given a new instance st+1 Compute features xt+1 Predict median run-time mt+1 = f(xt+1)
Prediction of median run-time (only red stuff changed)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
13
Experimental setup: solvers
• Two SAT solvers Novelty+ (WalkSAT variant) SAPS (Scaling and Probabilistic Smoothing) Adaptive version of Novelty+ won SAT04
competition for random instances, SAPS came second
• Runs cut off after 15 minutes
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
14
Experimental setup: benchmarks
• Three unstructured distributions: CV-fix: c/v ratio 4.26
20,000 instances with 400 variables, (10,011 satisfiable) CV-var: c/v ratio between 3.26 and 5.26
20,000 instances with 400 variables, (10,129 satisfiable) SAT04: 3,000 instances created with the generators for the SAT04
competition (random instances) with same parameters (1,420 satisfiable)
• One structured distribution: QWH: quasi groups with holes, 25% to 75% holes,
7,498 instances, satisfiable by construction
• All data sets were split 50:25:25 for train/valid/test
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
15
Results for predicting (median) run-time for SAPS on CV-var
Prediction based on single runs
Prediction based on medians of 1000 runs
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
16
Results for predicting median run-time based on 10 runs
SAPS on QWHNovelty+ on SAT04
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
17
Predicting complete run-time distribution for exponential RTDs (from extended CP version)
RTD of SAPS on q0.75 instance of QWH
RTD of SAPS on q0.25 instance of QWH
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
18
Overview
• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]
• EH models for randomized algorithms
• EH models automatic tuning for parametric algorithms
• Conclusions
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
19
Empirical hardness models for parametric algorithms (only red stuff changed)
• Training: Given a set of instances s1,...,st For each instance si
- Compute features xi - Run algorithm with some settings pi
1,...,pin
i to get runtimes yi1
,...,yin
i
- Basis functions i j (xi, pi
j) of features and parameter settings(quadratic expansion of params, multiplied by instance features)
Learn a function g:basis functions run-time, g(i j) yi
j
• Test: Given a new instance st+1 Compute features xt+1 For each parameter setting p of interest,
construct t+1 p (xt+1, p) and predict run-time g(t+1)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
20
Experimental setup
• Parameters Novelty+: noise {0.1,0.2,0.3,0.4,0.5,0.6} SAPS: all 30 combinations of {1.2,1.3,1.4} and
{0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9}
• One additional data set Mixed = union of QWH and SAT04
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
21
Results for predicting SAPS runtime with 30 different parameter settings on QWH
5 instances, one symbol per instance
1 instance in detail, (blue diamonds in left figure)
(from extended CP version)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
22
Results for automated parameter setting
Data Set
Algo Speedup over default params
Speedup over best fixed params
SAT04 Nov+ 0.89 0.89
QWH Nov+ 177 0.91
Mixed Nov+ 13 10.72
SAT04 SAPS 2 1.07
QWH SAPS 2 0.93
Mixed SAPS 1.91 0.93Not th
e best
algorithm to
tuneDo you have
a better one?
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
23
Results for automated parameter setting, Novelty+ on Mixed
Compared to random parameters
Compared to best fixed parameters
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
24
Overview
• Previous work on empirical hardness models [Leyton-Brown, Nudelman et al. ’02 & ’04]
• EH models for randomized algorithms
• EH models and automated tuning for parametric algorithms
• Conclusions
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
25
Conclusions
• We can predict the run-time of randomized and incomplete parameterized local search algorithms
• We can automatically find good parameter settings Better than the default settings Sometimes better than the best possible fixed setting
• There’s no free lunch Long initial training time Need domain knowledge to define features for a
domain (only once per domain)
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
26
Future work
• Even better predictions More involved ML techniques, e.g. Gaussian processes Predictive uncertainty to know when our predictions are not reliable
• Reduce training time Especially for high-dimensional parameter spaces,
the current approach will not scale Use active learning to choose best parameter configurations to train on
• We need compelling domains: please come talk to me! Complete parametric SAT solvers Parametric solvers for other domains where features can be defined
(CP? Even planning?) Optimization algorithms
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
27
The End
• Thanks to Holger Hoos, Kevin Leyton-Brown,
Youssef Hamadi Reviewers for helpful comments You for your attention
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
28
Experimental setup: solvers
• Two SAT solvers Novelty+ (WalkSAT variant)
- Default noise setting 0.5 (=50%) for unstructured instances
- Noise setting 0.1 used for structured instances
SAPS (Scaling and Probabilistic Smoothing)- Default setting (alpha, rho) = (1.3, 0.8)
Adaptive version of Novelty+ won SAT04 competition for random instances, SAPS came second
• Runs cut off after 15 minutes
July 16, 2006 Hutter, Hamadi, Hoos, Leyton-Brown: Automatic Parameter Tuning
29
Which features are most important?(from extended CP version)
• Results consistent with those for deterministic tree-search algorithms Graph-based and DPLL-based features Local search probes are even more important here
• Only very few features needed for good models Previously observed for all-sat data [Nudelman et al. ’04]
A single quadratic basis function is often almost as good as the best feature subset
Strong correlation between features Many choices yield comparable performance