Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Jordan Thayer (UNH) Optimistic Search – 1 / 45
Faster than Weighted A*:An Optimistic Approach to Bounded Suboptimal Search
Jordan Thayer and Wheeler Ruml
{jtd7, ruml} at cs.unh.edu
Motivation
Introduction
■ Motivation
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 2 / 45
■ Finding optimal solutions is prohibitively expensive.
Grid Pathfinding
No
des
gen
erat
ed
200,000
100,000
0
Problem Size
1,000800600400200
A*Grid Pathfinding
So
luti
on
Co
st (
rela
tiv
e to
A*)
1.6
1.4
1.2
1.0
Problem Size
1,000800600400200
A*
Motivation
Introduction
■ Motivation
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 3 / 45
■ Finding optimal solutions is prohibitively expensive.■ Greedy solutions can be arbitrarily bad.
Grid Pathfinding
No
des
gen
erat
ed
200,000
100,000
0
Problem Size
1,000800600400200
A*
Greedy
Four-way Grid Pathfinding (Unit cost)
So
luti
on
Co
st (
rela
tiv
e to
A*)
1.6
1.4
1.2
1.0
Problem Size
1,000800600400200
A*
Greedy
Motivation
Introduction
■ Motivation
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 4 / 45
■ Finding optimal solutions is prohibitively expensive.■ Greedy solutions can be arbitrarily bad.■ Weighted A* bounds suboptimality.
Grid Pathfinding
No
des
gen
erat
ed
200,000
100,000
0
Problem Size
1,000800600400200
A*wA*
Greedy
Grid Pathfinding
So
luti
on
Co
st (
rela
tiv
e to
A*)
1.6
1.4
1.2
1.0
Problem Size
1,000800600400200
A*wA*
Greedy
Motivation
Introduction
■ Motivation
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 5 / 45
■ Finding optimal solutions is prohibitively expensive.■ Greedy solutions can be arbitrarily bad.■ Weighted A* bounds suboptimality.■ Optimistic Search: faster search within the same bound.
Grid Pathfinding
No
des
gen
erat
ed
200,000
100,000
0
Problem Size
1,000800600400200
A*wA*
OptimisticGreedy
Grid Pathfinding
So
luti
on
Co
st (
rela
tiv
e to
A*)
1.6
1.4
1.2
1.0
Problem Size
1,000800600400200
A*wA*
OptimisticGreedy
Algorithm Overview
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 6 / 45
Talk Outline
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 7 / 45
■ Algorithm OverviewRun weighted A∗ with a weight higher than the bound.Expand additional nodes to prove solution quality.
■ The Greedy Search Phase■ The Cleanup Phase■ Empirical Evaluation■ Further Observations
Previous Algorithms: A∗
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 8 / 45
■ A best first search expanding nodes in f order.■ f(n) = g(n) + h(n)
If h(n) is admissible, returns optimal solution.
Previous Algorithms: Weighted A∗
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 9 / 45
■ A best first search expanding nodes in f ′ order.■ f ′(n) = g(n) + w · h(n)
Solution quality bounded by w for admissible h(n).
Optimistic Search: The Basic Idea
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 10 / 45
1. Run weighted A∗ with a high weight.2. Expand node with lowest f value after a solution is found.
Continue until w · fmin > f(sol)This ’clean up’ guarantees solution quality.
Optimistic Search: The Basic Idea
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 11 / 45
1. Run weighted A∗ with a high weight.2. Expand node with lowest f value after a solution is found.
Continue until w · fmin > f(sol)This ’clean up’ guarantees solution quality.
Optimistic Search: The Basic Idea
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 12 / 45
1. Run weighted A∗ with a high weight.2. Expand node with lowest f value after a solution is found.
Continue until w · fmin > f(sol)This ’clean up’ guarantees solution quality.
Optimistic Search: The Basic Idea
Introduction
Algorithm Overview
■ Predecessors
■ Basic Idea
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 13 / 45
1. Run weighted A∗ with a high weight.2. Expand node with lowest f value after a solution is found.
Continue until w · fmin > f(sol)This ’clean up’ guarantees solution quality.
1: Greedy Phase
Introduction
Algorithm Overview
1: Greedy Phase
■ Weighted A∗
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 14 / 45
Talk Outline
Introduction
Algorithm Overview
1: Greedy Phase
■ Weighted A∗
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 15 / 45
■ Algorithm Overview■ The Greedy Search Phase
Weighted A∗ becomes faster as the bound grows.Weighted A∗ is often better than the bound.
■ The Cleanup Phase■ Empirical Evaluation■ Further Observations
Large Bounds, Faster Solution
Introduction
Algorithm Overview
1: Greedy Phase
■ Weighted A∗
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 16 / 45
■ wA∗ returns solutions faster as the bound increases.
Pearl and Kim Hard
No
de
Gen
erat
ion
s R
elat
ive
to A
*
0.9
0.6
0.3
0.0
Sub-optimality bound
1.21.11.0
wA*
Weighted A∗ is often better than the bound
Introduction
Algorithm Overview
1: Greedy Phase
■ Weighted A∗
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 17 / 45
■ wA∗ returns solutions better than the bound.
Four-way Grid Pathfinding (Unit cost)
So
luti
on
Co
st (
rela
tiv
e to
A*)
3
2
1
Sub-optimality Bound
321
y=xwA*
2: Cleanup Phase
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
■ w-Admissibility
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 18 / 45
Talk Outline
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
■ w-Admissibility
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 19 / 45
■ Algorithm Overview■ The Greedy Search Phase■ The Cleanup Phase
Expand additional nodes in f order.Quit when the solution is provably within the bound.
■ Empirical Evaluation■ Further Observations
Proving w-Admissibility
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
■ w-Admissibility
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 20 / 45
■ p is the deepest node on anoptimal path to opt.
■ fmin is the node with thesmallest f value.
Proving w-Admissibility
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
■ w-Admissibility
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 20 / 45
■ p is the deepest node on anoptimal path to opt.
■ fmin is the node with thesmallest f value.
f(p) ≤ f(opt)f(fmin) ≤ f(p)
fmin provides a lower bound on solution cost.
Determine fmin by priority queue sorted on f
Proving w-Admissibility
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
■ w-Admissibility
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 20 / 45
■ p is the deepest node on anoptimal path to opt.
■ fmin is the node with thesmallest f value.
f(p) ≤ f(opt)f(fmin) ≤ f(p)
fmin provides a lower bound on solution cost.
Determine fmin by priority queue sorted on f
Optimistic Search: Run a greedy searchExpand fmin until w · fmin ≥ f(sol)
Proving w-Admissibility
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
■ w-Admissibility
Empirical Evaluation
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 21 / 45
■ p is the deepest node on anoptimal path to opt.
■ fmin is the node with thesmallest f value.
f(p) ≤ f(opt)f(fmin) ≤ f(p)
fmin provides a lower bound on solution cost.
Determine fmin by priority queue sorted on f
Optimistic Search: Run a greedy searchExpand fmin until w · fmin ≥ f(sol)
Empirical Evaluation
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 22 / 45
Talk Outline
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 23 / 45
■ Algorithm Overview■ The Greedy Search■ Guaranteeing solution quality■ Empirical Evaluation
Results in several domains.■ Further Observations
Empirical Evaluation
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 24 / 45
■ Sliding Tile PuzzlesKorf’s 100 15-puzzle instances (add date)
■ Traveling SalesmanUnit SquarePearl and Kim Hard (add date)
■ Grid world path findingFour-way and Eight-way MovementUnit and Life Cost Models25%, 30%, 35%, 40%, 45% obstacles
■ Temporal PlanningBlocksworld, Logistics, Rover, Satellite, Zenotravel
See paper for additional plots.
Performance of Optimistic Search
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 25 / 45
Korf’s 15 Puzzles: h = Manhattan Distance
No
de
Gen
erat
ion
s R
elat
ive
to I
DA
*
0.09
0.06
0.03
0.0
Sub-optimality bound
2.01.81.61.41.2
wA*
Optimistic
Performance of Optimistic Search
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 26 / 45
TSP: Pearl and Kim Hard
No
de
Gen
erat
ion
s R
elat
ive
to A
*
0.9
0.6
0.3
0.0
Sub-optimality bound
1.21.11.0
wA*Optimistic
Performance of Optimistic Search
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 27 / 45
Four-way Grid Pathfinding (Unit cost)
No
des
gen
erat
ed (
rela
tiv
e to
A*)
0.9
0.6
0.3
0.0
Sub-optimality Bound
321
wA*Optimistic
Performance of Optimistic Search
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
■ Performance
Further Observations
Conclusion
Jordan Thayer (UNH) Optimistic Search – 28 / 45
logistics (problem 3)
No
des
gen
erat
ed (
rela
tiv
e to
A*) 1.2
0.8
0.4
0.0
Sub-optimality Bound
321
wA*Optimistic Search
Further Observations
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 29 / 45
Talk Outline
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 30 / 45
■ Algorithm Overview■ The Greedy Search■ Guaranteeing solution quality■ Empirical Evaluation■ Further Observations
Strict vs. Loose Expansion PolicyBounded Anytime Weighted A∗
Expansion Policy
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 31 / 45
Strict Expansion Order:
■ Algorithms likewA∗, A∗
ǫ, Dynamically Weighted A∗
■ Any expanded node can be shown to be within the bound atthe time of their expansion
■ Quality bound comes from this
Loose Expansion Order:
■ Algorithms likeOptimistic Search
■ No restriction on the nodes expanded initially.■ Quality bound requires node expansion beyond the initial
solution.
Bounded Anytime Weighted A∗
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 32 / 45
■ Anytime Heuristic Search:Running weighted A∗ with a high weightContinue node expansions after a solution is found
Bounded Anytime Weighted A∗
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 32 / 45
■ Anytime Heuristic Search:Running weighted A∗ with a high weightContinue node expansions after a solution is found
■ Bounded Anytime Weighted A∗:Running weighted A∗ with a high weightContinue node expansions after a solution is foundAdd a second priority queue allows us to converge on a
bound instead of on optimal.
Optimistic Search expansions
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 33 / 45
1. Run weighted A∗ with a high weight.2. Expand node with lowest f value after a solution is found.
Continue until w · fmin > f(sol)This ’clean up’ guarantees solution quality.
Bounded Anytime Weighted A∗ Expansions
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 34 / 45
1. Run weighted A∗ with a high weight.2. Expand node with lowest f ′ value after a solution is found.
Continue until w · fmin > f(sol)This ’clean up’ guarantees solution quality.
Bounded Anytime Weighted A*
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 35 / 45
Korf’s 15 Puzzles
No
de
Gen
erat
ion
s R
elat
ive
to I
DA
*
0.09
0.06
0.03
0.0
Sub-optimality bound
2.01.81.61.41.2
BAwA*
wA*
Optimistic
Bounded Anytime Weighted A*
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
■ Expansion Policy
■ BAwA∗
Conclusion
Jordan Thayer (UNH) Optimistic Search – 36 / 45
Pearl and Kim Hard
No
de
Gen
erat
ion
s R
elat
ive
to A
*
0.9
0.6
0.3
0.0
Sub-optimality bound
1.21.11.0
BAwA*wA*
Optimistic
Conclusion
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
■ Conclusion
■ Advertising
Jordan Thayer (UNH) Optimistic Search – 37 / 45
Optimistic Search:
■ Simple to implement.■ Performance is predictable.■ Current results are good, tuning could help.
Optimal greediness is still an open question.■ Consistently better than Weighted A∗
If you currently use wA∗, you should use OptimisticSearch.
The University of New Hampshire
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
■ Conclusion
■ Advertising
Jordan Thayer (UNH) Optimistic Search – 38 / 45
Tell your students to apply to grad school in CS at UNH!
■ friendly faculty■ funding■ individual attention■ beautiful campus■ low cost of living■ easy access to Boston,
White Mountains■ strong in AI, infoviz,
networking, systems,bioinformatics
Additional Slides
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 39 / 45
Weighted A∗ is often better than the bound
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 40 / 45
■ wA∗ returns solutions better than the bound.
Four-way Grid Pathfinding (Unit cost)
So
luti
on
Co
st (
rela
tiv
e to
A*)
3
2
1
Sub-optimality Bound
321
y=xwA*
Weighted A∗ Respects a Bound
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 41 / 45
f(n) = g(n) + h(n)f ′(n) = g(n) + w · h(n)
g(sol)f ′(sol) ≤ f ′(p)
g(p) + w · h(p) ≤ w · (g(p) + h(p))w · f(p) ≤ w · f(opt)
w · g(opt)
Therefore, g(sol) ≤ w · g(opt)
Weighted A∗ Respects the Bound and Then Some
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 42 / 45
f(n) = g(n) + h(n)f ′(n) = g(n) + w · h(n)
g(sol)f ′(sol) ≤ f ′(p)
g(p) + w · h(p) ≤ w · (g(p) + h(p))w · f(p) ≤ w · f(opt)
w · g(opt)
g(p) + w · h(p) ≤ w · g(p) + w · h(p)
Duplicate Dropping can be Important
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 43 / 45
Four-way Grid Pathfinding (Unit cost)
No
des
gen
erat
ed (
rela
tiv
e to
A*)
0.9
0.6
0.3
0.0
Sub-optimality Bound
321
wA*wA* dd
Sometimes it isn’t
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 44 / 45
Korf’s 15 puzzles
No
de
Gen
erat
ion
s R
elat
ive
to I
DA
*
0.09
0.06
0.03
0.0
Sub-optimality bound
1.51.41.31.21.1
wA* ddwA*
Sometimes it isn’t
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 44 / 45
Korf’s 15 puzzles
No
de
Gen
erat
ion
s R
elat
ive
to I
DA
*
0.09
0.06
0.03
0.0
Sub-optimality bound
1.51.41.31.21.1
wA* ddwA*
Duplicates can be delayed during the greedy search phase.
Pseudo Code
Introduction
Algorithm Overview
1: Greedy Phase
2: Cleanup Phase
Empirical Evaluation
Further Observations
Conclusion
Additional Slides
■ Loose Bounds
■ Duplicates
■ Pseudo Code
Jordan Thayer (UNH) Optimistic Search – 45 / 45
Optimistic Search(initial, bound)1. openf ← {initial}
2. openf̂← {initial}
3. incumbent←∞4. repeat until bound · f(first on openf ) ≥ f(incumbent):
5. if f̂(first on openf̂) < f̂(incumbent) then
6. n← remove first on openf̂
7. remove n from openf
8. else n← remove first on openf
9. remove n from openf̂
10. add n to closed
11. if n is a goal then12. incumbent← n
13. else for each child c of n
14. if c is duplicated in openf then
15. if c is better than the duplicate then16. replace copies in openf and open
f̂
17. else if c is duplicated in closed then18. if c is better than the duplicate then19. add c to openf and open
f̂
20. else add c to openf and openf̂