Upload
duongkhuong
View
216
Download
1
Embed Size (px)
Citation preview
Sub-optimal Heuristic Search and its Application to Planning for
Manipulation
Mike Phillips
Slides adapted from Maxim Likhachev
Maxim Likhachev
Planning for Mobile Manipulation • What planning tasks are there?
2 Carnegie Mellon University
robotic bartender demo at Search-based Planning Lab, April’12
Maxim Likhachev
Planning for Mobile Manipulation • What planning tasks are there?
3 Carnegie Mellon University
robotic bartender demo at Search-based Planning Lab, April’12
task planning (order of getting items) planning for navigation planning motions for an arm planning coordinated base+arms (full body) motions planning how to grasp
Maxim Likhachev
Planning for Mobile Manipulation • What planning tasks are there?
4 Carnegie Mellon University
robotic bartender demo at Search-based Planning Lab, April’12
task planning (order of getting items) planning for navigation planning motions for an arm planning coordinated base+arms (full body) motions planning how to grasp
this class
Maxim Likhachev
Planning for Mobile Manipulation • Example of planning for 20D arm in 2D:
– each state is defined by 20 discretized joint angles {q1, q2, …, q20} – each action is changing one angle (or set of angles) at a time
5 Carnegie Mellon University
Maxim Likhachev
Planning for Mobile Manipulation • Example of planning for 20D arm in 2D:
– each state is defined by 20 discretized joint angles {q1, q2, …, q20} – each action is changing one angle (or set of angles) at a time
6 Carnegie Mellon University
How to search such a high-dimensional state-space?
Maxim Likhachev
Typical Approaches to Motion Planning
planning with optimal heuristic search-based
approaches (e.g., A*, D*)
planning with sampling-based approaches
(e.g., RRT, PRM,…) + completeness/optimality guarantees + excellent cost minimization + consistent solutions to similar motion queries - slow and memory intensive - used mostly for 2D (x,y) path planning
+ typically very fast and low memory + used for high-D path/motion planning - completeness in the limit - often poor cost minimization - hard to provide solution consistency
sub-optimal heuristic searches can be fast enough for high-D motion planning and return consistent good quality solutions
7 Carnegie Mellon University
Maxim Likhachev
Planning for Mobile Manipulation • Example of planning for 20D arm in 2D:
– each state is defined by 20 discretized joint angles {q1, q2, …, q20} – each action is changing one angle (or set of angles) at a time
8 Carnegie Mellon University
Heuristic search? Wouldn’t you need to store m20
(m-# of values) states in memory?
states are generated on-the fly as search expands states
Maxim Likhachev
Sub-optimal Heuristic Search for High-D Motion Planning
• ARA* (Anytime version of A*) graph search – effective use of solutions to relaxed (easier) motion planning problems
• Experience graphs – heuristic search that learns from its planning experiences
9 Carnegie Mellon University
Maxim Likhachev
Sub-optimal Heuristic Search for High-D Motion Planning
• ARA* (Anytime version of A*) graph search – effective use of solutions to relaxed (easier) motion planning problems
• Experience graphs – heuristic search that learns from its planning experiences
10 Carnegie Mellon University
Maxim Likhachev University of Pennsylvania
• A* Search: expands states in the order of f(s)=g(s)+h(s) values
• In A*, when s is expanded g(s) is optimal • h(s) is a relaxed (simplified) version of the problem
Heuristics in Heuristic Search
11
start
goal
h(s)
s g(s)
Maxim Likhachev University of Pennsylvania
• A* Search: expands states in the order of f(s)=g(s)+h(s) values
• In A*, when s is expanded g(s) is optimal • h(s) is a relaxed (simplified) version of the problem
Heuristics in Heuristic Search
12
start
goal
h(s)
s g(s)
If h(s) was a perfect estimate (let’s call it h*), what does f(s) represent upon expansion of s?
Maxim Likhachev University of Pennsylvania
• A* Search: expands states in the order of f(s)=g(s)+h(s) values
• In A*, when s is expanded g(s) is optimal • h(s) is a relaxed (simplified) version of the problem
Heuristics in Heuristic Search
13
start
goal
h(s)
s g(s)
If h(s) was a perfect estimate (let’s call it h*), what does f(s) represent upon expansion of s?
Now what if h(s) wasn’t perfect?
Maxim Likhachev University of Pennsylvania
• A* Search: expands states in the order of f = g+h values ComputePath function while(sgoal is not expanded) remove s with the smallest [f(s) = g(s)+h(s)] from OPEN;
insert s into CLOSED; for every successor s’ of s such that s’ not in CLOSED
if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); insert s’ into OPEN;
Heuristics in Heuristic Search
expansion of s
14
h(s) g(s)
Sstart
S
S2
S1
Sgoal
…
…
the cost of a shortest path from sstart to s found so far
an (under) estimate of the cost of a shortest path from s to sgoal
Maxim Likhachev
• Solutions to a relaxed version of the planning problem
Heuristics in Heuristic Search
15 Carnegie Mellon University
Euclidean distance heuristic: 2D planning without obstacles and no grid
2D end-effector distance heuristic: 2D planning for end-effector with obstacles
2D (x,y) planning 20D arm planning
Maxim Likhachev University of Pennsylvania
• Dijkstra’s: expands states in the order of f = g values • A* Search: expands states in the order of f = g+h values
• Weighted A*: expands states in the order of f = g+εh values, ε > 1 = bias towards states that are closer to goal
Heuristics in Heuristic Search
16
h(s) g(s)
Sstart
S
S2
S1
Sgoal
…
…
the cost of a shortest path from sstart to s found so far
an (under) estimate of the cost of a shortest path from s to sgoal
Maxim Likhachev University of Pennsylvania
Heuristics in Heuristic Search
sgoal sstart
… …
• Dijkstra’s: expands states in the order of f = g values
What are the states expanded?
17
Maxim Likhachev University of Pennsylvania
Heuristics in Heuristic Search
sgoal sstart
… …
• A* Search: expands states in the order of f = g+h values
What are the states expanded?
18
Maxim Likhachev University of Pennsylvania
Heuristics in Heuristic Search
sgoal sstart
… …
• A* Search: expands states in the order of f = g+h values
for high-D problems, this results in A* being too slow and running out of memory
19
Maxim Likhachev University of Pennsylvania
Heuristics in Heuristic Search
• Weighted A* Search: expands states in the order of f = g+εh values, ε > 1
• bias towards states that are closer to goal
sstart sgoal …
…
20
Maxim Likhachev University of Pennsylvania
Heuristics in Heuristic Search
• Weighted A* Search: – trades off optimality for speed – ε-suboptimal:
cost(solution) ≤ ε·cost(optimal solution) – in many domains, it has been shown to be orders of magnitude
faster than A*
A*: ε =1.0
20 expansions solution=10 moves
Weighted A*: ε =2.5
13 expansions solution=11 moves
21
Maxim Likhachev
Anytime Search based on weighted A*
• Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:
ε =2.5
13 expansions solution=11 moves
ε =1.5
15 expansions solution=11 moves
ε =1.0
20 expansions solution=10 moves
University of Pennsylvania 22
Maxim Likhachev
Anytime Search based on weighted A*
• Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:
ε =2.5
13 expansions solution=11 moves
ε =1.5
15 expansions solution=11 moves
ε =1.0
20 expansions solution=10 moves
• Inefficient because – many state values remain the same between search iterations – we should be able to reuse the results of previous searches
University of Pennsylvania 23
Maxim Likhachev
Anytime Search based on weighted A*
• Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:
ε =2.5
13 expansions solution=11 moves
ε =1.5
15 expansions solution=11 moves
ε =1.0
20 expansions solution=10 moves
• ARA* - an efficient version of the above that reuses state values within any search iteration
University of Pennsylvania 24
Maxim Likhachev
• Efficient series of weighted A* searches with decreasing ε: ARA*
ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;
for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN;
University of Pennsylvania 25
Maxim Likhachev
• Efficient series of weighted A* searches with decreasing ε: ARA*
ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;
for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS
University of Pennsylvania 26
Maxim Likhachev
• Efficient series of weighted A* searches with decreasing ε: ARA*
ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;
for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS
University of Pennsylvania 27
set ε to large value; g(sstart) = 0; OPEN = {sstart}; while ε ≥ 1
CLOSED = {}; INCONS = {}; ComputePathwithReuse(); publish current ε suboptimal solution;
decrease ε; initialize OPEN = OPEN U INCONS;
Maxim Likhachev
• A series of weighted A* searches
• ARA*
ε =2.5
13 expansions solution=11 moves
ε =1.5
15 expansions solution=11 moves
ε =1.0
20 expansions solution=10 moves
ε =2.5
13 expansions solution=11 moves
ε =1.5
1 expansion solution=11 moves
ε =1.0
9 expansions solution=10 moves
ARA*
University of Pennsylvania 28
Maxim Likhachev
When using ARA*, the research is in finding a graph representation G and a corresponding heuristic function h that lead to shallow local minima for the search
sstart sgoal …
…
key to finding solution fast: shallow minima for h(s)-h*(s) function
29 Carnegie Mellon University
ARA*
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for a single 7DOF arm [Cohen et al.,
ICRA’10, ICRA’11] – goal is given as 6D (x,y,z,r,p,y) pose of the end-effector – each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,
all 7 joint angles {q1,…q7} when close to the goal – actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to
the goal when close to it}
30 Carnegie Mellon University
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for a single 7DOF arm [Cohen et al.,
ICRA’10, ICRA’11] – goal is given as 6D (x,y,z,r,p,y) pose of the end-effector – each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,
all 7 joint angles {q1,…q7} when close to the goal – actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to
the goal when close to it} – heuristics are: 3D BFS distances for the end-effector
31 Carnegie Mellon University
expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)
expansions based on Euclidean distance heuristics (shown as curve)
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for a single 7DOF arm [Cohen et al.,
ICRA’10, ICRA’11] – goal is given as 6D (x,y,z,r,p,y) pose of the end-effector – each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,
all 7 joint angles {q1,…q7} when close to the goal – actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to
the goal when close to it} – heuristics are: 3D BFS distances for the end-effector
32 Carnegie Mellon University
expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)
expansions based on Euclidean distance heuristics (shown as curve)
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for a single 7DOF arm [Cohen et al.,
ICRA’10, ICRA’11] – goal is given as 6D (x,y,z,r,p,y) pose of the end-effector – each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,
all 7 joint angles {q1,…q7} when close to the goal – actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to
the goal when close to it} – heuristics are: 3D BFS distances for the end-effector
33 Carnegie Mellon University
expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)
expansions based on Euclidean distance heuristics (shown as curve)
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for a single 7DOF arm [Cohen et al.,
ICRA’10, ICRA’11]
34 Carnegie Mellon University
Shown on PR2 robot but planner has been run on other arms (kuka)
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for dual-arm motion [Cohen et al.,
ICRA’12] – goal is given as 4D (x,y,z,y) pose for the object – each state is 6D: 4D (x,y,z,y) pose of the object, {q1,q2} of the elbow angles – actions are: {changing the 4D pose of the object or q1, q2} – heuristics are: 3D BFS distances for the object
35 Carnegie Mellon University
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for dual-arm motion [Cohen et al.,
ICRA’12]
36 Carnegie Mellon University
motions as generated by the planner - no smoothing
Maxim Likhachev
ARA*-based Planning for Manipulation • ARA*-based motion planning for dual-arm motion [Cohen et al.,
ICRA’12]
37 Carnegie Mellon University
motions as generated by the planner - no smoothing consistency of motions across similar start-goal trials
Maxim Likhachev
Pros Cons
Carnegie Mellon University 38
• Explicit cost minimization • Consistent plans (similar
input generates similar output)
• Can handle arbitrary goal sets (such as set of grasps from a grasp planner)
• Many path constraints are easy to implement (upright orientation of gripper)
• Requires good graph and heuristic design to plan fast
• Number of motions from a state affects speed and path quality
Discussion
ARA*-based Planning for Manipulation
Maxim Likhachev
Sub-optimal Heuristic Search for High-D Motion Planning
• ARA* (Anytime version of A*) graph search – effective use of solutions to relaxed (easier) motion planning problems
• Experience graphs – heuristic search that learns from its planning experiences
39 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs • Many planning tasks are repetitive
- loading a dishwasher - opening doors - moving objects around a warehouse - …
• Can we re-use prior experience to accelerate planning, in the context of search-based planning?
• Would be especially useful for high-dimensional problems such as mobile manipulation!
40 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs Given a set of previous paths (experiences)…
41 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs Put them together into an E-graph (Experience graph)
42 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs • E-Graph [Phillips et al., RSS’12]:
– Collection of previously computed paths or demonstrations – A sub-graph of the original graph
43 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs Given a new planning query…
44 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs …re-use E-graph. For repetitive tasks, planning becomes much faster
goal
start
45 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs …re-use E-graph. For repetitive tasks, planning becomes much faster
goal
start
Theorem 1: Algorithm is complete with respect to the original graph Theorem 2: The cost of the solu8on is within a given bound on sub-‐op8mality
46 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs • Reuse E-Graph by:
– Introducing a new heuristic function – Heuristic guides the search toward expanding states on the E-Graph
within sub-optimality bound ε
goal
start
47 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs • Focusing search towards E-graph within sub-optimality bound ε
hG
cε
consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).
C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-
level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.
findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)
3: GE = GE ! !
updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()
5: compute heuristic hE according to Equation 1
Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .
hE(s0) = min!
N"1!
i=0
min{"EhG(si, si+1), cE(si, si+1)} (1)
where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments
such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains
a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two
Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG
segments are always only two points long, while GE segments can be anarbitrary number of points.
(a) "E = 1 (b) "E = 2
(c) "E " #
Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.
states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus
on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-
expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the
Traveling on E-Graph uses actual costs
Traveling off the E-Graph uses an inflated original heuristic
Heuristic computation finds a min cost path using two kinds of “edges”
48 Carnegie Mellon University
goal
start
Maxim Likhachev
Planning with Experience Graphs • Focusing search towards E-graph within sub-optimality bound ε
hG
cε
consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).
C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-
level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.
findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)
3: GE = GE ! !
updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()
5: compute heuristic hE according to Equation 1
Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .
hE(s0) = min!
N"1!
i=0
min{"EhG(si, si+1), cE(si, si+1)} (1)
where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments
such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains
a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two
Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG
segments are always only two points long, while GE segments can be anarbitrary number of points.
(a) "E = 1 (b) "E = 2
(c) "E " #
Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.
states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus
on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-
expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the
Traveling on E-Graph uses actual costs
Traveling off the E-Graph uses an inflated original heuristic
Heuristic computation finds a min cost path using two kinds of “edges”
49 Carnegie Mellon University
goal
start
Maxim Likhachev
Planning with Experience Graphs • Focusing search towards E-graph within sub-optimality bound ε
goal
start
hG
cε
consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).
C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-
level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.
findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)
3: GE = GE ! !
updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()
5: compute heuristic hE according to Equation 1
Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .
hE(s0) = min!
N"1!
i=0
min{"EhG(si, si+1), cE(si, si+1)} (1)
where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments
such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains
a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two
Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG
segments are always only two points long, while GE segments can be anarbitrary number of points.
(a) "E = 1 (b) "E = 2
(c) "E " #
Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.
states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus
on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-
expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the
Traveling on E-Graph uses actual costs
Traveling off the E-Graph uses an inflated original heuristic
Heuristic computation finds a min cost path using two kinds of “edges”
50 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs • Focusing search towards E-graph within sub-optimality bound ε
consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).
C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-
level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.
findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)
3: GE = GE ! !
updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()
5: compute heuristic hE according to Equation 1
Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .
hE(s0) = min!
N"1!
i=0
min{"EhG(si, si+1), cE(si, si+1)} (1)
where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments
such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains
a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two
Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG
segments are always only two points long, while GE segments can be anarbitrary number of points.
(a) "E = 1 (b) "E = 2
(c) "E " #
Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.
states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus
on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-
expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the
51 Carnegie Mellon University
εε=1.5 εε=∞
goal
start goal
start
Maxim Likhachev
Planning with Experience Graphs Theorem 5. Completeness w.r.t. the original graph G: Planning with E-graphs is guaranteed to find a solution, if one exists in G Theorem 6. Bounds on sub-optimality: The cost of the solution found by planning with E-graphs is guaranteed to be at most εE-suboptimal:
cost(solution) ≤ εE cost(optimal solution in G)
52 Carnegie Mellon University
Maxim Likhachev
Planning with Experience Graphs Theorem 5. Completeness w.r.t. the original graph G: Planning with E-graphs is guaranteed to find a solution, if one exists in G Theorem 6. Bounds on sub-optimality: The cost of the solution found by planning with E-graphs is guaranteed to be at most εE-suboptimal:
cost(solution) ≤ εE cost(optimal solution in G)
53 Carnegie Mellon University
These proper8es hold even if quality or placement of the experiences is arbitrarily bad
Maxim Likhachev
Planning with E-Graphs for Mobile Manipulation • Dual-arm mobile manipulation (10 DoF)
54 Carnegie Mellon University
Maxim Likhachev
Planning with E-Graphs for Mobile Manipulation • Dual-arm mobile manipulation (10 DoF)
• 3 for base navigation • 1 for spine • 6 for arms (upright object constraint, roll and pitch are 0)
• Experiments in multiple environments
55 Carnegie Mellon University
Maxim Likhachev
Planning with E-Graphs for Mobile Manipulation Kitchen environment:
- moving objects around a kitchen - bootstrap E-Graph with 10 representative goals - tested on 40 goals in natural kitchen locations
56 Carnegie Mellon University
Maxim Likhachev
Planning with E-Graphs for Mobile Manipulation
Kitchen environment – Planning times
57 Carnegie Mellon University
• Max planning *me of 2 minutes • Sub-‐op*mality bound of 20 (for E-‐Graphs and Weighted A*)
Method Success (of 40) Mean Speed-up Std. Dev. Max Weighted A* 37 34.62 87.74 506.78
Success (of 40) Mean Time (s) Std. Dev. (s) Max (s)
E-Graphs 40 0.33 0.25 1.00
Maxim Likhachev
Planning with E-Graphs for Mobile Manipulation
Kitchen environment – Path Quality
58 Carnegie Mellon University
Method Success (of 40) Object XYZ Path Length Ratio
Std. Dev.
Weighted A* 37 0.91 0.68
Maxim Likhachev
Planning with E-Graphs for Mobile Manipulation
Pros Cons
Carnegie Mellon University 59
• Can lead the search around local minima (such as obstacles)
• Provided paths can have arbitrary quality
• Can reuse many parts of multiple experiences
• User can control bound on solution quality
• Local minima can be created getting on to the E-Graph
• The E-Graph can grow arbitrarily large
Discussion
Maxim Likhachev
• Heuristic Search within sub-optimality bounds - can be much more suitable for high-dimensional planning for
manipulation than optimal heuristic search
- still provides - good cost minimization
- consistent behavior
- rigorous theoretical guarantees
• Planning for manipulation solves similar tasks over and over again (unlike planning for navigation, flight, …) - great opportunity for combining planning with learning
Conclusions
60 Carnegie Mellon University