Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A, D) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Sub-optimal Heuristic Search and its Application to Planning for

Manipulation

Mike Phillips

Slides adapted from Maxim Likhachev

Maxim Likhachev

Planning for Mobile Manipulation •  What planning tasks are there?

2 Carnegie Mellon University

robotic bartender demo at Search-based Planning Lab, April’12

Maxim Likhachev




task planning (order of getting items) planning for navigation planning motions for an arm planning coordinated base+arms (full body) motions planning how to grasp

Maxim Likhachev




task planning (order of getting items) planning for navigation planning motions for an arm planning coordinated base+arms (full body) motions planning how to grasp

this class

Maxim Likhachev

Planning for Mobile Manipulation •  Example of planning for 20D arm in 2D:

–  each state is defined by 20 discretized joint angles {q1, q2, …, q20} –  each action is changing one angle (or set of angles) at a time


Maxim Likhachev




How to search such a high-dimensional state-space?

Maxim Likhachev

Typical Approaches to Motion Planning

planning with optimal heuristic search-based

approaches (e.g., A*, D*)

planning with sampling-based approaches

(e.g., RRT, PRM,…) + completeness/optimality guarantees + excellent cost minimization + consistent solutions to similar motion queries - slow and memory intensive - used mostly for 2D (x,y) path planning

+ typically very fast and low memory + used for high-D path/motion planning - completeness in the limit -  often poor cost minimization - hard to provide solution consistency

sub-optimal heuristic searches can be fast enough for high-D motion planning and return consistent good quality solutions


Maxim Likhachev




Heuristic search? Wouldn’t you need to store m20

(m-# of values) states in memory?

states are generated on-the fly as search expands states

Maxim Likhachev

Sub-optimal Heuristic Search for High-D Motion Planning

•  ARA* (Anytime version of A*) graph search –  effective use of solutions to relaxed (easier) motion planning problems

•  Experience graphs –  heuristic search that learns from its planning experiences


Maxim Likhachev





Maxim Likhachev University of Pennsylvania

•  A* Search: expands states in the order of f(s)=g(s)+h(s) values

•  In A*, when s is expanded g(s) is optimal •  h(s) is a relaxed (simplified) version of the problem

Heuristics in Heuristic Search

11

start

goal

h(s)

s g(s)





12

start

goal

h(s)

s g(s)

If h(s) was a perfect estimate (let’s call it h*), what does f(s) represent upon expansion of s?





13

start

goal

h(s)

s g(s)

If h(s) was a perfect estimate (let’s call it h*), what does f(s) represent upon expansion of s?

Now what if h(s) wasn’t perfect?


•  A* Search: expands states in the order of f = g+h values ComputePath function while(sgoal is not expanded) remove s with the smallest [f(s) = g(s)+h(s)] from OPEN;

insert s into CLOSED; for every successor s’ of s such that s’ not in CLOSED

if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); insert s’ into OPEN;


expansion of s

14

h(s) g(s)

Sstart

S

S2

S1

Sgoal

…

…

the cost of a shortest path from sstart to s found so far

an (under) estimate of the cost of a shortest path from s to sgoal

Maxim Likhachev

•  Solutions to a relaxed version of the planning problem



Euclidean distance heuristic: 2D planning without obstacles and no grid

2D end-effector distance heuristic: 2D planning for end-effector with obstacles

2D (x,y) planning 20D arm planning


•  Dijkstra’s: expands states in the order of f = g values •  A* Search: expands states in the order of f = g+h values

•  Weighted A*: expands states in the order of f = g+εh values, ε > 1 = bias towards states that are closer to goal


16

h(s) g(s)

Sstart

S

S2

S1

Sgoal

…

…

the cost of a shortest path from sstart to s found so far

an (under) estimate of the cost of a shortest path from s to sgoal



sgoal sstart

… …

•  Dijkstra’s: expands states in the order of f = g values

What are the states expanded?

17



sgoal sstart

… …

•  A* Search: expands states in the order of f = g+h values

What are the states expanded?

18



sgoal sstart

… …

•  A* Search: expands states in the order of f = g+h values

for high-D problems, this results in A* being too slow and running out of memory

19



•  Weighted A* Search: expands states in the order of f = g+εh values, ε > 1

•  bias towards states that are closer to goal

sstart sgoal …

…

20



•  Weighted A* Search: –  trades off optimality for speed –  ε-suboptimal:

cost(solution) ≤ ε·cost(optimal solution) –  in many domains, it has been shown to be orders of magnitude

faster than A*

A*: ε =1.0

20 expansions solution=10 moves

Weighted A*: ε =2.5


21

Maxim Likhachev

Anytime Search based on weighted A*

•  Constructing anytime search based on weighted A*: - find the best path possible given some amount of time for planning - do it by running a series of weighted A* searches with decreasing ε:

ε =2.5


ε =1.5


ε =1.0


University of Pennsylvania 22

Maxim Likhachev



ε =2.5


ε =1.5


ε =1.0


•  Inefficient because – many state values remain the same between search iterations – we should be able to reuse the results of previous searches


Maxim Likhachev



ε =2.5


ε =1.5


ε =1.0


•  ARA* - an efficient version of the above that reuses state values within any search iteration


Maxim Likhachev

•  Efficient series of weighted A* searches with decreasing ε: ARA*

ComputePathwithReuse function while(f(sgoal) > minimum f-value in OPEN ) remove s with the smallest [g(s)+ εh(s)] from OPEN; insert s into CLOSED;

for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN;


Maxim Likhachev



for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS


Maxim Likhachev



for every successor s’ of s if g(s’) > g(s) + c(s,s’) g(s’) = g(s) + c(s,s’); if s’ not in CLOSED then insert s’ into OPEN; otherwise insert s’ into INCONS


set ε to large value; g(sstart) = 0; OPEN = {sstart}; while ε ≥ 1

CLOSED = {}; INCONS = {}; ComputePathwithReuse(); publish current ε suboptimal solution;

decrease ε; initialize OPEN = OPEN U INCONS;

Maxim Likhachev

•  A series of weighted A* searches

•  ARA*

ε =2.5


ε =1.5


ε =1.0


ε =2.5


ε =1.5

1 expansion solution=11 moves

ε =1.0


ARA*


Maxim Likhachev

When using ARA*, the research is in finding a graph representation G and a corresponding heuristic function h that lead to shallow local minima for the search

sstart sgoal …

…

key to finding solution fast: shallow minima for h(s)-h*(s) function


ARA*

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for a single 7DOF arm [Cohen et al.,

ICRA’10, ICRA’11] –  goal is given as 6D (x,y,z,r,p,y) pose of the end-effector –  each state is: 4 non-wrist joint angles {q1,…q4} when far away from the goal,

all 7 joint angles {q1,…q7} when close to the goal –  actions are: {moving one joint at a time by fixed ∆; IK-based motion to snap to

the goal when close to it}


Maxim Likhachev




the goal when close to it} –  heuristics are: 3D BFS distances for the end-effector


expansions (shown as dots corresponding to ef poses) based on 3D BFS heuristics (shown as curve)

expansions based on Euclidean distance heuristics (shown as curve)

Maxim Likhachev








Maxim Likhachev








Maxim Likhachev


ICRA’10, ICRA’11]


Shown on PR2 robot but planner has been run on other arms (kuka)

Maxim Likhachev

ARA*-based Planning for Manipulation •  ARA*-based motion planning for dual-arm motion [Cohen et al.,

ICRA’12] –  goal is given as 4D (x,y,z,y) pose for the object –  each state is 6D: 4D (x,y,z,y) pose of the object, {q1,q2} of the elbow angles –  actions are: {changing the 4D pose of the object or q1, q2} –  heuristics are: 3D BFS distances for the object


Maxim Likhachev


ICRA’12]


motions as generated by the planner - no smoothing

Maxim Likhachev


ICRA’12]


motions as generated by the planner - no smoothing consistency of motions across similar start-goal trials

Maxim Likhachev

Pros Cons

Carnegie Mellon University 38

•  Explicit cost minimization •  Consistent plans (similar

input generates similar output)

•  Can handle arbitrary goal sets (such as set of grasps from a grasp planner)

•  Many path constraints are easy to implement (upright orientation of gripper)

•  Requires good graph and heuristic design to plan fast

•  Number of motions from a state affects speed and path quality

Discussion

ARA*-based Planning for Manipulation

Maxim Likhachev





Maxim Likhachev

Planning with Experience Graphs •  Many planning tasks are repetitive

-  loading a dishwasher -  opening doors -  moving objects around a warehouse -  …

•  Can we re-use prior experience to accelerate planning, in the context of search-based planning?

•  Would be especially useful for high-dimensional problems such as mobile manipulation!


Maxim Likhachev

Planning with Experience Graphs Given a set of previous paths (experiences)…


Maxim Likhachev

Planning with Experience Graphs Put them together into an E-graph (Experience graph)


Maxim Likhachev

Planning with Experience Graphs •  E-Graph [Phillips et al., RSS’12]:

–  Collection of previously computed paths or demonstrations –  A sub-graph of the original graph


Maxim Likhachev

Planning with Experience Graphs Given a new planning query…


Maxim Likhachev

Planning with Experience Graphs …re-use E-graph. For repetitive tasks, planning becomes much faster

goal

start


Maxim Likhachev

Planning with Experience Graphs …re-use E-graph. For repetitive tasks, planning becomes much faster

goal

start

Theorem 1: Algorithm is complete with respect to the original graph Theorem 2: The cost of the solu8on is within a given bound on sub-‐op8mality


Maxim Likhachev

Planning with Experience Graphs •  Reuse E-Graph by:

–  Introducing a new heuristic function –  Heuristic guides the search toward expanding states on the E-Graph

within sub-optimality bound ε

goal

start


Maxim Likhachev

Planning with Experience Graphs •  Focusing search towards E-graph within sub-optimality bound ε

hG

cε

consistent heuristic is one that satisfies the triangle inequality,hG(s, sgoal) ! c(s, s!) + hG(s!, sgoal).

C. Algorithm DetailThe planner maintains two graphs, G and GE . At the high-

level, every time the planner receives a new planning requestthe findPath function is called. It first updates GE to accountfor edge cost changes and perform some precomputations.Then it calls the computePath function, which produces apath !. This path is then added to GE . The updateEGraphfunction works by updating any edge costs in GE that havechanged. If any edges are invalid (e.g. they are now blockedby obstacles) they are put into a disabled list. Conversely, ifan edge in the disabled list now has finite cost it is re-enabled.At this point, the graph GE should only contain finite edges.A precomputeShortcuts function is then called which can beused to compute shortcut edges before the search begins. Waysto compute shortcuts are discussed in Section V. Finally, ourheuristic hE , which encourages path reuse, is computed.

findPath(sstart, sgoal)1: updateEGraph(sgoal)2: ! = computePath(sstart, sgoal)

3: GE = GE ! !

updateEGraph(sgoal)1: updateChangedCosts()2: disable edges that are now invalid3: re-enable disabled edges that are now valid4: precomputeShortcuts()

5: compute heuristic hE according to Equation 1

Our algorithm’s speed-up comes from being able to reuseparts of old paths and avoid searching large portions ofgraph G. To accomplish this we introduce a heuristic whichintelligently guides the search toward GE when it looks likefollowing parts of old paths will help the search get close tothe goal. We define a new heuristic hE in terms of the givenheuristic hG and edges in GE .

hE(s0) = min!

N"1!

i=0

min{"EhG(si, si+1), cE(si, si+1)} (1)

where ! is a path "s0 . . . sN"1# and sN"1 = sgoal and "E isa scalar $ 1.Equation 1 is a minimization over a sequence of segments

such that each segment is a pair of arbitrary states sa, sb % Gand the cost of this segment is given by the minimum of twothings: either "EhG(sa, sb), the original heuristic inflated by"E , or the cost of an actual least-cost path in GE , providedthat sa, sb % GE .In Figure 2, the path ! that minimizes the heuristic contains

a sequence of alternating segments. In reality ! can alternatebetween hG and GE segments as many or as few times asneeded to produce the minimal !. When there is a GE segmentwe can have many states si on the segment to connect two

Fig. 2. A visualization of Equation 1. Solid lines are composed of edgesfrom GE , while dashed lines are distances according to hG. Note that hG

segments are always only two points long, while GE segments can be anarbitrary number of points.

(a) "E = 1 (b) "E = 2

(c) "E " #

Fig. 3. Shortest ! according to hE as "E changes. The dark solid lines arepaths in GE while the dark dashed lines are the heuristic’s path !. Note as"E increases, the heuristic prefers to travel on GE . The light gray circles andlines show the graph G and the filled in gray circles represent the expandedstates under the guidance of the heuristic.

states, while when on a hG segment a single pair of pointssuffices since no real edge between two states is needed forhG to be defined.The larger "E is, the more we avoid exploring G and focus

on traveling on paths in GE . Figure 3 demonstrates how thisworks. As GE increases, it becomes more expensive to traveloff of GE causing the heuristic to guide the search along partsofGE . In Figure 3a, the heuristic ignores the graphGE becausewithout inflating hG at all, following real edges costs (likethose in GE ) will never be the cheaper option. In the otherparts of Figure 3 we can see as "E increase, the heuristic usesmore and more of GE . This figure also shows how during thesearch, by following old paths, we can avoid obstacles andhave far fewer expansions. The expanded states are shown asfilled in gray circles, which change based on how the hE isbiased by "E .The computePath function runs weighted A* without re-

expansions [13, 10]. Weighted A* uses a parameter "w >1 to inflate the heuristic used by A*. The solution cost isguaranteed to be no worse than "w times the cost of the optimalsolution and in practice it runs dramatically faster than A*.The main modification to Weighted A*, is that in addition tousing the edges that G already provides (getSuccessors), weadd two additional types of successors: shortcuts and snapmotions. The only other change is that instead of using the

Traveling on E-Graph uses actual costs

Traveling off the E-Graph uses an inflated original heuristic

Heuristic computation finds a min cost path using two kinds of “edges”


goal

start

Maxim Likhachev


hG

cε





3: GE = GE ! !




hE(s0) = min!

N"1!

i=0







(a) "E = 1 (b) "E = 2

(c) "E " #









goal

start

Maxim Likhachev


goal

start

hG

cε





3: GE = GE ! !




hE(s0) = min!

N"1!

i=0







(a) "E = 1 (b) "E = 2

(c) "E " #









Maxim Likhachev






3: GE = GE ! !




hE(s0) = min!

N"1!

i=0







(a) "E = 1 (b) "E = 2

(c) "E " #






εε=1.5 εε=∞

goal

start goal

start

Maxim Likhachev

Planning with Experience Graphs Theorem 5. Completeness w.r.t. the original graph G: Planning with E-graphs is guaranteed to find a solution, if one exists in G Theorem 6. Bounds on sub-optimality: The cost of the solution found by planning with E-graphs is guaranteed to be at most εE-suboptimal:

cost(solution) ≤ εE cost(optimal solution in G)


Maxim Likhachev

Planning with Experience Graphs Theorem 5. Completeness w.r.t. the original graph G: Planning with E-graphs is guaranteed to find a solution, if one exists in G Theorem 6. Bounds on sub-optimality: The cost of the solution found by planning with E-graphs is guaranteed to be at most εE-suboptimal:

cost(solution) ≤ εE cost(optimal solution in G)


These proper8es hold even if quality or placement of the experiences is arbitrarily bad

Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation •  Dual-arm mobile manipulation (10 DoF)


Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation •  Dual-arm mobile manipulation (10 DoF)

•  3 for base navigation •  1 for spine •  6 for arms (upright object constraint, roll and pitch are 0)

•  Experiments in multiple environments


Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation Kitchen environment:

- moving objects around a kitchen - bootstrap E-Graph with 10 representative goals - tested on 40 goals in natural kitchen locations


Maxim Likhachev

Planning with E-Graphs for Mobile Manipulation

Kitchen environment – Planning times


•  Max planning *me of 2 minutes •  Sub-‐op*mality bound of 20 (for E-‐Graphs and Weighted A*)

Method Success (of 40) Mean Speed-up Std. Dev. Max Weighted A* 37 34.62 87.74 506.78

Success (of 40) Mean Time (s) Std. Dev. (s) Max (s)

E-Graphs 40 0.33 0.25 1.00

Maxim Likhachev


Kitchen environment – Path Quality


Method Success (of 40) Object XYZ Path Length Ratio

Std. Dev.

Weighted A* 37 0.91 0.68

Maxim Likhachev


Pros Cons

Carnegie Mellon University 59

•  Can lead the search around local minima (such as obstacles)

•  Provided paths can have arbitrary quality

•  Can reuse many parts of multiple experiences

•  User can control bound on solution quality

•  Local minima can be created getting on to the E-Graph

•  The E-Graph can grow arbitrarily large

Discussion

Maxim Likhachev

•  Heuristic Search within sub-optimality bounds -  can be much more suitable for high-dimensional planning for

manipulation than optimal heuristic search

-  still provides -  good cost minimization

-  consistent behavior

-  rigorous theoretical guarantees

•  Planning for manipulation solves similar tasks over and over again (unlike planning for navigation, flight, …) -  great opportunity for combining planning with learning

Conclusions


Documents

Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A*, D*) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality

Sub-optimal Heuristic Search and its Application to ... search-based approaches (e.g., A, D) planning with sampling-based approaches (e.g., RRT, PRM,…) + completeness/optimality