Hplan-P planning system

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 1/36

HPLAN-P

An Heuristic Search Planner to

Planning with TemporallyExtended Preferences

Luca Ceriani



HPLAN-P

• Heuristic planning with TEGs, SPs and TEPs

– Incremental search algorithm

• Extended version of TLPLAN

– TLPLAN DDL

– PDDL2TLPlan translator

• Awarded distinguished performance in thequalitative preference track (IPC5)



Heuristics Planning with Preferences

• Distinguishing between successfulplans of different quality

–

Qualitative vs Quantitative

• Actively guide the search towards the

achievement of preferences – heuristics for planning with preferences



Outline

• PDDL3 problem/domain

– Planning problem with TEGs and TEPs

• Preprocessing Phase

• Adapting existing heuristic search techniques toachieve SPs and solve the compiled problem

• HPLAN-P algorithm

– exploiting the adapted heuristics to incrementally find

better plans



PDDL3 Overview

• TEGs/TEPs

• Simple Preferences (SPs)

• Precondition Preferences (PPs)

• Metric Function (M)



TEGs and TEPs

• Temporal constraints



Simple Preferences (SPs)

• Atemporal conditions over the finalstate of a plan



Precondition Preferences (PPs)

• Fare clic per modificare stili del testo dello schema

– Secondo livello – Terzo livello

• Quarto livello

– Quinto livello



Metric Function

• Defines the plan (numeric) qualityover:

– Preference violation weigth/count

– Preference internal/external quantification



Preprocessing PDDL3

• Simpler planning problem containing

only SPs – augmented planning domain/problem

• New metric function M

– refers to SPs



Preprocessing PPs

• (preference p φ)

– is-violated- p counter

–

Initiliazed to 0

– (when (not φ) (increase (is-violated- p)1))

• In the context of a single action



Preprocessing TEGs /TEPs

• TEGs and TEPs reduced SGs and SPs

– SPs are optional goal condition

• ∀ (TEG or TEP) ф,

– a new domain predicate Pф

– Pф is TRUE ⇔ ф is satisfied by the plan



Preprocessing TEGs /TEPs: Steps

• ф TEGs/TEPs ⇒ f-FOLTL fф

– finite LTL: not achievable goals

•

fф ⇔ Automaton Aф

– No BA

– Transitions labeled with FO (PDDL) predicates

–

Aф states monitor the satisfaction of ф

• Aф⇒ Planning Domain

– Only valid/preferred plans simulate automata

– acceptance predicate ⇔ acceptance state



Sometime

re clic per modificare stili del testo dello sch

Secondo livello

erzo livello• Quarto livello

– Quinto livello



Always


Secondo livello

erzo livello• Quarto livello

– Quinto livello



Preprocessing Automaton

• ∀ Aф two new predicates (eventually parameterized)

– (state-A ?s ?x) and (accepting-A ?x)

• Automaton state updates with CE (eventually quantified)

–

– “one step behind”

– Augmenting each original action + finish action

– Adding start/finish actions ⇒ initialization/goal specification

– Mutex and exhaustive

– Multiple parallel updates of different automata



PNFA


econdo livelloerzo livello

• Quarto livello – Quinto livello



PNFA

• All different paths to the goal



PNFA

• Pseudo-action updates

– No augmenting action domain

• Belief state reasoning

• Exploited TLPLAN pruning ability



Non-Compilable TEGs/TEPs

• Constraints that require infinite plans

• State trajectory constraints andlinearization



Temporal Domain

• CE added at both start/end points of each action

• TIL (exogenous events)

– within, hold-after, hold-during

• (always-within t φ ψ )

– Timed Automaton

– reset action



Heuristics Design

• Active search

•Priority to achieving HG

• Desirability VS Ease of Achieving

preferences



Heuristic for Planning withPreferences

• Relaxed planning graph based heuristics

– graph expanded until all goal and preference

facts appear in the relaxed state

– accepting predicates

– pseuso actions



Goal Distance Function G

• How hard is to reach the goal

– non-admissible



Preference Distance Function P

• How hard is to reach the preferencefacts

• Unreachable preference facts do notaffect P’s value



Optimistic Metric Function O

• Estimate the value achievable by any plan extending the partial planreaching s

• NO RPG but evaluates M in s assuming:

– no PPs will be violated in the future

– Unachievable preference are treated as false

– All inviolate preferences will achieved in the future

• If M is non-increasing in the number of achieved preferences, O is alower bound (for M) on the best plan extending s





Discounted Metric Function D(r)

• Believes more in easier preferences

– M’s weight has higher impact on D(tradeoff)

• r ϵ [0, 1] discount factor

–r

0: heavily discount deeperpreferences



HPLAN-P

• Forward search

– Best First Search

•

Heuristic – Different from TLPLAN

• Incremental (episodic)

– Each episode ends as soon as a better plan is found

• Optimal



Sequence of Planning Episodes

• G with Best First Search

– HG must be satisfied

– Other h. can conflict with HG

• Restart the search using some combination of the h. functions

– Any combination of h.

– Always G at first

– Prioritized sequences to break ties

– GD(0.3)O

– GD(0.1)D(0.2)P

•

Caching relaxed states and computed h. values



Increase Plan Quality

• Each subsequent episode yields a better plan

• Increasingly restricted pruning

– MetricBoundFN(s) estimates a lower bound on M of any plan extending s

– Either O or B can be used by MetricBoundFN(.)

• Pruning states that violate HC



HPLAN-P Algorithm


Secondo livelloerzo livello•

Quarto livello – Quinto livello



Sound Pruning• If MetricBoundFN(s) is a lower bound on M of any plan

extending s ⇒ pruning is sound

• With sound pruning optimal plans are never pruned

1. MetricBoundFN(s) ≥ bestMetric

2. s is pruned

3. MetricBoundFN(s) ≤ M(ss)

4. ss never reached

5. M(ss) ≥ bestMetric

6. sound pruning



Optimality• If HPLAN-P stops and sound pruning is used ⇒ the last plan return is optimal

• Proof

• Each planning episode has returned a better plan

• It stops only when final episode has rejected all possible plans

• Sound pruning never prunes optimal plans

• No better plan than the last one returned exists

• UserHeuristic(.) can even be non-admissible

• k-optimality

• sound pruning

•

(total-time) ≤ k as HC



Termination

• HPLAN-P termination conditions:

– bestMetricintial finite

– MetricBoundFN(s) ≤ bestMetricintial finite

– M cannot improve as the number of violated PPs increases

– ∀ m | m < bestMetricintial and M=m

– The number of plans with M<m is finite



References

• A Heuristic Search Approach to Planning with Temporally Extended Prefer, Baier, J. and Bacchus, F. and McIlraith, S., 2007 Proceedings of the

Twentieth International Joint Conference on Artificial Intelligence(IJCAI-07), pp. 1808-1815, January , Hyderabad, India

• Planning with First-Order Temporally Extended Goals Using Heuristic Sear, Jorge A. Baier and Sheila McIlraith, Proceedings of the 21st NationalConference on Artificial Intelligence (AAAI-06), pp. 788-795, July2006, Boston, MA.

•

Alfonso E. Gerevini, Derek Long, Patrik Haslum, Alessandro Saetti, Yannis Dimopoulos, "Deterministic Planning in the Fifth International Planning Competition: PD, Artificial Intelligence, vol 173 (2009), pp. 619-668.

http://www.cs.toronto.edu/tlplan/IJCAI07-292.pdf

http://www.cs.toronto.edu/~jabaier


http://www.cs.toronto.edu/~fbacchus

http://www.cs.toronto.edu/~sheila


http://www.cs.toronto.edu/~sheila/publications/papers/bai-mci-aaai06.pdf





http://www.ing.unibs.it/~gerevini/papers/GereviniEtAl-IPC5.ps.gz

http://www.ing.unibs.it/~gerevini/papers/GereviniEtAl-IPC5.ps.gz





http://www.cs.toronto.edu/~sheila/publications/papers/bai-mci-aaai06.pdf



http://www.cs.toronto.edu/~fbacchus



http://www.cs.toronto.edu/tlplan/IJCAI07-292.pdf

Documents

Hplan-P planning system