33
HPLAN-P An Heuristic Search Planner to Planning with Temporally Extended Pr eferences Luca Ceriani

Hplan-P planning system

Embed Size (px)

Citation preview

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 1/36

HPLAN-P

An Heuristic Search Planner to

Planning with TemporallyExtended Preferences

Luca Ceriani

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 2/36

HPLAN-P

• Heuristic planning with TEGs, SPs and TEPs

 – Incremental search algorithm

• Extended version of TLPLAN

 – TLPLAN DDL

 – PDDL2TLPlan translator

• Awarded distinguished performance in thequalitative preference track (IPC5)

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 3/36

Heuristics Planning with Preferences

• Distinguishing between successfulplans of different quality

 –

Qualitative vs Quantitative

• Actively guide the search towards the

achievement of preferences – heuristics for planning with preferences

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 4/36

Outline

• PDDL3 problem/domain

 – Planning problem with TEGs and TEPs

• Preprocessing Phase

• Adapting existing heuristic search techniques toachieve SPs and solve the compiled problem

• HPLAN-P algorithm

 – exploiting the adapted heuristics to incrementally find

better plans

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 5/36

PDDL3 Overview

• TEGs/TEPs

• Simple Preferences (SPs)

• Precondition Preferences (PPs)

• Metric Function (M)

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 6/36

 TEGs and TEPs

•  Temporal constraints

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 7/36

Simple Preferences (SPs)

• Atemporal conditions over the finalstate of a plan

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 8/36

Precondition Preferences (PPs)

• Fare clic per modificare stili del testo dello schema

 – Secondo livello – Terzo livello

• Quarto livello

 – Quinto livello

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 9/36

Metric Function

• Defines the plan (numeric) qualityover:

 – Preference violation weigth/count

 – Preference internal/external quantification

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 10/36

Preprocessing PDDL3

• Simpler planning problem containing

only SPs – augmented planning domain/problem

• New metric function M

 – refers to SPs

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 11/36

Preprocessing PPs

• (preference p φ)

 – is-violated- p counter

 –

Initiliazed to 0

 – (when (not φ) (increase (is-violated- p)1))

• In the context of a single action

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 12/36

Preprocessing TEGs /TEPs

•  TEGs and TEPs reduced SGs and SPs

 – SPs are optional goal condition

 

• ∀ (TEG or TEP) ф,

 – a new domain predicate Pф

 – Pф is TRUE ⇔ ф is satisfied by the plan

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 13/36

Preprocessing TEGs /TEPs: Steps

• ф TEGs/TEPs ⇒ f-FOLTL fф

 – finite LTL: not achievable goals

fф ⇔ Automaton Aф

 – No BA

 – Transitions labeled with FO (PDDL) predicates

 –

Aф states monitor the satisfaction of ф

• Aф⇒ Planning Domain

 – Only valid/preferred plans simulate automata

 – acceptance predicate ⇔ acceptance state

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 14/36

Sometime

re clic per modificare stili del testo dello sch

Secondo livello

erzo livello• Quarto livello

 – Quinto livello

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 15/36

Always

re clic per modificare stili del testo dello sch

Secondo livello

erzo livello• Quarto livello

 – Quinto livello

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 16/36

Preprocessing Automaton

• ∀ Aф two new predicates (eventually parameterized)

 – (state-A ?s ?x) and (accepting-A ?x)

• Automaton state updates with CE (eventually quantified)

 –  

 – “one step behind”

 – Augmenting each original action + finish action

 – Adding start/finish actions ⇒ initialization/goal specification

 – Mutex and exhaustive

 – Multiple parallel updates of different automata

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 17/36

PNFA

re clic per modificare stili del testo dello sch

econdo livelloerzo livello

• Quarto livello – Quinto livello

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 18/36

PNFA

• All different paths to the goal

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 19/36

PNFA

• Pseudo-action updates

 – No augmenting action domain

• Belief state reasoning

• Exploited TLPLAN pruning ability

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 20/36

Non-Compilable TEGs/TEPs

• Constraints that require infinite plans

• State trajectory constraints andlinearization

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 21/36

 Temporal Domain

• CE added at both start/end points of each action

•  TIL (exogenous events)

 – within, hold-after, hold-during

• (always-within t φ ψ )

 – Timed Automaton

 – reset action

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 22/36

Heuristics Design

• Active search

•Priority to achieving HG

• Desirability VS Ease of Achieving

preferences

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 23/36

Heuristic for Planning withPreferences

• Relaxed planning graph based heuristics

 – graph expanded until all goal and preference

facts appear in the relaxed state

 – accepting predicates

 – pseuso actions

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 24/36

Goal Distance Function G

• How hard is to reach the goal

 – non-admissible

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 25/36

Preference Distance Function P

• How hard is to reach the preferencefacts

• Unreachable preference facts do notaffect P’s value

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 26/36

Optimistic Metric Function O

• Estimate the value achievable by any plan extending the partial planreaching s

• NO RPG but evaluates M in s assuming:

 – no PPs will be violated in the future

 – Unachievable preference are treated as false

 – All inviolate preferences will achieved in the future

• If M is non-increasing in the number of achieved preferences, O is alower bound (for M) on the best plan extending s

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 27/36

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 28/36

Discounted Metric Function D(r)

• Believes more in easier preferences

 – M’s weight has higher impact on D(tradeoff)

• r  ϵ [0, 1] discount factor

 –r

0: heavily discount deeperpreferences

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 29/36

HPLAN-P

• Forward search

 – Best First Search

Heuristic – Different from TLPLAN

• Incremental (episodic)

 – Each episode ends as soon as a better plan is found

• Optimal

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 30/36

Sequence of Planning Episodes

• G with Best First Search

 – HG must be satisfied

 – Other h. can conflict with HG

• Restart the search using some combination of the h. functions

 – Any combination of h.

 – Always G at first

 – Prioritized sequences to break ties

 – GD(0.3)O

 – GD(0.1)D(0.2)P

Caching relaxed states and computed h. values

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 31/36

Increase Plan Quality

• Each subsequent episode yields a better plan

• Increasingly restricted pruning

 – MetricBoundFN(s) estimates a lower bound on M of any plan extending s 

 – Either O or B can be used by MetricBoundFN(.)

• Pruning states that violate HC

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 32/36

HPLAN-P Algorithm

re clic per modificare stili del testo dello sch

Secondo livelloerzo livello•

Quarto livello – Quinto livello

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 33/36

Sound Pruning• If MetricBoundFN(s) is a lower bound on M of any plan

extending s ⇒ pruning is sound

• With sound pruning optimal plans are never pruned

1. MetricBoundFN(s) ≥ bestMetric

2. s is pruned

3. MetricBoundFN(s) ≤ M(ss)

4. ss never reached

5. M(ss) ≥ bestMetric

6. sound pruning

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 34/36

Optimality• If HPLAN-P stops and sound pruning is used ⇒ the last plan return is optimal

• Proof 

• Each planning episode has returned a better plan

• It stops only when final episode has rejected all possible plans

• Sound pruning never prunes optimal plans

• No better plan than the last one returned exists

• UserHeuristic(.) can even be non-admissible

• k-optimality

• sound pruning

(total-time) ≤ k as HC

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 35/36

 Termination

• HPLAN-P termination conditions:

 – bestMetricintial finite

 – MetricBoundFN(s) ≤ bestMetricintial finite

 – M cannot improve as the number of violated PPs increases

 – ∀ m | m < bestMetricintial and M=m

 – The number of plans with M<m is finite

7/30/2019 Hplan-P planning system

http://slidepdf.com/reader/full/hplan-p-planning-system 36/36

References

• A Heuristic Search Approach to Planning with Temporally Extended Prefer, Baier, J. and Bacchus, F. and McIlraith, S., 2007 Proceedings of the

 Twentieth International Joint Conference on Artificial Intelligence(IJCAI-07), pp. 1808-1815, January , Hyderabad, India

• Planning with First-Order Temporally Extended Goals Using Heuristic Sear, Jorge A. Baier and Sheila McIlraith, Proceedings of the 21st NationalConference on Artificial Intelligence (AAAI-06), pp. 788-795, July2006, Boston, MA.

Alfonso E. Gerevini, Derek Long, Patrik Haslum, Alessandro Saetti, Yannis Dimopoulos, "Deterministic Planning in the Fifth International Planning Competition: PD, Artificial Intelligence, vol 173 (2009), pp. 619-668.