52
Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal) Håkan L. S. Younes Thesis Committee: Reid G. Simmons (chair) Geoffrey J. Gordon Jeff Schneider David J. Musliner, Honeywell

Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

  • Upload
    joshua

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal). Håkan L. S. Younes Thesis Committee: Reid G. Simmons (chair) Geoffrey J. Gordon Jeff Schneider David J. Musliner, Honeywell. Sampling-based Probabilistic Verification. Success. Aim of Thesis Research. - PowerPoint PPT Presentation

Citation preview

Page 1: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

Håkan L. S. Younes

Thesis Committee:Reid G. Simmons (chair)

Geoffrey J. GordonJeff Schneider

David J. Musliner, Honeywell

Page 2: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

2

Aim of Thesis Research

Policy generation for continuous-time stochastic systems with concurrency

Sampling-basedProbabilistic Verification

Success

Page 3: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

3

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

Dependable workstation cluster Components may fail at any point in

time Single repair agent

Page 4: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

4

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

Planning problem: Find repair policy that maintains

satisfactory quality of service

Page 5: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

5

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

all working

Page 6: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

6

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

all working L2 failed

L2 fails

3.6

Page 7: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

7

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

all working L2 failed repairing L2

L2 fails repair L2

3.6 1.5

Page 8: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

8

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

all working L2 failed repairing L2

repairing L2

switch failed

L2 fails repair L2 left switch fails

3.6 1.5 2.8

Page 9: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

9

Motivating Example

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

all working L2 failed repairing L2 switch failedrepairing L2

switch failed

L2 fails repair L2 left switch fails L2 repaired

3.6 1.5 2.8 4.2

Page 10: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

10

Why Challenging?

Continuous-time domain: State transitions can occur at any point

in time along a continuous time-axis Concurrent actions and events:

Left switch fails while repairing L2

Non-Markovian: Weibull distribution for time to

hardware failure

Page 11: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

11

Semi-Markovian?

Each component can be viewed as a semi-Markov process

working

failedrepairing

Weibull(2)

Uniform(1,2)

Uniform(3,8)

Page 12: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

12

ConcurrentSemi-Markov Processes

4.2

4.2

4.2

all working L2 failed repairing L2 switch failedrepairing L2

switch failed

L2 fails repair L2 left switch fails L2 repaired

3.6 1.5 2.8 4.2

generalized semi-Markov process (GSMP)

Page 13: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

13

Potential Application Domains

Multi-agent systems Process plant control

Startup/shutdown procedures Robotic control

Mars rover Large degree-of-freedom humanoid

Page 14: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

14

Planning Problem

Input: Domain model (GSMP) Initial state Goal condition

Output: Policy mapping states to actions

Page 15: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

15

Motivating Example

Maintain three interconnected workstations for 24 hours with probability at least 0.9

Switch Backbone Switch

Left cluster Right cluster

L1

L2

L3 R3

R2

R1

Page 16: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

16

Approach

Generate, Test, and Debug (GTD) (Simmons 1988)

Page 17: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

17

The GTD Paradigm

Generate initial plan making simplifying assumptions

Test if plan satisfies goal condition Debug plan if not satisfactory

Page 18: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

18

My Adaptation of GTD

Generate assume we can quickly generate plan

for simplified problem, or start with null-plan

Test Sampling-based probabilistic verification

Debug Analysis of sample execution paths

Page 19: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

19

Approach: Schematic View

Generate

Test Debugplan, sample paths

Test

Compare

new plan

sample paths

best plan

old plan, sample paths

initial plan

Page 20: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

20

Test Step

Verify goal condition in initial state Use simulation to generate sample

execution paths Use sequential acceptance sampling

(Wald 1945) to verify probabilistic properties

Page 21: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

21

Debug Step

Analyze sample execution paths to find reasons for failure Negative sample paths provide

information on how a plan can fail

Page 22: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

22

Generic Repair Procedure

1. Select some state along some negative sample path

2. Change the action planned for the selected state

Develop heuristics to make informed state/action choices

Page 23: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

23

Hill-climbing Search for Satisfactory Plan

1. Compare repaired plan to original plan

Fast sampling-based comparison

2. Select the better of the two plans

Page 24: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

24

Approach Summary

Plan verification using discrete event simulation and acceptance sampling

Plan repair using sample path analysis Heuristics to guide repair selection

Hill-climbing search for satisfactory plan Fast sampling-based plan comparison

Page 25: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

25

Related Work

Probabilistic planning Planning with concurrency Policy search for POMDPs Probabilistic verification

Page 26: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

26

Probabilistic Planning

Anytime synthetic projection (Dummond & Bresina 1990) Modal temporal logic for goal

specification Generate initial solution path “Robustify”: Find solution paths for

deviations to ideal path

Page 27: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

27

Planning with Concurrency

CIRCA (Musliner et al. 1995) Policy generation for timed automata Non-deterministic model

Concurrent temporally extended actions (Rohanimanesh & Mahadevan 2001) Decision-theoretic approach SMDP Q-learning

Page 28: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

28

Policy Search for POMDPs

Pegasus (Ng & Jordan 2000) Estimate value function by generating

sample execution paths GPOMDP (Baxter & Bartlett 2001)

Simulation-based method for estimating gradient of average reward

Page 29: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

29

Probabilistic Verification

PRISM (Kwiatkowska et al. 2002) Symbolic probabilistic model checking Requires Markov model

Page 30: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

30

Previous Work

Heuristic deterministic planning Sampling-based probabilistic

verification Initial work on sample path

analysis

Page 31: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

31

Heuristic Deterministic Planning

VHPOP: A heuristic partial order planner Distance-based heuristics for plan

ranking Novel flaw selection strategies “Best Newcomer” at 3rd International

Planning Competition

Page 32: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

32

Sampling-basedProbabilistic Verification Verifying CSL properties of discrete

event systems Sampling-based vs. symbolic methods:

Sampling-based approach… uses less memory scales well with size of state space adapts to difficulty of problem (sequential)

Symbolic methods… give exact results handle time-unbounded and steady-state

properties

Page 33: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

33

Tandem Queuing Network

M/Cox2/1 queue sequentially composed with M/M/1 queue

Each queue has capacity n State space of size O(n2)

1 2

a……

1−a

Page 34: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

34

Tandem Queuing Network (property)

When empty, probability is less than 0.5 that system becomes full within t time units: ¬Pr≥0.5(true U≤t full) in state “empty”

…… ……

Page 35: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

35

Tandem Queuing Network (results)

101 102 103 104 105 106 107 108 109 101010−2

10−1

100

101

102

103

104

105

106

Size of state space

Veri

fica

tion

tim

e (

seco

nds)

¬Pr≥0.5(true U≤t full)

==10−2

=10−2

T=500 (symbolic)T=50 ( " )T=5 ( " )T=500 (sampling)T=50 ( " )T=5 ( " )

Page 36: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

36

Tandem Queuing Network (results)

10 100 100010−2

10−1

100

101

102

103

104

105

Veri

fica

tion

tim

e (

seco

nds)

t

¬Pr≥0.5(true U≤t full)

==10−2

=10−2

n=255 (symbolic)n=31 ( " )n=3 ( " )n=255 (sampling)n=31 ( " )n=3 ( " )

Page 37: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

37

Symmetric Polling System

Single server, n polling stations Stations are attended in cyclic

order Each station can hold one message State space of size O(n·2n)

Server

…Polling stations

Page 38: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

38

Symmetric Polling System (property)

When full and serving station 1, probability is at least 0.5 that station 1 is polled within t time units: Pr≥0.5(true U≤t poll1) in state “full, srv

1” …

serving 1 polling 1

Page 39: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

39

Symmetric Polling System (results)V

eri

fica

tion

tim

e (

seco

nds)

Size of state space102 104 106 108 1010 1012 1014

10−2

10−1

100

101

102

103

104

105

Pr≥0.5(true U≤t poll1)

==10−2

=10−2

T=40 (symbolic)T=20 ( " )T=10 ( " )T=40 (sampling)T=20 ( " )T=10 ( " )

Page 40: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

40

Symmetric Polling System (results)V

eri

fica

tion

tim

e (

seco

nds)

10 100 1000

t

105

100

101

102

103

104

106

Pr≥0.5(true U≤t poll1)

==10−2

=10−2

n=18 (symbolic)n=15 ( " )n=10 ( " )n=18 (sampling)n=15 ( " )n=10 ( " )

Page 41: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

41

symbolic

Symmetric Polling System (results)

==10−10

==10−8

==10−6

==10−4

==10−2

100

101

102

Veri

fica

tion

tim

e (

seco

nds)

0.001 0.01

Pr≥0.5(true U≤t poll1)

n=10t=50

Page 42: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

42

Initial Work onSample Path Analysis

s9 ¬safe

Pr≥0.9(safe U≤100 goal)

s1 s5s2

s1 s5 s3 ¬safe

−1−−2−3−4

−1−−2−3

s5

s5

Page 43: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

43

Proposed Work

Heuristic plan repair Conjunctive probabilistic goals

Page 44: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

44

Heuristic Plan Repair Identify local repair operations for GSMP

domain models Develop distance-based heuristics for

stochastic domains to guide repair selection

Make use of positive sample execution paths to rank plan repairs

Evaluate different search strategies, such as hill-climbing and simulated annealing

Page 45: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

45

Heuristics to GuideAction Selection

Pr≥0.9(safe U≤100 goal)

safe ¬safe goal

Alternative path

Page 46: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

46

Utilizing Positive Sample Execution Paths

Positive sample paths show how a plan can succeed

Analyze positive sample paths to identify positive behavior

s9 goal

Pr≥0.9(safe U≤100 goal)

s1 s5s2

1234

Page 47: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

47

Evaluating AlternativeSearch Strategies

Simulated annealing Avoid getting stuck in local optimum Use confidence from plan comparison

to probabilistically select a plan

Page 48: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

48

Conjunctive Probabilistic Goals Add support for more expressive

goal conditions Pr≥() Pr≥’(’)

Can be used to express goal priorities

Challenges: How to compare two plans Multiple goals to account for when

repairing a plan

Page 49: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

49

Thesis

There is sufficient information in sample execution paths obtained during plan verification to enable efficient and effective repair of plans for continuous-time stochastic domains

Page 50: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

50

Validation Construct benchmark domains

Develop probabilistic PDDL Extend benchmarks for deterministic

temporal planning Gain insight into what makes a problem

difficult (empirical and theoretical analysis)

Make sure interesting plans can be generated in reasonable amount of time

Page 51: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

51

TimetableDate Activity

Spring 2003 Develop benchmark domains

Summer 2003

Identify repair operationsImplement initial version of planning system

Fall 2003 Develop distance-based heuristics for stochastic domains

Winter 2003/04

Experiment with different search strategies

Spring 2004 Add support for conjunctive probabilistic goals

Summer 2004

Theoretical analysis

Fall 2004 Final experimental validationWriting of thesis

Winter 2004/05

Finalizing thesisThesis defense

Page 52: Planning with Concurrency in Continuous-time Stochastic Domains (thesis proposal)

52

Possible Extensions

Generation of non-stationary policies

Partial observability Continuous-space models Decision-theoretic planning