61
Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig Boutilier, U. of Toronto

Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Embed Size (px)

Citation preview

Page 1: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Hierarchical Methods forPlanning under Uncertainty

Thesis Proposal

Joelle Pineau

Thesis Committee:

Sebastian Thrun, Chair

Matthew Mason

Andrew Moore

Craig Boutilier, U. of Toronto

Page 2: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Integrating robots in living environments

The robot’s role:- Social interaction- Mobile manipulation- Intelligent reminding- Remote-operation- Data collection / monitoring

Page 3: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

A broad perspective

GOAL = Selecting appropriate actions

USER + WORLD + ROBOT

ACTIONS

OBSERVATIONSBeliefstate

STATE

Page 4: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Cause #1: Non-deterministic effects of actions

Cause #2: Partial and noisy sensor information

Cause #3: Inaccurate model of the world and the user

Why is this a difficult problem?

UNCERTAINTY

Page 5: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Cause #1: Non-deterministic effects of actions

Cause #2: Partial and noisy sensor information

Cause #3: Inaccurate model of the world and the user

Why is this a difficult problem?

UNCERTAINTY

A solution: Partially Observable MarkovDecision Processes (POMDPs)

S3o1, o2

S1o1, o2

S2o1, o2

a1

a2

Page 6: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The truth about POMDPs

• Bad news:

– Finding an optimal POMDP action selection policy is computationally intractable for complex problems.

Page 7: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The truth about POMDPs

• Bad news:

– Finding an optimal POMDP action selection policy is computationally intractable for complex problems.

• Good news:

– Many real-world decision-making problems exhibit structure inherent to the problem domain.

– By leveraging structure in the problem domain, I propose an algorithm that makes POMDPs tractable, even for large domains.

Page 8: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

How is it done?

• Use a “Divide-and-conquer” approach:

– We decompose a large monolithic problem into a collection of loosely-related smaller problems.

Dialoguemanager

Healthmanager Social

manager

Remindingmanager

Page 9: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Thesis statement

Decision-making under uncertaintycan be made tractable for complex problems

by exploiting hierarchical structurein the problem domain.

Page 10: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Outline

• Problem motivation

Partially observable Markov decision processes

• The hierarchical POMDP algorithm

• Proposed research

Page 11: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

POMDPs within the family of Markov models

Markov Chain Hidden Markov Model(HMM)

Markov Decision Process(MDP)

Partially Observable MDP(POMDP)

Uncertainty in sensor input?

no

no

Controlproblem?

yes

yes

Page 12: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

POMDP parameters: Initial belief: b0(s)=Pr(so=s) Observation probabilities: O(s,a,o)=Pr(o|s,a) Transition probabilities: T(s,a,s’)=Pr(s’|s,a) Rewards: R(s,a)

HMM

What are POMDPs?

Components:Set of states: sSSet of actions: aASet of observations: oO

0.5

0.5

1

MDP

S2Pr(o1)=0.9Pr(o2)=0.1

S1Pr(o1)=0.5Pr(o2)=0.5

a1

a2S3

Pr(o1)=0.2Pr(o2)=0.8

Page 13: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

A POMDP example: The tiger problem

S1“tiger-left”

Pr(o=growl-left)=0.85Pr(o=growl-right)=0.15

S2“tiger-right”

Pr(o=growl-left)=0.15Pr(o=growl-right)=0.85

Actions={ listen, open-left, open-right}

Reward Function: R(a=listen) = -1R(a=open-right, s=tiger-left) = 10R(a=open-left, s=tiger-left) = -100

Page 14: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

What can we do with POMDPs?

1) State tracking:– After an action, what is the state of the world, st ?

2) Computing a policy:– Which action, aj, should the controller apply next?

Very hard!

Not so hard.

bt-1 ??

at-1 ot

Robot:

St-1 stWorld:

Control layer:

...

...

??

Page 15: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The tiger problem: State tracking

S1“tiger-left”

S2“tiger-right”

Belief vector

b0

Belief

Page 16: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The tiger problem: State tracking

S1“tiger-left”

S2“tiger-right”

Belief vector

b0

Belief

obs=growl-leftaction=listen

Page 17: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The tiger problem: State tracking

b1

obs=growl-left

S1“tiger-left”

S2“tiger-right”

Belief vector

Belief

b0

action=listen

baoP

sbassPasoP

sbSs

jjii

ij

,|

,|,| 0

1

Page 18: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Policy Optimization

• Which action, aj, should the controller apply next?

– In MDPs:

• Policy is a mapping from state to action, : si aj

– In POMDPs:

• Policy is a mapping from belief to action, : b aj

• Recursively calculate expected long-term reward for each state/belief:

• Find the action that maximizes the expected reward:

)(),|Pr(),(max)(1

j

N

jiji

ai sVassasRsV

)(),|Pr(),(maxarg)(1

j

N

jiji

ai sVassasRs

Page 19: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The tiger problem: Optimal policy

Belief vector:

open-leftopen-right listen

S1“tiger-left”

S2“tiger-right”

Optimal policy:

Page 20: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

• Finite-horizon POMDPs are in worse-case doubly exponential:

• Infinite-horizon undiscounted stochastic POMDPs are EXPTIME-hard, and may not be decidable (|n|).

POMDPComplexity(per step ofvalue iteration)

MDPrecursive upper-bound

Time

Space

Complexity of policy optimization

nOAS ||2 ||||

nOA ||||

||1

2 |||||| OnAS

||1 |||| O

nA

|||| 2 AS

|| S

|| n

Page 21: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The essence of the problem

• How can we find good policies for complex POMDPs?

• Is there a principled way to provide near-optimal policies in reasonable time?

Page 22: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Outline

• Problem motivation

• Partially observable Markov decision processes

The hierarchical POMDP algorithm

• Proposed research

Page 23: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

A hierarchical approach to POMDP planning

• Key Idea: Exploit hierarchical structure in the problem domain to break a problem into many “related” POMDPs.

• What type of structure?

Action set partitioning Act

InvestigateHealth Move

NavigateCheckPulse

AskWhere

Left Right Forward Backward

CheckMeds

subtask

abstractaction

Page 24: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Assumptions

• Each POMDP controller has a subset of Ao.

• Each POMDP controller has full state set S0, observation set O0.

• Each controller includes discriminative reward information.

• We are given the action set partitioning graph.

• We are given a full POMDP model of the problem: {So,Ao,Oo,Mo}.

Page 25: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The tiger problem: An action hierarchy

Pinvestigate={S0, Ainvestigate, O0, Minvestigate}Ainvestigate={listen, open-right}

act

open-left investigate

open-rightlisten

Page 26: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Optimizing the “investigate” controller

S1“tiger-left”

S2“tiger-right”

Locally optimal policy:

Belief vector:

open-right listen

Page 27: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The tiger problem: An action hierarchy

Pact={S0, Aact, O0, Mact}Aact={open-left, investigate}

act

open-left investigate

open-rightlisten

But... R(s, a=investigate)is not defined!

Page 28: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Modeling abstract actions

Insight: Use the local policy of corresponding low-level controller.

General form: R( si, ak) = R ( si, Policy(controllerk,si) )

Example: R(s=tiger-left,ak =investigate) =

open-right listen open-left

tiger-left 10 -1 -100

tiger-right -100 -1 10

Policy (investigate,s=tiger-left) = open-right

Page 29: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Optimizing the “act” controller

S1“tiger-left”

S2“tiger-right”

Locally optimal policy:

investigate

Belief vector:

open-left

Page 30: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The complete hierarchical policy

S1“tiger-left”

S2“tiger-right”

Hierarchical policy:

Belief vector:

open-leftopen-right listen

Page 31: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

The complete hierarchical policy

S1“tiger-left”

S2“tiger-right”

Hierarchical policy:

open-leftopen-right listen

Optimal policy:

Belief vector:

Page 32: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Results for larger simulation domains

POMDP H-POMDP MDP

Navigation Problem:|S|=11, |A|=6, |O|=6

CPU Time (secs): 1119.93 2.84 0.000654

Average Reward: 12.5 12.2 0.0

Dialogue Problem:|S|=20, |A|=30, |O|=27

CPU Time (secs): >24hrs 77.99 6.46

Average Reward: 64.43 53.33

%Correct actions: 93.2 80.0

Page 33: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Related work on hierarchical methods

• Hierarchical HMMs– Fine et al., 1998

• Hierarchical MDPs– Dayan&Hinton, 1993; Dietterich, 1998; McGovern et al., 1998; Parr&Russell,

1998; Singh, 1992.

• Loosely-coupled MDPs– Boutilier et al., 1997; Dean&Lin, 1995; Meuleau et al. 1998; Singh&Cohn, 1998;

Wang&Mahadevan, 1999.

• Factored state POMDPs– Boutilier et al., 1999; Boutilier&Poole, 1996; Hansen&Feng, 2000.

• Hierarchical POMDPs– Castanon, 1997; Hernandez-Gardiol&Mahadevan, 2001; Theocharous et al., 2001;

Wiering&Schmidhuber, 1997.

Page 34: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Outline

• Problem motivation

• Partially observable Markov decision processes

• The hierarchical POMDP algorithm

Proposed research

Page 35: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Proposed research

1) Algorithmic design

2) Algorithmic analysis

3) Model learning

4) System development and application

Page 36: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Research block #1: Algorithmic design

• Goal 1.1: Developing/implementing hierarchical POMDP algorithm.

• Goal 1.2: Extending H-POMDP for factorized state representation.

• Goal 1.3: Using state/observation abstraction.

• Goal 1.4: Planning for controllers with no local reward information.

Page 37: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

• Assumption #2:

“Each POMDP controller has full state set S0, and observation set O0.”

• Can we reduce the number of states/observations, |S| and |O|?

Goal 1.3: State/observation abstraction

Page 38: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

• Assumption #2:

“Each POMDP controller has full state set S0, and observation set O0.”

• Can we reduce the number of states/observations, |S| and |O|?

Yes! Each controller only needs subset of state/observation features.

• What is the computational speed-up?

Goal 1.3: State/observation abstraction

Navigate

Left Right Forward Backward

InvestigateHealth

CheckPulse CheckMeds

POMDP recursive upper-bound

Time complexity:nOAS ||2 ||||||

12 |||||| O

nAS

Page 39: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Goal 1.4: Local controller reward information

• Assumption #3:

“Each controller includes some amount of discriminative reward information.”

• Can we relax this assumption?

Page 40: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Goal 1.4: Local controller reward information

• Assumption #3:

“Each controller includes some amount of discriminative reward information.”

• Can we relax this assumption?

Possibly. Use reward shaping to select policy-invariant reward function.

• What is the benefit?

– H-POMDP could solve problems with sparse reward functions.

Page 41: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Research block #2: Algorithmic analysis

• Goal 2.1: Evaluating performance of the H-POMDP algorithm.

• Goal 2.2: Quantifying the loss due to the hierarchy.

• Goal 2.3: Comparing different possible decompositions of a problem.

Page 42: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Goal 2.1: Performance evaluation

• How does the hierarchical POMDP algorithm compare to:– Exact value function methods

» Sondik, 1971; Monahan, 1982; Littman, 1996; Cassandra et al, 1997.

– Policy search methods» Hansen, 1998; Kearns et al., 1999; Ng&Jordan, 2000; Baxter&Bartlett, 2000.

– Value approximation methods» Parr&Russell, 1995; Thrun, 2000.

– Belief approximation methods» Nourbakhsh, 1995; Koenig&Simmons, 1996; Hauskrecht, 2000; Roy&Thrun,

2000.

– Memory-based methods» McCallum, 1996.

• Consider problems from POMDP literature and dialogue management domain.

Page 43: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Goal 2.2: Quantifying the loss

• The hierarchical POMDP planning algorithm provides an approximately-optimal policy.

• How “near-optimal” is the policy?

• Subject to some (very restrictive) conditions:

“The value function of top-level controller

is an upper-bound on the value

of the approximation.”

• Can we loosen the restrictions? Tighten the bound?

Find a lower-bound?

Atop

A1

...

...

V top(b)Vactual(b)

Page 44: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Goal 2.3: Comparing different decomposition

• Assumption #4:

“We are given an action set partitioning graph.”

• What makes a good hierarchical action decomposition?

• Comparing decompositions is the first step towards automatic decomposition.

Manufacture Examine Inspect

Replace

a1

a2

Manufacture Replace Examine Inspect

a1

a2 a3

Page 45: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Research block #3: Model learning

• Goal 3.1: Automatically generating good action hierarchies.

– Assumption #4: “We are given an action set partitioning graph.”

– Can we automatically generate a good hierarchical decomposition?

– Maybe. It is being done for hierarchical MDPs.

• Goal 3.2: Including parameter learning.

– Assumption #5: “We are given a full POMDP model of the problem.”

– Can we introduce parameter learning?

– Yes! Maximum-likelihood parameter optimization (Baum-Welch) can be used for POMDPs.

Page 46: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Touchscreen inputSpeech utterance

Research block #4: System development and application

• Goal 4.1: Building an extensive dialogue manager

Touchscreen messageSpeech utterance

Dialogue Manager

Remindermessage

Robot sensor readings Motion command

Status information

Facemail operations

Robot module

Reminding module

Teleoperation module

User

Remote-controlcommand

Page 47: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

An implemented scenario

Physiotherapy

Patientroom

Robothome

Problem size: |S|=288, |A|=14, |O|=15State Features: {RobotLocation, UserLocation, UserStatus, ReminderGoal, UserMotionGoal, UserSpeechGoal}

Test subjects: 3 elderly residents in assisted living facility

Page 48: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Contributions

• Algorithmic contribution: A novel POMDP algorithm based on hierarchical structure.

Enables use of POMDPs for much larger problems.

• Application contribution: Application of POMDPs to dialogue management is novel.

Allows design of robust robot behavioural managers.

Page 49: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Research schedule

1) Algorithmic design/implementation

2) Algorithmic analysis

3) Model learning

4) System development and application

5) Thesis writing

fall 01

spring/summer 02

spring/summer/fall 02

ongoing

fall 02 / spring 03

Page 50: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Questions?

Page 51: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

A simulated robot navigation example

Domain size: |S|=11, |A|=6, |O|=6

GetReward(t)ReadMap

Act

Navigate(t)Read OpenDoor

GoLeft GoRight GoBack GoForward

($$)($$)

Page 52: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

A dialogue management example

- AskGoWhere- GoToRoom- GoToKitchen- GoToFollow- VerifyRoom- VerifyKitchen- VerifyFollow

- GreetGeneral- GreetMorning- GreetNight- RespondThanks

- AskWeatherTime- SayCurrent- SayToday- SayTomorrow

- StartMeds- NextMeds- ForceMeds- QuitMeds

- AskCallWho- Call911- CallNurse- CallRelative- Verify911- VerifyNurse- VerifyRelative

- AskHealth- OfferHelp

- SayTimeAct

CheckHealth

PhoneDoMedsCheckWeatherMoveGreet

Domain size: |S|=20, |A|=30, |O|=27

Page 53: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Action hierarchy for implemented scenario

Act

Remind Assist Rest

MoveContact Inform

BringtoPhysioCheckUserPresentDeliverUser

SayWeatherVerifyRequest

SayTime

RemindPhysioPublishStatus

RingBellGotoRoom

VerifyBringVerifyRelease

RechargeGotoHome

Page 54: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Sondik’s parts manufacturing problem

Manufacture Examine Inspect Replace

a1

a2 a3

Manufacture Examine Inspect

Replace

a1

a2

Decomposition1:

Decomposition2:

+5 more decompositions

Page 55: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Manufacturing task results

0

0.1

0.2

0.3

0.4

0.5

POMDP D1 D2 D3 D4 D5 D6 D7

MDP

Planning Method

Av

g. R

ew

ard

Page 56: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

ReminderGoal={none, medsX}CommunicationGoal={none, personX}UserHealth={good, poor, emergency}

Using state/observation abstraction

Action Set: State Set:

CommunicationGoal={none, nurse, 911, relative}

- AskHealth- OfferHelp

CheckHealth

PhoneDoMeds

- AskCallWho- CallHelp- CallNurse- CallRelative- VerifyHelp- VerifyNurse- VerifyRelative

Phone

Page 57: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Related work on robot planning and control

• Manually-scripted dialogue strategies:– Denecke&Waibel, 1997; Walker et al., 1997.

• Markov decision processes (MDPs) for dialogue management– Levin et al., 1997; Fromer, 1998; Walker et al., 1998; Goddeau&Pineau, 2000; Singh

et al., 2000; Walker, 2000.

• Robot interface:– Torrance, 1996; Asoh et al., 1999.

• Classical planning– Fikes&Nilsson, 1971; Simmons, 1987; McAllester&Rosenblitt, 1991;

Penberthy&Weld, 1992; Kushmerick, 1995; Veloso&al., 1995; Smith&Weld, 1998.

• Execution architectures– Firby, 1987; Musliner, 1993; Simmons, 1994; Bonasso&Kortenkamp, 1996;

Page 58: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Decision-theoretic planning models

Page 59: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

-100

-80

-60

-40

-20

0

20

40

0 1

The tiger problem: Value function solution

V

belief

open-right open-leftlisten

S=tiger-left S=tiger-right

Page 60: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

Optimizing the “investigate” controller

-120

-100

-80

-60

-40

-20

0

0 1

V

open-right

listen

belief

S=tiger-left S=tiger-right

Page 61: Hierarchical Methods for Planning under Uncertainty Thesis Proposal Joelle Pineau Thesis Committee: Sebastian Thrun, Chair Matthew Mason Andrew Moore Craig

Thesis Proposal: Hierarchical Methods for Planning under Uncertainty Joelle Pineau

-60

-40

-20

0

20

40

60

80

0 1

Optimizing the “act” controller

V

belief

open-left

investigate

S=tiger-left S=tiger-right