CSE 591: Human-aware Robotics - Arizona State …Human-aware Robo.cs Human modeling Human teammate...

Preview:

Citation preview

Human-awareRobo.cs

1

CSE 591: Human-aware Robotics

Instructor: Dr. Yu (“Tony”) Zhang

Location & Times: CAVC 359, Tue/Thu, 9:00--10:15 AM Office Hours: BYENG 558, Tue/Thu, 10:30--11:30AM

Oct 6/Nov 1, 2016

This set of slides borrow from various online sources; it is used for educational purposes only.

SlidesadaptedfromPieterAbbeel(UCBerkeley)

Human-awareRobo.cs

2

Modeling of Humans

Behavior model

Human-awareRobo.cs

3

Goal

Modeling of Humans

GoalBehavior model

Ø Goal and intent selection

Goal

Human-awareRobo.cs

4

Goal

Modeling of Humans

Goal

river

GoalBehavior model

•  Goal and intent selection Ø  Plan selection

Human-awareRobo.cs

5

Goal

Modeling of Humans

Goal

river

GoalBehavior model

•  Goal and intent selection •  Plan selection (informed by the capabilities, and influenced by mental states and etc.)

Human-awareRobo.cs

6

Modeling of Humans

Behavior model

•  Goal and intent selection •  Plan selection (informed by the capabilities, and influenced by mental states and etc.)

Ø  Goal/planrecogni?onshouldbeinformedbythebehaviormodel

Ø  Howshouldwelearnabehaviormodel?

Human-awareRobo.cs

7

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Human-awareRobo.cs

Humanmodeling

Humanteammate

Human-awareplanner

Observa?ons

Humanmodels

Robotmodels

Plangenera6on

8

Modeling of Humans 1.   Nopre-specifiedgoals/plans2.   Incompleteobserva6ons

Human-awareRobo.cs

9

LearningChallenges

CompleteObserva6ons

ActualObserva6ons

Observa6ons(par6al)withindefinitegaps

Behavior model

1.   Nopre-specifiedgoals/plans2.   Incomplete&noisyobserva6ons

Human-awareRobo.csCapability

->:denoteanatomicstatechange

{has_water(AG),has_coffee_beans(AG)}->{has_boilling_water(AG),has_coffee_beans(AG)}->{has_boilling_water(AG),has_ground_coffee_beans(AG)}->{has_coffee(AG)}

Westartwithanincompleterepresenta6on

§  DEFINITION(CAPABILITY)–Givenanagent,acapabilityisamapping,whichisanasser.onabouttheprobabilityoftheexistenceofaplaninfewerthanorequaltoTatomicstatechangesthatcanconnectthetwostates.

Par6alstates

has_water(AG)=>has_ground_coffee_beans(AG)has_boiling_water(AG)=>has_coffee(AG)…WhenT=2

WhenT=3…(includingallcapabili?eswhenT=2)has_water(AG)=>has_coffee(AG)

Boundonthegapsbetweenobserva6ons

10

Human-awareRobo.csCapabilityModel

Capabilitymodelencodesallcapabili6esforagivenT

T-gapcapabilitymodel

Synchroniclinks

Diachroniclinks

11

Human-awareRobo.cs

12

CapabilityModel

Human-awareRobo.csCapabilityModel&EncodedCapabili?es

sI=>sE

Acondi6onalprobability(specifiedbyapar6alini6alandeventualstate)

Jointdistribu6onoverT

Acapability:T-gapcapabilitymodel

Acapabilitymodelencodesthefollowingdistribu6ons:

13

Human-awareRobo.csLearningCapabilityModels

§  Learningmodelstructure

Causalrela?onships(diachroniclinks);variablecorrela?ons(synchroniclinks)

§  Learningmodelparameters Condi?onalprobabili?es

Learningfrom(gap-bounded)plantraces

14

Human-awareRobo.cs

15

ParameterLearning

Learningfromincompletetraces

Human-awareRobo.cs

16

ParameterLearning

LearningsamplesApplyBayesianlearning(assumingbetadistribu6ons):

Weassumethatthemaximumnumberofmissingstateobserva6onsbetweenanytwoobserva6onsinthepar6alplantraceisupperboundedbyT

DEFINITION(T-GAPPARTIALPLANTRACE).AT-gappar.alplantraceisapar.alplantraceinwhichallk[1,2…]<=T

Human-awareRobo.cs

17

UsingCapabilityModels

§  Robotcanpredictthehuman’snextac?onoutcomes

Statepredic6on(goalrecogni6on)

Proac6veassistance(toincreasegoalsuccessprobability)

§  Robotcanreasonabouthowlikelyataskcanbeachievedbythehuman

Human-awareRobo.cs

18

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Human-awareRobo.cs

19

Goal

Modeling of Humans

Goal

Goal

Human-awareRobo.cs

20 Safety?Time?Comfort?Wai?ng?me?Speed?

Human-awareRobo.cs

21

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Human-awareRobo.cs

22

Reward:R(s)Decayingfactor:Policy:π

MarkovDecisionProcess

Human-awareRobo.cs

23

Human-awareRobo.cs

24

Human-awareRobo.cs

25

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Human-awareRobo.cs

26

Human-awareRobo.cs

27

Human-awareRobo.cs

28

Human-awareRobo.cs

29

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Human-awareRobo.cs

30

Human-awareRobo.cs

31

Human-awareRobo.cs

32

Human-awareRobo.cs

33

Human-awareRobo.cs

34

Human-awareRobo.cs

35

Human-awareRobo.cs

36

Human-awareRobo.cs

37

Human-awareRobo.cs

38

Human-awareRobo.cs

39

Human-awareRobo.cs

40

Human-awareRobo.cs

41

Human-awareRobo.cs

42

Human-awareRobo.cs

43

Human-awareRobo.cs

44

Directlycomputeapolicy!

Human-awareRobo.cs

45

[Abbeel&Ng,2004]

Human-awareRobo.cs

46

Human-awareRobo.cs

47

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Human-awareRobo.cs

48

Human-awareRobo.cs

49

Human-awareRobo.cs

50

Outline

BehaviormodelingØ Capabilitymodel

GoalpreferenceØ  InverseRL

•  WhyIRL•  InverseRLvs.Behavioralcloning•  Mathema?calformula?onofIRL•  Applica?ons

Recommended