View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Towards Collaboration in Human Robot Interaction
Alan C. Schultz
Navy Center for Applied Research in Artificial IntelligenceNaval Research Laboratory
Entering Golden Age of HRI?
Entering Golden Age of HRI?
• ACM/IEEE Second International Conference on Human-Robot Interaction– Washington DC, March 9-11, 2007
– HRI 2007 Young Researchers Workshop, March 8
– Single track, highly multi-disciplinary
– 2008 will be in Europe
• HRI Journal being created• New funding sources
– NSF
– ONR, ARO and AFOSR MURI Topics
Silver
QuickTime™ and aMotion JPEG A decompressor
are needed to see this picture.
Need for Dynamic Autonomyand Mixed Initiative
• State of the practice for most unmanned vehicles– Tele-operation, fly to waypoints– Currently many operators required per vehicle– The human acts as operator
• Desire human as high-level supervisor– Goal: human as supervisor, collaborator, peer
Levels of Autonomy(Sheridan, 1992)
1 Computer offers no assistance; human does it all
2 The computer offers a complete set of action alternatives
3 The computer narrows the selection down to a few choices
4 The computer suggests a single action
5 The computer executes that action if human approves
6 Allows the human limited time to veto before automatic execution
7 Executes automatically then necessarily informs the human
8 Informs human after automatic execution only if human asks
9 Informs human after automatic execution only if it decides too
10 Computer decides everything and acts autonomously, ignoring the human
A Different Scale
• A human user or the autonomous system itself may adjust the system's "level of autonomy" as required by the current situation; dynamic autonomy
• Mixed initiative!
Thinking about the vehicle Thinking about the task
Consider User Roles(Scholtz, 2000)
• Need to consider the the relationship between the human and the robot– Operator: more traditional relationship between humans and robots
– User: No longer a “roboticist,” but trained to use robot
– Supervisor: Controlling a team; make requests of assets and performance of tasks
– Peer to peer: A fellow team member
– Bystanders: Other people in the environment• Note the possible lack of proper expectations
and common ground
Beyond Direct Interaction
Human-robot interaction actually goes beyond direct interaction
human
robothuman
robothumanhuman
robot
robot
Direct interaction, the obvious one!
human
robothuman
robothumanhuman
robot
robot
RequirementsBeyond Direct Interaction
Robot needs to understand interactions of others in the environment for context
human
robothuman
robothumanhuman
robot
robot
RequirementsBeyond Direct Interaction
May be important for human’s situational awareness and context
human
robothuman
robothumanhuman
robot
robot
RequirementsBeyond Direct Interaction
These are not important. Robots can speak “bits.”
human
robothuman
robothumanhuman
robot
robot
RequirementsBeyond Direct Interaction
Issues of Trust
• Systems must be accepted
Slide from Army briefing on robots for field medicine
What are the killer apps?
• Homeland security• Healthcare and medicine
– Interfaces for medical robots, patient care
• Logistics• Entertainment• Transportation• …..• Key point is that it is cross cutting technology
– If you believe robots will become ubiquitous… – Avoid typical problem of the hardware coming first
• Communication– Speech, natural language– Gestures
• Hands, face, body• Natural, symbolic, beat…• Sketches, drawing• Speech acts, dialog
• Much more needed– Common ground, ontology– Cognitive skills, perceptual/motor skills– Intentionality, anticipation…– Social skills…
The Scope of HRI
Dr. Nokemoff, can I offer you a ride home?
Peer-to-peer Collaboration in Human-Robot Teams
• Cognitive skills -- a critical component
• Approach: to be informed from cognitive psychology– Study collaboration in teams of humans
– Determine important high-level cognitive skills
– Build computational cognitive models of these skills
– Use computational models as reasoningmechanism on robot for high-level cognition
Cognitive Science as EnablerCognitive Robotics
• Hypothesis:– A system using human-like representations and
processes will enable better collaboration• Similar representations and reasoning mechanisms
make it easier for humans to work with the system; more compatible
– For close collaboration, systems should act “naturally”• i.e. not do something or say something in a way that
detracts from the interaction with the human
• Robot should accommodate humans
Hide and Seek(Trafton & Schultz; 2004, 2006)
• Lots of knowledge about space required
• A “good” hider needs visual, spatial perspective taking to find the good hiding places (large amount of spatial knowledge needed)
Development of Perspective-Taking
• Children start developing (very very basic) perspective-taking ability around age 3-4 – Huttenlocher & Presson, 1979; Newcombe &
Huttenlocher, 1992; Wallace, Alan, & Tribol, 2001
• In general, 3-4 year old children do not have a particularly well developed sense of perspective taking
Case Study: Hide and Seek Age 3½
Game Num ber Hiding Location Hiding Type 1 eyes-closed can't see me if I
can't see you 2 out-in-open understanding
rules of game suggestion don't hide out in
the open
3 under piano under 4 in laundry room containment
(room) break 5 under piano under 6 in laundry room containment
(room) 7 in bathroom containment
(room) 8 in her room containment
(room) 9 under chair under 10 behind bedr oom
door containment or behind
11 under chair under 12 under covers under or
containment 13 under covers under or
containment 14 in bathroom containment 15 under glass
coffee table under
Elena did not have perspective taking ability– Left/right errors– play hide and seek by learning pertinent qualitative features of objects– construct knowledge about hiding
that is object-specific
Case Study 2: Hide and SeekAge 5½
Game Number
Hiding Location Hiding Type
1 Behind stuffed animals
Behind
2 Behind boxes Behind
3 Inside her closet Containment (room)
4 Behind a table (moving to keep away from It’s view)
Behind
5 Underneath a chair Under
6 Behind a chair Behind
7 Behind a bassinett Behind
8 Under a table Under
9 Behind a chair (moving to keep away from It’s view)
Behind
10 Behind bedroom door Containment or behind
Elena has perspective taking ability– Left/right question
Was able to take seeker’s perspective, more than at 3½.2(1)=51.5, p < .001
Hide and Seek Cognitive Model
• Created cognitive model of Elena learning to play hide and seek using ACT-R (Anderson, et al 93, 95, 98, 05)
• Correctly models Elena’s behavior at 3½ years of age
• Learns and refines hiding behavior based on interactions with “teacher” – Learns production strength based on success and failure of hiding
behavior
– Learns ontological or schematic knowledge about hiding
• Its bad to hide behind something that’s clear
• Its good to hide behind something that is big enough
• Knows about location of objects (relative) (behind, in front of) adds knowledge about relationships. Model only has syntactic notion of spatial relationships
Hybrid Cognitive/Reactive ArchitectureRobot Hide and Seek
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Using cognitive model of hiding (after learning) in order to reason about what makes a good hiding place in order to seek.
Computational cognitive model of hiding makes deliberative (high-level cognitive) decisions. Models learning.
• Reactive layer of hybrid model for mobility and sensor processing
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Summary of Hide and Seek
• Children learn pertinent qualitative features of objects in order to play hide and seek
• Developed computational cognitive model of hide and seek, but without visual, spatial perspective taking ability
• Came to understand the importance of visual, spatial perspective taking.
How important is perspective taking?(Trafton et al., 2005)
• Analyzed a corpus of NASA training tapes– Space Station Mission 9A
– Two astronauts working in full suits in neutral-buoyancy facility. Third, remote person participates.
– Standard protocol analysis techniques; transcribed 8 hours of utterances and gestures (~4000 instances)
• Use of spatial language (up, down, forward, in between, my left, etc) and commands
– Research questions:• What frames of reference are used?
• How often do people switch frames of reference?
• How often do people take another person’s perspective?
Spatial language in spaceResults
Frame of Reference Example % Utterances
Exocentric Go straight zenith (“up”) 7%
Egocentric Turn to my left 15%
Addressee-Centered Turn to your left 10%
Deictic Put it over there [Points] 5%
Object-centered Put it on top of the box 63%
• How frequently do people switch their frame of reference?– 45% of the time (Consistent with Franklin, Tversky, & Coon, 1992)
• How often do people take other people’s perspective (or force others to take theirs)?– 25% of the time
Perspective Taking and Changing Frames of Reference
QuickTime™ and aDV/DVCPRO - NTSC decompressorare needed to see this picture.
• Notice the mixing of perspectives: exocentric (down), object-centered (down under the rail), addressee-centered (right hand), and exocentric again (nadir) all in one instruction!
• Notice the “new” term developed collaboratively: mystery hand rail
Bob, if you come straight down from where you are, uh, and uh kind of peek down under the rail on the nadir side, by your right hand, almost straight nadir, you should see the uh…
Perspective Taking and Changing Frames of Reference
Perspective Taking
• Perspective taking is critical for collaboration.
• How do we model it? (ACT-R, Polyscheme…)
• I’ll show several demos that show our current progress on spatial perspective taking
• But first a scenario:
“Please hand me the wrench”
Perspective Taking in Human Interactions
• How do people usually resolve ambiguous references that involve different spatial perspectives? (Clark, 96)– Principle of least effort (which implies least joint effort)
• All things being equal, agents try to minimize their effort
– Principle of joint salience• The ideal solution to a coordination problem among two or more
agents is the solution that is the most salient, prominent, or conspicuous with respect to their current common ground.
• In less simple contexts, agents may have to work harder to resolve ambiguous references
• Acceptance: if robot always asked it would be annoying
Configural - Navigation
Focal -object identification
Manipulative- grasping & tracking
Configural - Navigation
Focal -object identification
Manipulative- grasping & tracking
Perspective Taking:A tale of two systems
• ACT-R/S (Schunn & Harrison, 2001)– Perspective-taking system using ACT-R/S is described in
Hiatt et al. 2003• Three Integrated VisuoSpatial buffers• Focal: Object ID; non-metric geon parts• Manipulative: grasping/tracking; metric geons• Configural: navigation; bounding boxes
• Polyscheme (Cassimatis)– Computational Cognitive Architecture where:
• Mental Simulation is the primitive• Many AI methods are integrated
– Perspective-taking using Polyscheme is described in Trafton et al., 2005
Robot Perspective Taking
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Human can see one coneRobot can sense two cones
QuickTime™ and aCinepak decompressor
are needed to see this picture.
(Fong et al., 06)
Dereferencing “last”Temporal Perspective
• Tracks object of dialog and tries to understand what “last” references
• Notice difference between what these reference:– “tighten the last bolt”– “loosen the last bolt”
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Dereferencing “last”Temporal Perspective
• Tracks object of dialog and tries to understand what “last” references
• Notice difference between what these reference:– “tighten the last bolt” (4th bolt)– “loosen the last bolt” (3rd bolt)
QuickTime™ and aCinepak decompressor
are needed to see this picture.
Acoustic PerspectiveAdapting to the Acoustic Environment
• Adjust word usage depending on noise levels– Use smaller words with higher recognition rates– Ask questions to verify understanding– Repeat yourself
• Change the quality of the speech sounds– Adapt voice volume and pitch to overcome local noise levels
(Lombard Speech)– Emphasize difficult words– Don’t talk during very loud noises
• Reposition oneself– Vary the proximity to the listener– Face the listener as much as possible– Move to a different location
Summary
• Having similar or compatible representation and reasoning as a human facilitates human-robot collaboration
• We’ve demonstrated computational cognitive models of high-level human cognitive skills as reasoning mechanisms for robots
• Open questions:– Scale up; combining many such skills– What are the other important skills?– Which skills are built upon others?
Cognitive Skills
• Spatial reasoning– What can teammates see from their locations?– Communicate with teammates from their point of view
• Focus of Attention– Dealing with the flood of information in its environment
• Robot’s ability to attend to most appropriate events• Reasoning about focus of attention of fellow teammates• Joint attention
• Temporal reasoning– Predicting how long actions of team members will take
Cognitive Skills
• Intentionality– Joint intention theory– Working together to achieve goal
• Anticipation– Modeling of others to understand their intent and to
anticipate near future actions of teammates and adversary
– What does a person need and why?– What will he expect of me?