43
Paul Fitzpatrick lbr-vision ± Face to Face Face to Face ± Robot vision in social settings

Social Settings

  • Upload
    vny

  • View
    219

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 1/43

Paul Fitzpatrick lbr-vision

±± Face to FaceFace to Face ±±

Robot vision in social settings

Page 2: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 2/43

Paul Fitzpatrick lbr-vision

R obot vision in social settings

Humans (and robots) recover infor mation aboutobjects from the light they reflect

Human head and eye movements give clues toattention and motivation

In a social context, people constantly read eachother¶s actions for these clues

Anthropomorphic robots can partak e in thisimplicit communication, giving smooth andintuitive inter action

Page 3: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 3/43

Paul Fitzpatrick lbr-vision

Humanoid robotics at MIT

Page 4: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 4/43

Paul Fitzpatrick lbr-vision

Eye tilt

Left eye panRight eye pan

Camera withwide field of 

view

Camera with

narrow field of 

view

Kism

et

Page 5: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 5/43

Paul Fitzpatrick lbr-vision

Kism

et

Built by Cynthia Breazeal to explore

expressive social exchange between humans

and robots

± Facial and vocal expression

± Vision-mediated inter action (collabor ation with

Brian Scassellati, Paul Fitzpatrick )± Auditory-mediated inter action (collabor ation

with Lijin Aryananda)

Page 6: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 6/43

Paul Fitzpatrick lbr-vision

Vision-m

ediated interaction

Visual attention

± Driven by need for high-resolution view of a particular 

object, for example, to find eyes on a f ace± Mar k s the object around which behavior is organized

± Manipulating attention is a powerf ul way to influence

behavior 

Pattern of eye/head movement± Gives insight into level of engagement

Page 7: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 7/43

Paul Fitzpatrick lbr-vision

Expressing visual attention

Attention can be deduced from behavior 

Or can be expressed more directly

Page 8: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 8/43

Paul Fitzpatrick lbr-vision

Building an attention syste

m

Current inputPre-attentive filters

Saliency map Tracker  

Eye movementBehavior system

Modulator y influence

Primar y information flow

Page 9: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 9/43

Paul Fitzpatrick lbr-vision

Skin tone Color Motion Habituation

Weighted

by behavioral

relevance

Pre-attentive filters

Initiating visual attention

Current input Saliency map

Page 10: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 10/43

Paul Fitzpatrick lbr-vision

Example filter ± skin tone

Page 11: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 11/43

Paul Fitzpatrick lbr-vision

Image pixel p(r,g,b) is NOT considered sk in tone if:

r < 1.1vg (red component f ails to dominate green sufficiently)

r < 0.9 v b (red component is excessively dominated by blue)

r > 2.0 v max(g,b) (red component completely dominates)

r < 20 (red component too low to give good estimate of r atios)

r > 250 (too satur ated to give good estimate of r atios)

Lots of things that are not sk in pass these tests

But lots and lots of things that are not sk in f ail

Skin tone filter ± details

Page 12: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 12/43

Paul Fitzpatrick lbr-vision

high sk in gain,

low color saliency gain

Look ing time ± 

86% f ace, 14% block 

Look ing time ± 

28% f ace, 72% block 

low sk in gain,

high color saliency gain

Modulating vis

ual attention

Page 13: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 13/43

Paul Fitzpatrick lbr-vision

Manipulating vis

ual attention

Page 14: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 14/43

Paul Fitzpatrick lbr-vision

slipped« «recovered

Maintaining visual attention

Page 15: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 15/43

Paul Fitzpatrick lbr-vision

Persistence of attention

Want attention to be responsive to changing

environment

Want attention to be persistent enough to

per mit coherent behavior 

Tr ade-off between persistence and

responsiveness needs to be dynamic

Page 16: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 16/43

Paul Fitzpatrick lbr-vision

Influences on attention

Current inputPre-attentive filters

Saliency map Tracker  

Eye movementBehavior system

Modulator y influence

Primar y information flow

Page 17: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 17/43

Paul Fitzpatrick lbr-vision

Vergence

angle

Left eye

Right eye

Ballistic saccade

to new target

Smooth pursuit andvergence co-operate

to track object

Eyem

ovem

ent

Page 18: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 18/43

Paul Fitzpatrick lbr-vision

Eye/neckm

otor control

Neck  movements combine :-

± Attention-driven orientation shifts

± Affect-driven postur al shifts

± Fixed action patterns

Eye movements combine :-

± Attention-driven orientation shifts

± Turn-tak ing cues

Page 19: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 19/43

Paul Fitzpatrick lbr-vision

Social a

mplification of 

m

otor acts Active vision involves choosing a robot¶s pose to

f acilitate visual perception.

Focus has been on immediate physicalconsequences of pose.

For anthropomorphic head, active vision str ategiescan be ³read´ by a human, assigned an intent

whichma

y then be com

pleted beyond the robot¶simmediate physical capabilities.

Robot¶s pose has communicative value, to whichhuman responds.

Page 20: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 20/43

Comfortable

interaction distance

Too close ±

withdrawal

response

Too far ±

calling

behavior 

Person draws

closer 

Person

backs off 

Comfortable interaction

speed 

Too fast ± irritation

responseToo fast,

Too close ±

threat response

Beyond 

sensor 

range

Page 21: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 21/43

Paul Fitzpatrick lbr-vision

Video: Withdrawal response

Page 22: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 22/43

Paul Fitzpatrick lbr-vision

Eye tilt

Left eye panRight eye pan

Camera withwide field of 

view

Camera with

narrow field of 

view

Kism

et¶s cam

eras

Page 23: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 23/43

Paul Fitzpatrick lbr-vision

Simplest camera configuration

Single camer a± Multiple camer a systems require caref ul calibr ation for cross-

camer a correspondence

Wide field of view± Don¶t k now where to look beforehand

Moving infrequently relative to the r ate of visualprocessing

± Ego-motion complicates visual processing

Page 24: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 24/43

Skin

Detector 

Color 

Detector 

Motion

Detector 

Face

Detector 

Wide

Frame Grabber 

MotionControl

Daemon

Right

Frame Grabber 

Left

Frame Grabber 

Right Foveal

Camera

Wide

CameraLeft Foveal

Camera

Eye-Neck

Motors

W W W W

Attention

Wide

Tracker 

Foveal

Disparity

Smooth Pursuit

& Vergence

w/ neck comp.

VOR

Saccade

w/ neck

comp.

Fixed Action

Pattern

Affective

Postural Shiftsw/ gaze comp.

Arbitor 

Eye-Head-Neck Control

DisparityBallistic movement Locus of attentionBehaviors

Motivations

5,  5,  5. ..

U, d U d U2

s s s

U, d U d U2

p p p

U,

 d U d U

2

f f f  U,

 d U d U

2

v v v

Wide

Camera 2

Tracked

target

Salient target

Eye

finder 

Left

Frame Grabber 

Distance

to target

Page 25: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 25/43

Paul Fitzpatrick lbr-vision

Missing components

High acuity vision ± for example, to find

eyes within a f ace

± Need camer as that sample a narrow field of 

view at high resolution

Binocular view, for stereoscopic vision

± Need paired camer as± May need wide or narrow fields of view,

depending on application

Page 26: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 26/43

Paul Fitzpatrick lbr-vision

Missing: high acuity vision

Typical visual task s require both high acuity and a 

wide field of view

High acuity is needed for recognition task s and for controlling precise visually guided motor 

movements

A wide field of view is needed for search task s,

for tr ack ing multiple objects, compensating for 

involuntary ego-motion, etc.

Page 27: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 27/43

Paul Fitzpatrick lbr-vision

Biological solution

A common tr ade-off found in biological systems

is to sample part of the visual field at a high

enough resolution to support the first set of task s,and to sample the rest of the field at an adequate

level to support the second set.

This is seen in animals with foveate vision, such

as humans, where the density of photoreceptors ishighest at the center and f alls off dr amatically

towards the periphery.

Page 28: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 28/43

Paul Fitzpatrick lbr-vision

Simulated example

Compare size of eyes and ears intr ansfor med image ± eyes are closer tocenter, and so are better represented

Page 29: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 29/43

Foveated vision

(From C. Graham, ³Vision and Visual Perception´)

Page 30: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 30/43

Paul Fitzpatrick lbr-vision

Mechanical approximations

Imaging surf ace with varying sensor density

(Sandini et al)

Distorting lens projecting onto conventionalimaging surf ace (K uniyoshi et al)

Multi-camer a arr angements (Scassellati et al)

Cam

er as with

zoom

control directly tr ade-off acuity with field of view (but can¶t have both)

Or do something completely different!

Page 31: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 31/43

Paul Fitzpatrick lbr-vision

Multi-camera arrangement

Wide view camer a gives context

used to select region at which to

point narrow view camer a

If target is close and moving,

must respond quick ly and accur ately,

or won¶t gain any infor mation at all

W ide field of view,

low acuity 

Narrow field of view,

high acuity 

Page 32: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 32/43

Paul Fitzpatrick lbr-vision

Mixing fields of view

Small distance between camer as with wide and narrow

field of views, simplifying mapping between the two

Centr al location of wide camer a allows head to be oriented

accur ately independently of the distance to an object

Allows coarse open-loop control of eye direction from 

wide camer a ± improves gaze stability

Eye tilt

Left eye panRight eye pan

Fixed with respect

to head

Page 33: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 33/43

Paul Fitzpatrick lbr-vision

Wide

View

camera

Narrow

view

camera

Objectof interest

Field of view

Rotate

camera

New field

of view

Tip-toeing around 3D

Page 34: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 34/43

Paul Fitzpatrick lbr-vision

Using 3D

Eye tilt

Left eye panRight eye pan

Fixed with respect

to head

Page 35: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 35/43

Skin

Detector 

Color 

Detector 

Motion

Detector 

Face

Detector 

Wide

Frame Grabber 

MotionControl

Daemon

Right

Frame Grabber 

Left

Frame Grabber 

Right Foveal

Camera

Wide

CameraLeft Foveal

Camera

Eye-Neck

Motors

W W W W

Attention

Wide

Tracker 

Foveal

Disparity

Smooth Pursuit

& Vergence

w/ neck comp.

VOR

Saccade

w/ neck

comp.

Fixed Action

Pattern

Affective

Postural Shiftsw/ gaze comp.

Arbitor 

Eye-Head-Neck Control

DisparityBallistic movement Locus of attentionBehaviors

Motivations

5,  5,  5. ..

U, d U d U2

s s s

U, d U d U2

p p p

U, d U d U2

f f f U, d U d U

2

v v v

Wide

Camera 2

Tracked

target

Salient target

Eye

finder 

Left

Frame Grabber 

Distance

to target

Page 36: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 36/43

Paul Fitzpatrick lbr-vision

Kismet¶s little secret

Page 37: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 37/43

NTspeech synthesis

affect recognition

LinuxSpeech

recognition

Face

Control

Emotion

Percept

& Motor 

Drives &

Behavior 

L

Tracker 

Attent.system

Dist.

totarget

Motionfilter 

Eyefinder 

Motor ctrl

audiospeechcomms

Skinfilter 

Color filter 

QNX

CORBA

sockets,

CORBA

CORBA

dual-port

RAM

Cameras

Eye, neck, jaw motorsEar, eyebrow, eyelid,

lip motors

Microphone

Speakers

Page 38: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 38/43

Paul Fitzpatrick lbr-vision

R obots looking at humans

Responsiveness to the human f ace is vital

for a robot to partak e in natur al social

exchange

Need to locate and tr ack  f acial features, and

recover their semantic content

Page 39: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 39/43

Paul Fitzpatrick lbr-vision

Hair Forehead

Eye/brow

Cheeks/nose

Mouth/jaw

TOP

BOTTOM

LEFT

Hair 

Skin Eye Bridge

Hair 

SkinEye RIGHT

Modeling the face

Match oriented regions on f ace againstvertical model to isolate eye/browregion

Match eye/brow region againsthorizontal model to find eyes, bridge

Each model scans one spatialdimension, so can for mulate as HMM,allowing f ast optimization of match

V ertical face model 

Horizontal 

eye/brow model 

Page 40: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 40/43

Paul Fitzpatrick lbr-vision

R obots looking at robots

It is usef ul to link  the robot¶s representation of its own f ace

with that of humans.

Bonu

s: Allows robot-robot inter action vi

ahuma

n protocol.

Page 41: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 41/43

Paul Fitzpatrick lbr-vision

Conclusion

Vision community wor k ing on improving

machine perception of human

But equally important to consider human

perception of machine

Robot¶s point of view must be clear to

human, so they can communicateeffectively ± and quick ly!

Page 42: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 42/43

Paul Fitzpatrick lbr-vision

Video: Turn taking

Page 43: Social Settings

8/7/2019 Social Settings

http://slidepdf.com/reader/full/social-settings 43/43

Paul Fitzpatrick lbr-vision

Video: Affective intent