13
eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

  • Upload
    lazar

  • View
    52

  • Download
    0

Embed Size (px)

DESCRIPTION

eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation. Overview. Context: Exploitation of multi-modal signals for the development of an active robot/agent listener Storytelling experience : - PowerPoint PPT Presentation

Citation preview

Page 1: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

eNTERFACE’08Multimodal Communication with

Robots and Virtual Agentsmid-term presentation

Page 2: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

OverviewContext:

• Exploitation of multi-modal signals for the development of an active robot/agent listener

• Storytelling experience :– Speakers told a story of an animated cartoon they had just seen

1- See the cartoon

2- Tell the story to a robot or an agent

Page 3: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

OverviewActive listening :

– During natural interaction, speakers see if the statements have been correctly understood (or at least heard).

– Robots/agents should also have active listening skills…

• Characterization of multi-modal signals as inputs of the feedback model:– Speech analysis : prosody, keywords recognition, pauses– Partner analysis : face tracking, smile detection

• Robot/agent feedbacks (outputs):– Lexical non-verbal behaviors

• Feedback model:– Exploitation of both inputs and outputs signals

• Evaluation:– Storytelling experiences are usually evaluated by annotation

Page 4: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

• Audio visual recordings of a storytelling between a speaker and a listener.

• 22 storytelling sessions telling the “Tweety and Sylvester - Canary row” cartoon story.

• Several conditions (speaker and listener): same language, different.

• Languages: Arabic, French, Turkish and Slovak

• Annotation oriented to interaction analysis:– Smile, Head nod, shake, Eye brow, Acoustic prominence

Page 5: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

QuickTime™ et undécompresseur Vidéo 1 Microsoft

sont requis pour visionner cette image.

Page 6: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Architecture of an interaction feedback model

Multi-modal featureextraction

Feedbackstrategy

Multi-modal feedback

Page 7: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Multi-modal feature extractionKey idea:

Extraction of features annotated from the STEAD corpus:

• Face processing: Head nod, shake, smile, activity.

• Keyword spotting: keywords have been defined in order to switch the agent’s state.

• Speech Processing: Acoustic Prominence detection

Page 8: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

QuickTime™ et undécompresseur Vidéo 1 Microsoft

sont requis pour visionner cette image.

Page 9: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Multi-modal feature extractionKeyword spotting: keywords have been defined in order to switch the agent’s

state.

ASR

Agent’s state manager (ASM)

Page 10: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Multi-modal feature extractionAcoustic Prominence Detection:

• Prosody analysis in real-time by using Pure Data:– Development of different Pure Data objects (written in C):

• Voice Activity Detection

• Pitch and Energy extraction

• Detection: – Statistical model (Gaussian assumption):

• Kullback-Leibler similarity

Page 11: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Feedback model• Extraction of rules from the annotations (STEAD corpus):

– Rules are defined in the literature– Application to our specific task

• When a feedback is triggered?

• Feedback behaviours:– ECA : Several behaviours are already defined (head movements, face

expressions) for GRETA with BML (Behaviour Markup Language).

– ROBOT: We defined several basic behaviours for our AIBO robot (inspired from dog’s reactions): Mapping from BML and robot movements.

Page 12: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Future works• Integration:

– Real-time Multi-modal Feature Extraction:• Prominence detection object (Pure Data)• Communication between the modules by PsyClone

– Already done for Video processing.

– Tests of Feedback Behaviours for AIBO– Agent’s state modifications

• Recordings and annotations of storytelling experiences with both GRETA and AIBO.

Page 13: eNTERFACE’08 Multimodal Communication with Robots and Virtual Agents mid-term presentation

Thank for your attention…