Upload
waseem
View
29
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Conversational role assignment problem in multi-party dialogues. Natasa Jovanovic Dennis Reidsma Rutger Rienks TKI group University of Twente. Outline. Research tasks at TKI Interpretation of multimodal human-human communication in the meetings - PowerPoint PPT Presentation
Citation preview
Conversational role assignment problem in multi-party dialogues
Natasa Jovanovic Dennis Reidsma Natasa Jovanovic Dennis Reidsma
Rutger RienksRutger Rienks
TKI groupTKI group
University of TwenteUniversity of Twente
Outline
Research tasks at TKIResearch tasks at TKI Interpretation of multimodal human-Interpretation of multimodal human-
human communication in the meetingshuman communication in the meetings Conversational Role Assignment Problem Conversational Role Assignment Problem
((CRAPCRAP)) Towards automatic addressee detectionTowards automatic addressee detection
A framework for multimodal interaction research
Layered annotation
Unannotated corpus (videoand audio recordings in a
certain domain)
Annotation of events indifferent modalities (E.g.gaze, posture, gesture,
speech)
Multimodal interpretation ofevents in terms of semantic
models.
Tools (e.g. for retrieval,simulation, remote
presence, generation ofminutes for meetings)
Research on humanbehaviour
Models and theories ofinteraction; semantics of
annotation schemes
I
IIA
IIB III
IV V
II
Multimodal annotation tool
Who is talking to whom?
CRAP as one of the main issues in multi- CRAP as one of the main issues in multi- parity conversation (Traum 2003.)parity conversation (Traum 2003.)
Taxonomy of conversational roles (Herbert Taxonomy of conversational roles (Herbert K. Clark)K. Clark)
speaker addressee side participant
all participantsbystander
eavesdropperall listener
Our goal:Our goal: Automatic addressee identification in Automatic addressee identification in
small group discussions small group discussions Addressees in meeting conversations: Addressees in meeting conversations:
single participant, group of people, whole single participant, group of people, whole audienceaudience
Importance of the issue of addressing in Importance of the issue of addressing in multi-party dialoguesmulti-party dialogues
Addressing mechanisms
What are relevant sources of information for What are relevant sources of information for addressee identification in the face-to-face addressee identification in the face-to-face meeting conversations?meeting conversations?
How does the speaker express who is the How does the speaker express who is the addressee of his utterance?addressee of his utterance?
How can we combine all this information in order How can we combine all this information in order to determine the addressee of the utterance?to determine the addressee of the utterance?
Sources of information
SpeechSpeech Linguistic markers Linguistic markers
word classesword classes: personal pronouns, determiners in combination : personal pronouns, determiners in combination with personal pronouns, possessive pronouns and adjectives, with personal pronouns, possessive pronouns and adjectives, indefinite pronouns, etc.indefinite pronouns, etc.
Name detection ( vocatives)Name detection ( vocatives) Dialogue actsDialogue acts
Gaze directionGaze direction Pointing gesturesPointing gestures Context categories(features)Context categories(features)
Dialogue Acts and Addressee detection (I) How many addresses may have an utterance?How many addresses may have an utterance? According to dialog act theory an utterance or an According to dialog act theory an utterance or an
utterance segment may have more than one utterance segment may have more than one conversational function.conversational function.
Each DA has a addressee Each DA has a addressee ==> an utterance may ==> an utterance may have several addresses have several addresses
Dialogue Acts and Addressee detection (II) MRDA (Meeting Recorder Dialogue Acts)– tag MRDA (Meeting Recorder Dialogue Acts)– tag
set for labeling multiparty face to face meetings set for labeling multiparty face to face meetings (ICSI)(ICSI)
We use a huge subset of the MRDA set which is We use a huge subset of the MRDA set which is organized on two levels:organized on two levels: Forward looking functions (FLF )Forward looking functions (FLF ) Backward looking functions (BLF)Backward looking functions (BLF)
Non-verbal features
GazeGaze Contribution of the gaze to the addressee detection is Contribution of the gaze to the addressee detection is
dependent on: participants’ location (visible area), dependent on: participants’ location (visible area), utterance length, current meeting actionutterance length, current meeting action
Turn-taking behavior and addressing behaviorTurn-taking behavior and addressing behavior
Gesture ( pointing at a person)Gesture ( pointing at a person) TALK_TO (X,Y) AND POINT_TO (X,Y)TALK_TO (X,Y) AND POINT_TO (X,Y) TALK_TO( X,Y) AND POINT_TO (X,Z) – X talk to Y about ZTALK_TO( X,Y) AND POINT_TO (X,Z) – X talk to Y about Z
Context categories
Bunt: “totality of conditions that may influence Bunt: “totality of conditions that may influence understanding and generation of communicative behavior”understanding and generation of communicative behavior” Local contextLocal context is an aspect of context that can be is an aspect of context that can be
changed through communicationchanged through communication Context categories:Context categories:
Interaction history ( verbal and non-verbal)Interaction history ( verbal and non-verbal) Meeting action historyMeeting action history Spatial context (participants’ location, distance, visible Spatial context (participants’ location, distance, visible
area, etc. )area, etc. ) User context (name, gender, roles, etc. )User context (name, gender, roles, etc. )
Towards an automatic addressee detection Manual or automatic features annotation?Manual or automatic features annotation? An automatic target interpreter has to deal An automatic target interpreter has to deal
with uncertaintywith uncertainty Methods:Methods:
Rule-based methodRule-based method Statistical method ( Bayesian networks)Statistical method ( Bayesian networks)
Rule-based method
1.1. Processing information obtained from the utterance Processing information obtained from the utterance ( linguistic markers, vocatives, DA). The result is a list ( linguistic markers, vocatives, DA). The result is a list of possible addressees with corresponding probabilitiesof possible addressees with corresponding probabilities
1.1. Eliminate cases where target is completely Eliminate cases where target is completely determined (for instance, name in vocative form)determined (for instance, name in vocative form)
2.2. Set of rules for BLFSet of rules for BLF
3.3. Set of rules for FLFSet of rules for FLF
2.2. Processing gaze and gesture information adding the Processing gaze and gesture information adding the additional probability values to the candidates additional probability values to the candidates
Meeting actions and addressee detection Automatic addressee detection method can Automatic addressee detection method can
be applied to the whole meetingbe applied to the whole meeting Knowledge about the current meeting Knowledge about the current meeting
action as well as about meeting actions action as well as about meeting actions history may help to better recognize the history may help to better recognize the addressee of a dialogue act.addressee of a dialogue act.
Future works
Development of multimodal annotation toolDevelopment of multimodal annotation tool Data annotation forData annotation for
training and evaluating statistical models training and evaluating statistical models obtaining inputs for rule-based methodsobtaining inputs for rule-based methods
New meeting scenarios for research in New meeting scenarios for research in addressing addressing