Learning to Talk Through Listening

Preview:

DESCRIPTION

Learning to Talk Through Listening. Alexander I. Rudnicky with Ananlada Chotimongkol and Dan Bohus Carnegie Mellon University CATALOG 2004 – Barcelona July 21, 2004. Outline. Empirical approaches to understanding dialogue and building dialogue systems A task-based approach to dialogue - PowerPoint PPT Presentation

Citation preview

Learning to Talk Through Listening

Alexander I. Rudnickywith

Ananlada Chotimongkol and Dan BohusCarnegie Mellon University

CATALOG 2004 – BarcelonaJuly 21, 2004

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Fundamental representations and observable events

• Learning through observation

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Fundamental representations and observable events

• Learning through observation

Why Build Dialogue Systems?

• The devil is in the details

• Better understand the actual complexities of human-computer interaction

• Create specific artifacts that embody theories of dialogue and interaction (and thereby allow us to test them directly)

Domains, Tasks and Applications

Domain

Task

Task

Task

Application

Task representation/specification alternatives

• Code (unspecialized representations, procedural)– Difficult to manage

• Forms (properly, F→A sets)– Works for the simplest tasks, which can be easily cast

as such– Many examples

• Forms + graph-based dialogue structure– Graph-based part essentially = code, same problems– Examples: VXML, SALT

• Hierarchical, plan-based– Task specified as a hierarchical plan (recipe) for the

domain– Examples: RavenClaw, Collagen

CMU dialogue approaches and systems

• Procedural– Command and control [OM,, etc]– Information access [MovieLine,etc]

• Script-based and graph-based– Travel planning; maintenance [SpeechWear]

• AGENDA-based– Communicator: travel planning– LARRI: task guidance [m-modal]– Roomline, etc: information access and transactions– Madeleine: medical diagnosis– TeamTalk: multi-participant dialogue– Valerie: interviews

Graph-based systems

Welcome to Bank ABC! Please say one of the following:

Balance, Hours, Loan, ...

What type of loan are you interested in?Please say one of the following:

Mortgage, Car, Personal, ...

. . . .

Loan

Car

Frame-based systems

• I would like to fly to Boston• When would you like to fly?

• Friday

Destination_City: ______Departure_Date: ______Departure_Time: ______Preferred_Airline: ______...

20030822Boston

Frame-based systems

• I’d like to go to Boston on Friday, …• What time would you like to leave?

Destination_City: ______Departure_Date: ______Departure_Time: ______Preferred_Airline: ______...

20030822Boston

Frame-based systems

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Zxfgdh_dxab: _____askjs: _____dhe: _____aa_hgjs_aa: _____..

Transition onkeyword or phrase

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Learning through observation

• Fundamental representations and observable events

Task-oriented Interaction

• Implicit system goal is to create products– Data structures that specify information for action

• Sessions can generate multiple products– Immediate products, e.g., information requests– Products that are built up incrementally over the

course of a session, e.g., a plan such as an itinerary

• An Agenda to order (and re-order) topics for discussion

Products and Actions

• Products and Actions are domain-specific – e.g., itineraries bookings, queries information

display• Products are represented as an ordered tree

– nodes in the trees correspond to schemas (handlers, agents, etc.) and are slots or forms

• Slot-specific computation is encapsulated in schema (handler objects)

• Agenda is generated from the current product tree

– defines the sequence of topics to take up with the user

Agenda Structure

• Ordered list of conversational topics

– current goal: focussed topic– pending goals: schema yet to

be filled– persistent goals: handlers that

are always active• constructors• generic help• garble

Current focus

Pending

goals

Persistent

goals

Simple and Compound Schema

valuetransform

focus hook prompt•Invalidate value

•self-promote

•reorder tree

receptors

Domain

Agent

valuetransform

Value_3

Value_1

Value_2

report

Domain

Agent

e.g. SQL query

receptor

+

Agendas from product tree traversal

• Default traversal of current product tree

– left-to-right, depth-first

– all nodes in the current product tree are always on the agenda

• Persistent goals sort to the bottom of the list

profile

root

Leg_1

Hotel_1 Car_1

Leg_2

1

2

4Flight_1

Dest_1 Time_1Date_1 65 7

3

9

10

8

Shifting focus

• Agenda has linear structure

– Derived from product tree

• Focus capture implies reordering sibling nodes

– Reordering propagates to root

• enclosing topic contexts get promoted

– Focus node is promoted to top of the agenda

node i gets focus

a

dc

b e

gf

iha

dc

b e

g f

ih

a

fg

e b

dc

ih

1

2

63

7

98

5 ( 1)4

Constructors

• Products are not fixed data structures but may expand through the course of a session

• Users can modify the product

– “I’d like to go on to Syracuse”

– [system adds a new leg sub-tree to the product]

t

n l

f

D d t

ch

l

f

D d t

ch

t

n l

f

D d t

ch

Hierarchical Plan-based Representation

Login

AskRegistered

AskName GreetUser

GetProfile

GreetGuest

PRE: registered=false

PRE: AVAILABLE(name)

PRE: AVAILABLE(name)

GOAL: (registered = false) || AVAILABLE(profile)

Execution policy

• Dialog control:– Task constraints (Declarative): define the boundaries

of the space of possible dialogs– Execution policy (Procedural/Workflow): actively

defines dialogue control

Hierarchical Plan-based Representation

Communicator

Welcome Login Travel Locals Bye

AskRegistered AskName GreetUser GetProfile Leg1

GetQuery ExecuteQuery DiscussLeg1

Registered: [yes]

Registered: [yes]Name: [user_name]

Registered: [yes]Name: [user_name]Departure: [City]Arrival: [City]… … …

AskRegistered

Login

Communicator

FOCUS

MAIN TOPIC

S: Are you a registered user?U: Yes, this is Alex [yes] [user_name]

Hierarchical Plan-based Representation

Leg1

ExecuteQuery DiscussLeg1GetQuery: FORM

DepartureLocation: TCityArrivalLocation: TCityDepartureDate: TDateDepartureTime: TTime

Common task skills

Dialog Engine

• Controls the dialog by executing the hierarchical plan-based task specification

• In the process, automatically exhibits appropriate generic (task and domain-independent) conversational skills:– Global dialogue mechanisms

• repeat, suspend, start-over, help, where are we?

– Grounding• Implicit and explicit confirmations, disambiguations, various

non-understanding handling strategies

– Timing and turn-taking

Issues that remain

• Parallel activities and asynchronous events– Understanding the scope of “dialogue”

• Knowledge engineering dialogue systems– Building the interface between the dialogue engine

and the world (“pragmatics”)– Capturing human speech and language behavior

within tasks and domains– Reasoning about the world within applications– Communicating meaningfully and efficiently with the

user about the state of the world

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Learning through observation

• Fundamental representations and observable events

Learning by observation

• Many automatic systems are meant to substitute for current human-based operations (e.g., a travel agency or a call center)

• Can we use such existing working human systems to infer the structure of a corresponding automatic system?

• If so, what might be the requisite representations and learning heuristics?

Learning to dialogue

• Goal-directed conversation is regular– Both participants can agree on the same goal

and both participants want to achieve this goal

– Correct transmission of information is at a premium

• Can we exploit the regularity to extract the (currently human engineered) structure of the dialogue?

Learning structure from dialogue

• Concept identification

• Form (topic) segmentation

• Task graphs

• Multiple data streams

• Lightly supervised learning

Travel agent and client

greeting

hotel

confirm

returnout leg

carpayment / close

Outline

• Empirical approaches to understanding dialogue and building dialogue systems

• A task-based approach to dialogue

• Learning through observation

• Fundamental representations and observable events

Properties of a dialogue representation

1. Sufficiency– Captures sufficient information for the creation of a

dialogue system– Describes the important (i.e., operative)

phenomena in conversations

2. Generality– Covers conversations in dissimilar domains

3. Learnability– Can be populated through observation (e.g., from a

corpus of human-human conversations)

• Components of task structure– Procedures for completing task goal(s)

• Steps in the task and their dependencies (i.e., the workflow)

– Domain language• Concepts and idioms that humans use to

communicate about the task

– Domain reasoning• The relationships between language and task, and

the domain of the application

Task-centric dialogue representation

• Components of task structure– Procedures for completing task goal(s)

• Steps in the task and their dependencies (i.e., the workflow)

– Domain language• Words, constructs and idioms that humans use to

communicate about the task

– Domain reasoning• The relationships between language and task, and

the domain of the application

Dialogue primitives

Levels of representation1. Task: a subset of conversational sequences that

achieves a particular (human/system) goal 2. Sub-task: a step in a task that contributes toward the

fulfillment of the task goal– The smallest unit of a dialogue that contains information

sufficient to execute a specific domain action

3. Concept: key domain entities (perhaps organized into a type-hierarchy or ontology)

Mechanisms1. Task Oriented: form-filling and result negotiation2. Discourse oriented: grounding, etc

Task Structure Representation

• Task = collection of forms

• Sub-task = a form

• Concept = a slot in a form

F: Query_Departure_Time

Depart_Location: carnegie_mellon

Arrive_Location: the airport

Arrive_Time: Hour: four Minute: thirty

Bus_Number: 28X

Example: Air travel planning

1. Task: create itinerary2. Sub-tasks:

– Flight reservation– Hotel reservation– Car rental reservation

3. Concepts: – Airline = { Continental, Iberia, … }– Hotel = { Novotel, Hilton, … }

Example: Bus schedule enquiry

1. Task (multiple tasks): – Find bus numbers that run between two locations – Find a departure time given a bus number and

stop location

2. Sub-tasks: – No further decomposition needed

3. Concepts: – Bus Number = { 61C, 28X, … }– Location = { CMU, airport, … }

Dialogue mechanisms

• Operations invoked by participants:– Correspond to an utterance or a part of an utterance – Has a unique consequence on the state of the

conversation– init_form causes a system to create a new form– The behavior of the same operation is the same

regardless of the domain (only the parameters that are different)

Dialogue mechanisms (2)

• Dialogue procedure– Requires more than one utterance to complete– A confirmation mechanism = 2 operations

(confirmation_request + respond)

• Non-verbal operation– Activated by a state of the representation rather than

a verbal expression– access_database is activated by the completion of

the query form

An example from the Map Task

• Forms– Action forms ( →draw_line )– Entity forms ( landmark )

• Operations ( various )

• Resolving a misunderstanding through grounding [session q8nc7]

Giver’s Map

Follower’s Map

Episode 11-1Operation:GIVER87:     ask_landmark: have you got a TarLM:[golden beach((left))]?

FOLLOWER88:     respond: yes uh-huh.     add_landmark: (golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one)

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location:

Follower’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Follower’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Episode 11-1 (2)Operation:GIVER87:     ask_landmark: have you got a TarLM:[golden beach((left))]?

FOLLOWER88:     respond: yes uh-huh.     add_landmark: (golden beach (right)) (Misunderstanding, the follower ground the left one while the giver ask about the right one)

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Origin:

Orientation:

Distance:

Path:

Destination

Episode 11-2Operation:GIVER89:     fill_form_info: well goDir:[straight up ]... ... from Ori:[Loc:[the top of the white

mountain]] 'til you're just Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]FOLLOWER90:     acknowledge:  right,

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Origin: Ori:[Loc:[the top of the white mountain]]

Orientation: Dir:[straight up ]

Distance:

Path:

Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]

Origin: Ori:[Loc:[the top of the white mountain]]

Orientation: Dir:[straight up ]

Distance:

Path:

Destination Dest:[Loc:[beside the golden beach]] toDest:[Loc:[ the right of it (white mountain)]]

Episode 11-3Operation:

ask_fill_form_info: you want me to go dilect-- ... Dir:[directly right]?   GIVER91:     respond: no,      fill_form_info: Dir:[directly up].

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Episode 11-4Operation:FOLLOWER92:     fill_form_info:  but golden beach((right)) is away in Loc:[the far right].

(The follower explicitly fill the location of the golden beach (right). )

GIVER93:     acknowledge:  ah right. (Agree with the location of the golden beach (right))

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Follower’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: implicitly grounded

Follower’s

Landmark: golden beach (right)

Giver Map:

Follower Map: yes

Location: the far right

Giver’s

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location:

Giver’s

Landmark: golden beach (right)

Giver Map: yes

Follower Map:

Location:

Episode 11-5Operation:FOLLOWER94:     ask_landmark:  have you got TarLM:[your (golden beach (right))]?  

GIVER95:     inform_other_info: i've got two golden beaches.

FOLLOWER96:     acknowledge: ah.      add_landmark: (golden beach (right))

Landmark: golden beach (right)

Giver Map:

Follower Map: yes

Location: the far right

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Episode 11-5 (2)Operation:FOLLOWER94:     ask_landmark:  have you got TarLM:[your (golden beach (right))]?  

GIVER95:     inform_other_info: i've got two golden beaches.

FOLLOWER96:     acknowledge: ah.      add_landmark: (golden beach (right))

Grounding Form

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Episode 11-6Operation:GIVER97:     fill_form_info:  sorry ... so there's TarLM:[the one(golden beach (left))] Loc:

[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me.

FOLLOWER98:     fill_form_info: is there, yeah there's nothing nothing there.     

add_landmark: golden beach (left)

GIVER99:     acknowledge:  right okay,

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location:

Landmark: golden beach (left)

Giver Map: yes

Follower Map: no

Location: above the ... white mountain, the left of it (white mountain)

Landmark: golden beach (left)

Giver Map: yes

Follower Map: no

Location: above the ... white mountain, the left of it (white mountain)

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location: above the ... white mountain, the left of it (white mountain)

Grounding Form

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Episode 11-6 (2)Operation:GIVER97:     fill_form_info:  sorry ... so there's TarLM:[the one(golden beach (left))] Loc:

[above the ... white mountain] as well to Loc:[ the left of it (white mountain)] for me.

FOLLOWER98:     fill_form_info: is there, yeah there's nothing nothing there.     

add_landmark: golden beach (left)

GIVER99:     acknowledge:  right okay,

Grounding Form

Landmark: golden beach (left)

Giver Map: yes

Follower Map:

Location: above the ... white mountain, the left of it (white mountain)

Grounding Form

Landmark: golden beach (right)

Giver Map: yes

Follower Map: yes

Location: the far right

Applying the representation

• Four different task-oriented domains – Air travel planning

• Professional travel agent and volunteer clients (re)booking former trips

– HCRC map-reading task• Hired subjects communicating path information

– Bus schedule information• Professional agents helping customers

– UAV operation• Trainees flying an unmanned airline, in a

simulation

Evaluation Corpora

• Annotated conversations from the four task-oriented domains

Domain Available Analyzed

#Dialogs #Dialogs #Utterances

Bus schedule 12 5 90

Air travel 43 4 273

Map reading 128 4 498

UAV operation 2 1 224

Rejected utterances

• Utterances that could not be described by the proposed structure– Out Of Domain (OOD)– Out Of Scope (OOS) : in-domain but out of the

conversation goal– Indirect : requires substantial reasoning or world-

knowledge to interpret– Task Management (TM) : manages the overall state

of the dialogue, rather than a particular form

Rejected utterance percentage

Domain Rejected utterances (%)

OOD OOS Indirect TM Total

Bus schedule 4.4 4.4 6.7 0.0 15.6

Air travel 1.8 4.4 0.4 2.6 9.2

Map reading 0.0 0.0 2.2 0.0 2.2

UAV simulation 1.0 0.0 1.0 4.0 5.9

Summary

• Human-computer dialogue is organized around specific tasks within domains

• The key level of representation is in fact the task; applications are particular embodiments of these tasks

• All applications necessarily include a large amount of detail– Such detail is not knowable a priori (and much of it

cannot be generated from principle)– Either extensive knowledge engineering or (better)

systems that learn are necessary to produce systems that function robustly

Recommended