Upload
godfrey-dorsey
View
223
Download
0
Tags:
Embed Size (px)
Citation preview
Natural Language and Dialogue Systems Lab
Computational Models of Discourse and Dialogue 2011: Conversation in Social Media
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Persuasion in Social Media
Persuasion and argumentation in social media websites and forums
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
NLDS Social Media Dialogue Data
Data collected in the last year in collaboration with FoxTree’s Lab & Anand’s SemLab
Convinceme.net 4forums.org Carm.org
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Using Mechanical Turk to get labels http://pcon.soe.ucsc.edu/mturk_external/1
23/123.php?pageId=1597&assignmentId=ASSIGNMENT_ID_NOT_AVAILABLE&hitId=1HNBWKACQBSEV0YDIOYSBWM1C0YNIP
http://pcon.soe.ucsc.edu/mturk_external/qr/qr.php?pageId=1398&assignmentId=ASSIGNMENT_ID_NOT_AVAILABLE&hitId=1CEJFP6T9BRSEF7QNPYEV9U37T7Y6W
Natural Language and Dialogue Systems Lab
Classic Models of Discourse and Dialogue Structure(Task Oriented Dialog, Newspaper texts) Marilyn Walker. CS245. April 1st, 2010
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Dialogue Processing (circa 1988)
Grosz & Sidner 1986 Planning, Grice
Mann & Thompson 1988 Rhetorical Relations,
Text Structure Polanyi 1984
Linguistic Discourse Model
Hobbs 1979 Coherence Relations
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Dialogue Processing (circa 1988)
Me 1989 Starting my
Ph.D. with Aravind Joshi and Ellen Prince
Science IS NOT a belief system
=> Empirical Methods in Discourse
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Empirical/Statistical Approaches in NLP Penn Treebank first available ~ 1990
Plenty of data for parsing and POS But what about language behavior above the
sentence? What about interactive language?
1993: NSF Workshop on Centering in Naturally Occurring Discourse => Walker, Joshi & Prince 1997
1995: AAAI Workshop on Empirical Methods in Discourse => Walker & Moore CL special issue
1996: NSF Workshop on Discourse & Dialogue Tagging => DAMSL markup
NOW: there is virtually no work in NLP on discourse and dialogue that is not corpus based/empirical.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
What is a dialogue model?
A model is an abstraction of a thing, simplified or dimensionally reduced
A good model should be simpler but capture the essence of the real thing.
A good dialogue model should be testable. It should make predictions. Its claims should be such that one should be able to prove whether or not it is correct.
A good dialogue model should lead to results that are more generalizable.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Dialogue Structure
What makes a text coherent? What are discourse structures? Theories of discourse structures Approaches to build discourse structures
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse Coherence Example:
(1) John hid Bill’s car keys. (2) He was drunk.
(1) John hid Bill’s car keys. (2) He likes junk food.
(1) George Bush supports big business. (2) He’s sure to veto House Bill 1711.
Hearers try to find connections between utterances in a discourse.
The possible connections between utterances can be specified as a set of coherence relations.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Coherence relations (Hobbs,1979)
Result: S0 causes S1 John bought an Acura. His father went ballistic.
Explanation: S1 causes S0. John hid Bill’s car keys. He was drunk.
Parallel: S0 and S1 are parallel. John bought an Acura. Bill bought a BMW.
Elaboration: S1 is an elaboration of S0. John bought an Acura this weekend. He
purchased it for $40 thousand dollars. …
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse structure
S1: John took a train to Bill’s car dealership.
S2: He needed to buy a car.
S3: The company he works for now isn’t near any public transportation.
S4:He also wanted to talk to Bill about their softball leagues.
] Explanation
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse structure
S1: John took a train to Bill’s car dealership.
S2: He needed to buy a car.
S3: The company he works for now isn’t near any public transportation.
S4:He also wanted to talk to Bill about their softball leagues.
]Expla
natio
n ] Parallel
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse structure
S1: John took a train to Bill’s car dealership.
S2: He needed to buy a car.
S3: The company he works for now isn’t near any public transportation.
S4:He also wanted to talk to Bill about their softball leagues.
]Expla
natio
n
]
Para
llel
]Expla
natio
n
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse parsing
Explanation (e1)
S1 (e1)Parallel (e2;e4)
Explanation (e2) S4 (e4)
S2(e2) S3(e3)
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Why compute discourse structure?
Natural language understanding Summarization Information retrieval Natural language Generation Reference resolution
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Theories of discourse structure Mann and Thompson’s Rhetorical structure
theory (1988) Grosz and Sidner’s Attention, intention and
structure of discourse (1986) Discourse TAG. Penn Discourse Treebank
(PDTB) We will read a lot of papers using DTAG
and PDTB so am just going to talk about these ‘classic theories’ today.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Rhetorical structure theory (RST) Mann and Thompson (1988) One theory of discourse structure, based
on identifying relations between parts of the text: Defined 20+ rhetorical relations
Presentational relations: intentional Subject matter relations: informational
Nucleus: central segment of text Satellite: more peripheral segment
Relation definitions and more.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Presentational (intentional) relations Those whose intended effect is to increase
some inclination in the hearer. Relations:
Antithesis - Justify Background - Motivation Concession - Preparation Enablement: - Restatement Evidence - Summary
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Subject matter (information) relations Those whose intended effect is that the hearer
recognize the relation in question. Relations
Circumstance - Otherwise Condition - Purpose Elaboration - Solutionhood Evaluation - Unconditional Interpretation - Unless Means - Volitional cause Non-volitional cause - Volitional result Non-volitional result
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Multinuclear relations
Contrast Joint List Multinuclear restatement Sequence
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Some examples
Explanation: John went to the coffee shop. He was sleepy.
Elaboration: John likes coffee. He drinks it every day.
Contrast: John likes coffee. Mary hates it.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse structure
John likes coffee
He drinks it every day
Mary hates coffee.
They argue a lot
elab
orat
ion
contrast
cause
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
A relation: Evidence
(a) George Bush supports big business. (b) He’s sure to veto House Bill 1711.
Relation Name: Evidence Constraints on Nucl: H might not believe
Nucl to a degree satisfactory to S. Constraints on Sat: H believes Sat or will
find it credible Constraints on Nucl+Sat: H’s
comprehending Sat in Sat increases H’s belief of Nucl.
Effect: H’s belief of Nucl is increased.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
A relation: Volitional-Cause
(a) George Bush supports big business. (b) He’s sure to veto House Bill 1711.
Relation Name: Volitional-Cause Constraints on Nucl: presents a volitional action Constraints on Sat: none. Constraints on Nucl+Sat: Sat presents a situation
that could have caused the agent of the volitional action in Nucl to perform the action.
Effect: H recognizes the situation presented in Sat as a cause for the volitional action presented in Nucl.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Another example
S: (a) Come home by 5:00. (b) Then we can go to the hardware store before it closes. (c) That way we can finish the bookshelves tonight.
(a)
(a) (b) (c)(b) (c)
motivation motivation
condition condition
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
A Problem with RST (Moore & Pollack, 1992) How many rhetorical relations are there? How can we use RST in dialogues? How do we incorporate speaker intentions
into RST? RST does not allow for multiple relations
between parts of a discourse: informational and intentional levels must coexist.
Natural Language and Dialogue Systems Lab
Grosz & Sidner (1986)
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Grosz and Sidner (1986)
Three components: Linguistic structure Intentional structure Attentional state
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Linguistic structure
The structure of the sequence of utterances that comprises a discourse.
Utterances form Discourse Segment (DS); and a discourse is made up of embedded DSs. What exactly is a DS? Any evidence that humans naturally recognize
segment boundaries? Do humans agree on segment boundaries? How to find the boundaries automatically?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Intentional structure
Speakers in a discourse may have many intentions: public or private.
Discourse purpose (DP): the intention that underlies engaging in a discourse.
Discourse segment purpose (DSP): the purpose a DS. How this segment contributes to achieving the overall DP?
Two relations between DSPs: Dominance: if DSP1 contributes to DSP2, we
say DSP2 dominates DSP1. Satisfaction-precedence: DSP1 must be
satisfied before DSP2.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Attentional State
The attentional state is an abstraction of the participants’ focus of attention as their discourse unfolds.
The state is a stack of focus spaces. A focus space (FS) is associated with a DS,
and it contains DSP and objects, properties, and relations salient in the DS. When a DS ends, its FS is popped. When a DS starts, its FS is pushed onto the
stack.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
An example
C1: I need to travel in May.A1: And, what day in May do you want to travel?C2: I need to be there for a meeting on 15th.A2: And you are flying into what city?C3: Seattle.A3: And what time would you like to leave Pittsburgh?C4: Hmm. I don’t think there are many options for non-stop.A4: There are three non-stops today.C5: What are they?….
DS0
DS2
DS3
DS4
DS5
DS1
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Discourse structure with intention info
I0: C wants A to find a flight for C I1: C wants A to know that C is traveling in May. I2: A wants to know the departure date etc. I3: A wants to know the destination I4: A wants to know the departure time I5: C wants A to find a nonstop flight
DS0
DS1 DS2 DS3 DS4 DS5
A1-C2 A2-C3 A3 C4-C7C1
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Problems with G&S 1986
Assume that discourses are task-oriented Assume there is a single, hierarchical
structure shared by speaker and hearer Do people really build such structures
when they speak? Do they use them in interpreting what others say?
Natural Language and Dialogue Systems Lab
Walker 1996: Limited Attention & Discourse Structure
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
LIMITED ATTENTION CONSTRAINTWalker 1993, 1996
ellipsis interpretation pronominal anaphora interpretation inference of discourse relations between
utterances A and B B MOTIVATES A B is EVIDENCE for A
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
How is attention modeled ?
Linear Recency Hierarchical
Recency
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Centering
Centering is formulated as a theory that relates focus of attention, choice of referring expression, and perceived coherence of utterances, within a discourse segment [Grosz et al., 1995].
Brennan, Walker & Pollard 1987: Centering theory of Anaphora Resolution
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
What about Processing & Centering?
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Informationally Redundant Utterances
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Centers cross segments
Centers continued over discourse segment boundaries with pronominal referring expressions whose form is identical to those that occur within a discourse segment. (29) and he's going to take a pear or two, and then.. go on his way (30) um but the little boy comes, (31) and uh he doesn't want just a pear, (32) he wants a whole basket. (33) So he puts the bicycle down, (34) and he.. [Pear Stories, Chafe, 1980; Passonneau, 1995]:
=> discourse segment boundary between (32) and (33). [Passonneau, 1995, Passonneau & Litman 1997]
[Walker et al., 1998], (33) realizes a CONTINUE transition, indicating that utterance (33) is highly coherent in the context of utterance (32).
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Why is centering only within Segment? It is not plausible that a different process
than centering would be required to explain the relationship between utterances (32) and (33), simply because these utterances span a discourse segment boundary.
Centering is a theory that relates focus of attention, choice of referring expression, and perceived coherence of utterances, within a discourse segment [Joshi & Weinstein 1983, Grosz, Joshi & Weinstein, 1995],
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Cache Model (Human Working Memory)
Natural Language and Dialogue Systems Lab
Building discourse structure
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Tasks
Identify units, e.g. discourse segment boundaries
Determine relations between segments Determine intentions of the segments Determine the attentional state
Methods: Inference-based approach: symbolic Cue-based approach: statistical
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Inference-based approach
Ex: John hid Bill’s car keys. He was drunk. X is drunk people do not want X to
drive People don’t want X to drive people
hide X’s car key.
Abduction:
AI-complete: Require and utilize world knowledge.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Cue-based approach
Attentional state: Attentional changes:
(push) now, next, but, …. (pop) anyway, in any case, now back to, ok, fine,...
True interruption: excuse me, I must interrupt Flashback: oops, I forgot
Intention: Satisfaction-precedes: first, second,
furthermore, …. Dominance: for example, first, second, ….
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Cues (cont)
Linguistic structure Elaboration: for example, … Concession: although Condition: if Sequence: and, first, second. Contrast: and, … …
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
One example
(Marcu 1999): Train a parser on a discourse treebank. 90 trees, hand-annotated for rhetorical
relations (RR) Learn to identify Elementary discourse units
(EDUs) Learn to identify N, S, and their relation. Features: WordNet-based similarity, lexical,
structural, …
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Results
Identify units (Elementary DUs): 96%-98% accuracy
Identify hierarchical structures (2 EDUs are related): Recall=71%, Precision=84%
Identify nucleus/satellite labels: Rec=58%, Prec=69%
Identify rhetorical relation: Rec=38%, Prec=45%
Hierarchical structure is easier to id than rhetorical relations.
Natural Language and Dialogue Systems Lab
Discourse Representation Theory
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Informational Components.
Data Participants Beliefs Common ground Intentions
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Formal Representations
Formal representation of informational components
Typed feature structures Lists Sets Propositions First order logic
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Dialog Moves
Trigger the update of the information state
Grammatical triggers External events
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Update Rules
Govern information state updates Sometimes incorporates domain
knowledge Sometimes govern behavior of dialog
moves
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Control Strategy
Decide which update rule applies Simple priority list Game theory Utility theory Statistical methods
Natural Language and Dialogue Systems Lab
Also for Dialogue Systems…
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Dialog Theories
Finite State Dialog Models Plan-based Models
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Finite State Dialog Models
Information is a state in the FSM Dialog moves are inputs matching
transitions Update Rules are FSM lookups and
transitions Control Strategy is static, the FSM itself
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Plan-based Models
Information state is the modeled beliefs, desires, and intentions of the participants
Dialog moves are speech acts, e.g. request and inform
Update rules are cognitive rules of evidence
Control Strategies are classic AI plan-based strategies
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
63
What is a discourse relation? (Joshi,Prasad, Webber,
Coling/ACL Tutorial 1996)
The meaning and coherence of a discourse results partly from how its constituents relate to each other. Reference relations Discourse relations
Reference Relations
Discourse Coherence
Discourse Relations
Informational Intentional
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Why Discourse Relations?
Informational discourse relations convey relations that hold in the subject matter.
Intentional discourse relations specify how intended discourse effects relate to each other.
[Moore & Pollack, 1992] argue that discourse analysis requires both types.
RST informational or semantic relations (e.g, CONTRAST, CAUSE, CONDITIONAL, TEMPORAL, etc.) between abstract entities of appropriate sorts (e.g., facts, beliefs, eventualities, etc.), commonly called Abstract Objects (AOs) [Asher, 1993].
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
65
Why Discourse Relations?
Discourse relations provide a level of description that is
theoretically interesting, linking sentences (clauses) and discourse;
identifiable more or less reliably on a sufficiently large scale;
capable of supporting a level of inference potentially relevant to many NLP applications.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
66
How are Discourse Relations declared?
Broadly, there are two ways of specifying discourse relations:
Abstract specification Relations between two given Abstract Objects are always
inferred, and declared by choosing from a pre-defined set of abstract categories.
Lexical elements can serve as partial, ambiguous evidence for inference.
Lexically grounded Relations can be grounded in lexical elements.
Where lexical elements are absent, relations may be inferred.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
67
Rhetorical Structure Theory (RST)
RST [Mann & Thompson, 1988] associate discourse relations with discourse structure (TEXT).
Discourse structure reflects context-free rules called schemas.
Applied to a text, schemas define a tree structure in which:
• Each leaf is an elementary discourse unit (a continuous text span);
• Each non-terminal covers a contiguous, non-overlapping text span;
• The root projects to a complete, non-overlapping cover of the text;
• Discourse relations (aka rhetorical relations) hold only between daughters of the same non-terminal node.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
68
Types of Schemas in RST
RST schemas differ with respect to: what rhetorical relation, if any, hold between right-hand side (RHS) sisters; whether or not the RHS has a head (called a nucleus); whether or not the schema has binary, ternary, or arbitrary branching.
RST schema types in RST annotation
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Moore & Pollack 1992
Example 1 (a) George Bush supports big business. SATELLITE (b) He's sure to veto House Bill 1711. NUCLEUS
Relation name: EVIDENCE (MT 1987) Evidence is a “presentational relation”
Constraints on Nucleus: H might not believe Nucleus to a degree satisfactory to S.
Constraints on Satellite: H believes Satellite or will find it credible.
Constraints on Nucleus + Satellite combination: H's comprehending Satellite increases H's belief of Nucleus.
Effect: H's belief of Nucleus is increased
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Moore & Pollack 1992 Example 1
(a) George Bush supports big business. (b) He's sure to veto House Bill 1711.
Relation name: VOLITIONAL-CAUSE Volitional Cause is a “subject matter” relation
Constraints on Nucleus: presents a volitional action or situation that could have arisen from a volitional action.
Constraints on Satellite: none. Constraints on Nucleus + Satellite combination: Satellite
presents a situation that could have caused the agent of the volitional action in Nucleus to perform that action; without the presentation of Satellite, H might not regard the action as motivated or know the particular motivation; Nucleus is more central to S's purposes in putting forth the Nucleus-Satellite combination than Satellite is.
Effect: H recognizes the situation presented in Satellite as a cause for the volitional action presented in Nucleus.
NATURAL LANGUAGE AND DIALOGUE SYSTEMS LAB
UC SANTA CRUZ
Moore & Pollack 1992 Presentational relations: == Speaker
intention Speaker always has an INTENTION But Informational (subject matter relations)
also necessary to understand the discourse
Multiple levels of analysis are simultaneously available