66
Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Embed Size (px)

Citation preview

Page 1: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Kees van DeemterMatthew Stone

Formal Issuesin

Natural Language Generation

Lecture 1Overview + Reiter & Dale

1997

Page 2: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

We are Kees van Deemter…

• Principal Research Fellow, ITRI, University of Brighton, UK

• PhD University of Amsterdam 1991• Research interests: - Formal semantics of Natural Language (underspecification, anaphora, prosody) - Generation of text, documents, speech

Page 3: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

…and Matthew Stone

• Assistant Professor, Comp. Sci. & Cog. Sci.Rutgers University, USA

• PhD University of Pennsylvania 1998• Research interests: - Knowledge representation (action, knowledge, planning,

inference) - Task-oriented spoken language

Page 4: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Our Question This Week

What are the possible ways of using knowledge (of the world and of language) in formulating an utterance?

Page 5: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Knowledge in utterances

Knowledge of the worldUtterance says something useful and reliable.

Knowledge of languageUtterance is natural and concise,in other words, it fits hearer and context.

Page 6: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

A Concrete Example

Turn handle to locked position.

OPEN

LOCKED

Our partner is working with equipment that looks like:

The instruction that we’d like to give them is:

Page 7: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Knowledge in this utterance

Knowledge of the worldUtterance says something useful and reliable.

OPEN

LOCKED

This is what has to happen next.

Page 8: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Knowledge in this utteranceKnowledge of language

Utterance is natural and concise.Consider the alternatives…

OPEN

LOCKED

Move the thing around.

Page 9: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Knowledge in this utteranceKnowledge of language

Utterance is natural and concise.Consider the alternatives…

OPEN

LOCKED

You ought to readjust the fuel-line access panel handle by pulling clockwise 48 degrees

until the latch catches.

Page 10: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Our Question This Week

What are the possible ways of using knowledge (of the world and of language) in formulating an utterance?

This is a formal question; the answers will depend on the logics behind grammatical information and real-world inference.

Page 11: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

The NLG problem depends on the input to the system

OUTPUT: Turn handle to locked position.

INPUT: turn’(handle’, locked’)

If the input looked like this:

Deriving the output would be easy:

Page 12: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Must support correct, useful domain reasoninge.g., characterizing the evident function of equipment

Real conceptual input is richer and organized differently

OPEN

LOCKED

e.g., simulating/animating the intended action

Page 13: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Difference in Content

Input: New info complete, separate from old

Output: New info cut down, mixed with old

Turn handle to locked position.

Page 14: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Difference in Organization

Input: deictic representation for objectsthrough atomic symbols that index a flat database

handle(object388). number(object388, “16A46164-1”).composedOf(object388,steel). color(object388,black).goal(object388,activity116).

partOf(object388,object486).

Output: descriptions for objectsthrough complex, hierarchical structures

NP – DET – the - N’ – N – handle

Page 15: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Formal problem

NLG means applying input domain knowledge that looks quite different from output language!

Page 16: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Formal problem

How can we characterize these different sources of informationin a common frameworkas part of a coherent model of language use

For example: how can we represent linguistic distinctions that make choices in NLG good or bad?

Page 17: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Our Question This Week

What are the possible ways of using knowledge (of the world and of language) in formulating an utterance?

This is not just a mathematical question –

This is a computational question; possible ways of using knowledge will be algorithms.

Page 18: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

No Simple Strategy to Resolve Differences

Lots of variability in natural instructions

Lift assembly at hinge.Disconnect cable from receptacle.Rotate assembly downward.Slide sleeve onto tube.Push in on poppet.

Page 19: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Strategy Must DecideWhen to Say More

Using utterance interpretation as a whole

Turn handle by hand-grip from current open position for handle 48 degrees clockwise to locked position for handle.

OPEN

LOCKED

Page 20: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

In particular, hearer matches shared initial state

So describe objects and places succinctly

Turn handle by hand-grip from current open position for handle 48 degrees clockwise to locked position for handle.

OPEN

LOCKED

Page 21: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

In particular, hearer applies knowledge of the domain

So omit inevitable features of action

Turn handle by hand-grip from current open position for handle 48 degrees clockwise to locked position for handle.

OPEN

LOCKED

Page 22: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Computational problem

Because this process is so complex, it takes special effort to specify the process, to make it effective, and ensure that it works appropriately.

For example: How much search is necessary? How can we control search when search is required? What will such a system be able to say?

Page 23: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

OK, so generation is hard

Is this worth doing? Why does it matter?

NLG systems also have practical difficulties:Actually getting domain information.Successful software engineering.Knowing when you’ve completed them.

These are not to be underestimated – They’ve been the focus of most NLG research.

Page 24: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Why does this matter?

Formally precise work in NLG is starting, with motivations in computer science, cognitive science and linguistics.

Particularly challenging because generation has received much less attention than understanding.

Page 25: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Computer Science MotivationsDialogue systems are coming.

Spoken language interfaces are standard for simple phone applications; they will soon be the norm for customer service, information, etc.

Generation is an important bottleneck.After speech recognizer performance, the biggest impediment to user satisfaction is the lack of concise, natural, context-dependent responses. (More Wednesday.)

Page 26: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Computer Science Motivations• On-line help can be enhanced when help

messages are generated. (Assumption: There are too many to be stored separately.)

• Multilingual document authoring can be an efficient alternative to MT. (Different texts can be generated from the same input.)

Slightly less obvious:

• Sophisticated text summarization requires NLG

• Some approaches to MT require NLG

Page 27: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Computer Science MotivationsBetter formal understanding of NLG

promises• More flexible architectures• More robust and natural behavior• Less labor-intensive programming

methods

In short, NLG systems that work better and are easier to build.

Page 28: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Motivations in Cognitive Science

Want to know how language structure supports language use.– different representations make different

processing problems.

Page 29: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Motivations in Cognitive Science

Different representations make different processing problems.

E.G. incremental interpretation: how do you understand an incomplete sentence –

John cooked…

Page 30: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Motivations in Cognitive Science

E.G. incremental interpretation: how do you understand an incomplete sentence –

John cooked…

CFG: “compilation” – metalevel reasoning that abstracts meaning in common to many derivations.

CCG: just use the grammar itself – John cooked parses as S/NP

Page 31: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Motivations in Cognitive Science

Want to know how language structure supports language use.– different representations make different

processing problems.

What do we learn when we think of speaking as well as understanding?

Page 32: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Linguistic Motivations

Generation brings an exacting standard for theoretical precision.

Grice says “be brief.” For NLG we must say precisely how. (More Tuesday.)

More generally, is a model of the meaning of forms enough to explain when people use them?

If not, we need to say more. (Or maybe this whole meaning stuff is wrongheaded!)

Page 33: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Our Ulterior Motive

• Explain why NLG is theoretically interesting

(formally, computationally, and linguistically).

• Get more people working on NLG.

Page 34: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Our strategy this week:

• Zoom in on Generation of Referring Expressions (henceforth: GRE)

• Suggest that rest of NLG is equally interesting

Page 35: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

What is GRE?(very briefly)

• Given: an object to refer to. (target object)

• Given: properties of all objects.• (Given: a context.)• Task: identify the target object.

Page 36: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Our Plan

Today: Overview of generation.Tuesday: Case study in generation: GRE.Wednesday: Pragmatics in GRE.Thursday: Semantics in GRE.Friday: Syntax in GRE.(NOTE: Relatively complementary

lectures.)

Of course it’s backwards: it’s generation!

Page 37: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Rest of Today

A discussion of how generation systems usually work

(after Reiter & Dale, 1997)

Page 38: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

In practice, NLG systems work the way we can build them.

They solve a specific, carefully-delineated task.

They can verbalize only specific knowledge.They can verbalize it only in specific, often quite stereotyped ways.

Page 39: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

In practice, NLG systems work the way we can build them.

That means start with available input and the desired output, and putting together something that maps from one to the other.

Any linguistics is a bonus.Any formal analysis of computation is a bonus.

Page 40: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Input can come from …

• Existing database (e.g., tables) Format facilitates update, etc.

• An interface that allows a user to specify it

(e.g., by selecting from menus)

• Language interpretation

Page 41: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

For Example

Input: Rail schedule database. Current train status.

User query When is the next train to Glasgow?

Output:There are 20 trains each day from Aberdeen to Glasgow. The next train is the Caledonian express; it leaves Aberdeen at 10am. It is due to arrive in Glasgow at 1pm, but arrival may be slightly delayed.

Page 42: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

To get from input to output means selecting and organizing information

The selection and organization typically happens in a cascade of processes that use special data structures or representations

Each makes explicit a degree of selection and organization that the system is committed to.Indirectly, each indicates the degree of selection and organization the system has still to create.

Page 43: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

The NLG PipelineGoals

Text Plans

Text Planning

Sentence Plans

Sentence Planning

Linguistic Realization

Surface Text

Page 44: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Overview of Processes and Representations, 1

Goals Messages Text Plans

Text Planning

ContentPlanning

DiscoursePlanning

Page 45: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Message

A message represents a piece of information that the text should convey, in domain terms.

Page 46: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Example Messages

message-id: msg01

relation: IDENTITY

arguments: arg1: NEXT-TRAIN

arg2: CALEDONIAN-EXPRESS

The next train is the Caledonian Express

Page 47: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Example Messages

message-id: msg02

relation: DEPARTURE

arguments: entity: CALEDONIAN-EXPRESS

location: ABERDEEN

time: 1000

The Caledonian Express leaves Aberdeen at 10am.

Page 48: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

A close variant

• Q: When is the next train to New Brunswick?

• A: It’s the 7:38 Trenton express.

I know something about the domain in this case – and can highlight how nonlinguistic the domain representation will be.

Page 49: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Variant message

message-id: msg03

relation: NEXT-SERVICE

arguments:station-stop: STATION-144

train: TRAIN-3821

The next train to New Brunwickis the Trenton Local.

Page 50: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Closer to home

message-id: msg04

relation: DEPARTURE

arguments: origin: STATION-000

train: TRAIN-3821

time: 0738

It leaves Penn Station at 7:38.

Page 51: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

How I got domain knowledge

NY Penn Station really is NJT Station 000,New Brunswick really is Station 144

(you have to key this into ticket machines!)This really is train #3821

(it’s listed with this number on the schedule!)

Page 52: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Text Plan

A text plan represents the argument that the text should convey; it is a hierarchical structure of interrelated messages.

Page 53: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Example Text Plan

NextTrainInformation

ELABORATION

[DEPARTURE][IDENTITY]

Page 54: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Overview of Processes and Representations, 2

TextPlans

SentencePlans

Sentence Planning

LexicalChoice

Aggregation

ReferringExpressionGeneration

?

Page 55: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Sentence Plans

A sentence plan makes explicit the lexical elements and relations that have to be realized in a sentence of the output text.

Page 56: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Example Sentence Plan

(S1/be

:subject (NEXT-SERVICE/it)

:object (TRAIN-3821/express

:modifier Trenton

:modifier 7:38

:status definite))

It’s the 7:38 Trenton express.

Page 57: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

We know what’s happened

Aggregation: we have constructed a single sentence that realizes two messages.

Once we have the first message:It’s the Trenton express.

We just add 7:38 to realize the second message:It’s the 7:38 Trenton express.

Page 58: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

We know what’s happened

Referring expression generation: we have figured out to realize the next-service as it, and figured out to identify the train by its destination and frequency of stops.

Page 59: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

We know what’s happened

Lexical (and grammatical) choice: to use the verb be with it as the subject and a reference to the train second; to say express rather than express train.to say Trenton rather than Northeast Corridor.

Page 60: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

But there’s no consensus method for how to do it.• Reiter (1994, survey of 5 NLG

systems): Most practical systems follow a pipeline, even though this makes some things difficult to do. Example: Avoidance of ambiguity

• Cahill et al. (1999, survey of 18 NLG systems): Tasks like Aggregation and GRE can happen almost anywhere in the system, e.g.,

- as early as Content Planning - as late as Sentence Realization

Page 61: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

But there’s no consensus method for how to do it.

And we’ll see that formal and computational questions raise important difficulties forwhat representations you can havewhat processes and algorithms you can usehow you bring knowledge of language into

the loop

Page 62: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Overview of Processes and Representations, 3

SentencePlans

SurfaceText

Linguistic Realization

Page 63: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

This is easier to think about

We all know what a surface text looks like!

And we all know you have to have a grammar

(of some kind or other) to get one!

Page 64: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Concluding remarks

• Our overview has followed `standard model’ of Reiter & Dale (1997)

• Even though the paper has an applied motivation, the formal problems we described earlier really come up.– Need linguistic representations that

correspond to domain and enable choices– Need good algorithms to put

correspondence to work

Page 65: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

Next Time

GRE in particular.• A microcosm of NLG,

requiring choices of content and form.

• A proving ground for formal questionshow to formalize knowledge of language, characterize good communication, and design effective algorithms.

Page 66: Kees van Deemter Matthew Stone Formal Issues in Natural Language Generation Lecture 1 Overview + Reiter & Dale 1997

What to do

If you’ve read Reiter & Dale 1997 great!This lecture probably made more sense to you.

But we won’t touch general issues till FridaySo otherwise no hurry to do Reiter & Dale

1997.

Next reading is Dale & Reiter 1995.We’ll be covering the whole paper rather

closely.