39
CS544: Lecture 6: Reference and Other Problems + Some Applications Jerry R. Hobbs USC/ISI Marina del Rey, CA February 23, 2010

CS544: Lecture 6: Reference and Other Problems + Some Applications Jerry R. Hobbs USC/ISI Marina del Rey, CA February 23, 2010

Embed Size (px)

Citation preview

CS544: Lecture 6:Reference and Other

Problems+ Some Applications

Jerry R. HobbsUSC/ISI

Marina del Rey, CA

February 23, 2010

Logical Form

Pat asked Chris to leave early.

Pat(x) & Past(e1) & ask’(e1,x,y,e2) & Chris(y) & leave’(e2,y) & early(e2)

e2

e2, y

y

x

The x in Pat(x) and ask’(e1,x,y,e2) are

the same.

Now what?

The y in Chris(y) and ask’(e1,x,y,e2) are

the same.

The y and e2 in leave’(e2,y) and ask’(e1,x,y,e2) are

the same.

The e2 in leave’(e2,y) and early(e2) are

the same.

Is There Systematicity?

The basic unit of information is the predication:

p(x,y)What is p?

predicate strengthening

What are x and y? coreference

What’s the relation between p and x, p and y?In what way is it appropriate for p to describe x? y?

metonymy, metaphor, ...

p(x,y) & q(y,z)

What’s the relation between these two predications?intraclausal coherence, discourse coherence

(predicate strengthening on sentence adjacency)

Reference and Coreference

Language: ......... x ................. y .............

World: A

x refers to A; y refers to A; x and y corefer; y is coreferential with xThe more general expressions (pronouns, definite NPs) are called anaphoric expressions, or anaphora

Varieties of coreference: Pronouns Definite NPs Other anaphora, e.g. “other” anaphora Implicit arguments Many syntactic/attachment ambiguities

An Algorithm for Pronoun Resolution

The network system divides data into small blocks called packets, which it sendsindividually.

S

NPFrom pronoun:1. Skip reflexive level2. Go up to next NP or S3. Breadth-first search for candidate NPs4. Rule out if selectional, number or gender conflict5. Pick the first candidate

80-90% accuracy

VP

The networksystem

divides

sendsit

which

packets

called

smallblocks

into

NP PP

NP

SBAR

S

NP VP

data

VP

NP

First and Second Person Pronouns

I, me, my: “.... I ....,” Person Verb[say] ...

OR the speaker/writer: “I would momentarily forget where I was”

we, us, our: “ .... we ....,” Verb[say] Person of Org

OR the reader and/or writer: “We will not cover it here”

OR the relevent everyone: “We had no idea what we had missed”

you, your: “ ... you ...,” Person said. Person(s) being addressed in a quote / some nearby Person not coreferential with speaker

OR the reader/listener

OR anyone / impersonal

“the” and “a”Conventional notation:

A car arrives. ==> (E x)[car(x) & arrive(x)]

The car arrives. ==> arrive( x [car(x)])

iota operator: the x such that car(x)

But “the” and “a” convey information: “the”: the entity referred to by the NP is mutually identifiable in context via the property conveyed by the rest of the NP. The car is in the driveway. Known entity “a”: the entity referred to by the NP is not mutually identifiable in context via the property conveyed by the rest of the NP. A car is in the driveway. New entity Arnold Schwarzenegger is a short man. New property

My approach: the man ==> the(x,e) & man’(e,x) a man ==> a(x,e) & man’(e,x)

Highly idiosyncratic

Definite NPs

Several cases:

Refers to something explicit in previous text: “I saw Bill Russell on a plane. The man is very tall.” “I bought a Prius. The car’s failures worry me.”

Refers to something implied by something explicit in previous text: “The city was all quiet. The streets were covered in snow.” “ ... shaking my car across the lane dividers”

Heuristic: Person resolves to last Person, etc.

No good heuristics; there are efforts to learn common associations, e.g., part-of relations

Anaphoric

Definite NPs

Definite description is self-contained (determinative definite NP), because:

It refers to something unique in the world: “the world”

It refers to something uniquely associated with a syntactically related entity: “the way Wall Street operates”, “the top of a table” “the student who scored the best on the test” Superlatives: “the most momentous thing ...”

Refers to something unique in the context: “the city turned monochrome”

Generic: refers to the representative element of the set of all things of that description: “the dollar fell yesterday” = dollars

Heuristic: If there is a superlative or right modifier

Bad heuristic: if there is no antecedent in the previous text

(determinatives far more frequent)

Resolving Definite NPs with Inference

... a car ... ... the car ...

... Prius ... ... the car ...

... a city ... ... the streets ...

Prius(x) --> car(x)

city(x) --> (E s,y) street(y) & in(y,x) & Plural(y,s)

Prove the existence of:

To resolve a definite NP reference, find the most economical proof of the existence of an entity of that description.Problem: Requires very large knowledge base.

Demonstratives and Deictics

Demonstratives (this, that, these, those): It is not well understood how these function, other than being definites.

“Attendance at meetings became more sporadic. Those who did come looked damp and resentful.”

“Enceladus is very strange. Something happened to this body in the past.”

“What regularities are there in allowable expressions? This is the problem of grammar induction.”

Deictics (relative to some anchor in the world):

“.... a report showed Friday”

“a quarter of a century ago ....”

“the last seven months”

of what week?

relative to when?

Implicit Arguments

Often the underlying predicate has more arguments than the text provides; how do we resolve the implicit arguments, and when do we need to?

tougher regulation by federal agencies.

suppporting air strikes

The work was tougher in such weather

The interest generated by Voyager’s visit made a comprehensive examination of Enceladus a cardinal goal of the Cassini mission to Saturn.

The practice of parsing can be considered ....

parsing what?

of what?

by whom and against whom?

than what?

in what and by whom?

Syntactic Ambiguity

A unit of American soldiers were engulfed in a fight with the Taliban.

unit(u1) & of(u1,s1) & American(y1) & soldier(y1) & Plural(y1,s1)engulf’(e1,z1,y1) & in(e1,f1) & fight(f1,y2,t2) & with(x,t1) & Taliban(t1) & [x = f1 v x = e1]

Axioms:

The third argument of the predicate “fight” is realized with “with”: fight(f,y,t) --> with(f,t)

If y accompanies t in event e, then e is with t: accompany(y,t,e) & arg(y,e) --> with(e,t)

Constrainedcoreferenceproblem

Is There Systematicity?

The basic unit of information is the predication:

p(x,y)What is p?

predicate strengthening

What are x and y? coreference

What’s the relation between p and x, p and y?In what way is it appropriate for p to describe x? y?

metonymy, metaphor, ...

p(x,y) & q(y,z)

What’s the relation between these two predications?intraclausal coherence, discourse coherence

(predicate strengthening on sentence adjacency)

Pragmatic Strengthening of Vague Predicates

Some words/predicates convey little information on their own, but we understand them much more specifically.

Compound nominals:

pension fund: fund that provides pensions

air strike: strike originating from the air

prairie storms: storms located on a prairie

Voyager 2 spacecraft: space craft named Voyager 2

lobster salad: salad containing meat of lobster

grammar induction: induction inducing a grammar

In general, the relation between the two nouns can be anything.Heuristic: Predicate-argument if selectionally possible. Otherwise, one of the dozen most common (part-of, in, made-of, etc.) determined by semantic type of the two nouns

Resolving Compound Nominalswith Inference

Prove the “nn” relation between the two nouns in the most economical way.

“pension fund”: fund(y1) & nn(x1,y1) & pension(x1)

fund(y1) --> provide(y1,x2) & payment(x2)

payment(x2) & for(x2,e3) & retire’(e3,z) --> pension(x2)

x1=x2

Other Vague Predicates

Possessive: CalPERS’ efforts: efforts by CalPERS Afghanistan’s Uruzgan Valley: Uruzgan Valley that is part of Afghanistan

“of” prepositional phrase: mounds of fine white powder: mounds consisting of fine white powder extensive plains of smooth terrain: plains that are smooth terrain a straightforward implementation of the idea: predicate-argument relation: implement(x, idea)

“have”: They had no dreams of global jihad: predicate-argument: dream(x,jihad) i.e., to have a dream is to dream Everyone had a cautionary tale: predicate-argument: tell(x,tale)

Lexical Ambiguity

The plane taxied to the terminal.

plane(x) & taxi(x,y) & terminal(y)

KB:

airplane(x) --> plane(x)

move-on-ground(x,y) & airplane(x) --> taxi(x,y)

airport-terminal(y) --> terminal(y)

airport(z) --> airplane(x) & airport-terminal(y)

wood-smoother(x) --> plane(x)

ride-in-cab(x,y) & person(x) --> taxi(x,y)

computer-terminal(y) --> terminal(y)

LF:

Specializationsof the vague

predicate “plane”

Lexical AmbiguityThe plane taxied to the terminal.

plane(x) & taxi(x,y) & terminal(y)

KB:

airplane(x) --> plane(x)

move-on-ground(x,y) & airplane(x) --> taxi(x,y)

airport-terminal(y) --> terminal(y)

airport(z) --> airplane(x) & airport-terminal(y)

wood-smoother(x) --> plane(x)

ride-in-cab(x,y) & person(x) --> taxi(x,y)

computer-terminal(y) --> terminal(y)

LF:

Lexical Ambiguity

John wanted a loan. He went to the bank.

LF: . . . & loan(l) & . . . . . . & bank(y) & . . .

KB:loan(x) --> financial-institution(y) & issue(y,x)

financial-institution(y) & etc4(y) --> bank1(y)

bank1(y) --> bank(y)

river(z) --> bank2(y) & borders(y,z)

bank2(y) --> bank(y)

Is There Systematicity?

The basic unit of information is the predication:

p(x,y)What is p?

predicate strengthening

What are x and y? coreference

What’s the relation between p and x, p and y?In what way is it appropriate for p to describe x? y?

metonymy, metaphor, ...

p(x,y) & q(y,z)

What’s the relation between these two predications?intraclausal coherence, discourse coherence

(predicate strengthening on sentence adjacency)

Are the Predicate and Argument

“Congruent”?p(x)

The predicate reallymeans something else,

e.g., metaphor

The argument reallyrefers to something else:

metonymy

John is an elephant==> John is big / clumsy / has a good memory / ...

I like to read Shakespeare==> I like to read the plays written by Shakespeare

This restaurant takes American Express==> This restaurant takes credit cards issued by American Express

What about -- America believes in democracy.

Metonymy

Metonymy: referring to something by referring to something related to it.

We have to coerce the apparent referent into the actual referent via some coercion function.

Common coercions:

Entity into part of entity: ... researchers excavating a cave ...

Organization into person: The White House said in its report that ....

Container into contained: She had consumed three glasses.

In a World Without Metonymy

Resolving Metonymy

For a particular domain, you can have a graph of the principal types of entities, where the links between nodes are the possible relations between them.To resolve metonymy, find the shortest path from the node of the apparent referent to a node matching the required type.

Country

Government

Organization

Personrules

isa member-of

More generally, prove there is a relation between the apparent referentand something satisfying the requirements, in the most economical way.

read Shakespeare

text wrote plays

coercion relation

See Katja Markertand Udo Hahn,

Artificial IntelligenceJournal, 2003

e.g., France criticized American policy in Iraq.

Metaphor

Metaphor: a predicate appropriate in one domain is used in another; abstract properties of that predicate are intended to be conveyed; sometimes large scale frameworks are enlisted (Lakoff & Johnson)

Holding/Having is Perceiving: returned a handful of images Influence as Physical Force: CalPERS pushed companies to improve their governance. tougher regulation by federal agencies

Knowledge as Visibility/Seeing: greater openness in the way companies are run delve into some controversial investments

Metaphor

John is an elephant ==> John is heavy

A metaphor explicitly conveys one thing, but is intended to convey something implied by what is explicit.

elephant’(e1,x) --> heavy’(e2,x) & imply’(e1,e2)

Make the implication relation explicit in the axiom, then use that as the coercion relation.The assertion is coerced from John’s being an elephant to John’s being heavy.

LF: Assert(e2) & rel(e1,e2) & elephant’(e1,x)

Interp: heavy’(e2,x) & imply(e1,e2)

Some Applications

Anchoring Interpretations in an Underlying Theory or Schema

Learning by Reading

Textual Entailment

Knowledge-Based Question-Answering

Question-Answeringfrom Multiple Sources

Show me the region 100 km north of the capital of Afghanistan.

What is the capitalof Afghanistan?

What is the lat/long100 km north?

What is the lat/longof Kabul?

CIAFact Book Geographical

Formula

QuestionDecomposition

via Logical Rules

AlexandrianDigital Library

Gazetteer

Show thatlat/long

Google Earth

ResourcesAttached toReasoning Process

Learning by Reading

Challenge 1: Read passages in a chemistry textbook and map the statements into an underlying formal language for chemistry. e.g., read a passage on acids and bases and write the corresponding chemical equations

Challenge 2: Read a chemistry textbook and solve the problems at the end of the chapter.

Challenge 3: Read a chapter of a textbook, compile it into a useful set of axioms, and use these axioms for understanding the next chapter. e.g., read the chapter on chemical equations and construct a useable theory of them; use them in the next chapter on acids and bases.

Learning by Reading Example

Every acid has a conjugate base, formed by removing a proton from the acid.

TRANSLATE THIS INTO

HX <--> X- + H+

e.g., HNO2 + H2O <--> NO2- + H3O+

We need an underlying theory of chemical equations: elements: H, O, ... compounds: set of <element,number> pairs optionally with + or - with constraints on when + or - appears constituent-of(element, compound) ingredients: set of compounds reaction: <ingredients-1, ingredients-2> constraints: number of atoms on each side equal; total charge on each side the same.

Learning by Reading Example

Every acid has a conjugate base, formed by removing a proton from the acid.

TRANSLATE THIS INTO

HX <--> X- + H+

Link word up with underlying theory:

form: reaction acid: HX proton: H+

remove: one side of reaction has it as a constituent; other side has it as a separate ingredient.

Textual Entailment

T: Claude Chabrol is a French movie director and has become well-known in the 40 years since his first film, Le Beau Serge, for his chilling tales of murder, including Le Boucher.

H1: Le Beau Serge was directed by Chabrol.

H2: Le Boucher was made by a French movie director.

Steps: 1. Interpret T. 2. Interpret H. 3. Assert the interpretation of T in the KB. 4. Prove the interpretation of H.

Extended WordNet (XWN) and the

LCC Question-Answering System

Developed by Sanda Harabagiu, Dan Moldovan, and Roxana Girju of Language Computer Corp, Southern Methodist, U Texas Arlington

Ongoing work since the mid 1990s.

In TREC-2002 QA evaluation, LCC - 83%; next two 58% and 54%; everyone else under 40%

Question

Question Processing

Document Retrieval

Answer Reranking and Extraction

Answer

Match with large set of common patterns + full sentence parse

State of the art

Answer Extraction

Document retrieval component returns a ranked list of subdocuments (passages within documents) If too many, conjoin new query terms (more key words from query) If too few, disjoin new query terms (e.g., synonyms)

Candidate answer reranking: Analyze the candidate subdocuments into their logical form. Try to prove the logical form of the question from the logical form of the answer, using a large knowledge base. The candidate answer with the best proof gets ranked highest.

They say reranking candidateanswers in this way improved their score from ~65% to 83%,and similar improvements in

other evaluations.

What Knowledge Base?Extended WordNet (XWN)

Disambiguate word senses in WordNet glosses Automatic word sense disambiguation worked at 80% in Senseval; Half of the glosses were checked by hand; So a word sense accuracy in XWN of ~90%

Parse glosses and translate glosses into logical form; i.e., axioms

“Suicide is the act of killing yourself.”suicide’(e1,x1) <--> kill’(e1,x1,x1)

“To kill is to cause to die.”kill’(e1,x1,x2) <--> cause’(e2,x1,e3) & die’(e3,x2)

“Old is having lived for a relatively long time or attained a specific age.”old(x6) <--> live’(e2,x6,x2) & for(e2,x1) & relatively(x1) & long(x1) & time(x1) & or’(e5,e2,e3) & attain’(e3,x6,x2) & specific(x2) & age(x2)

The Search Space Problem

120,000 glosses --> 120,000 axiomsTheorem proving would take forever.

Lexical chains / marker passing: Try to find paths between Answer Logical Form and Question Logical Form. Ignore the arguments; look for links between predicates in XWN; it becomes a graph traversal problem (e.g., confuse “buy”, “sell”) Observation: All proofs use chains of inference no longer than 4 steps Carry out this marker passing only 4 levels out

Q: “What Spanish explorer discovered the Mississippi River?”Candidate A: “Spanish explorer Hernando de Soto reached the Mississippi River in 1536.”Lexical chain: discover-v#7 --GLOSS--> reach-v#1

Set of support strategy: Use only axioms that are on one of these paths. 120,000 axioms ==> several hundred axioms

Prove Question from AnswerQ: “How did Adolf Hitler die?”QLF: manner(e4) & Adolf(x10) & Hitler(x11) & nn(x12,x10,11) & die’(e4,x12)

ALF: it(x14) & be’(e1,x14,x2) & Zhukov(x1) & ’s(x2,x1) & soldier(x2) & plant’(e2,x2,x3) & Soviet(x3) & flag(x3) & atop(e2,x4) & Reichstag(x4) & on(e2,x8) & May(x5) & 1(x6) & 1945(x7) & nn(x8,x5,x6,x7) & day(x9) & Adolf(x10) & Hitler(x11) & nn(x12,x10,x11) & commit’(e3,x12,e5) & suicide’(e5,x12)A: “It was Zhukov’s soldiers who planted a Soviet flag atop the Reichstag on May 1, 1945, a day after Adolf Hitler committed suicide.”

“suicide” is troponym of “kill”: suicide’(e5,x12) --> kill’(e5,x12,x12) & manner(e5)

Gloss of “kill”: kill’(e5,x12,x12) <--> cause’(e5,x12,e4) & die’(e4,x12)

Gloss of “suicide”: suicide’(e5,x12) <--> kill’(e5,x12,x12)

e4=e5?

Relaxation (Assumptions)

Rarely or never can the entire Question Logical Form be proved from the Answer Logical Form ==> We have to relax the Question Logical Form

“Do tall men succeed?”

Logical Form: tall’(e1,x1) & x1=x2 & man’(e2,x2) & x2=x3 & succeed’(e3,x3)

Remove these conjuncts from what has to be proved, one by one, in some order, and try to prove again.

E.g., we might find a mention of something tall and a statement that men succeed.One limiting case: We find a mention of success.

Penalize proof for every relaxation, and pick the best proof.