30
LING 2000 - 2006 NLP 1 Introduction to Computational Linguistics Martha Palmer April 19, 2006

Introduction to Computational Linguistics

  • Upload
    siran

  • View
    87

  • Download
    2

Embed Size (px)

DESCRIPTION

Introduction to Computational Linguistics. Martha Palmer April 19, 2006. Natural Language Processing. Machine Translation Predicate argument structures Syntactic parses Producing semantic representations Ambiguities in sentence interpretation. Machine Translation. - PowerPoint PPT Presentation

Citation preview

LING 2000 - 2006 NLP1

Introduction to Computational Linguistics

Martha PalmerApril 19, 2006

LING 2000 - 2006 NLP2

Natural Language Processing

• Machine Translation• Predicate argument structures• Syntactic parses• Producing semantic representations• Ambiguities in sentence interpretation

LING 2000 - 2006 NLP3

Machine Translation

• One of the first applications for computers– bilingual dictionary > word-word translation

• Good translation requires understanding!– War and Peace, The Sound and The Fury?

• What can we do? Sublanguages.– technical domains, static vocabulary– Meteo in Canada, Caterpillar Tractor Manuals,

Botanical descriptions, Military Messages

LING 2000 - 2006 NLP4

Example translation

LING 2000 - 2006 NLP5

Translation Issues: Korean to English

- Word order- Dropped arguments- Lexical ambiguities- Structure vs morphology

LING 2000 - 2006 NLP6

Common Thread

• Predicate-argument structure– Basic constituents of the sentence and how

they are related to each other• Constituents

– John, Mary, the dog, pleasure, the store.• Relations

– Loves, feeds, go, to, bring

LING 2000 - 2006 NLP7

Abstracting away from surface structure

LING 2000 - 2006 NLP8

Transfer lexicons

LING 2000 - 2006 NLP9

Machine Translation Lexical Choice- Word Sense Disambiguation

Iraq lost the battle. Ilakuka centwey ciessta. [Iraq ] [battle] [lost].

John lost his computer. John-i computer-lul ilepelyessta. [John] [computer] [misplaced].

LING 2000 - 2006 NLP10

Natural Language Processing

• Syntax– Grammars, parsers, parse trees,

dependency structures• Semantics

– Subcategorization frames, semantic classes, ontologies, formal semantics

• Pragmatics– Pronouns, reference resolution, discourse

models

LING 2000 - 2006 NLP11

Syntactic Categories

• Nouns, pronouns, Proper nouns• Verbs, intransitive verbs, transitive verbs,

ditransitive verbs (subcategorization frames)

• Modifiers, Adjectives, Adverbs• Prepositions• Conjunctions

LING 2000 - 2006 NLP12

Syntactic Parsing

• The cat sat on the mat. Det Noun Verb Prep Det Noun

• Time flies like an arrow. Noun Verb Prep Det Noun

• Fruit flies like a banana. Noun Noun Verb Det Noun

Context Free Grammar

• S -> NP VP• NP -> det (adj) N• NP -> Proper N• NP -> N• VP -> V, VP -> V PP• VP -> V NP• VP -> V NP PP, PP -> Prep NP• VP -> V NP NP

LING 2000 - 2006 NLP13

LING 2000 - 2006 NLP14

Parses

V PP

VP

S

NP

the

the mat

satcat

onNPPrep

The cat sat on the mat

DetN

Det N

LING 2000 - 2006 NLP15

Parses

VPP

VP

S

NP

time

an arrow

flies

likeNPPrep

Time flies like an arrow.

N

Det N

LING 2000 - 2006 NLP16

Parses

V NP

VP

S

NP

flies like

anNDet

Time flies like an arrow.

Ntime

arrow

N

LING 2000 - 2006 NLP17

Features• C for Case, Subjective/Objective

– She visited her. • P for Person agreement, (1st, 2nd, 3rd)

– I like him, You like him, He likes him, • N for Number agreement, Subject/Verb

– He likes him, They like him.• G for Gender agreement, Subject/Verb

– English, reflexive pronouns He washed himself.– Romance languages, det/noun

• T for Tense, – auxiliaries, sentential complements, etc. – * will finished is bad

LING 2000 - 2006 NLP18

Probabilistic Context Free Grammars

• Adding probabilities• Lexicalizing the probabilities

LING 2000 - 2006 NLP19

Simple Context Free Grammar in BNFS → NP VPNP → Pronoun

| Noun | Det Adj Noun |NP PP

PP → Prep NPV → Verb

| Aux VerbVP → V

| V NP | V NP NP | V NP PP | VP PP

LING 2000 - 2006 NLP20

Simple Probabilistic CFGS → NP VPNP → Pronoun [0.10]

| Noun [0.20]| Det Adj Noun [0.50]|NP PP [0.20]

PP → Prep NP [1.00]V → Verb [0.33]

| Aux Verb [0.67]VP → V [0.10]

| V NP [0.40]| V NP NP [0.10]| V NP PP [0.20]| VP PP [0.20]

LING 2000 - 2006 NLP21

Simple Probabilistic Lexicalized CFGS → NP VPNP → Pronoun [0.10]

| Noun [0.20]| Det Adj Noun [0.50]|NP PP [0.20]

PP → Prep NP [1.00]V → Verb [0.33]

| Aux Verb [0.67]VP → V [0.87] {sleep, cry, laugh}

| V NP [0.03]| V NP NP [0.00]| V NP PP [0.00]| VP PP [0.10]

LING 2000 - 2006 NLP22

Simple Probabilistic Lexicalized CFGVP → V [0.30]

| V NP [0.60] {break,split,crack..}

| V NP NP [0.00]| V NP PP [0.00]| VP PP [0.10]

VP → V [0.10] what about | V NP [0.40] leave?| V NP NP [0.10] leave1,

leave2?| V NP PP [0.20]| VP PP [0.20]

LING 2000 - 2006 NLP23

Language to Logic

• John went to the book store. John store1, go(John, store1)

• John bought a book. buy(John,book1)

• John gave the book to Mary. give(John,book1,Mary)

• Mary put the book on the table. put(Mary,book1,table1)

LING 2000 - 2006 NLP24

SemanticsSame event - different sentences

  John broke the window with a hammer.

  John broke the window with the crack.

  The hammer broke the window.

  The window broke.

LING 2000 - 2006 NLP25

Same event - different syntactic frames

  John broke the window with a hammer.  SUBJ VERB OBJ MODIFIER

  John broke the window with the crack.  SUBJ VERB OBJ MODIFIER

  The hammer broke the window.  SUBJ VERB OBJ

  The window broke.  SUBJ VERB

LING 2000 - 2006 NLP26

Semantics -predicate arguments

  break(AGENT, INSTRUMENT, PATIENT)

  AGENT PATIENT INSTRUMENT  John broke the window with a hammer.

  INSTRUMENT PATIENT  The hammer broke the window.

  PATIENT  The window broke.

  Fillmore 68 - The case for case

LING 2000 - 2006 NLP27

    AGENT PATIENT INSTRUMENT  John broke the window with a hammer.  SUBJ OBJ MODIFIER

  INSTRUMENT PATIENT  The hammer broke the window.  SUBJ OBJ

  PATIENT  The window broke.  SUBJ

LING 2000 - 2006 NLP28

Canonical Representation

  break (Agent: animate,  Instrument: tool,  Patient: physical-object)

  Agent <=> subj  Instrument <=> subj, with-pp  Patient <=> obj, subj

 

LING 2000 - 2006 NLP29

Syntax/semantics interaction

• Parsers will produce syntactically valid parses for semantically anomalous sentences

• Lexical semantics can be used to rule them out

LING 2000 - 2006 NLP30

Headlines

• Police Begin Campaign To Run Down Jaywalkers

• Iraqi Head Seeks Arms

• Teacher Strikes Idle Kids

• Miners Refuse To Work After Death

• Juvenile Court To Try Shooting Defendant