Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved....

Preview:

DESCRIPTION

Carnegie Mellon School of Computer Science Copyright © 2007, Carnegie Mellon. All Rights Reserved. 3 LTI Grammars and Lexicons Bird.gra review General Problems Incomplete F-structure Incorrect F-structure Not enough constraints in the rule Unification problems

Citation preview

1Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar WritingLecture 5

11-721 Grammars and Lexicons

Teruko Mitamura

teruko@cs.cmu.eduwww.cs.cmu.edu/~teruko

2Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Schedule: November 19, 2007• Review of “bird.gra” • Review of “bird2.gra”• Character-based Parsing vs. Word-based

Parsing • Morphology• Start a new grammar exercise (4)

3Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Bird.gra review General Problems

• Incomplete F-structure • Incorrect F-structure• Not enough constraints in the rule • Unification problems

4Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Incomplete F-structuresDeterminer information is missing from f-

structure“A bird flies” and “The bird flies” showed the

same F-structure((subj ((agreement 3sg) (number sg) (root bird))) (form present) (agreement 3sg) (root fly))

5Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Complete F-structure• Contains all the necessary grammatical

information• Be able to reconstruct the original sentence “A bird flies”((SUBJ ( (NUMBER SG) (AGREEMENT 3SG) (ROOT BIRD) (DET ((NUMBER SG) (DEFINITENESS -) (ROOT A))) )) (FORM PRESENT) (AGREEMENT 3SG) (ROOT FLY))

• Some feature structures are redundant

6Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Incomplete F-structures (2)

Grammar problem:(<NP> < == > (<DET> <N>) ( ((x1 number) = (x2 number)) (x0 = x2) ))

7Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Not Enough ConstraintsThe singular noun without determiner can

become NP. “Bird flies” may parse. (<NP> < == > (<N>) ((x0 = x1)))

Problem: No constraint for number. ((x1 number) =c pl)

8Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Be Aware of Unification

(<NP> < == > (<DET> <N>) ((x0 = x1) (x0 = x2)))(<DET> < -- > (t h e) (((x0 definiteness) = +)))(<N> < -- > (b i r d) (((x0 root) = bird) ((x0 number) = sg) ((x0 agreement) = 3sg)))

9Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Be Aware of Unification (cont.)(<NP> < == > (<DET> <N>) ((x0 = x1) (x0 = x2)))(<DET> < -- > (t h e) (((x0 definiteness) = +) ((x0 root) = the)))(<N> < -- > (b i r d) (((x0 root) = bird) ((x0 number) = sg) ((x0 agreement) = 3sg)))

10Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Frequently Seen Problems• Test equations come before Action (x0 = x2) ;action ((x1 agreement) = (x2 agreement)) ;test• No “root” info in f-structure• When submitted:

– Write your full name in the grammar– Write more comments in the grammar– Turn off (dmode 2) or trace

• Print out the grammar and results files. – lpr –P<printer name> <filename> e.g. lpr –Pshakthi bird.gra

11Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Review: Bird 2 Grammar

• Goal: To learn more on unification• Some Problems:

– Not scalable semantic features ((x0 semclass) = Morris)– Incomplete f-structures– Incorrect f-structures

12Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar Exercise (3) Test Sentences

"A bird flies“"Birds fly“"The bird flies“"The birds fly“"The cat runs“"The cats run““Morris runs““Morris meows“"Cats meow“"A cat meows”"The cats meow“"The penguins run”"A penguin runs"

13Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar Exercise (3) Test Sentences (fail)

"A bird fly" "A birds flies" "Birds flies" "Bird flies" "The bird fly" "The birds flies"

14Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Test Sentences (fail) "The cat flies" "The cats fly" "The cat run" "A cat meow" “Morris meow" “Morris flies" "The bird meows" "A penguin meows" “Penguins meow" "The penguin flies"

15Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Semantic Category

Bird fly, run, *meowCat *fly, run, meowPenguin *fly, run, *meow

16Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Semantic Features (noun)

Bird (sem-class bird)Cat (sem-class cat)Penguin (sem-class penguin)--------------------------------- (animate +)

17Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Semantic Features (verb)Fly ((subj sem-class) = bird)Meow ((subj sem-class) = cat)Run ((subj animate) = +)

18Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Unification(<N> <--> (c a t s)

(((x0 root) = cat) ((x0 number) = pl) ((x0 animate) = +) ((x0 sem-class) = cat) ((x0 agreement) = pl)))

(<V> <--> (m e o w) (((x0 root) = meow) ((x0 agreement) = pl) ((x0 subj animate) = +) ((x0 subj sem-class) = cat) ((x0 form) = present)))

19Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Unification(<S> <==> (<NP> <VP>)(((x1 agreement) = (x2 agreement)) ((x0 subj) = x1) (x0 = x2)))

20Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Character-based ParsingMorphological rules can be parsedInput string: tabeta eat-past

taberu eat-present

(<v-class-1> < -- > (<v-class-1> r u)((x0 = x1) ((x0 tense) = present)))

(<v-class-1> < -- > (<v-class-1> t a)((x0 = x1) ((x0 tense) = past)))

(<v-class-1> < -- > (t a b e)(((x0 root) = taberu))

21Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Japanese morphologytabe-sase-rare-taeat-caus-pass-past(<v-class-1> < -- > (t a b e)

(((x0 root) = taberu))) (<v-class-1> < -- > (<v-class-1> s a s e)

(((x1 pass) = *undefined*) ((x1 tense) = *undefined*) (x0 = x1) ((x0 caus) = +)))

(<v-class-1> < -- > (<v-class-1> r a r e)(((x1 tense) = *undefined*) (x0 = x1) ((x0 pass) = +)))

(<v-class-1> < -- > (<v-class-1> t a)((x0 = x1) ((x0 tense) = past)))

Tabeta eat-pastTabe-sase-ta eat-caus-pastTabe-rare-ta eat-pass-pastTabe-sase-rare-ta eat-caus-pass-past

*tabe-rare-sase-ta eat-pass-caus-past*tabe-ta-sase-rare eat-past-caus-pass*tabe-ta-rare-sase eat-past-pass-caus*tabe-rare-ta-sase eat-pass-past-caus*tabe-sase-ta-rare eat-caus-past-pass*tabe-rare-sase eat-pass-caus*tabe-ta-sase eat-past-caus

22Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Word-based Parsing(<N> < -- > (sushi) (((x0 = root) = sushi)))Instead of: (<N> < -- > (s u s h i) (((x0 = root) = sushi)))For parsing: (parse-list list of symbols $)e.g. (parse-list ‘(a bird flies $))

23Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Grammar Exercise (4)

• Start grammar exercise (4): mlb.gra• Files are in /afs/cs/project/cmt-55/lti/Lab/Modules/ GNL-721/2007/• Test file: mlb-test.lisp

24Carnegie MellonSchool of Computer Science LTI Grammars and Lexicons

Copyright © 2007, Carnegie Mellon. All Rights Reserved.

Next Class: Nov 26

• Return bird2.gra• Return Assignment #1• Grammar Writing Project Evaluation

Criteria• Finish mlb.gra• Start a new exercise

Recommended