61
CS114 Lecture 10 Parsing March 5, 2014 Professor Meteer Thanks for Jurafsky & Mar?n & Prof. Pustejovksy for slides

CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

  • Upload
    others

  • View
    15

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

CS11413 Lecture13 1013 Parsing13

March13 513 201413 Professor13 Meteer13

Thanks13 for13 Jurafsky13 amp13 Marn13 amp13 Prof13 Pustejovksy13 for13 slides13

Announcements13

bull  Industry13 Meet13 and13 Greet13 13 ndash Tuesday13 March13 1113

bull  JBS13 13 Summer13 201413

PARSING13 13

bull  Parsing13 is13 the13 process13 of13 recognizing13 and13 assigning13 STRUCTURE13

bull  Parsing13 a13 string13 with13 a13 CFG13 13 ndash Finding13 a13 derivaon13 of13 the13 string13 consistent13 with13 the13 grammar13

ndash The13 derivaon13 gives13 us13 a13 PARSE13 TREE13

Grammar13 S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 Det13 Nom13 Nom13 13 Noun13 Nom13 13 Noun13 Nom13

PARSING13 AS13 SEARCH13 bull  The13 main13 problem13 with13 parsing13 is13 the13 existence13 of13 CHOICE13 POINTS13

bull  Parsing13 Strategy ndash  Top down

bull  Expectation Driven bull  Start with ldquoSrdquo

ndash  Bottom up bull  Data Driven bull  Start with wordscategories13

bull  Search13 Strategy13 ndash Determining13 the13 order13 alternaves13 are13 considered

bull  Depth first bull  Breadth first

TOP-shy‐DOWN13 vs13 13 BOTTOM-shy‐UP13

bull  TOP-shy‐DOWN13 ndash Only13 search13 among13 grammacal13 answers13 ndash BUT13 suggests13 hypotheses13 that13 may13 not13 be13 consistent13 with13 data13

ndash Problem13 le[-shy‐recursion13

bull  BOTTOM-shy‐UP13 ndash Only13 forms13 hypotheses13 consistent13 with13 data13 ndash BUT13 may13 suggest13 hypotheses13 that13 make13 no13 sense13 globally13

NON-shy‐PARALLEL13 SEARCH13

bull  If13 itrsquos13 not13 possible13 to13 examine13 all13 alternaves13 in13 parallel13 itrsquos13 necessary13 to13 make13 further13 decisions13 ndash Which13 node13 in13 the13 current13 search13 space13 to13 expand13 first13 (breadth-shy‐first13 or13 depth-shy‐first)13

ndash Which13 of13 the13 applicable13 grammar13 rules13 to13 expand13 first13

ndash Which13 leaf13 node13 in13 a13 parse13 tree13 to13 expand13 next13 (eg13 le[most)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 2: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Announcements13

bull  Industry13 Meet13 and13 Greet13 13 ndash Tuesday13 March13 1113

bull  JBS13 13 Summer13 201413

PARSING13 13

bull  Parsing13 is13 the13 process13 of13 recognizing13 and13 assigning13 STRUCTURE13

bull  Parsing13 a13 string13 with13 a13 CFG13 13 ndash Finding13 a13 derivaon13 of13 the13 string13 consistent13 with13 the13 grammar13

ndash The13 derivaon13 gives13 us13 a13 PARSE13 TREE13

Grammar13 S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 Det13 Nom13 Nom13 13 Noun13 Nom13 13 Noun13 Nom13

PARSING13 AS13 SEARCH13 bull  The13 main13 problem13 with13 parsing13 is13 the13 existence13 of13 CHOICE13 POINTS13

bull  Parsing13 Strategy ndash  Top down

bull  Expectation Driven bull  Start with ldquoSrdquo

ndash  Bottom up bull  Data Driven bull  Start with wordscategories13

bull  Search13 Strategy13 ndash Determining13 the13 order13 alternaves13 are13 considered

bull  Depth first bull  Breadth first

TOP-shy‐DOWN13 vs13 13 BOTTOM-shy‐UP13

bull  TOP-shy‐DOWN13 ndash Only13 search13 among13 grammacal13 answers13 ndash BUT13 suggests13 hypotheses13 that13 may13 not13 be13 consistent13 with13 data13

ndash Problem13 le[-shy‐recursion13

bull  BOTTOM-shy‐UP13 ndash Only13 forms13 hypotheses13 consistent13 with13 data13 ndash BUT13 may13 suggest13 hypotheses13 that13 make13 no13 sense13 globally13

NON-shy‐PARALLEL13 SEARCH13

bull  If13 itrsquos13 not13 possible13 to13 examine13 all13 alternaves13 in13 parallel13 itrsquos13 necessary13 to13 make13 further13 decisions13 ndash Which13 node13 in13 the13 current13 search13 space13 to13 expand13 first13 (breadth-shy‐first13 or13 depth-shy‐first)13

ndash Which13 of13 the13 applicable13 grammar13 rules13 to13 expand13 first13

ndash Which13 leaf13 node13 in13 a13 parse13 tree13 to13 expand13 next13 (eg13 le[most)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 3: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

PARSING13 13

bull  Parsing13 is13 the13 process13 of13 recognizing13 and13 assigning13 STRUCTURE13

bull  Parsing13 a13 string13 with13 a13 CFG13 13 ndash Finding13 a13 derivaon13 of13 the13 string13 consistent13 with13 the13 grammar13

ndash The13 derivaon13 gives13 us13 a13 PARSE13 TREE13

Grammar13 S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 Det13 Nom13 Nom13 13 Noun13 Nom13 13 Noun13 Nom13

PARSING13 AS13 SEARCH13 bull  The13 main13 problem13 with13 parsing13 is13 the13 existence13 of13 CHOICE13 POINTS13

bull  Parsing13 Strategy ndash  Top down

bull  Expectation Driven bull  Start with ldquoSrdquo

ndash  Bottom up bull  Data Driven bull  Start with wordscategories13

bull  Search13 Strategy13 ndash Determining13 the13 order13 alternaves13 are13 considered

bull  Depth first bull  Breadth first

TOP-shy‐DOWN13 vs13 13 BOTTOM-shy‐UP13

bull  TOP-shy‐DOWN13 ndash Only13 search13 among13 grammacal13 answers13 ndash BUT13 suggests13 hypotheses13 that13 may13 not13 be13 consistent13 with13 data13

ndash Problem13 le[-shy‐recursion13

bull  BOTTOM-shy‐UP13 ndash Only13 forms13 hypotheses13 consistent13 with13 data13 ndash BUT13 may13 suggest13 hypotheses13 that13 make13 no13 sense13 globally13

NON-shy‐PARALLEL13 SEARCH13

bull  If13 itrsquos13 not13 possible13 to13 examine13 all13 alternaves13 in13 parallel13 itrsquos13 necessary13 to13 make13 further13 decisions13 ndash Which13 node13 in13 the13 current13 search13 space13 to13 expand13 first13 (breadth-shy‐first13 or13 depth-shy‐first)13

ndash Which13 of13 the13 applicable13 grammar13 rules13 to13 expand13 first13

ndash Which13 leaf13 node13 in13 a13 parse13 tree13 to13 expand13 next13 (eg13 le[most)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 4: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

PARSING13 AS13 SEARCH13 bull  The13 main13 problem13 with13 parsing13 is13 the13 existence13 of13 CHOICE13 POINTS13

bull  Parsing13 Strategy ndash  Top down

bull  Expectation Driven bull  Start with ldquoSrdquo

ndash  Bottom up bull  Data Driven bull  Start with wordscategories13

bull  Search13 Strategy13 ndash Determining13 the13 order13 alternaves13 are13 considered

bull  Depth first bull  Breadth first

TOP-shy‐DOWN13 vs13 13 BOTTOM-shy‐UP13

bull  TOP-shy‐DOWN13 ndash Only13 search13 among13 grammacal13 answers13 ndash BUT13 suggests13 hypotheses13 that13 may13 not13 be13 consistent13 with13 data13

ndash Problem13 le[-shy‐recursion13

bull  BOTTOM-shy‐UP13 ndash Only13 forms13 hypotheses13 consistent13 with13 data13 ndash BUT13 may13 suggest13 hypotheses13 that13 make13 no13 sense13 globally13

NON-shy‐PARALLEL13 SEARCH13

bull  If13 itrsquos13 not13 possible13 to13 examine13 all13 alternaves13 in13 parallel13 itrsquos13 necessary13 to13 make13 further13 decisions13 ndash Which13 node13 in13 the13 current13 search13 space13 to13 expand13 first13 (breadth-shy‐first13 or13 depth-shy‐first)13

ndash Which13 of13 the13 applicable13 grammar13 rules13 to13 expand13 first13

ndash Which13 leaf13 node13 in13 a13 parse13 tree13 to13 expand13 next13 (eg13 le[most)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 5: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

TOP-shy‐DOWN13 vs13 13 BOTTOM-shy‐UP13

bull  TOP-shy‐DOWN13 ndash Only13 search13 among13 grammacal13 answers13 ndash BUT13 suggests13 hypotheses13 that13 may13 not13 be13 consistent13 with13 data13

ndash Problem13 le[-shy‐recursion13

bull  BOTTOM-shy‐UP13 ndash Only13 forms13 hypotheses13 consistent13 with13 data13 ndash BUT13 may13 suggest13 hypotheses13 that13 make13 no13 sense13 globally13

NON-shy‐PARALLEL13 SEARCH13

bull  If13 itrsquos13 not13 possible13 to13 examine13 all13 alternaves13 in13 parallel13 itrsquos13 necessary13 to13 make13 further13 decisions13 ndash Which13 node13 in13 the13 current13 search13 space13 to13 expand13 first13 (breadth-shy‐first13 or13 depth-shy‐first)13

ndash Which13 of13 the13 applicable13 grammar13 rules13 to13 expand13 first13

ndash Which13 leaf13 node13 in13 a13 parse13 tree13 to13 expand13 next13 (eg13 le[most)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 6: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

NON-shy‐PARALLEL13 SEARCH13

bull  If13 itrsquos13 not13 possible13 to13 examine13 all13 alternaves13 in13 parallel13 itrsquos13 necessary13 to13 make13 further13 decisions13 ndash Which13 node13 in13 the13 current13 search13 space13 to13 expand13 first13 (breadth-shy‐first13 or13 depth-shy‐first)13

ndash Which13 of13 the13 applicable13 grammar13 rules13 to13 expand13 first13

ndash Which13 leaf13 node13 in13 a13 parse13 tree13 to13 expand13 next13 (eg13 le[most)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 7: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 8: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (II)13

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 9: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

TOP-shy‐DOWN13 DEPTH-shy‐FIRST13 13 LEFT-shy‐TO-shy‐RIGHT13 (III)13

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 10: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

A13 T-shy‐D13 D-shy‐F13 L-shy‐R13 PARSER13

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 11: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

LEFT-shy‐RECURSION13

bull  A13 LEFT-shy‐RECURSIVE13 grammar13 may13 cause13 a13 T-shy‐D13 D-shy‐F13 L-shy‐R13 parser13 to13 never13 return13

bull  Examples13 of13 le[-shy‐recursive13 rules13 ndash NP13 13 NP13 PP13 ndash S13 13 S13 and13 S13

ndash But13 also13 bull  NP13 13 Det13 Nom13 bull  Det13 13 NPrsquos13

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 12: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

THE13 PROBLEM13 WITH13 13 LEFT-shy‐RECURSION13

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 13: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Dynamic13 Programming13

bull  We13 need13 a13 method13 that13 fills13 a13 table13 with13 paral13 results13 that13 ndash Does13 not13 do13 (avoidable)13 repeated13 work13 ndash Does13 not13 fall13 prey13 to13 le[-shy‐recursion13 ndash Can13 find13 all13 the13 pieces13 of13 an13 exponenal13 number13 of13 trees13 in13 polynomial13 me13

bull  Two13 popular13 methods13 ndash CKY13 ndash Earley13

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 14: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

The13 CKY13 (Cocke-shy‐Kasami-shy‐Younger)13 Algorithm13

bull  Requires13 the13 grammar13 be13 in13 Chomsky13 Normal13 Form13 (CNF)13 ndash All13 rules13 must13 be13 in13 following13 form13

bull  A13 -shy‐gt13 B13 C13 bull  A13 -shy‐gt13 w13

bull  Any13 grammar13 can13 be13 converted13 automacally13 to13 Chomsky13 Normal13 Form13

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 15: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Converng13 to13 CNF13

bull  Rules13 that13 mix13 terminals13 and13 non-shy‐terminals13 ndash  Introduce13 a13 new13 dummy13 non-shy‐terminal13 that13 covers13 the13 terminal13

bull  INFVP13 -shy‐gt13 to13 VP13 13 13 13 13 13 replaced13 by13

bull  INFVP13 -shy‐gt13 TO13 VP13 bull  TO13 -shy‐gt13 to13

bull  Rules13 that13 have13 a13 single13 non-shy‐terminal13 on13 right13 (ldquounit13 produconsrdquo)13 ndash  Rewrite13 each13 unit13 producon13 with13 the13 RHS13 of13 their13 expansions13

bull  Rules13 whose13 right13 hand13 side13 length13 gt213 ndash  Introduce13 dummy13 non-shy‐terminals13 that13 spread13 the13 right-shy‐hand13 side13

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 16: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Sample13 Grammar13 Det13 13 |13 a13 |13 the13 Noun13 book13 |13 saw13 |13 mark13 Verb13 13 book13 |13 saw13 Proper-shy‐Noun13 13 Mark13 Aux13 Did13 |13 Has13 Prep13 to13 |13 on13 |13 near13

S13 13 NP13 VP13 S13 Aux13 NP13 VP13 S13 13 VP13 NP13 13 NP13 PP13 NP13 13 Det13 Noun13 NP13 13 PrN13 VP13 13 V13 VP13 13 V13 NP13 VP13 13 V13 NP13 PP13 PP13 13 Prep13 NP13

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 17: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Automac13 Conversion13 to13 CNF13

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 18: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Back13 to13 CKY13 Parsing13

bull  Given13 rules13 in13 CNF13 bull  Consider13 the13 rule13 A13 -shy‐gt13 BC13

ndash  If13 there13 is13 an13 A13 in13 the13 input13 then13 there13 must13 be13 a13 B13 followed13 by13 a13 C13 in13 the13 input13

ndash  If13 the13 A13 goes13 from13 i13 to13 j13 in13 the13 input13 then13 there13 must13 be13 some13 k13 st13 iltkltj13 bull  Ie13 The13 B13 splits13 from13 the13 C13 someplace13

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 19: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

CKY13

bull  So13 letrsquos13 build13 a13 table13 so13 that13 an13 A13 spanning13 from13 i13 to13 j13 in13 the13 input13 is13 placed13 in13 cell13 [ij]13 in13 the13 table13

bull  So13 a13 non-shy‐terminal13 spanning13 an13 enre13 string13 will13 sit13 in13 cell13 [013 n]13

bull  If13 we13 build13 the13 table13 bojom13 up13 wersquoll13 know13 that13 the13 parts13 of13 the13 A13 must13 go13 from13 i13 to13 k13 and13 from13 k13 to13 j13

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 20: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

CKY13

bull  Meaning13 that13 for13 a13 rule13 like13 A13 -shy‐gt13 B13 C13 we13 should13 look13 for13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13

bull  In13 other13 words13 if13 we13 think13 there13 might13 be13 an13 A13 spanning13 ij13 in13 the13 inputhellip13 AND13

bull  A13 -shy‐gt13 B13 C13 is13 a13 rule13 in13 the13 grammar13 THEN13 bull  There13 must13 be13 a13 B13 in13 [ik]13 and13 a13 C13 in13 [kj]13 for13 some13 iltkltj13

bull  So13 just13 loop13 over13 the13 possible13 k13 values13

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 21: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

CKY13 Table13

bull Filling the [ij]th cell in the CKY table

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 22: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 23: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

CKY13 Algorithm13

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 24: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Note13

bull  We13 arranged13 the13 loops13 to13 fill13 the13 table13 a13 column13 at13 a13 me13 from13 le[13 to13 right13 bojom13 to13 top13 13 ndash This13 assures13 us13 that13 whenever13 wersquore13 filling13 a13 cell13 the13 parts13 needed13 to13 fill13 it13 are13 already13 in13 the13 table13 (to13 the13 le[13 and13 below)13

ndash Are13 there13 other13 ways13 to13 fill13 the13 table13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 25: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 26: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

013 Book13 113 the13 213 flight13 313 through13 413 13 Houston13 513 13

S13 13 NP13 VP13 S13 13 X113 VP13 X113 13 AUX13 NP13 S13 13 Verb13 NP13 S13 13 VP13 PP13 Nom13 13 book13 |13 flight13 |13 meal13 Nom13 13 13 Nom13 PP13 Det13 13 the13 |13 a13 |13 this13 NP13 13 Det13 Nom13 NP13 13 twa13 houston13 PP13 13 Prep13 NP13 Prep13 13 through13 |13 in13 |13 at13 VP13 13 Verb13 VP13 13 Verb13 NP13 Verb13 13 book13 |13 fly13 |13 list13

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 27: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13

John called Sue from Denver

S

VP13 PP13

NP13 VP13

V13 NP13 NP13 P13

John called Sue from Denver

S13

PP13

NP13 VP13

NP13

NP13

V13

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 28: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13 113

13 S(05) NP(45)

P(34) Denver

NP(23)13 V(23)

from

V(12) Sue

NP(01) called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 VP13 -shy‐gt13 VP13 PP13 NP13 -shy‐gt13 NP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 29: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13 213

PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 NP13 -shy‐gt13 NP13 PPVP13 -shy‐gt13 V13 NPVP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 30: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13 313

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NPNP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Mary13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 31: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13 413

S(05) S(05)

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V Sue

NP called

John

S13 -shy‐gt13 NP13 VP13 VP13 -shy‐gt13 V13 NP13 NP13 -shy‐gt13 NP13 PP13 VP13 -shy‐gt13 VP13 PP13 PP13 -shy‐gt13 P13 NP13 NP13 -shy‐gt13 John13 NP13 -shy‐gt13 Sue13 NP13 -shy‐gt13 Denver13 V13 -shy‐gt13 called13 V13 -shy‐gt13 sue13 P13 -shy‐gt13 from13

13 -shy‐gt13 NP13 V13

113 013 213 313 413 513

VP(15)13 VP(15)13

NP(25) PP(35) NP

13 -shy‐gt13 V13 P13 13 -shy‐gt13 NP13 P13

P Denver

S(03) VP(13) V13 NP from

V13 (12) Sue

NP called

John

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 32: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Back13 to13 Ambiguity13 bull  Did13 we13 solve13 it13 bull  Nohellip13

ndash  Both13 CKY13 and13 Earley13 will13 result13 in13 mulple13 S13 structures13 for13 the13 [0n]13 table13 entry13

ndash  They13 both13 efficiently13 store13 the13 sub-shy‐parts13 that13 are13 shared13 between13 mulple13 parses13

ndash  But13 neither13 can13 tell13 us13 which13 one13 is13 right13 ndash Not13 a13 parser13 ndash13 a13 recognizer13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 33: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Converng13 CKY13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 each13 new13 cell13 in13 chart13 to13 point13 to13 where13 we13 came13 from13

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 34: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Problem13 (minor)13

bull  We13 said13 CKY13 requires13 the13 grammar13 to13 be13 binary13 (ie13 In13 Chomsky-shy‐Normal13 Form)13

bull  We13 showed13 that13 any13 arbitrary13 CFG13 can13 be13 converted13 to13 Chomsky-shy‐Normal13 Form13 so13 thatrsquos13 not13 a13 huge13 deal13

bull  Except13 when13 you13 change13 the13 grammar13 the13 trees13 come13 out13 wrong13

bull  All13 things13 being13 equal13 wersquod13 prefer13 to13 leave13 the13 grammar13 alone13

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 35: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earley13 Parsing13

bull  Allows13 arbitrary13 CFGs13 bull  Where13 CKY13 is13 bojom-shy‐up13 Earley13 is13 top-shy‐down13 bull  Fills13 a13 table13 in13 a13 single13 sweep13 over13 the13 input13 words13 ndash Table13 is13 length13 N+113 N13 is13 number13 of13 words13 ndash Table13 entries13 represent13

bull  Completed13 constuents13 and13 their13 locaons13 bull  In-shy‐progress13 constuents13 bull  Predicted13 constuents13

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 36: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

States13

bull  The13 table-shy‐entries13 are13 called13 states13 and13 are13 represented13 with13 dojed-shy‐rules13 S13 -shy‐gt13 13 VP 13 13 13 13 A13 VP13 is13 predicted13 NP13 -shy‐gt13 Det13 13 Nominal 13 13 An13 NP13 is13 in13 progress13 VP13 -shy‐gt13 V13 NP13 13 13 13 13 A13 VP13 has13 been13 found13

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 37: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

StatesLocaons13

bull  It13 would13 be13 nice13 to13 know13 where13 these13 things13 are13 in13 the13 input13 sohellip13 S13 -shy‐gt13 13 VP13 [00] 13 13 13 A13 VP13 is13 predicted13 at13 the13 13 13 13

13 13 13 start13 of13 the13 sentence13 NP13 -shy‐gt13 Det13 13 Nominal13 13 [12] 13 An13 NP13 is13 in13 progress13 the13

13 13 13 13 13 13 Det13 goes13 from13 113 to13 213 VP13 -shy‐gt13 V13 NP13 13 13 [03] 13 13 A13 VP13 has13 been13 found13 13 13 13 13

13 13 13 starng13 at13 013 and13 ending13 at13 313

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 38: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Graphically13

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 39: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earley13

bull  As13 with13 most13 dynamic13 programming13 approaches13 the13 answer13 is13 found13 by13 looking13 in13 the13 table13 in13 the13 right13 place13

bull  In13 this13 case13 there13 should13 be13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 40: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earley13 Algorithm13

bull  March13 through13 chart13 le[-shy‐to-shy‐right13 bull  At13 each13 step13 apply13 113 of13 313 operators13

ndash Predictor13 bull  Create13 new13 states13 represenng13 top-shy‐down13 expectaons13

ndash Scanner13 bull  Match13 word13 predicons13 (rule13 with13 word13 a[er13 dot)13 to13 words13

ndash Completer13 bull When13 a13 state13 is13 complete13 see13 what13 rules13 were13 looking13 for13 that13 completed13 constuent13

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 41: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earleyrsquos13 example13 113 Predict13 -shy‐13 Scan-shy‐13 Complete13

NP13 -shy‐gt13 John13 13 S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PP

John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 13 NP13 VP13 NP13 -shy‐gt13 13 NP13 PP13 NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

P13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

NP13 -shy‐gt13 13 John13

NOTE13 TO13 SELF13 13 13 13 13 13 13 Put13 in13 spans13

SCAN13 PREDICT13 COMPLETE13

Move13 the13 dot13

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 42: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earleyrsquos13 example13 213 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 sue13 P13 -shy‐gt13 13 from13

V13 -shy‐gt13 13 called13 V13 -shy‐gt13 13 called13 13 VP13 -shy‐gt13 13 V13 13 NP13

SCAN13 PREDICT13 COMPLETE13

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 43: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earleyrsquos13 example13 313 John13 called13 Sue13 from13 Denver13

NP13 -shy‐gt13 13 Sue13 13 VP13 -shy‐gt13 13 V13 NP13 VP13 -shy‐gt13 13 VP13 13 13 PP13 S13 -shy‐gt13 NP13 13 VP13 13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 PP13 PP13 -shy‐gt13 13 P13 NP13 NP13 -shy‐gt13 13 John13 13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Sue13

SCAN13 PREDICT13 COMPLETE13

Am13 I13 done13

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 44: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earleyrsquos13 example13 413 John13 called13 Sue13 from13 Denver13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 V13 13 NPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 NP13 P13 -shy‐gt13 13 from13

P13 -shy‐gt13 13 from13

S13 -shy‐gt13 NP13 13 VP13 NP13 -shy‐gt13 NP13 13 PPVP13 -shy‐gt13 13 VP13 13 PP13 PP13 -shy‐gt13 13 P13 13 NP13 P13 -shy‐gt13 13 from13 13

NP13 -shy‐gt13 13 John13 NP13 -shy‐gt13 13 Sue13 NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13

NP13 -shy‐gt13 13 Denver13 13 PP13 -shy‐gt13 13 P13 13 NP13 13 NP13 -shy‐gt13 NP13 13 PP13 VP13 -shy‐gt13 13 VP13 13 PP13 VP13 -shy‐gt13 13 V13 13 NP13 13 S13 -shy‐gt13 NP13 13 13 VP13 13

DONE13

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 45: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Predictor13

bull  Given13 a13 state13 ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 not13 a13 part-shy‐of-shy‐speech13 category13 ndash  Create13 a13 new13 state13 for13 each13 expansion13 of13 the13 non-shy‐terminal13 ndash  Place13 these13 new13 states13 into13 same13 chart13 entry13 as13 generated13 state13 beginning13 and13 ending13 where13 generang13 state13 ends13 13

ndash  So13 predictor13 looking13 at13 bull  S13 -shy‐gt13 13 VP13 [00]13 13 13

ndash  13 13 results13 in13 bull  VP13 -shy‐gt13 13 Verb13 [00]13 bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 46: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Scanner13 bull  Given13 a13 state13

ndash With13 a13 non-shy‐terminal13 to13 right13 of13 dot13 ndash  That13 is13 a13 part-shy‐of-shy‐speech13 category13 ndash  If13 the13 next13 word13 in13 the13 input13 matches13 this13 part-shy‐of-shy‐speech13 ndash  Create13 a13 new13 state13 with13 dot13 moved13 over13 the13 non-shy‐terminal13 ndash  So13 scanner13 looking13 at13

bull  VP13 -shy‐gt13 13 Verb13 NP13 [00]13 ndash  If13 the13 next13 word13 ldquobookrdquo13 can13 be13 a13 verb13 add13 new13 state13

bull  VP13 -shy‐gt13 Verb13 13 NP13 [01]13 ndash  Add13 this13 state13 to13 chart13 entry13 following13 current13 one13 ndash  Note13 Earley13 algorithm13 uses13 top-shy‐down13 input13 to13 disambiguate13 POS13 Only13 POS13 predicted13 by13 some13 state13 can13 get13 added13 to13 chart13

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 47: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Completer13 bull  Applied13 to13 a13 state13 when13 its13 dot13 has13 reached13 right13 end13 of13 role13

bull  Parser13 has13 discovered13 a13 category13 over13 some13 span13 of13 input13

bull  Find13 and13 advance13 all13 previous13 states13 that13 were13 looking13 for13 this13 category13 ndash  copy13 state13 move13 dot13 insert13 in13 current13 chart13 entry13

bull  Given13 ndash  NP13 -shy‐gt13 Det13 Nominal13 13 [13]13 ndash  VP13 -shy‐gt13 Verb13 NP13 [01]13

bull  Add13 ndash  VP13 -shy‐gt13 Verb13 NP13 13 [03]13

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 48: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earley13 how13 do13 we13 know13 we13 are13 done13

bull  How13 do13 we13 know13 when13 we13 are13 done13 bull  Find13 an13 S13 state13 in13 the13 final13 column13 that13 spans13 from13 013 to13 n+113 and13 is13 complete13

bull  If13 thatrsquos13 the13 case13 yoursquore13 done13 ndash S13 ndashgt13 α13 13 [0n+1]13

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 49: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earley13

bull  So13 sweep13 through13 the13 table13 from13 013 to13 n+1hellip13 ndash New13 predicted13 states13 are13 created13 by13 starng13 top-shy‐down13 from13 S13

ndash New13 incomplete13 states13 are13 created13 by13 advancing13 exisng13 states13 as13 new13 constuents13 are13 discovered13

ndash New13 complete13 states13 are13 created13 in13 the13 same13 way13 13

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 50: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Earley13

bull  More13 specificallyhellip13 1  Predict13 all13 the13 states13 you13 can13 upfront13 2  Read13 a13 word13

1  Extend13 states13 based13 on13 matches13

2  Add13 new13 predicons13 3  Go13 to13 213

3  Look13 at13 N+113 to13 see13 if13 you13 have13 a13 winner13

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 51: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13

bull  Book13 that13 flight13 bull  We13 should13 findhellip13 an13 S13 from13 013 to13 313 that13 is13 a13 completed13 statehellip13

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 52: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 53: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 54: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Example13

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 55: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Details13

bull  What13 kind13 of13 algorithms13 did13 we13 just13 describe13 (both13 Earley13 and13 CKY)13 ndash Not13 parsers13 ndash13 recognizers13

bull  The13 presence13 of13 an13 S13 state13 with13 the13 right13 ajributes13 in13 the13 right13 place13 indicates13 a13 successful13 recognion13

bull  But13 no13 parse13 treehellip13 no13 parser13 bull  Thatrsquos13 how13 we13 solve13 (not)13 an13 exponenal13 problem13 in13 polynomial13 me13

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 56: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Back13 to13 Ambiguity13

bull  Did13 we13 solve13 it13

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 57: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Ambiguity13

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 58: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Converng13 Earley13 from13 Recognizer13 to13 Parser13

bull  With13 the13 addion13 of13 a13 few13 pointers13 we13 have13 a13 parser13

bull  Augment13 the13 ldquoCompleterrdquo13 to13 point13 to13 where13 we13 came13 from13

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 59: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Augmenng13 the13 chart13 with13 structural13 informaon13

Step13 DoCed13 rule13 Span13 Step13 Backpointer13

S813 Verb13 13 book13 13 [01]13 Scanner13

S913 VP13 13 Verb13 13 [01]13 Completer13 S813

S1013 S13 13 VP13 13 [01]13 Completer13 S913

S1113 VP13 13 Verb13 13 13 NP13 [01]13 Completer13 S813

S1213 NP13 13 13 Det13 Nom13 [11]13 Predictor13 S1113

S1313 NP13 13 13 PropN13 [11]13 Predictor13 S1113

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 60: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

Retrieving13 Parse13 Trees13 from13 Chart13

bull  All13 the13 possible13 parses13 for13 an13 input13 are13 in13 the13 table13 bull  We13 just13 need13 to13 read13 off13 all13 the13 backpointers13 from13 every13 complete13 S13 in13 the13 last13 column13 of13 the13 table13

bull  Find13 all13 the13 S13 -shy‐gt13 X13 13 13 [0N+1]13 bull  Follow13 the13 structural13 traces13 from13 the13 Completer13 bull  Of13 course13 this13 wonrsquot13 be13 polynomial13 me13 since13 there13 could13 be13 an13 exponenal13 number13 of13 trees13

bull  So13 we13 can13 at13 least13 represent13 ambiguity13 efficiently13

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13

Page 61: CS114 Lect11 Parsing - cs.brandeis.educs114/CS114_slides/CS114_Lect11_Parsing.pdf · CS114%Lecture%10% Parsing% March%5,%2014% Professor%Meteer% Thanks%for%Jurafsky%&%Mar?n%&%Prof.%Pustejovksyforslides

How13 to13 do13 parse13 disambiguaon13

bull  Probabilisc13 methods13 bull  Augment13 the13 grammar13 with13 probabilies13

bull  Then13 modify13 the13 parser13 to13 keep13 only13 most13 probable13 parses13

bull  And13 at13 the13 end13 return13 the13 most13 probable13 parse13