61
Probabilistic CKY Roger Levy [thanks to Jason Eisner]

Probabilistic CKY

  • Upload
    rollin

  • View
    50

  • Download
    1

Embed Size (px)

DESCRIPTION

Probabilistic CKY. Roger Levy [thanks to Jason Eisner]. Managing Ambiguity. John saw Mary Typhoid Mary Phillips screwdriver Mary note how rare rules interact I see a bird is this 4 nouns – parsed like “city park scavenger bird”? - PowerPoint PPT Presentation

Citation preview

Page 1: Probabilistic CKY

Probabilistic CKY

Roger Levy

[thanks to Jason Eisner]

Page 2: Probabilistic CKY

Managing Ambiguity

• John saw Mary • Typhoid Mary• Phillips screwdriver Mary

note how rare rules interact

• I see a bird • is this 4 nouns – parsed like “city park scavenger bird”?

rare parts of speech, plus systematic ambiguity in noun sequences

• Time flies like an arrow• Fruit flies like a banana• Time reactions like this one• Time reactions like a chemist• or is it just an NP?

Page 3: Probabilistic CKY

Our bane: Ambiguity

• John saw Mary • Typhoid Mary• Phillips screwdriver Mary

note how rare rules interact

• I see a bird • is this 4 nouns – parsed like “city park scavenger bird”?

rare parts of speech, plus systematic ambiguity in noun sequences

• Time | flies like an arrow NP VP• Fruit flies | like a banana NP VP

• Time | reactions like this one V[stem] NP

• Time reactions | like a chemist S PP

• or is it just an NP?

Page 4: Probabilistic CKY

How to solve this combinatorial explosion of ambiguity?

1. First try parsing without any weird rules, throwing them in only if needed.

2. Better: every rule has a weight. A tree’s weight is total weight of all its rules. Pick the overall lightest parse of sentence.

3. Can we pick the weights automatically?We’ll get to this later …

Page 5: Probabilistic CKY

The plan for the rest of parsing

• There’s probability (inference) and then there’s statistics (model estimation)

• These two problems are logically separate, yet interrelated in practice

• We’ll start with the problem of efficient inference (probabilistic bottom-up parsing)

• Then we’ll move to the estimation of good (generative) models that permit efficient bottom-up parsing

• Finally, we’ll mention state-of-the-art work that doesn’t permit exact inference (“reranking” approaches)

Page 6: Probabilistic CKY

The critical recursion

• We’ll do bottom-up parsing (weighted CKY)• When combining constituents into a larger constituent:

• The weight of the new constituent is the sum of the weights of the combined subconstituents…

• …plus the weight of the rule used to combine the subconstituents

Page 7: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

1

NP 4VP 4

2

P 2V 5

3 Det 1

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 8: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

1

NP 4VP 4

2

P 2V 5

3 Det 1

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 9: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10

1

NP 4VP 4

2

P 2V 5

3 Det 1

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 10: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8

1

NP 4VP 4

2

P 2V 5

3 Det 1

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 11: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

3 Det 1

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 12: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

3 Det 1

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 13: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 14: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 15: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

PP 12

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 16: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 17: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 18: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

NP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 19: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

NP 18S 21

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 20: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 21: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 22: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 23: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 24: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 25: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 26: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 27: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 28: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 29: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Page 30: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

SFollow backpointers …

Page 31: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

S

NP VP

Page 32: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

S

NP VP

VP PP

Page 33: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

S

NP VP

VP PP

P NP

Page 34: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

S

NP VP

VP PP

P NP

Det N

Page 35: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Which entries do we need?

Page 36: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Which entries do we need?

Page 37: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Not worth keeping …

Page 38: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

… since it just breeds worse options

Page 39: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Keep only best-in-class!

“inferior stock”

Page 40: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

NP 3Vst 3

NP 10S 8

NP 24S 22

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Keep only best-in-class!(and backpointers so you can recover parse)

Page 41: Probabilistic CKY

Probabilistic Trees

• Instead of lightest weight tree, take highest probability tree• Given any tree, your assignment 1 generator would have some

probability of producing it! • Just like using n-grams to choose among strings …• What is the probability of this tree?

S

NPtime

VP

VPflies

PP

Plike

NP

Detan

N arrow

Page 42: Probabilistic CKY

Probabilistic Trees

• Instead of lightest weight tree, take highest probability tree

• Given any tree, your assignment 1 generator would have some probability of producing it!

• Just like using n-grams to choose among strings …• What is the probability of this tree?

• You rolled a lot of independent dice …

S

NPtime

VP

VPflies

PP

Plike

NP

Detan

N arrow

p( | S)

Page 43: Probabilistic CKY

Chain rule: One word at a time

p(time flies like an arrow) = p(time)

* p(flies | time)* p(like | time flies)* p(an | time flies like)* p(arrow | time flies like an)

Page 44: Probabilistic CKY

Chain rule + backoff (to get trigram model)

p(time flies like an arrow) = p(time)

* p(flies | time)* p(like | time flies)* p(an | time flies like)* p(arrow | time flies like an)

Page 45: Probabilistic CKY

Chain rule – written differently

p(time flies like an arrow) = p(time)

* p(time flies | time)* p(time flies like | time flies)* p(time flies like an | time flies like)* p(time flies like an arrow | time flies like an)

Proof: p(x,y | x) = p(x | x) * p(y | x, x) = 1 * p(y | x)

Page 46: Probabilistic CKY

Chain rule + backoff

p(time flies like an arrow) = p(time)

* p(time flies | time)* p(time flies like | time flies)* p(time flies like an | time flies like)* p(time flies like an arrow | time flies like an)

Proof: p(x,y | x) = p(x | x) * p(y | x, x) = 1 * p(y | x)

Page 47: Probabilistic CKY

Chain rule: One node at a time

S

NPtime

VP

VPflies

PP

Plike

NP

Detan

N arrow

p( | S) = p(S

NP VP| S) * p(

S

NPtime

VP|

S

NP VP)

* p(S

NPtime

VP

VP PP

|S

NPtime

VP)

* p(S

NPtime

VP

VPflies

PP

|S

NPtime

VP) * …

VP PP

Page 48: Probabilistic CKY

Chain rule + backoff

S

NPtime

VP

VPflies

PP

Plike

NP

Detan

N arrow

p( | S) = p(S

NP VP| S) * p(

S

NPtime

VP|

S

NP VP)

* p(S

NPtime

VP

VP PP

|S

NPtime

VP)

* p(S

NPtime

VP

VPflies

PP

|S

NPtime

VP) * …

VP PP

Page 49: Probabilistic CKY

Simplified notation

S

NPtime

VP

VPflies

PP

Plike

NP

Detan

N arrow

p( | S) = p(S NP VP | S) * p(NP flies | NP)

* p(VP VP NP | VP)

* p(VP flies | VP) * …

Page 50: Probabilistic CKY

Already have a CKY alg for weights …

S

NPtime

VP

VPflies

PP

Plike

NP

Detan

N arrow

w( | S) = w(S NP VP) + w(NP flies | NP)

+ w(VP VP NP)

+ w(VP flies) + …

Just let w(X Y Z) = -log p(X Y Z | X)Then lightest tree has highest prob

Page 51: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8 S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

multiply to get 2-22

2-8

2-12

2-2

Page 52: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8 S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

multiply to get 2-22

2-8

2-12

2-2

2-13

Need only best-in-class to get best parse

Page 53: Probabilistic CKY

Why probabilities not weights?• We just saw probabilities are really just a special case

of weights …• … but we can estimate them from training data by

counting and smoothing!• Warning: What kind of training data do we need for

this type of estimation?

Page 54: Probabilistic CKY

A slightly different task

• One task: What is probability of generating a given tree with a PCFG generator? • To pick tree with highest prob: useful in parsing.

• But could also ask: What is probability of generating a given string with the generator?

• To pick string with highest prob: useful in speech recognition, as substitute for an n-gram model. • (“Put the file in the folder” vs. “Put the file and the

folder”)• To get prob of generating string, must add up

probabilities of all trees for the string …

Page 55: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S 8 S 13

NP 24S 22S 27NP 24S 27S 22S 27

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 12VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Could just add up the parse probabilities

2-22

2-27

2-27

2-22

2-27

oops, back to finding

exponentially many parses

Page 56: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10SS 2-13

NP 24S 22S 27NP 24S 27SS

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 2-12

VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2-2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Any more efficient way?

2-8

2-22

2-27

Page 57: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S

NP 24S 22S 27NP 24S 27S

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 2-12

VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2-2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Add as we go … (the “inside algorithm”)

2-8+2-13

2-22

+2-27

Page 58: Probabilistic CKY

time 1 flies 2 like 3 an 4 arrow 5

0

NP 3Vst 3

NP 10S

NP

S

1

NP 4VP 4

NP 18S 21VP 18

2

P 2V 5

PP 2-12

VP 16

3 Det 1 NP 10

4 N 8

1 S NP VP6 S Vst NP2-2 S S PP

1 VP V NP2 VP VP PP

1 NP Det N2 NP NP PP3 NP NP NP

0 PP P NP

Add as we go … (the “inside algorithm”)

2-8+2-13

+2-22

+2-27

2-22

+2-27 2-22

+2-27 +2-27

Page 59: Probabilistic CKY

PCFGs and HMMs

• There is a natural connection between:• the inside algorithm for filling out a probabilistic CKY

chart • and the forward algorithm for filling out an HMM trellis

• Do you see the relationship?

• However, bottom-up for PCFGs and left-to-right for HMMs are not the only ways to go for DPs• You can go outside-in for PCFGs• We’ll see left-to-right (the Earley algorithm) for PCFGs • And you can go right-to-left for HMMs

Page 60: Probabilistic CKY

Charts and lattices

• You can equivalently represent a parse chart as a lattice constructed over some initial arcs

• This will also set the stage for Earley parsing later

salt flies scratch

NP

N

NP

S

S

VP

V

N

NP

S

VP

NP

N

NP

V

VP salt

N NP V VP

flies scratch

N NP V VPN NP

NP S NP VP S

S

S NP VPVP V NPVPVNP NNP N N

Page 61: Probabilistic CKY

(Speech) Lattices

• There was nothing magical about words spanning exactly one position.

• When working with speech, we generally don’t know how many words there are, or where they break.

• We can represent the possibilities as a lattice and parse these just as easily.

I

aweof

van

eyes

saw

a

‘ve

an

Ivan