Dependency Tree-to-Dependency Tree Machine Translation

Preview:

DESCRIPTION

Dependency Tree-to-Dependency Tree Machine Translation. November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration with: Chris Dyer, Noah Smith, Stephan Vogel. Problem. Swahili: Watoto ni kusoma vitabu . Gloss: children aux- pres read books - PowerPoint PPT Presentation

Citation preview

1

Dependency Tree-to-Dependency Tree Machine Translation

November 4, 2011Presented by: Jeffrey Flanigan (CMU)

Lori Levin, Jaime Carbonell

In collaboration with:Chris Dyer, Noah Smith, Stephan Vogel

2

ProblemSwahili: Watoto ni kusoma vitabu.Gloss: children aux-pres read booksEnglish: Children are reading books.MT (Phrase-based): Children are reading books.

Why?Phrase Table: Pr(reading books| kusoma vitabu)

Pr(books | kusoma vitabu)Language model: Children are three new reading books. Children are reading books three new.

Swahili: Watoto ni kusoma vitabu tatu mpya.Gloss: children aux-pres read books three newEnglish: Children are reading three new books.MT (Phrase-based): Children are three new books.

3

Problem: Grammatical Encoding Missing

Swahili: Nimeona samaki waliokula mashua.Gloss: I-found fish who-ate boatEnglish: I found the fish that ate the boat.

MT System: I found that eating fish boat.

Predicate-argument structure was corrupted.

4

Grammatical Relations

I found the fish that ate the boat.

SUBJ

OBJ

⇒ Dependency trees on source and target!

ROOT

DETRCMOD

DOBJ

DETREF

5

Approach

Source Dependency Tree

Target Dependency Tree

Source Sentence

Target Sentence

Undo grammatical encoding (parse)

Translate

Grammatical encoding(choose surface form, linearize)

All stages statistical

6

Extracting the rules:Extract all consistent tree fragment pairs

Children are

reading

books

three new

Abaana

barasoma

ibitabo

bitatu bishya

NSUBJ

NSUBJ

NUM

AUX

DOBJ

DOBJ

NUMAMOD

AMOD

are

reading

NSUBJAUX

[1] [2]

DOBJbarasoma

NSUBJ

[1] [2]

DOBJ

are

readingNSUBJ

AUX[1] books

DOBJ

ibitabo

barasoma

NSUBJ DOBJ

[1]

[1]

NUM

three new

AMOD

bishya

[1]

NUM AMOD

bitatu

ChildrenAbaana

Example Extracted PairsSOURCE SIDE TARGET SIDE

Abaana

[1]

Children

[1]NUM NUM

7

Translating

Extension of phrase-based SMT• Linear strings → Dependency trees• Phrase pairs → Tree fragment pairs• Language model → Dependency language model

Search is top down on the target side using beam search decoder

8

Translation Example

umwaana[3]

arasoma

Ibitabo[4]

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ

Inventory of Rules

theDET

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

ibitabo books P(e|f)=.7

Input

The child is reading books

9

Translation Example

is

reading

[4]umwaana

[3]

arasoma

Ibitabo[4]

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ NSUBJDOBJ

AUX

Inventory of Rules

theDET

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

Score = w1ln(.5)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX)))

ibitabo books P(e|f)=.7

[3]

Input

Language model on target dependency tree

10

Translation Example

is

reading

booksumwaana

[3]

arasoma

ibitabo

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ NSUBJDOBJ

AUX

Inventory of Rules

theDET

[3]

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

Score = w1ln(.5)+w1ln(.7)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX)))+w2ln(Pr(books|(reading,DOBJ)))

ibitabo books P(e|f)=.7

Input

11

Translation Example

is

reading

booksumwaana

arasoma

ibitabo

is

readingNSUBJ

AUX[1] [2]

DOBJ

[2]

arasomaNSUBJ DOBJ

[1]

childumwaana

P(e|f)=.5

P(e|f)=.8

NSUBJ DOBJ NSUBJDOBJ

AUX

Inventory of Rules

theDET

child

theDET

NSUBJ[1]

[1]NSUBJ

childumwaana P(e|f)=.1

aDET

NSUBJ[1]

[1]NSUBJ

Score(Translation) = w1ln(.5)+w1ln(.7)+w1ln(.8)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX)))+w2ln(Pr(books|(reading,DOBJ)))+w2ln(Pr(child|(reading,NSUBJ)))+w2ln(Pr(the|(child,DET),(reading,ROOT)))

ibitabo books P(e|f)=.7

Input

12

Linearization

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM enough

is

strong

NSUBJ COP

ADVMOD

He

13

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is enough strong

Score=Pr(He|START) Pr(<∙ NSUBJ,HEAD,COP>|is) Pr(<∙ HEAD,ADVMOD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

14

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is enough strong

Score=Pr(He|START) Pr(is|He,START) Pr(<∙ ∙ NSUBJ,HEAD,COP>|is) Pr(<∙ HEAD,ADVMOD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

15

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is strong enough

Score=Pr(He|START) Pr(is|He∙ ,START) Pr(strong|He,is)∙ ∙ Pr(<NSUBJ,HEAD,COP>|is) Pr(<∙ ADVMOD,HEAD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

16

Linearization

enough

is

strong

NSUBJ COP

ADVMOD

He

He is strong enough

Score=Pr(He) Pr(is|He) Pr(strong|He, is) Pr(enough|strong, is)∙ ∙ ∙ ∙ Pr(<NSUBJ,HEAD,COP>|is) Pr(<∙ HEAD,ADVMOD>|strong)

• Generate projective trees• A* Search• Left to right with target LM• Admissible Heuristic: Highest scoring

completion without LM

17

Comparison To Major ApproachesApproach Similarities Difference

Old Style Analysis-Transfer-Generate Separate analysis, generation, transfer models

Statistical, rules learned

Synchronous CFGs[Chiang 2005] [Zollman et al. 2006]

Model of grammatical encoding Allows adjunction and head switching

Tree-Transducers[Graehl & Knight 2004]

Model of grammatical encoding Different decoding

Quasi-Synchronous Grammars[Gimpel & Smith 2009]

Dependency trees on source and target

Different rules, decoding

Synchronous Tree Insertion Grammars [DeNeefe & Knight 2009]

Allows adjuncts Allows head switching

Dependency Treelets [Quirk et al 2005] [Shen et al 2008]

Dependency trees on source and target

Word order not in rules, linearization procedure

String-to-Dependency MT[Shen et al 2008]

Target dependency language model

Dependency trees on both source and target

Dependency tree to dependency tree (JHU Summer Workshop 2002)[Čmejrek et al 2003] [Eisner 2003]

Dependency trees on source and target. Linearization step.

Different learning of rules, different decoding procedure

18

Conclusion• Separate translation from reordering• Dependency trees capture grammatical relations• Can extend phrase-based MT to dependency trees• Complements ISI’s approach nicely

Work in progress!

19

Backup Slides

20

Allowable Rules• Nodes consistent w/ alignments• All variables aligned• Nodes variables arcs alignments = connected graph∪ ∪ ∪

Optional Constraints• Nodes on source connected• Nodes on target connected• Nodes on source and target connected

Decoding Constraint• Target tree connected

21

Head Switching Example

bébé

Le

vient

de

tomber

child

fell

just

The

NSUBJ

NSUBJ

DET

PREP

ADVMOD

POBJ

ADVMOD

[1]

[2]

justNSUBJ ADVMOD

[1]

vient

de

[2]

NSUBJ PREP

POBJ

[1]

fell

[2]

NSUBJ ADVMOD

[1]

[2]

de

tomber

NSUBJ PREP

POBJ

22

Moving Up the Triangle

Propositional Semantic Dependencies

Deep Syntactic Dependencies

Surface Syntactic Dependencies

23

Comparison to Synchronous Phrase Structure Rules

• Training dataset:

• Test sentence:

• Synchronous decoders (SAMT, Hiero, etc) produce:The children are reading book ’s Charles new all of .The children are reading book Charles ’s all of new .

Problem: Grammatical encoding tied to word order.

Kinyarwanda: Abaana baasoma igitabo gishya kyose cyaa Karooli .English: The children are reading all of Charles ’s new book .

Kinyarwanda: Abaana baasoma igitabo cyaa Karooli gishya kyose.

Recommended