Automating Post-Editing To Improve MT Systems Ariadna Font Llitjós and Jaime Carbonell APE Workshop, AMTA – Boston August 12, 2006

Automating Post-Editing To Improve MT Systems

Ariadna Font Llitjós and Jaime Carbonell

APE Workshop, AMTA – Boston

August 12, 2006

Outline• Introduction

– Problem and Motivation– Goal and Solution– Related Work

• Theoretical Space

• Online Error Elicitation

• Rule Refinement Framework

• MT Evaluation Framework

• Conclusions

Problem and Motivation

MT output: Assassinated a diplomat Russian and kidnapped other four in Bagdad



• Could hire post-editors to correct machine translations



• Could hire post-editors to correct machine translations

• Expensive + time consuming



• Not feasible for large amounts of data (google, yahoo, etc.)

• Does not generalize to new sentences

Ultimate Goal

Automatically Improve MT Quality

Two Alternatives

• Automatic learning of post-editing rules (APE)

+ system independent

- several thousands of sentences might need to be corrected for the same error

• Automatic Refinement of MT System+ attacks the core of the problem- system dependent

Our Approach

Automatically Improve MT Quality

by Recycling Post-Editing Information

Back to MT Systems

Our Solution

• Get non-expert bilingual speakers to provide correction feedback online – Make correcting translations easy and fun – 5-10 minutes a day Large amounts of correction

data

SL: Asesinado un diplomático ruso y secuestrados otros cuatro en Bagdad

TL: Assassinated a diplomat Russian and kidnapped other four in Bagdad

Our Solution

• Get non-expert bilingual speakers to provide correction feedback online – Make correcting translations easy and fun – 5-10 minutes a day Large amounts of correction

data

• Feed corrections back into the MT system, so that they can be generalized

System will translate new sentences better

SL: Asesinado un diplomático ruso y secuestrados otros cuatro en Bagdad

TL: Assassinated a diplomat Russian and kidnapped other four in Bagdad

Bilingual Post-editing

• In addition to:– TL sentence, and– Context information, when available

• It also provides post-editors with:– Source Language sentence– Alignment information

Traditional post-editing

Allegedly a harder task, but we can now get much more data for free

MT Approaches

Interlingua

Syntactic Parsing

Semantic Analysis

Sentence Planning

Text Generation

Source (e.g. English)

Target(e.g. Spanish)

Transfer Rules

Direct: SMT, EBMT

[Nishida et al. 1988]

[Corston-Oliver & Gammon, 2003]

[Imamura et al. 2003]

[Menezes & Richardson, 2001]

Related Work

[Su et al. 1995]

[Brill, 1993]

[Gavaldà, 2000]

Post-editing

Rule AdaptationFixingMachine Translation

[Callison-Burch, 2004]

[Allen 2003]

[Allen & Hogan, 2000]

[Knight & Chander, 1994]

Our ApproachNon-expert user feedback

Provides relevant reference translations

Generalizes over unseen data

System ArchitectureINPUT TEXT

OUTPUT TEXT

Rule Learning

and other

Resources

Run-Time MT

System

Online

Translation

Correction

Tool

Rule Refinement

Main Technical Challenge

Mapping between

Simple user edits to MT output

Improved Translation Rules

Blame assignment

Rule Modifications

Lexical Expansions


• Theoretical Space– Error classification– Limitations




• Conclusions

Error Typology for Automatic Rule Refinement (simplified)

Missing word

Extra word

Wrong word order

Incorrect word

Local vs Long distance

Word vs. phrase

+ Word change

Sense

Form

Selectional restrictions

Idiom

Missing constraint

Extra constraint

Error Typology for Automatic Rule Refinement (simplified)

Missing word

Extra word

Wrong word order

Incorrect word

Local vs Long distance

Word vs. phrase

+ Word change

Sense

Form

Selectional restrictions

Idiom

Missing constraint

Extra constraint

Limitations of Approach

• The SL sentence needs to be fully parsed by the translation grammar.– current approach needs rules to refine

• Depend on bilingual speakers’ ability to detect MT errors.

Outline

• Introduction


• Online Error Elicitation– Translation Correction Tool



• Conclusions

TCTool (Demo)• Add a word• Delete a word• Modify a word• Change word order

Actions:

Interactive elicitation of error information

http://avenue.lti.cs.cmu.edu/aria/spanish/KenProject/dragshort.html?sl=Gaudi%20was%20a%20great%20artist&con=&tl=gaudi%20era%20un%20artista%20grande&al=((1,1),(2,2),(3,3),(5,4),(4,5))&id=2004-11-2-14:39:12-17393&count=0&senum=3

Expanding the lexicon

1. OOV words/idoms Hamas cabinet calls a truce to avoid Israeli retaliation

El gabinete de Hamas llama un *TRUCE* para evitar la venganza israelí …acuerda una tregua…

2. New Word FormThe children fell *los niños cayeron

los niños se cayeron

3. New Word SenseThe girl plays the guitar *la muchacha juega la guitarra

la muchacha toca la guitarra

Refining the grammar

4. Add Agreement constraints I see the red car *veo el coche roja veo el coche rojo

5. Add Word or ConstituentYou saw the woman viste la mujer viste a la mujer

6. Move Constituent Order I saw you yo vi te yo te vi

Eng2Spa User Study [LREC 2004]

• MT error classification 9 linguistically-motivated classes [Flanagan, 1994], [White et al. 1994]:

word order, sense, agreement error (number, person, gender, tense), form, incorrect word and no translation

Interactive elicitation of error information

precision recall

error detection 90% 89%

error classification 72% 71%




• Rule Refinement Framework– RR operations– Formalizing Error information– Refinement Steps (Example)


• Conclusions

1. Refine a translation rule:R0 R1 (change R0 to make it more

specific or more general)

Types of Refinement Operations

Automatic Rule Adaptation

R0:

R1:

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonito

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonita

N gender = ADJ gender

1. Refine a translation rule:R0 R1 (change R0 to make it more

specific or more general)

Types of Refinement Operations


R0:

R1:

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonito

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonita


2. Bifurcate a translation rule:R0 R0 (same, general rule)

Types of RefinementOperations (2)


R0: NP

DET N ADJ

NP

DET ADJ N

a nice house una casa bonita

NP

DET ADJ N

NP

DET ADJ N

R1:

a great artist un gran artista

ADJ type: pre-nominal

2. Bifurcate a translation rule:R0 R0 (same, general rule)

R1 (add a new more specific rule)

Types of RefinementOperations (2)


R0: NP

DET N ADJ

NP

DET ADJ N

a nice house una casa bonita

NP

DET ADJ N

NP

DET ADJ N

R1:


ADJ type: pre-nominal

Wi = error

Wi’ = correction

Wc = clue word


NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonito

NP

DET N ADJ

NP

DET ADJ N

a nice house

una casa bonita


Wi = bonito

Wi’ = bonita

Wc = casa

Formalizing Error Information

Triggering Feature DetectionComparison at the feature level to detect

triggering feature(s)

Delta function: (Wi,Wi’)

Examples:(bonito,bonita) = {gender}

(comiamos,comia) = {person,number}(mujer,guitarra) = {}

If set is empty, need to postulate a new binary feature (feat_i = {+,−})


gen = masc gen = fem

Refinement Steps

Error Correction Elicitation

Finding Triggering Features

Blame Assignment

Rule Refinement

Refinement Steps



Blame Assignment

Rule Refinement

SL: Gaudí was a great artist

TL: Gaudí era un artista grande

1. Error Correction Elicitation

Wi = grande Edit Word

Change Word Order

Wi’ = gran

CTL: Gaudí era un gran artista

Refinement Steps


Finding Triggering Feature

Blame Assignment

Rule Refinement

1. Edit: Wi = grande Wi’ = gran

2. Change Word Order: artista gran gran artista

Refinement Steps



Blame Assignment

Rule Refinement



ADJ

[great] [grande]

agr num = sg

agr gen = masc

ADJ

[great] [gran]

agr num = sg

agr gen = masc

Delta function difference at the feature level?

(grande, gran) =

2. Finding Triggering Features

ADJ

[great] [grande]

agr num = sg

agr gen = masc

ADJ

[great] [gran]

agr num = sg

agr gen = masc


(grande, gran) =

need to postulate a new binary feature: feat1


ADJ

[great] [grande]

agr num = sg

agr gen = masc

ADJ

[great] [gran]

agr num = sg

agr gen = masc


(grande, gran) =

need to postulate a new binary feature: feat1

[feat1 = +] = [type = pre-nominal]

[feat1 = −] = [type = post-nominal]



ADJ

[great] [grande]

agr num = sg

agr gen = masc

ADJ

[great] [gran]

agr num = sg

agr gen = masc


(grande, gran) = new binary feature: feat1

ADJ

[great] [grande]

agr num = sg

agr gen = masc

feat1 = −

ADJ

[great] [gran]

agr num = sg

agr gen = masc

feat1 = +

REFINE

Refinement Steps



Blame Assignment

Rule Refinement



grande [feat1 = −]

gran [feat1 = +]

Refinement Steps



Blame Assignment

Rule Refinement




gran [feat1 = +]

(from transfer and generation tree)

tree: <( S,1 (NP,2 (N,5:1 "GAUDI") )

(VP,3 (VB,2 (AUX,17:2 "ERA") )

(NP,8 (DET,0:3 "UN")

(N,4:5 "ARTISTA")

(ADJ,5:4 "GRANDE"))) )>

3. Blame Assignment

S,1

…

NP,1

…

NP,8

…

Grammar

Refinement Steps



Blame Assignment

Rule Refinement




gran [feat1 = +]

NP,8

N ADJ ADJ N

Refinement Steps



Blame Assignment

Rule Refinement




gran [feat1 = +]

NP,8

N ADJ ADJ N

4. Rule Refinement

NP

DET ADJ N

NP

DET N ADJ

NP,8:

a great artist un artista gran

NP

DET ADJ N

NP

DET ADJ N

NP,8’:


ADJ feat1 =c +

BIFURCATE

REFINE

4. Rule Refinement

NP

DET ADJ N

NP

DET N ADJ

NP,8:

a great artist un artista gran

NP

DET ADJ N

NP

DET ADJ N

NP,8’:


ADJ type = pre-nominal

Refinement Steps



Blame Assignment

Rule Refinement




gran [feat1 = +]

NP,8

(N ADJ ADJ N)

NP,8 ADJ N N ADJ

NP,8’ ADJ N ADJ N [ADJ feat 1 =c +]

Correct Translation Output

NP,8 ADJ(great-grande) [feat1 = -]

NP,8’ ADJ(great-gran)

[ADJ feat1 =c +] [feat1 = +]

Gaudi era un artista grande

Gaudi era un gran artista

*Gaudi era un grande artista

Done? Not yetNP,8 (R0) ADJ(grande)

[feat1 = -]

NP,8’ (R1) ADJ(gran)

[feat1 =c +] [feat1 = +]

Need to restrict application of general rule (R0) to just post-nominal ADJ

un artista grande

un artista gran

un gran artista

*un grande artista

Add Blocking ConstraintNP,8 (R0) ADJ(grande) [feat1 = -] [feat1 = -]


[feat1 =c +] [feat1 = +]

Can we also eliminate incorrect translations automatically?

un artista grande

*un artista gran

un gran artista

*un grande artista

Making the grammar tighter

• If Wc = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0)

between N and ADJ ((N feat1) = (ADJ feat1))

*un artista grande

*un artista gran

un gran artista

*un grande artista

Generalization Power: abstract feature (feat_i)

Irina is a great friend Irina es una gran amiga (instead of *Irina es una amiga

grande) NP

DET ADJ N

NP

DET ADJ N

NP,8’:

a great friend una gran amiga

NP

DET ADJ N

NP

DET ADJ N

a great person una gran persona

Juan is a great person Juan es una gran persona

(instead of *Juan es una persona grande) ADJ feat1 = +

ADJ feat1 = +

Generalization Power++

When triggering feature already exists in the grammar/lexicon (POS, gender, number, etc.)

I see the red car *veo un auto roja ADJ gender = N gender

Refinements generalize to all lexical entries that have that feature (gender):

The yellow houses are his las casas amarillas son suyas (before: *las casas amarillos son suyas)

We need to go to a dark cave tenemos que ir a una cueva oscura

(before: *cueva oscuro)

gender = fem

gender = masc

gender = fem

gender = masc gender = fem

veo un auto rojoo





• MT Evaluation Framework– Relevant (and free) human reference translations– Not punished by BLEU, NIST and METEOR.

• Conclusions

MT Evaluation Framework

Multiple Human Reference

Translations

relevant to specific MT system errors

Bilingual speakers

- MT systems not be punished for picking a different synonym or morpho-syntactic variation by BLEU and METEOR.

- Similar to what [Snover et al. 2006] propose (HTER).

MT Evaluation continued

• Did the refined MT system generate translation as corrected by the user (CTL)? Simple “recall” {0=CTL not in output, 1=CTL in output}

• Did the number of bad translations (implicitly identified by users) generated by the system decrease?Precision at rank k (k=5 in TCTool)

• Did the system successfully managed to reduce ambiguity (number of alternative translations)? Reduction ratio






• Conclusions– Impact on MT systems– Work in Progress and Contributions – Future Work

Impact on Transfer-Based MT

Learning

Module

Transfer Rules

Lexical Resources

Transfer System

Translation Candidate

Lattice

Rule Learning and other Resources

Run-Time System

Handcrafted rules

Morpho-logical analyzer

INPUT TEXT

OUTPUT TEXT

Learning

Module

Transfer Rules

Lexical Resources

Transfer System


Lattice


Run-Time System

Handcrafted rules


INPUT TEXTOnline

Translation

Correction

Tool

Rule Refinement


Learning

Module

Transfer Rules

Lexical Resources

Transfer System


Lattice


Run-Time System

Handcrafted rules


INPUT TEXTOnline

Translation

Correction

Tool

Rule Refinement


Learning

Module

Transfer Rules

Lexical Resources

Transfer System


Lattice


Run-Time System

Handcrafted rules


INPUT TEXTOnline

Translation

Correction

Tool

Rule Refinement


Learning

Module

Transfer Rules

Lexical Resources

Transfer System


Lattice


Run-Time System

Handcrafted rules


INPUT TEXTOnline

Translation

Correction

Tool

Rule Refinement

OUTPUT TEXT


TCTool can help improve

• Rule-based MT (grammar, lexicon, LM)

• EBMT (examples, lexicon, alignments)

• Statistical MT (lexicon, and alignments)+ relevant annotated data

develop smarter training algorithmsPanel this afternoon 3:45-4:45pm

Work in Progress

• Finalizing regression and test sets to perform rigorous evaluation of approach

• Handling incorrect Correction Instances– Have multiple users correct the same set of

sentences

filter out noise (threshold: 90% users agree)

• User study with multiple users

evaluate improvement after refinements

Contributions so far

• An efficient online GUI to display translations and alignments and solicit pinpoint fixes from non-expert bilingual users.

• New Framework to improve MT quality: an expandable set of rule refinement operations

• MT Evaluation Framework, which provides relevant reference translations

Future work

• Explore other ways to make the interaction with users more fun

• Games with a purpose [Von Ahn and Blum, 2004 & 2006]

Future work

• Explore other ways to make the interaction with users more fun

• Games with a purpose [Von Ahn and Blum, 2004 & 2006]

• Second Language Learning

Questions?

Thanks!

MT + MT=

Backup slides• TCTool next step • What if users corrections are different (user n

oise)?• More than one correction per sentence?

• Wc example

• Data set• Where is appropriate to refine vs bifurcate?• Lexical bifurcate• Is the refinement process invariable?

Backup slides (ii)

• TCTool Demo Simulation

• RR operation patterns

• Automatic Evaluation feasibility study

• AMTA paper results

• User studies map

• Precision, recall, F1

• NIST, BLEU, METEOR

Backup slides (iii)

• Done? Not yet… (blocking constraints++)• Minimal pair example• Batch mode implementation• Interactive mode implementation• User studies• Evaluation of refined output• Another Generalization Example• Avenue Architecture• Technical Challenges• Rules Formalism

Player = The Teacher

• Goal: teach the MT system to translate correctly

• Different levels of expertise– Beginner: language learning and improving

(labeled data)– Intermediate-Advanced: provide labeled data

to improve the system and for beginner levels

MT

Two players: cooperation

• In English to Spanish MT:

– Player 1: Spanish native speaker + learning English

– Player 2: English native speaker + learning Spanish

MT

back to mainback to main

Translation rule example

{NP,8}

NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) (y2 = x3) ((y1 agr) = (y2 agr)) ((y3 agr) = (y2 agr)))


{NP,8} SL side (English) TL side (Spanish) ;;X0 Y0 X1 X2 X3 Y1 Y2 Y3NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) (x0 = x3) (y2 = x3) ((y1 agr) = (y2 agr)) ((y3 agr) = (y2 agr)))

Rule ID

;; analysis;; transfer;; transfer;; generation;; generation


{NP,8};;X0 Y0 X1 X2 X3 Y1 Y2 Y3 NP::NP : [DET ADJ N] -> [DET N ADJ]( (X1::Y1) (X2::Y3) (X3::Y2) ((x0 def) = (x1 def)) ; passing definiteness up (x0 = x3) ; X3 is the head of X0 (y2 = x3) ; pass features of head to

Y2 ((y1 agr) = (y2 agr)) ; det-noun agreement ((y3 agr) = (y2 agr)) ; adj-noun agreement) back to mainback to main

Done? Not yetNP,8 (R0) ADJ(grande)

[feat1 = -]


[feat1 =c +] [feat1 = +]

Need to restrict application of general rule (R0) to just post-nominal ADJ


un artista grande

un artista gran

un gran artista

*un grande artista

Add Blocking ConstraintNP,8 (R0) ADJ(grande) [feat1 = -] [feat1 = -]


[feat1 =c +] [feat1 = +]

Can we also eliminate incorrect translations automatically?


un artista grande

*un artista gran

un gran artista

*un grande artista

Making the grammar tighter

• If Wc = artista Add [feat1= +] to N(artista) Add agreement constraint to NP,8 (R0)

between N and ADJ ((N feat1) = (ADJ feat1))


*un artista grande

*un artista gran

un gran artista

*un grande artista


Example Requiring Minimal Pair


1. Run SL sentence through the transfer engine

I see them *veo los Correct TL: los veo

2. Wi = los but no Wi’ nor Wc

Need a minimal pair to determine appropriate refinement:

I see cars veo autos

3. Triggering feature(s): (veo los, veo autos)

(los,autos) = {pos}

PRON(los)[pos=pron] N(autos)[pos=n]

Proposed Work


Avenue Architecture

Learning

Module

Learned Transfer

Rules

Lexical Resources

Run Time Transfer System

Decoder

Translation

Correction

Tool

Word-Aligned Parallel Corpus

Elicitation Tool

Elicitation Corpus

Elicitation Rule Learning

Run-Time System

Rule Refinement

Rule

Refinement

Module

Morphology

Morphology Analyzer

Learning Module Handcrafted

rules

INPUT TEXT

OUTPUT TEXT


Technical Challenges

Elicit minimal MT information from non-expert users

Automatically Refine and Expand

Translation Rules minimally

Manually written Automatically Learned

Automatic Evaluation of Refinement process


Batch Mode Implementation• Given a set of user corrections, apply

refinement module.

• For Refinement Operations of errors that can be refined fully automatically using:

1. Correction information only

2. Correction and error information

Proposed Work Automatic Rule Adaptation

error type, clue word

Rule Refinement Operations

Modify Add Delete Change W Order

+Wc –Wc +Wc –Wc +al –al Wi Wc Wi(…) Wc –Wc

= +rule –rule +al –al =Wi WiWi’ RuleLearner

POSi=POSi’ POSiPOSi’






1. Correction info only

It is a nice house – *Es una casa bonito Es una casa bonita

Gaudi was a great artist – *Gaudi era un artista grande Gaudi era un gran artista






I am proud of you – *Estoy orgullosa de tu Estoy orgullosa de ti

PP PREP NP

2. Correction and Error info


Interactive Mode Implementation

• Extra error information is required to determine triggering context automatically

Need to give other relevant sentences to the user at run-time (minimal pairs)

• For Refinement Operations of errors that can be refined fully automatically but:

3. require a further user interaction

Proposed Work Automatic Rule Adaptation






3. Further user interaction

I see them – *Veo los Los veo


Refining and Adding ConstraintsVP,3: VP NP VP NP (veo los, veo autos)

VP,3’: VP NP NP VP + [NP pos =c pron](los veo, *autos veo)

• Percolate triggering features up to the constituent level:

NP: PRON PRON + [NP pos = PRON pos]

• Block application of general rule (VP,3):

VP,3: VP NP VP NP + [NP pos = (*NOT* pron)]

*veo los, veo autos (los veo, *autos veo)

Proposed Work

User Studies• TCTool: new MT classification (Eng2Spa)

• Different language pair Mapudungun or Quechua Spanish

• Batch vs Interactive mode

• Amount of information elicitedjust corrections vs + error information

Proposed Work


Evaluation of Refined MT Output

1. Evaluate best translation Automatic evaluation metrics (BLEU, NIST, METEOR)

2. Evaluate translation candidate list size precision (includes parsimony)

1. Evaluate Best translationHypothesis file (translations to be evaluated

automatically)

Raw MT output:– Best sentence (picked by user to be correct or

requiring the least amount of correction) Refined MT output:– Use METEOR score at sentence level to pick best

candidate from the list

Run all automatic metrics on the new hypothesis file using user corrections as reference translations.

2. Evaluate Translation Candidate List

• Precision: tp binary {0,1} (1 =user correction)

tp + fp total number of TLs

SLTL XXTL XXTL XX

SLTLTL XXTL XXTL XXTL XX

SLTLTL XXTL XXTL XXTL

SLTLTL XXTL XX

=

0/30/3 1/51/5 1/51/5 1/31/3

user correction

SLTLTL XX

= 1/21/2back to mainback to main

Generalization PowerWhen triggering feature already exists in

the feature language (pos, gender, number, etc.)

I see them *veo los los veo

- I love him lo amo (before: *amo lo)

- They called me yesterday me llamaron ayer (before: *llamaron me ayer)

- Mary helps her with her homework

Maria le ayuda con sus tareas (before: *Maria ayuda le con sus tareas)


Precision, Recall and F1

• Precision: tp

tp + fp (selected, incorrect)

• Recall: tp

tp + fn (correct, not selected)

• F1: 2 PR

(P + R)back to main

Automatic Evaluation Metrics

• BLEU: averages the precision for unigram, bigram and up to 4-grams and applies a length penalty [Papineni, 2001].

• NIST: instead of n-gram precision the information gain from each n-gram is taken into account [NIST 2002].

• METEOR: assigns most of the weight to recall, instead of precision and uses stemming [Lavie, 2004] back to main

Data Set

• Split development set (~400 sentence) into: – Dev set Run User Studies

Develop Refinement Module Validate functionality

– Test set Evaluate effect of Refinement operations

• + Wild test set (from naturally occurring text)

Requirement: need to be fully parsed by grammar

Proposed Work

back to questionsback to questions

Refine vs Bifurcate

• Batch mode bifurcate (no way to tell if the original rule should never apply)

• Interactive mode refine (change original rule) if can get enough evidence that original rule never applies.

• Corrections involving agreement constraints seem to hold for all cases refine (Open research question)


More than one correction/sentence

A B TL X X

A 1st X

B 2nd X

Tetris approach to Automatic Rule Refinement

Assumption: different corrections to different words different error back to questionsback to questions

Exception: structural divergencesHe danced her out of the room

* La bailó fuera de la habitación

her he-danced out of the room

La sacó de la habitación bailando

her he-take-out of the room dancing

Have no way of knowing that these corrections are related

Do one error at a time, if TQ decreases over the test set, hypothesize that it’s a divergence Feed to the Rule Learner as a new (manually corrected)

training exampleback to questions

Constituent order change

I gave him the tools *di a él las herramientas

I-gave to him the tools

le di las herramientas a él him I-gave the tools to him

desired refinement:VPVP PP(a PRON) NP VPVP NP PP(a PRON)

Can extract constituent information from MT output (parse tree) and treat as one error/correction

More than one correction/error

A+B

TL X

Example: edit and move same word (like: gran, bailó)

Occam’s razor

Assumption: both corrections are part of the same error

back to questions

Wc ExampleI am proud of you

*Estoy orgullosa de tu Wc=de tuti

I-am proud of you-nom

Estoy orgullosa de tiI-am proud of you-oblic

Without Wc information, would need to increase the ambiguity of the grammar significantly!

+[you ti]: I love you *ti quiero (te quiero,…)

you read *ti lees (tu lees,…)


Lexical bifurcate

• Should the system copy all the features to the new entry?– Good starting point– Might want to copy just a subset of features

(possibly POS-dependent)

Open Research Question



Automatic Rule AdaptationSL + best TL picked by user

Automatic Rule AdaptationChanging “grande” into “gran”


back to main

1

2

3


Input to RR module

sl: I see them

tl: VEO LOS

tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )

(NP,0 (PRON,2:3 "LOS") ) ) ) )>

- User correction log file - Transfer engine output (+ parse tree):


sl: I see cars

tl: VEO AUTOS

tree: <((S,0 (VP,3 (VP,1 (V,1:2 "VEO") )

(NP,2 (N,1:3 “AUTOS") ) ) ) )>back to main

Types of RR Operations• Grammar:

– R0 R0 + R1 [=R0’ + constr] Cov[R0] Cov[R0,R1]– R0 R1[=R0 + constr= -]

R2[=R0’ + constr=c +] Cov[R0] Cov[R1,R2]

– R0 R1 [=R0 + constr] Cov[R0] Cov[R1]

• Lexicon– Lex0 Lex0 + Lex1[=Lex0 + constr] – Lex0 Lex1[=Lex0 + constr]– Lex0 Lex0 + Lex1[Lex0 + TLword] Lex1 (adding lexical item)

bifurcate

refine

Completed Work Automatic Rule Adaptation

back to main

Manual vs Learned Grammars[AMTA 2004]


NIST BLEU METEOR

Manual grammar 4.3 0.16 0.6

Learned grammar 3.7 0.14 0.55

• Manual inspection:

• Automatic MT Evaluation:

back to main

Human Oracle experiment• As a feasibility experiment, compared

raw output with manually corrected MT:

statistically significant (confidence interval test)

• These is an upper-bound on how much difference we should expect any refinement approach to make.


Completed Work

back to main

Invariable process?• Rule Refinement process is not invariable

Order of the corrected sentences input to the system matters

• Example:1st: gran artista bifurcate (2 rules)2nd: casa bonito add agr constraint to only 1 rule

(original, general rule)

the specific rule is still incorrect (missing agr constraint)

1st: casa bonito add agr constraint 2nd: gran artista bifurcate

both rules have agreement constraint (optimal order)


User noise?

Solution:

Have several users evaluate and correct the same test set

threshold, 90% agreement

- correction

- error information (type, clue word)

Only modify the grammar if enough evidence of incorrect rule.

back to questions

User Studies Map

RR moduleManual grammars

Learned grammars

Eng2spa

Only Corrections

Corrections+error info

Active learning

Batch mode

interactive mode

Proposed Work

XX

XX

Mapu2SpaXX

back to main

Recycle corrections of Machine Translation output back into the

system by refining and expanding

existing translation rules







It is a nice house – Es una casa bonito Es una casa bonita

John and Mary fell – Juan y Maria cayeron Juan y Maria se cayeron

Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista






Gaudi was a great artist – Gaudi era un artista grande Gaudi era un gran artista

Es una casa bonito Es una casa bonita

J y M cayeron J y M se cayeron

I will help him fix the car – Ayudaré a él a arreglar el auto

Le ayudare a arreglar el auto







I will help him fix the car – Ayudaré a él a arreglar el auto

Le ayudare a arreglar el auto

I would like to go – Me gustaria que ir

Me gustaria ir







I am proud of you – Estoy orgullosa tu Estoy orgullosa de ti

PP PREP NP

2. Correction and Error info






Focus 3

Wally plays the guitar – Wally juega la guitarra Wally toca la guitarra

I saw the woman – Vi la mujer Vi a la mujer

I see them – Veo los Los veo






Outside Scope of Thesis

John read the book – A Juan leyó el libro Juan leyó el libro

Where are you from? – Donde eres tu de? De donde eres tu?

Documents

Automating Post-Editing To Improve MT Systems Ariadna Font Llitjós and Jaime Carbonell APE Workshop, AMTA – Boston August 12, 2006