67
Advances in Computer Aided Translation Beyond Post-Editing Philipp Koehn 31 October 2015 Philipp Koehn Computer Aided Translation 31 October 2015

Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

Advances inComputer Aided Translation

Beyond Post-EditingPhilipp Koehn

31 October 2015

Philipp Koehn Computer Aided Translation 31 October 2015

Page 2: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

1Overview

• Post-editing

• Richer information

– word alignment– confidence scores– translation option array– bilingual concordancer– paraphraser

• Interactive translation prediction

• Model adaptation

• Logging, eye tracking, and user studies

• CASMACAT Home Edition

Philipp Koehn Computer Aided Translation 31 October 2015

Page 3: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

2Postediting Interface

• Screenshot from casmacat post-editing mode (same as matecat)

• Source on left, translation on right / context above and below

Philipp Koehn Computer Aided Translation 31 October 2015

Page 4: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

3Productivity Improvements

(source: Autodesk)

Philipp Koehn Computer Aided Translation 31 October 2015

Page 5: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

4MT Quality and Productivity

• What is the relationship between MT Quality and Postediting Speed

• One study (English–German, news translation, non-professionals)

SystemSpeed Metric

sec./wrd. wrds./hr. bleu manual

online-b 5.46 659 20.7 0.637uedin-syntax 5.38 669 19.4 0.614uedin-phrase 5.45 661 20.1 0.571uu 6.35 567 16.1 0.361

Philipp Koehn Computer Aided Translation 31 October 2015

Page 6: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

5Translator Variability

• Translator differ in

– ability to translate– motivation to fix minor translation

• High variance in translation time(again: non-professionals)

Post-editorSpeed

sec./wrd. wrds./hr.1 3.03 1,1882 4.78 7533 9.79 3684 5.05 713

Philipp Koehn Computer Aided Translation 31 October 2015

Page 7: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

6Overview

• Post-editing

• Richer information

– word alignment– confidence scores– translation option array– bilingual concordancer– paraphraser

• Interactive translation prediction

• Model adaptation

• Logging, eye tracking, and user studies

• CASMACAT Home Edition

Philipp Koehn Computer Aided Translation 31 October 2015

Page 8: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

7Word Alignment

• Caret alignment (green)

• Mouse alignment (yellow)

Philipp Koehn Computer Aided Translation 31 October 2015

Page 9: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

8Confidence Measures

• Sentence-level confidence measures→ estimate usefulness of machine translation output

• Word-level confidence measures→ point posteditor to words that need to be changed

Philipp Koehn Computer Aided Translation 31 October 2015

Page 10: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

9Translation Option Array

• Visual aid: non-intrusive provision of cues to the translator

• Clickable: click on target phrase→ added to edit area

• Automatic orientation– most relevant is next word to be translated– automatic centering on next word

Philipp Koehn Computer Aided Translation 31 October 2015

Page 11: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

10Enabling Monolingual Translators

• Monolingual translator

– wants to understand a foreign document

– has no knowledge of foreign language

– uses a machine translation system

• Questions

– Is current MT output sufficient for understanding?

– What else could be provided by a MT system?

Philipp Koehn Computer Aided Translation 31 October 2015

Page 12: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

11Example

• MT system output:

The study also found that one of the genes in the improvement in people withprostate cancer risk, it also reduces the risk of suffering from diabetes.

• What does this mean?

• Monolingual translator:

The research also found that one of the genes increased people’s risk of prostatecancer, but at the same time lowered people’s risk of diabetes.

• Document context helps

Philipp Koehn Computer Aided Translation 31 October 2015

Page 13: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

12Example: Arabic

up to 10 translations for each word / phrase

Philipp Koehn Computer Aided Translation 31 October 2015

Page 14: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

13Example: Arabic

Philipp Koehn Computer Aided Translation 31 October 2015

Page 15: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

14Bilingual Concordancer

Philipp Koehn Computer Aided Translation 31 October 2015

Page 16: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

15

Philipp Koehn Computer Aided Translation 31 October 2015

Page 17: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

16

Philipp Koehn Computer Aided Translation 31 October 2015

Page 18: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

17Verification of Terminology

• Translation of German Windkraft

• Context shows when each translation is used

• Indication of source supports trust in translations

Philipp Koehn Computer Aided Translation 31 October 2015

Page 19: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

18Paraphrasing

• User marks part of translation

• Clicks on paraphrasing button

• Alternative translations appear

Philipp Koehn Computer Aided Translation 31 October 2015

Page 20: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

19Overview

• Post-editing

• Richer information

– word alignment– confidence scores– translation option array– bilingual concordancer– paraphraser

• Interactive translation prediction

• Model adaptation

• Logging, eye tracking, and user studies

• CASMACAT Home Edition

Philipp Koehn Computer Aided Translation 31 October 2015

Page 21: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

20Interactive Translation Prediction

Philipp Koehn Computer Aided Translation 31 October 2015

Page 22: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

21Shade Off Translated

• Word alignment visualization for interactive translation prediction

• Shade off words that are already translated

• Highlight words aligned to first predicted translation word

Philipp Koehn Computer Aided Translation 31 October 2015

Page 23: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

22Visualization

• Show n next words

• Show rest of sentence

Philipp Koehn Computer Aided Translation 31 October 2015

Page 24: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

23Spence Green’s Lilt System

• Show alternate translation predictions

• Show alternate translations predictions with probabilities

Philipp Koehn Computer Aided Translation 31 October 2015

Page 25: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

24Prediction from Search Graph

he

it

has

planned

has

for

since

for

months

months

months

Search for best translation creates a graph of possible translations

Philipp Koehn Computer Aided Translation 31 October 2015

Page 26: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

25Prediction from Search Graph

he

it

has

planned

has

for

since

for

months

months

months

One path in the graph is the best (according to the model)

This path is suggested to the user

Philipp Koehn Computer Aided Translation 31 October 2015

Page 27: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

26Prediction from Search Graph

he

it

has

planned

has

for

since

for

months

months

months

The user may enter a different translation for the first words

We have to find it in the graph

Philipp Koehn Computer Aided Translation 31 October 2015

Page 28: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

27Prediction from Search Graph

he

it

has

planned

has

for

since

for

months

months

months

We can predict the optimal completion (according to the model)

Philipp Koehn Computer Aided Translation 31 October 2015

Page 29: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

28Overview

• Post-editing

• Richer information

– word alignment– confidence scores– translation option array– bilingual concordancer– paraphraser

• Interactive translation prediction

• Model adaptationInteractive translation prediction

• Logging, eye tracking, and user studies

• CASMACAT Home Edition

Philipp Koehn Computer Aided Translation 31 October 2015

Page 30: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

29Adaptation

• Machine translation works best if optimized for domain

• Typically, large amounts of out-of-domain data available

– European Parliament, United Nations– unspecified data crawled from the web

• Little in-domain data (maybe 1% of total)

– information technology data– more specific: IBM’s user manuals– even more specific: IBM’s user manual for same product line from last year– and even more specific: sentence pairs from current project

• Various domain adaptation techniques researched and used

Philipp Koehn Computer Aided Translation 31 October 2015

Page 31: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

30Combining Data

CombinedDomainModel

• Too biased towards out of domain data

• May flag translation options with indicator feature functions

Philipp Koehn Computer Aided Translation 31 October 2015

Page 32: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

31Interpolate Models

InDomainModel

Out-ofDomainModel

• pc(e| f ) = λinpin(e| f ) + λoutpout(e| f )

• Quite successful for language modelling

Philipp Koehn Computer Aided Translation 31 October 2015

Page 33: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

32Multiple Models

InDomainModel

Out-ofDomainModel

Use both

• Multiple models→ multiple feature functions

Philipp Koehn Computer Aided Translation 31 October 2015

Page 34: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

33Backoff

InDomainModel

Out-ofDomainModel

Look up phrase

If found, returnIf not found

If found, return

Philipp Koehn Computer Aided Translation 31 October 2015

Page 35: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

34Fill-Up

InDomainModel

Out-ofDomainModel

translations for phrase f

translations for phrase f

translations for phrase f

CombinedDomainModel

• Use translation options from in-domain table

• Fill up with additional options from out-of-domain table

Philipp Koehn Computer Aided Translation 31 October 2015

Page 36: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

35Sentence Selection

CombinedDomainModel

• Select out-of-domain sentence pairs that are similar to in-domain data

• Score similarity with language model, other means

Philipp Koehn Computer Aided Translation 31 October 2015

Page 37: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

36Project Adaptation

• Method developed by the Matecat project

• Update model during translation project

• After each day

– collected translated sentences

– add to model

– optimize

• Main benefit after the first day

Philipp Koehn Computer Aided Translation 31 October 2015

Page 38: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

37Incremental Updating

Machine Translation

Philipp Koehn Computer Aided Translation 31 October 2015

Page 39: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

38Incremental Updating

Machine Translation

Postediting

Philipp Koehn Computer Aided Translation 31 October 2015

Page 40: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

39Incremental Updating

Machine Translation

Postediting

Retraining

Philipp Koehn Computer Aided Translation 31 October 2015

Page 41: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

40Adaptable Translation Model

• Store in memory

– parallel corpus– word alignment

• Adding new sentence pair

– word alignment of sentence pair– add sentence pair– update index (suffix array)

• Retrieve phrase translations on demand

Philipp Koehn Computer Aided Translation 31 October 2015

Page 42: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

41Bias Towards User Translation

• Cache-based models

• Language model

→ give bonus to n-grams in previous user translation

• Translation model

→ give bonus to translation options in previous user translation

• Decaying score for bonus (less recent, less relevant)

Philipp Koehn Computer Aided Translation 31 October 2015

Page 43: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

42Overview

• Post-editing

• Richer information

– word alignment– confidence scores– translation option array– bilingual concordancer– paraphraser

• Interactive translation prediction

• Model adaptation

• Logging, eye tracking, and user studies

• CASMACAT Home Edition

Philipp Koehn Computer Aided Translation 31 October 2015

Page 44: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

43How do we Know it Works?

• Intrinsic Measures

– word level confidence: user does not change words generated with certainty– interactive prediction: user accepts suggestions

• User Studies

– professional translators faster with post-editing– ... but like interactive translation prediction better

• Cognitive studies with eye tracking

– where is the translator looking at?– what causes the translator to be slow?

Philipp Koehn Computer Aided Translation 31 October 2015

Page 45: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

44Keystroke Log

Input: Au premier semestre, l’avionneur a livre 97 avions.Output: The manufacturer has delivered 97 planes during the first half.

(37.5 sec, 3.4 sec/word)

black: keystroke, purple: deletion, grey: cursor moveheight: length of sentence

Philipp Koehn Computer Aided Translation 31 October 2015

Page 46: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

45Unassisted Novice Translators

L1 = native French, L2 = native English, average time per input word

only typing

Philipp Koehn Computer Aided Translation 31 October 2015

Page 47: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

46Unassisted Novice Translators

L1 = native French, L2 = native English, average time per input word

typing, initial and final pauses

Philipp Koehn Computer Aided Translation 31 October 2015

Page 48: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

47Unassisted Novice Translators

L1 = native French, L2 = native English, average time per input word

typing, initial and final pauses, short, medium, and long pausesmost time difference on intermediate pauses

Philipp Koehn Computer Aided Translation 31 October 2015

Page 49: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

48Activities: Native French User L1b

User: L1b total init-p end-p short-p mid-p big-p key click tabUnassisted 7.7s 1.3s 0.1s 0.3s 1.8s 1.9s 2.3s - -Postedit 4.5s 1.5s 0.4s 0.1s 1.0s 0.4s 1.1s - -Options 4.5s 0.6s 0.1s 0.4s 0.9s 0.7s 1.5s 0.4s -Prediction 2.7s 0.3s 0.3s 0.2s 0.7s 0.1s 0.6s - 0.4sPrediction+Options 4.8s 0.6s 0.4s 0.4s 1.3s 0.5s 0.9s 0.5s 0.2s

Philipp Koehn Computer Aided Translation 31 October 2015

Page 50: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

Slightlyless timespent ontyping

49Activities: Native French User L1b

User: L1b total init-p end-p short-p mid-p big-p key click tabUnassisted 7.7s 1.3s 0.1s 0.3s 1.8s 1.9s 2.3s - -Postedit 4.5s 1.5s 0.4s 0.1s 1.0s 0.4s 1.1s - -Options 4.5s 0.6s 0.1s 0.4s 0.9s 0.7s 1.5s 0.4s -Prediction 2.7s 0.3s 0.3s 0.2s 0.7s 0.1s 0.6s - 0.4sPrediction+Options 4.8s 0.6s 0.4s 0.4s 1.3s 0.5s 0.9s 0.5s 0.2s

Philipp Koehn Computer Aided Translation 31 October 2015

Page 51: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

Slightlyless timespent ontyping

Lesspausing

50Activities: Native French User L1b

User: L1b total init-p end-p short-p mid-p big-p key click tabUnassisted 7.7s 1.3s 0.1s 0.3s 1.8s 1.9s 2.3s - -Postedit 4.5s 1.5s 0.4s 0.1s 1.0s 0.4s 1.1s - -Options 4.5s 0.6s 0.1s 0.4s 0.9s 0.7s 1.5s 0.4s -Prediction 2.7s 0.3s 0.3s 0.2s 0.7s 0.1s 0.6s - 0.4sPrediction+Options 4.8s 0.6s 0.4s 0.4s 1.3s 0.5s 0.9s 0.5s 0.2s

Philipp Koehn Computer Aided Translation 31 October 2015

Page 52: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

Slightlyless timespent ontyping

Lesspausing

Especiallyless time

in bigpauses

51Activities: Native French User L1b

User: L1b total init-p end-p short-p mid-p big-p key click tabUnassisted 7.7s 1.3s 0.1s 0.3s 1.8s 1.9s 2.3s - -Postedit 4.5s 1.5s 0.4s 0.1s 1.0s 0.4s 1.1s - -Options 4.5s 0.6s 0.1s 0.4s 0.9s 0.7s 1.5s 0.4s -Prediction 2.7s 0.3s 0.3s 0.2s 0.7s 0.1s 0.6s - 0.4sPrediction+Options 4.8s 0.6s 0.4s 0.4s 1.3s 0.5s 0.9s 0.5s 0.2s

Philipp Koehn Computer Aided Translation 31 October 2015

Page 53: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

52Origin of Characters: Native French L1b

User: L1b key click tab mtPostedit 18% - - 81%Options 59% 40% - -Prediction 14% - 85% -Prediction+Options 21% 44% 33% -

Philipp Koehn Computer Aided Translation 31 October 2015

Page 54: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

Translation comes to largedegree from assistance

53Origin of Characters: Native French L1b

User: L1b key click tab mtPostedit 18% - - 81%Options 59% 40% - -Prediction 14% - 85% -Prediction+Options 21% 44% 33% -

Philipp Koehn Computer Aided Translation 31 October 2015

Page 55: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

54Eye Tracking

• Eye trackers extensively used in cognitive studies of, e.g., reading behavior

• Overcomes weakness of key logger: what happens during pauses

• Fixation: where is the focus of the gaze

• Pupil dilation: indicates degree of concentration

Philipp Koehn Computer Aided Translation 31 October 2015

Page 56: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

55Eye Tracking Chart

focus on target word (green) or source word (blue) at position x

Philipp Koehn Computer Aided Translation 31 October 2015

Page 57: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

56Cognitive Studies: User Styles

• User style 1: Verifies translation just based on the target text,reads source text to fix it

Philipp Koehn Computer Aided Translation 31 October 2015

Page 58: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

57Cognitive Studies: User Styles

• User style 2: Reads source text first, then target text

Philipp Koehn Computer Aided Translation 31 October 2015

Page 59: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

58Cognitive Studies: User Styles

• User style 3: Makes corrections based on target text only

Philipp Koehn Computer Aided Translation 31 October 2015

Page 60: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

59Cognitive Studies: User Styles

• User style 4: As style 1, but also considers previous segment for corrections

Philipp Koehn Computer Aided Translation 31 October 2015

Page 61: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

60Backtracking

• Local backtracking

– immediate repetition

– local alternation

– local orientation

• Long-distance backtracking

– long-distance alternation

– text final backtracking

– in-text long distance backtracking

Philipp Koehn Computer Aided Translation 31 October 2015

Page 62: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

61Overview

• Post-editing

• Richer information

– word alignment– confidence scores– translation option array– bilingual concordancer– paraphraser

• Interactive translation prediction

• Model adaptation

• Logging, eye tracking, and user studies

• CASMACAT Home Edition

Philipp Koehn Computer Aided Translation 31 October 2015

Page 63: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

62CASMACAT

GUI webserver

CATserver

MTserver

Javascript PHP

Python

Python

web socketHTTP

HTTP

• European research project 2011-2014

• All describe methods implemented in casmacat workbench

– builds on matecat open source implementation– typical web application: LAMP (Linux, Apache, MySQL, PHP)– uses model, view, controller breakdown

• Workbench freely available at http://www.casmacat.eu/

Philipp Koehn Computer Aided Translation 31 October 2015

Page 64: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

63Home Edition

• Running casmacat on your desktop or laptop

• Installation

– Installation software to runvirtual machines(e.g., Virtualbox)

– installation of Linuxdistribution(e.g., Ubuntu)

– installation script sets upall the required softwareand dependencies

Philipp Koehn Computer Aided Translation 31 October 2015

Page 65: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

64Administration through Web Browser

Philipp Koehn Computer Aided Translation 31 October 2015

Page 66: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

65Training MT Engines

• Train MT engineon own or public data

Philipp Koehn Computer Aided Translation 31 October 2015

Page 67: Advances in Computer Aided Translation Beyond Post-Editingphi/papers/mt-summit-2015-invited-talk.pdf · { bilingual concordancer { paraphraser Interactive translation prediction Model

66Thank You

questions?

Philipp Koehn Computer Aided Translation 31 October 2015