CS460/626 : Natural LanguageCS460/626 : Natural Language
46
CS460/626 : Natural Language CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10 11 MT approaches) (Lecture 10, 11–MT approaches) Pushpak Bhattacharyya Pushpak Bhattacharyya CSE Dept., IIT Bombay 25 th J d 27 th J 2011 25 th Jan and 27 th Jan, 2011 Acknowledgement: parts are from Hansraj’s dual degree seminar presentation
CS460/626 : Natural LanguageCS460/626 : Natural Language
Microsoft PowerPoint -
cs626-460-lect10-and-11-MT-approaches-2011-1-25-27.pptx(Lecture 10
11 MT approaches)(Lecture 10, 11–MT approaches)
Pushpak BhattacharyyaPushpak Bhattacharyya CSE Dept., IIT
Bombay
25th J d 27th J 201125th Jan and 27th Jan, 2011
Acknowledgement: parts are from Hansraj’s dual degree seminar
presentation
Czeck-English data
[nesu] “I carry” [ponese] “He will carry”[ponese] He will carry
[nese] “He carries” [ ] “ h ”[nesou] “They carry” [yedu] “I drive”
[plavou] “They swim”
To translate …
I will carry. They driveThey drive. He swims. h ll dThey will
drive.
Hindi-English data
[DhotA huM] “I carry” [DhoegA] “He will carry”[DhoegA] He will
carry [DhotA hAi] “He carries” [ h h ] “ h ”[Dhote hAi] “They
carry” [chalAtA huM] “I drive” [tErte hEM] “They swim”
Bangla-English data
[bai] “I carry” [baibe] “He will carry”[baibe] He will carry [bay]
“He carries” [b ] “ h ”[bay] “They carry” [chAlAi] “I drive”
[sAMtrAy] “They swim”
MT Approaches interlinguainterlingua
Example Based MT (EBMT)
Motivation
MT: NLP Complete NLP: AI complete AI: CS
completeAI: CS complete
How will the world be different when the language barrier
disappears? V l f t t i d t b t l t d tl dVolume
of text required to be translated currently
exceeds translators’
capacity (demand outstrips supply).
Solution: automation (the only solution) M hi t l ti
t h iMany machine translation techniques
Which approach is better for HindiEnglish MT
Interlingual representation: complete disambiguation
@ t <is-a >
@past action
<is-a > person
Kinds of disambiguation needed for a complete and correctfor a
complete and correct interlingua graph
N: Name P: POSP: POS A: Attachment S: Sense C: Co-reference R:
Semantic Role
Target Sentence Generation from interlingua
Target Sentence Generation
Morphological Synthesis
(Word/Phrase Translation )
Washington ne Washington koWashingtonagent ne Washingtonobject ko
sattaagoal ke liye chunaa
Vote- chunna Power- sattaa
Statistical Machine Translation (SMT)
Data driven approach
Goal is to find out the English sentence e
given foreign
language sentence f whose p(e|f)
is maximum.
Translations are generated on the basis of statistical
modelTranslations are generated on the basis of statistical model
Parameters are estimated using bilingual parallel corpora
SMT: Language Model
To detect good English sentences
Probability of an English sentence s1s2 …… sn
can be written asas
Pr(s1s2 …… sn) = Pr(s1) * Pr(s2|s1) *. . . * Pr(sn|s1
s2 . . . sn1)
Here Pr(sn|s1 s2 . . . sn1)
is the probability that word sn
follows d t iword string s1
s2 . . . sn1.
Ngram model probability
Trigrammodel probability
calculationTrigram model probability calculation
SMT: Translation Model
P(f|e): Probability of some f given hypothesis English
translation e How to assign the values to p(e|f)
?How to assign the values to p(e|f) ?
Sentences are infinite not possible to find pair(e f) for all
sentences
Sentence level
Sentences are infinite, not possible to find pair(e,f) for all sentences
Introduce a hidden variable a, that represents alignments
b t th i di id l d i th t
ibetween the individual words in the sentence pair
Word level
Alignment If the string, e= e1
l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m
words, then the alignment, a, can be represented by a series,
a1
m= a1a2...am , of m values, each between 0 and l such that if the
word in position j of the p j f-string is connected to the word in
position i of the e-string, then
a = i andaj= i, and if it is not connected to any English word,
then aj= O
Example of alignmentp g
English: Ram went to schoolEnglish: Ram went to school Hindi: Raama
paathashaalaa gayaa
Ram went to school
Translation Model: Exact expression
Choose alignment i d
d l f h [ ]
given e and m of foreign word
given e, m, a
of foreign language string given e
Five models for estimating parameters in the expression [2]
Model1, Model2, Model3, Model4, Model5
Proof of Translation Model: Exact expression
∑= eafef )|,Pr()|Pr(
= j
)|P (f )|P ( ∏ −−− m
m is fixed for a particular f, hence
)|,,Pr( emaf )|Pr( em= ∏ =j
jj j
Simplest model Assumptions
Pr(m|e) is independent of m and e and is equal to ε
Alignment of foreign language words (FLWs) depends only on
length of English sentence
= (l+1)-1( ) l
is the length of English sentence
The likelihood function will be
M i i th lik lih d f ti t i d
tMaximize the likelihood function constrained to
Model-1: Parameter estimation
Using Lagrange multiplier
for constrained maximization,
the solution for model1 parameters
λe : normalization constant;
c(f|e; f,e) expected count; δ(f,fj) is 1 if f
& f are same zero otherwiseis 1 if
f & fj are same, zero otherwise.
Estimate t(f|e)
using Expectation Maximization (EM)
procedurep
Model-2
Alignment of FLW to an Eng word depends on its position
The likelihood function is
Model1 & 2 Model1 is special case
of model2 where
To instantiate the model2 parameters, use parameter
estimated in model1
Model-3
Fertility: Number of FLWs to which an Eng word is connected
in a randomly selected alignment
Tablet: List of FLWs connected to an Eng wordg
Tableau: The collection of tablets
The alignment processThe alignment process for each
English word
Begin Decide the fertility of the wordy
Get a list of French words to connect to the word
End
Permute words in tableau to generate f
Model-3: Example
English Sentence (e) = Annual inflation rises to 11.42%
Step1: Deciding fertilities(F) e =
Annual
inflation rises to 11.42%
F = Annual inflation inflation inflation rises rises rises to 11.42%
Model-3: Example
English Sentence (e) = Annual inflation rises to 11.42%
Step2: Translation to FLWs(T) e =
Annual
inflation rises to 11.42%
F = Annual inflation inflation inflation rises rises rises to 11.42%
T= 11.42%
Model-3: Example
English Sentence (e) = Annual inflation rises to 11.42%
Step 3: Reordering FLWs(R)Step3: Reordering FLWs(R)
e = Annual
inflation rises to 11.42%
F =
Annual inflation inflation inflation rises rises rises to 11.42%
T = 11 42%T = 11.42% R =
11.42%
Values fPr, T, R
calculated using the formulas obtained in model3 [2]
Model-4 & 5
Model3: Every word is moved independently
Model4: Consider phrases (cept) in a sentence
Distortion probability is replaced by
A parameter for head of the each cept
A parameter for the remaining part of the ceptp
g p p
Deficiency in model3 & 4
In distortion probability
fModel5 removes the deficiency
Avoid unavailable positions
Introduces a new variable for the positions
Example Based Machine Translation (EBMT)
Basic idea: translate a sentence by using the closest
match in parallel data
Inspired by human analogical
thinkingInspired by human analogical thinking
Issues Related to Examples in Corpora
Granularity of examples
Parallel text should be aligned at the subsentence
level
Number of examplesNumber of examples
Suitability of examples
(i) Columbus discovered America (ii) America was discovered by Columbus
( ) Ti fli lik (b) Ti fli lik(a) Time flies
like an arrow (b) Time flies like an arrow
How examples should be stored?p
Annotated tree structure Generalized examples
“Rajesh will reach Mumbai by 10:00 pm”>”P will reach D by T”
Annotated Tree Structure: example
EBMT: Matching and Retrieval (1/2)
System must be able to recognize the similarity and
differences b/w the input and stored examples
String based matching:
Longest common subsequence
Takes word similarity into account while sense disabiguation
EBMT: Matching and Retrieval (2/2)
Angle of similarity:
Trigonometric similarity measure based on relative length &
relative contents
(x). Select ‘Symbol’ in the Insert menu.
(y). Select ‘Symbol’ in the Insert menu to enter a character from the symbol set.
(z). Select ‘Paste’ in the Edit menu.
(w) Select ‘Paste’ in the Edit menu to enter some text from the
clip board(w). Select Paste
in the Edit menu to enter some text from the clip board.
θxy: the qualitative difference between sentence x and y
δ(x,y): the difference between size of x and y,
EBMT: Adaptation & Recombination
Adaptation
Extracting appropriate fragments from the matched translation
The boy entered the house>
I saw a tiger > The boy
eats his breakfast >
I saw the boy >
Boundary FrictionBoundary Friction
Retrieved translations do not fit the syntactic context
I saw the boy > *
Recombine fragments into target text
SMT “language model” can be used
Interlingua Based MT
Interlingua "between languages“
SL text converted into a languageindependent or 'universal'
b h f l
TLabstract representation then transform into several TL
Universal Networking Language (UNL)
UNL is an example of interlingua
Represents information sentence by sentence UNL
i d fUNL is composed of
Universal words Relations
{/unl}
Issues related to interlingua
Interlingua must
Capture the knowledge in text precisely and accurately
Handle cross language
divergenceHandle cross language divergence
Divergence between HindiEnglish language
Constituent order divergence Null subject
divergenceNull subject divergence
== * am going (I am going)
Conflational divergence
== Jim stabbed John Promotional divergence
The play is on ==
Benefits & Shortcomings(1/3)
Statistical Machine translation “Every time I fire a
linguist, my system’s performance improves”
(Brown et al. 1988)(Brown et al. 1988) Pros
No linguistic knowledge is required Great deal
of natural language in machine readable
textGreat deal of natural language in machine readable text
Loose dependencies b/w languages can be modeled better
Cons
Probability of rare words can’t be
trustedProbability of rare words can
t be trusted
Not good for idioms, jokes, compound words, text having hidden
meaning
Selection of correct morphological word is difficult
Benefits & Shortcomings(2/3)
Example Based MT Pros
P f t t l ti f t if i il f d
iPerfect translation of a sentence if very similar one found in
example sentences
No need to bother about previously translated sentences
Cons Fails if no match found in corpora Problem at points of
example concatenation in recombination step
Benefits & Shortcomings(3/3)
Interlingua based MT Pros
Add a ne lang age and get all a s translation to all pre io
slAdd a new language and get allways translation to all previously
added languages
Monolingual lingual development team
Economical in situation where translation among multiple languages g
p g g is used
Cons
“Meaning” is arbitrarily deep. At what level of detail do we stop?g
y p p Human development time
Translation is Ubiquitous
Between Languages Delhi is the capital of IndiaDelhi is the capital
of India
Between dialectsBetween dialects Example next slide
B t i tBetween registers My “mom” not well. My “mother” is unwell
(in a leave application)
Between dialects (1/3)Between dialects (1/3) Lage Raho Munnabhai:
an excellent example Scene: Munnabhai (Sanjay Dutt) is Prof. Murli
Prasad Sharma being interviewed with some iti en king q e tion in p
e en e of J hn icitizens asking questions in presence of
Jahnavi
(Vidya Baalan) Question by citizen:Question by citizen: , . .
Between dialects (2/3)Between dialects (2/3) Bapu from behind
invisible to others:
Munnabhai
Bapu
Munnabhai full country
Bapu
MunnabhaiMunnabhai
Between dialects (3/3)Between dialects (3/3) Bapu
Munnabhai ,
Bapu
Munnabhai , ,
, heart heart !
Comparison b/w SMT, EBMT, InterlinguaComparison b/w SMT, EBMT,
Interlingua
Property Example Based MT Statistical MT Interlingua based MT
Parallel Yes Yes No
Corpora Yes Yes No
Dictionary Yes No Yes
Parser Yes No Yes
1.
P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. The mathematics
of statistical machine translation: parameter estimation.
Computational
Linguistics, 19(2), 263311. (1993)
2. Makoto Nagao.
A framework of a mechanical translation between Japanese
and English by analogy principle, in A. Elithorn
and R. Banerji: Artificial
and Human Intelligence. Elsevier Science Publishers. (1984).
3.
Somers H. Review Article: Example based Machine Translation. Machine
Translation, Volume 14,
Number 2, pp. 113157(45). (June 1999)
4.
D. Turcato, F. Popowich. What is ExampleBased Machine Translation?
In
M. Carl and A. Way (eds). Recent Advances of EBMT. Kluwer
Adacemic
Publishers, Dordrecht. Note, revised version of Workshop Paper. (2003)
References (2/2)
5. Dave S., Parikh J. and Bhattacharyya. Interlingua Based English
Hindi Machine Translation and Language Divergence. P. Journal of
Machine Translation, Volume 17. (2002)
6 Adam L Be ge Stephen A Della Piet a Y Vincent J Della Piet a Y
A6. Adam L. Berger, Stephen A. Della Pietra Y, Vincent J. Della
Pietra Y. A maximum entropy approach to natural language
processing. Computational Linguistics, (22-1), (March 1996).
7. Jason Baldridge, Tom Morton, and Gann Bierner. The
opennlp.maxent package: g , , p p p g POS tagger, end of sentence
detector, tokenizer, name finder. http://maxent.sourceforge.net/
version- 2.4.0 (Oct. 2005)
8. Universal Networking Language (UNL) Specifications. UNL Center
of UNDL Foundation URL: http://www undl org/unlsys/unl/unl2005/ 7
June 2005Foundation. URL: http://www.undl.org/unlsys/unl/unl2005/.
7 June 2005.