CS460/626 : Natural Language Processing/Speech NLP and...

CS460/626 : Natural Language Processing/Speech NLP and the WebProcessing/Speech, NLP and the Web

(Lecture 18– Alignment in SMT and Tutorial on Giza++ and Moses)on Giza++ and Moses)

Pushpak BhattacharyyaPushpak BhattacharyyaCSE Dept., IIT Bombay

15th F b 201115th Feb, 2011

Going forward from word alignmentalignment

Word alignmentWord alignment

Phrase Alignment Decoding(going to bigger units (best possibleOf correspondence) translation)

Abstract ProblemAbstract Problem

Given: e e e e e e (Entities)Given: eoe1e2e3….enen+1 (Entities)

Goal: l l1l2l3 l l 1 (Labels)Goal: lol1l2l3….lnln+1 (Labels)

The Goal is to find the best possible label sequence

))|((maxarg* ELPLL

Generative Model

)|().(maxarg)|(maxarg LEPLPELPL

SimplificationSimplification

Using Markov Assumption the LanguageUsing Markov Assumption, the Language Model can be represented using bigrams

Simila l t anslation model can also be

)|()( 10

iLLPLP +

Similarly translation model can also be represented in the following way:

iii lePLEP

)|()|(

Statistical Machine Translation

Finding the best possible English sentence given the foreign sentencesentence given the foreign sentence

)|().(maxarg)|(maxarg* EFPEPFEPeE

P(E)= Language ModelP(F|E) Translation ModelP(F|E) = Translation ModelE: English, F: Foreign Language

Problems in the frameworkProblems in the frameworkLabels are words of the target languageLabels are words of the target language

Very large in number Who do you want to_go with ? Preposition

With whom do you want to go ?आप िकस के_साथ जाना चाहते_हो (Aap kis ke sath jaana chahate ho)

Stranding

(Aap kis ke_sath jaana chahate_ho)who whodo do and so on

you youwant wantto_go to_gowith with

Column of words of target language on the

l dsource language words

^ Aap kis ke_sath jaana chahate_ho .who whodo do and so on you youy y

^ want want … .to_go to_gowith withwith with

Find the best possible path from ‘^’ to ‘.’ using transition andObservation probabilities.

Viterbi can be usedViterbi can be used

TUTORIAL ON Giza++ and Moses tools(delivered by Kushal Ladha)

Word-based alignmentWord based alignment

For each word in source language alignFor each word in source language, align words from target language that this word possibly producespossibly producesBased on IBM models 1-5M d l 1 i l tModel 1 – simplestAs we go from models 1 to 5, models get more complex but more realisticThis is all that Giza++ does

Ali tAlignment

A function from target position to source position:

The alignment sequence is: 2,3,4,5,6,6,6Ali f i A A(1) 2 A(2) 3 Alignment function A: A(1) = 2, A(2) = 3 ..A different alignment function will give the sequence:1,2,1,2,3,4,3,4 for A(1), A(2)..

To allow spurious insertion, allow alignment with word 0 (NULL)No. of possible alignments: (I+1)J

IBM Model 1: Generative ProcessProcess

Training Alignment ModelsTraining Alignment Models

Given a parallel corpora, for each (F,E) learn the best alignment A and thelearn the best alignment A and the component probabilities:

How to compute these probabilities if all h i ll l

you have is a parallel corpora

Intuition : Interdependence of ProbabilitiesProbabilities

If you knew which words are probable translation of each other then you cantranslation of each other then you can guess which alignment is probable and which one is improbablepIf you were given alignments with probabilities then you can compute p y ptranslation probabilitiesLooks like a chicken and egg problem

gg pEM algorithm comes to the rescue

Limitation: Only 1->Many Alignments ll dallowed

Phrase-based alignmentPhrase based alignment

More natural

Many-to-one mappings allowed

Giza++ and Moses PackageGiza++ and Moses Package

http://cl naist jp/~eric-n/ubuntu-nlp/http://cl.naist.jp/~eric-n/ubuntu-nlp/Select your Ubuntu versionBrowse the nlp folderDownload debian package of giza++, p g g ,moses, mkcls, srilmResolve all the dependencies and they getResolve all the dependencies and they get installedFor alternate installation refer toFor alternate installation, refer to http://www.statmt.org/moses_steps.html

StepsSteps

Input - sentence aligned parallel corpusO t t t t id t d d tOutput- target side tagged data

TrainingTuningGenerate output on test corpusGenerate output on test corpus (decoding)

TrainingTraining Create a folder named corpus containing test, train and tuning fileGiza++ is used to generate alignmentg gPhrase table is generated after trainingBefore training language model needs toBefore training language model needs to be build on target sidemkdir lm ; /usr/bin/ngram-count -order 3 -interpolate -kndiscount -text d ; /us /b / g a cou t o de 3 te po ate d scou t te t$PWD/corpus/train_surface.hi -lm lm/train.lm;/usr/share/moses/scripts/training/train-factored-phrase-model.perl -scripts-root-dir /usr/share/moses/scripts -root-dir . -corpus train.clean -e hi -f en -l $ /l / llm 0:3:$PWD/lm/train.lm:0;

ExampleExample

train en train prtrain.enh e l l oh l l

train.prhh eh l owhh h l h e l l o

w o r l dc o m p o u n d w o r d

hh ah l oww er l dk d dc o m p o u n d w o r d

h y p h e n a t e do n e

k aa m p aw n d w er dhh ay f ah n ey t ih dow eh n iyo n e

b o o mk w e e z l e b o t t e r

ow eh n iyb uw mk w iy z l ah b aa t ah rk w e e z l e b o t t e r k w iy z l ah b aa t ah r

Sample from Phrase-tableSample from Phrase table

b ||| b ||| (0) (1) ||| (0) (1) ||| 1 0 666667 1 0 181818b o ||| b aa ||| (0) (1) ||| (0) (1) ||| 1 0.666667 1 0.181818 2.718

b ||| b ||| (0) ||| (0) ||| 1 1 1 1 2.718c o m p o ||| aa m p ||| (2) (0,1) (1) (0) (1) ||| (1,3) (1,2,4) (0)

||| 1 0.0486111 1 0.154959 2.718c ||| p ||| (0) ||| (0) ||| 1 1 1 1 2.718d w ||| d w ||| (0) (1) ||| (0) (1) ||| 1 0.75 1 1 2.718

l l o ||| l ow ||| (0) (0) (1) ||| (0,1) (2) ||| 0.5 1 1 0.227273 2.718l l ||| l ||| (0) (0) ||| (0,1) ||| 0.25 1 1 0.833333 2.718l o ||| l ow ||| (0) (1) ||| (0) (1) ||| 0.5 1 1 0.227273 2.718l ||| l ||| (0) ||| (0) ||| 0 75 1 1 0 833333 2 718d ||| d ||| (0) ||| (0) ||| 1 1 1 1 2.718

e b ||| ah b ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718e l l ||| ah l ||| (0) (1) (1) ||| (0) (1,2) ||| 1 1 0.5 0.5 2.718e l l ||| eh l ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.111111 0.5

0.111111 2.718e l ||| eh ||| (0) (0) ||| (0,1) ||| 1 0.111111 1 0.133333 2.718e ||| ah ||| (0) ||| (0) ||| 1 1 0 666667 0 6 2 718

l ||| l ||| (0) ||| (0) ||| 0.75 1 1 0.833333 2.718m ||| m ||| (0) ||| (0) ||| 1 0.5 1 1 2.718n d ||| n d ||| (0) (1) ||| (0) (1) ||| 1 1 1 1 2.718n e ||| eh n iy ||| (1) (2) ||| () (0) (1) ||| 1 1 0.5 0.3 2.718n e ||| n iy ||| (0) (1) ||| (0) (1) ||| 1 1 0.5 0.3 2.718n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718e ||| ah ||| (0) ||| (0) ||| 1 1 0.666667 0.6 2.718

h e ||| hh ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.6 2.718h ||| hh ||| (0) ||| (0) ||| 1 1 1 1 2.718l e b ||| l ah b ||| (0) (1) (2) ||| (0) (1) (2) ||| 1 1 1 0.5 2.718l e ||| l ah ||| (0) (1) ||| (0) (1) ||| 1 1 1 0.5 2.718

n ||| eh n ||| (1) ||| () (0) ||| 1 1 0.25 1 2.718o o m ||| uw m ||| (0) (0) (1) ||| (0,1) (2) ||| 1 0.5 1 0.181818 2.718o o ||| uw ||| (0) (0) ||| (0,1) ||| 1 1 1 0.181818 2.718o ||| aa ||| (0) ||| (0) ||| 1 0.666667 0.2 0.181818 2.718o ||| ow eh ||| (0) ||| (0) () ||| 1 1 0.2 0.272727 2.718o ||| ow ||| (0) ||| (0) ||| 1 1 0.6 0.272727 2.718w o r ||| w er ||| (0) (1) (1) ||| (0) (1,2) ||| 1 0.1875 1 0.424242 2.718w ||| w ||| (0) ||| (0) ||| 1 0.75 1 1 2.718

TuningTuning

Not a compulsory step but will improve the decoding by a small percentagethe decoding by a small percentagemkdir tuning; cp $WDIR/corpus/tun.en tuning/input; cp $WDIR/corpus/tun.hi tuning/reference; /usr/share/moses/scripts/training/mert moses pl $PWD/tuning/input/usr/share/moses/scripts/training/mert-moses.pl $PWD/tuning/input $PWD/tuning/reference /usr/bin/moses $PWD/model/moses.ini --working-dir $PWD/tuning --rootdir /usr/share/moses/scripts

It will take around 1 hour on a server with 32GBIt will take around 1 hour on a server with 32GB RAM

TestingTesting

mkdir evaluation; /usr/bin/moses -config $WDIR/tuning/moses.ini -input-file $WDIR/corpus/test.en >evaluation/test.output;

The output will be inThe output will be in evaluation/test.output fileSample OutputSample Output

h o t hh aa th |UNK hh h ip h o n e p|UNK hh ow eh n iy

b o o k b uw k

CS460/626 : Natural Language Processing/Speech NLP and...

Documents

CS460/626 : Natural Language Processing/Speech NLP and the

CS460/626 : Natural Language Processing/Speech, NLP and ...cs626-460-2012/cs626-460-2011/lecture_slides/cs626-460...CS460/626 : Natural Language Processing/Speech, NLP and the Web

CS626: NLP, Speech and the Webcs626/cs626-sem1-2012/lecture_slides/c… · CS626: NLP, Speech and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 15, 17: Parsing Ambiguity,

CS626-449: Speech, Natural Language Processing and the Web/Topics in Artificial Intelligence

CS460/626 : Natural Language Processing/Speech, NLP …pb/cs626-460-2011/cs626-460-lect37... · CS460/626 : Natural Language Processing/Speech, ... •Computational semantics has

CS460/626 : Natural Language Processing/Speech, NLP …cs626-460-2012/lecture... · · 2012-01-02CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 1 ... just

CS626 : Natural Language Processing/Speech, NLP and …pb/cs626-2014/cs626-lect34-36... · CS626 : Natural Language Processing/Speech, ... Places of articulation 16 oct, 2014 Phonetics-phonology,

CS460/626 : Natural LanguageCS460/626 : Natural Language ...pb/cs626-460-2011/cs... · Philosophy Semantics, Meaning of “meaning”, Logic (syllogism) Linguistics Study of Syntax,

CS626 : Natural Language Processing, Speech and the Webpb/cs626-sem1-2012/cs626-lect1-to-3-intro... · CS626 : Natural Language Processing, Speech and the Web (Lecture 1,2,3 – Introduction,

CS460/626 : Natural Language Processing/Speech, …pb/cs626-sem1-2012/cs626-lect27-wn... · The first wordnet in the world was for English ... Wordnets for Hindi and Marathi being

CS460/626 : Natural Language Processing/Speech, …pb/cs626-460-2011/cs626-460...2011/03/31 · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 33– Phonetics

CS626 : Natural Language Processing/Speech, NLP …pb/cs626-sem1-2012/cs626...2012/11/01 · Introduction to sonority theory “The Sonority of a sound is its loudness relative to

CS460/626 : Natural Language Processing/Speech …pb/cs626-460-2011/cs626-460...Thus, only a limited no. of possible two‐consonant clusters. Three‐consonant: Restricted to licensed

CS460/626 : Natural Language Processing/Speech, NLP and the …cs626-460-2012/lecture... · 2012. 2. 14. · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture

CS626-449: NLP, Speech and Web-Topics-in-AI

CS460/626 : Natural Language Processing/Speech, NLP and ... · CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 35– X-bar theory) Pushpak Bhattacharyya CSE

CS460/626 : Natural Language Processing/Speech, NLP and

Speech, NLP and the Web - CSE, IIT Bombaypb/cs626-2014/cs626-lect1to4-intro-pos... · Speech, NLP and the Web Pushpak Bhattacharyya ... Word formation rules from root words ... Preposition

Speech, NLP and the Web - CSE, IIT Bombaypb/cs626-2014/cs626-lect13to15... · Speech, NLP and the Web Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 13, 14, 15: Morphology: English

CS460/626 : Natural LanguageCS460/626 : Natural Language ...cs626-460-2012/cs626... · The phaonmneal pweor of the hmuan mnid, aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it