25
HMM-based Alignment in Statistical Translation (1996) Lekha Muraleedharan [133050002] Sagar Ahire [133050073]

Paper Presentation: HMM-based Alignment

Embed Size (px)

DESCRIPTION

The paper presentation I did for HMM-based Alignment at IIT Bombay as a part of the Topics in NLP course. The paper treats alignment as an HMM problem, which is a different approach compared to the IBM models approach which is predominantly used.

Citation preview

Page 1: Paper Presentation: HMM-based Alignment

HMM-based Alignment in Statistical Translation (1996)

Lekha Muraleedharan [133050002]Sagar Ahire [133050073]

Page 2: Paper Presentation: HMM-based Alignment

Roadmap

● Review of Alignment● HMM-based Alignment● Results and Examples

Page 3: Paper Presentation: HMM-based Alignment

Roadmap: We Are Here

● Review of Alignment● HMM-based Alignment● Results and Examples

Page 4: Paper Presentation: HMM-based Alignment

Review of Alignment

● In order to translate a French sentence F to an English sentence E, the following expression can be used:

E* = argmaxE P(E|F)

= argmaxE P(E) * P(F|E)

● To learn P(F|E), the concept of alignments is used.

Page 5: Paper Presentation: HMM-based Alignment

Review of Alignment

● Alignment refers to a correspondence between E and F which indicates which word in F is translated to a particular word in E.

● For Example:पीटर ज द सोया

Peter slept early 1 3 2

Page 6: Paper Presentation: HMM-based Alignment

Alignment Models

Depending on the assumptions taken, there are several possible alignment models:● IBM Models (1 to 5)● HMM-based Alignment Models

Page 7: Paper Presentation: HMM-based Alignment

MODEL 1 MODEL 2

IBM Model 1,2 :The Math

Page 8: Paper Presentation: HMM-based Alignment

● Assumes alignments are more likely to “lie along the diagonal”

IBM Model 1

● Assumes all alignments are equally likely● Assumes source word depends only on

target word

IBM Model 2

Page 9: Paper Presentation: HMM-based Alignment

Roadmap: We Are Here

● Review of Alignment● HMM-based Alignment● Results and Examples

Page 10: Paper Presentation: HMM-based Alignment

HMM-based Alignment :The Math

Page 11: Paper Presentation: HMM-based Alignment

HMM-based Alignment

● Assumes alignment depends only on○ The previous alignment (not all previous)○ The jump width

● Thus, in this model alignments are relative

Page 12: Paper Presentation: HMM-based Alignment

A ComparisonIBM MODEL 1 IBM MODEL 2

HMM Based Model

Page 13: Paper Presentation: HMM-based Alignment

Roadmap: We Are Here

● Review of Alignment● HMM-based Alignment● Results and Examples

Page 14: Paper Presentation: HMM-based Alignment

Statistical Results:Basic Framework

● Models compared:○ IBM 1○ IBM 2○ HMM

● Corpora Used (German to French)○ Avalanche Bulletins Corpus (News)○ Vermobil Corpus (Spoken Dialog)○ EuTrans Corpus (Travel & Tourism)

Page 15: Paper Presentation: HMM-based Alignment

Statistical Results:Basic Framework

● Training Process:○ IBM 1: 10 iterations of EM○ IBM 2: 5 iterations of Maximum Approximation○ HMM: 5 iterations of Maximum Approximation

● Metric Used○ Perplexity (Wikipedia: “a measurement of how well a

probability model predicts a sample”)

Page 16: Paper Presentation: HMM-based Alignment

Statistical Results

Corpus IBM 1 IBM 2 HMM

EuTrans 16.267 9.781 9.686

Vermobil 46.672 30.706 26.495

Page 17: Paper Presentation: HMM-based Alignment

Intuitive Example: 1

Hin: पीटर ज द सोया

Eng: Peter slept earlyA: 1 3 2Jump: N/A 2 -1

Page 18: Paper Presentation: HMM-based Alignment

Intuitive Example:पीटर ज द सोया

● Relatively straightforward● As there are no major jumps, translation

probabilities take precedence

Page 19: Paper Presentation: HMM-based Alignment

Intuitive Example: 2

Hin: पीटर घर लौटने पर ज द सोया

Eng: Peter slept early on returning homeA: 1 6 5 4 3 2Jump: N/A 5 -1 -1 -1 -1

Page 20: Paper Presentation: HMM-based Alignment

Intuitive Example:पीटर घर लौटने पर ज द सोया

● IBM 2 stresses on diagonal alignments, so it will find the correct alignment difficult, as all alignments are nearly on the inverse diagonal

● HMM only concentrates on previous alignments and overall jump lengths, so this alignment minimizes the total jump length

Page 21: Paper Presentation: HMM-based Alignment

Intuitive Example: 3

Hin: पीटर बहुत ह ज द सोया

Eng: Peter slept very earlyA: 1 3 ? 4 2

Page 22: Paper Presentation: HMM-based Alignment

Intuitive Example:पीटर बहुत ह ज द सोया

● The HMM model assumes that every source word has a corresponding target word

● Moreover, empty word alignments are not incorporated in the basic HMM model

● To model empty words an HMM of order 2 is required

Page 23: Paper Presentation: HMM-based Alignment

Intuitive Example: 4

Hin: पीटर आज कल ज द सोता है

Eng: Peter sleeps early these daysA: 1 2,3 3 2 2

Page 24: Paper Presentation: HMM-based Alignment

Intuitive Example:पीटर आज कल ज द सोता है

● सोता है↔sleeps can be handled by HMM● आज कल↔these days requires multi-word

handling to defeat a translation like “today tomorrow”

Page 25: Paper Presentation: HMM-based Alignment

References

● HMM-based Word Alignment in Statistical Translation (1996) by Stephan Vogel, Hermann Ney, Christoph Tillman; COLING ‘96, Copenhagen

● The Mathematics of Statistical Machine Translation: Parameter Estimation (1993) by Peter Brown, Stephen Della-Pietra, Vincent Della-Pietra, Robert Mercer; Journal of Computational Linguistics