16
15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 1 A Context-Aware Topic Model for Statistical Machine Translation Jinsong Su, Deyi Xiong, Yang Liu, Xianpei Han, Hongyu Lin, Junfeng Yao, Min Zhang ACL 2015 Introduced by Yusuke Oda @odashi_t 2015/9/10 NAIST MT-Study Group

[Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

Embed Size (px)

Citation preview

Page 1: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 1

A Context-Aware Topic Modelfor Statistical Machine Translation

Jinsong Su, Deyi Xiong, Yang Liu, Xianpei Han, Hongyu Lin,Junfeng Yao, Min Zhang

ACL 2015

Introduced by Yusuke Oda@odashi_t

2015/9/10 NAIST MT-Study Group

Page 2: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 2

Lexical Selection for SMT● Lexical selection is important for SMT

● Two categories in previous studies for lexical selection:

– Incorporating sentence-level (local) contexts

– Integrating document-level (global) topics

● Considering the correlation between local and global information

– Have never been explored

– But both are highly correlated

sentence-levelcontexts

document-leveltopics

Page 3: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 3

Proposed Model● Context-aware topic model (CATM)

– Jointly model both local and global contexts for lexical selection

– Based on topic modeling

– Performing Gibbs sampling to learn parameters of the model

● Terms

– Topical words: telated to topics of the document

● In this study, we use content words (= noun, verb, adjective, adverb)

– Contextual words: effect translation selections of topical words

● In this study, we use all words in the sentence

– Target-side topical items: are translation candidates of source topical words

Page 4: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 4

Assumption

● Assumption

– Topic consistency: all should be consistent with in the document

– Context compatibility: all should be compatible with neighboring

Topical words

Target-side topical items

Contextual words

Topic

Page 5: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 5

Graphical Representation of Proposed Model

Topic distributionof the document

Topic

Target-side topical item

Neighboringtarget-side topical item

Topic distribution over

Distribution of Distribution of

Page 6: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 6

Generation Steps

Page 7: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 7

Joint Probability ● Objective: fitting below joint probability distribution given training data :

● …OMG, too complex.

Page 8: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 8

Gibbs Sampling (1)● Directly fitting the joint probability is intractable to compute

● Use Gibbs sampling instead

● Given the training data ,the joint distribution of is propotion to:

Page 9: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 9

Gibbs Sampling (2)● Sampling ● Sampling

● Sampling

● indicates

the count of b in a range a

● (-i) indicates ignoring i-th content

Page 10: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 10

Experiments● Domain: Chinese to English

● Corpus:

– Training: FBIS / Hansards (1M sent., 54.6k doc.)

– Dev: NIST MT05

– Test: NIST MT06 / 08

● Alignment: GIZA++ / grow-diag-final-and

● Hyperparameters:

– number of topic = 25

– α = 50 / number of topics

– β = 0.1

– γ = 1.0 / number of topical words

– δ = 2000 / number of contextual words

Page 11: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 11

Result: Impact of Window Size

● Best performance under window size = 12

– Sufficient for predicting target-side translations for ambiguous source-side topical words

12 words 12 wordsAttention

Page 12: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 12

Result: Overall Performance

● Proposed method achieves the best performancewith statistical significance

BLEU4

Page 13: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 13

Result: Effect of Correlation Modeling● Comparing with separated models

– CATM (Content): substitutes uniform distribution for

● Omitting effects from topics

– CATM (Topic): window size = 0

● Omitting effects from contexts

– CATM (Log-linear): combining above two wusing log-linear mannar

● Proposed model achieves best performance

– Jointly learning both context and topic is effective for lexical selection.

Page 14: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 14

Topic Examples

Page 15: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 15

Summaries● Context-aware topic model (CATM)

– Jointly learning context and topic information

– Is the first work in author's knowledge

– Achieves highest translation performance thanusing only context or topic informationand naively combining using log-linear mannar

● Future work

– Considering modeling for phrase-level as well as word-level

– Improving model with monolingual corpora

Page 16: [Paper Introduction] A Context-Aware Topic Model for Statistical Machine Translation

15/09/10 Copyright (C) 2015 by Yusuke Oda, AHC-Lab, IS, NAIST 16

Impressions● Is it correct to use sequence-of-words window as the context?

– How about using some syntax information?

● This model uses the word alignment (GIZA++)for selecting translation candidates

– How about the effect of alignment accuracy?