Upload
aubrey-marshall
View
218
Download
0
Tags:
Embed Size (px)
Citation preview
A non-contiguous Tree Sequence Alignment-based
Model for Statistical Machine Translation
Jun Sun┼, Min Zhang╪, Chew Lim Tan┼
┼ ╪
Outline
Introduction
Non-contiguous Tree Sequence Modeling
Rule Extraction
Non-contiguous Decoding: the Pisces Decoder
Experiments
Conclusion
2
Contiguous and Non-contiguousBilingual Phrases
3
Contiguoustranslational equivalences
Non-contiguoustranslational equivalence
VP
NP
VV PN
IP
CP
NNDECVV
到 时候的出场他
SBAR
VP
S
RPVBZPRPWRB
upshowshewhen
(at) (he)(show up) (‘ s) (time)
Previous Work on Non-contiguous phrases
(-) Zhang et al. (2008) acquire the non-contiguous phrasal rules from the contiguous tree sequence pairs, and find them useless via real syntax-based translation systems.
(+) Wellington et al. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints.
(+) Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model.
4
VP
NP
VV
CP
NN
到 时候
SBAR
S
WRB
when
(at) (time)
Previous Work on Non-contiguous phrases (cont.)
5
VP
NP
VV PN
IP
CP
NNDECVV
到 时候的出场他
SBAR
VP
S
RPVBZPRPWRB
upshowshewhen
(at) (he)(show up) (‘ s) (time)
VP(VV( 到 ),NP(CP[0],NN( 时候 ))) SBAR(WRB(when),S[0])
Non-contiguous
Contiguous tree sequence pair
Contiguous tree sequence pair
Previous Work on Non-contiguous phrases (cont.)
6
No match in rule set
PN
IP
CP
DECVV
的出场他
VP
S
RPVBZPRP
upshowshe
(he) (show up) (‘ s)
VP
NP
VV PN
IP
CP
NNDECVV
到 时候的出场他
SBAR
VP
S
RPVBZPRPWRB
upshowshewhen
(at) (he)(show up) (‘ s) (time)
VP
NP
VV
CP
NN
到 时候
SBAR
S
WRB
when
(at) (time)
VP
NP
ASVV PN
IP
CP
NNDECVV
到 时候的出场他了(at) (NULL) (he) (show up) (‘ s) (time)
VP
NP
ASVV
CP
NN
到 时候了(at) (NULL) (time)
VP
S
RPVBZPRP
upshowshe
Proposed Non-contiguous phrases Modeling
7
PN
IP
CP
DECVV
的出场他
VP
S
RPVBZPRP
upshowshe
(he) (show up) (‘ s)
VP
NP
VV PN
IP
CP
NNDECVV
到 时候的出场他
SBAR
VP
S
RPVBZPRPWRB
upshowshewhen
(at) (he)(show up) (‘ s) (time)
VP
NP
VV
CP
NN
到 时候
SBAR
S
WRB
when
(at) (time)
VP
NP
ASVV PN
IP
CP
NNDECVV
到 时候的出场他了(at) (NULL) (he) (show up) (‘ s) (time)
VP
NP
ASVV
CP
NN
到 时候了(at) (NULL) (time)
VP
S
RPVBZPRP
upshowshe
WRB
when
VV NN
到 时候
WRB
when
(at) (time)
. . .
Extracted from non-contiguous tree sequence
pairs
Contributions
The proposed model extracts the translation rules not only from the contiguous tree sequence pairs but also from the non-contiguous tree sequence pairs (with gaps). With the help of the non-contiguous tree sequence, the proposed model can well capture the non-contiguous phrases in avoidance of the constraints of large applicability of context and enhance the non-contiguous constituent modeling.
A decoding algorithm for non-contiguous phrase modeling
8
Outline
Introduction
Non-contiguous Tree Sequence Modeling
Rule Extraction
Non-contiguous Decoding: the Pisces Decoder
Experiments
Conclusion
9
SncTSSG
Synchronous Tree Substitution Grammar (STSG, Chiang, 2006)
Synchronous Tree Sequence Substitution Grammar (STSSG, Zhang et al. 2008)
Synchronous non-contiguous Tree Sequence Substitution Grammar (SncTSSG)
10
Word Aligned Parse Tree and Two Parse Tree Sequence
11
VBA
把 我给钢笔
P RVGNG
VO
VBA
把 给
P RVGNG
VO
subtree
Substructure
abstract
1. Word-aligned bi-parsed Tree 2. Two Structure 3. Two Tree Sequences
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
Ts:
A:
Tt:
我(me)
给(give)
RVG
Give to me
VBP TO PRP
PP
, *** ,
,
Contiguous Translation Rules
12
VBA
把(NULL)
我(me)
P RVGNG
to me
VBP TO PRP
NG PP
VP
1 2
2
1
r1:
钢笔(pen)
NG VG
给(give)
,
the
DT
NPVBP
give
, NN
pen
r2:
VO
r1. Contiguous Tree-to-Tree Rule r2. Contiguous Tree Sequence Rule
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
Ts:
A:
Tt:
Non-contiguous Translation Rules
13
VBA
把(NULL)
P NG
VO
the
VBP DT NN
NP PP
VP
,1
1
1
,
give
VG
ncTSr1:
pen钢笔(pen)
我(me)
给(give)
RVG
VO
to me
TO PRP
PP
ncTSr2: ,, ***
r1. Non-contiguous Tree-to-Tree Rule r2. Non-contiguous Tree Sequence Rule
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
Ts:
A:
Tt:
Outline
14
Introduction
Non-contiguous Tree Sequence Modeling
Rule Extraction
Non-contiguous Decoding: the Pisces Decoder
Experiments
Conclusion
A word-aligned parse tree pairs
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
Ts:
A:
Tt:
Example for contiguous rule extraction(1)
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
钢笔(pen)
NG
,
pen
NN
Example for contiguous rule extraction(2)
给(give)
VG
,
give
VBP
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
Example for contiguous rule extraction(3)
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
P RVGNG
VO
Give topenthe me
VBP DT NN TO PRP
NP PP
VP
,
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
Example for contiguous rule extraction(4)
VBA
把(NULL)
我(me)
P RVGNG
VO
tothe me
VBP DT NN TO PRP
NP PP
VP
,
1 2
2 1
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S Abstract into substructures
Example for non-contiguous rule extraction(1)
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
,
give
VG
我(me)
给(give)
RVG
VO
to me
TO PRP
PP
,, ***
Extracted from non-contiguous tree sequence
pairs
Example for non-contiguous rule extraction(2)
S
VBA
把(NULL)
我(me)
给(give)
钢笔(pen)
。(.)
P WJRVGNG
VO
Give topenthe me .
VBP DT NN TO PRP PUNC.
NP PP
VP
S
VBA
把(NULL)
P NG
VO
the
VBP DT NN
NP PP
VP
,1
1
1
pen钢笔(pen)
Abstract into substructures from non-contiguous tree sequence pairs
Outline
22
Introduction
Non-contiguous Tree Sequence Modeling
Rule Extraction
Non-contiguous Decoding: the Pisces Decoder
Experiments
Conclusion
The Pisces Decoder
Pisces conducts searching by the following two modules The first one is a CFG-based chart parser as a pre-processor for mapping an input sentence to a parse tree Ts (for details of chart parser, please refer to Charniak (1997))
The second one is a span-based tree decoder (3 phases)Contiguous decoding (same with Zhang et al. 2008)
Source side non-contiguous translation
Tree sequence reordering in Target side
23
Source side non-contiguous translation
Source gap insertion
24
PP
LCP
在(in)
P LC
DNP
NNDEGNT
NP
近期(recent)
的 调查(survey)
中
NP(DNP(NT(近期),DEG(的)),NN(调查)) NP(DT(the),JJ(recent),NNS(surveys))
P(在) … LC(中) IN(in)
IN(in)NP(...) NP(...)
NP
NNSJJDT
surveysrecentthe
IN
in
,…,
Right insertion: Left insertion:
NP
NNSJJDT
surveysrecentthe
,…,
IN
in
Tree sequence reordering in Target side
Binarize each span into the left one and the right one.
Generating the new translation hypothesis for this span by inserting the candidate translations of the right span to each gap in the ones of the left span.
Generating the translation hypothesis for this span by inserting the candidate translations of the left span to each gap in the ones of the right span.
25
A candidate hypo
taget span
with gaps
Left span
Right span
Modeling
26
: source/target sentence
: source/target parse tree
: a non-contiguous source/target tree sequence
: source/target spans
hm : the feature function
Features
The bi-phrasal translation probabilities
The bi-lexical translation probabilities
The target language model
The # of words in the target sentence
The # of rules utilized
The average tree depth in the source side of the rules adopted
The # of non-contiguous rules utilized
The # of reordering times caused by the utilization of the non-contiguous rules
27
Outline
28
Introduction
Non-contiguous Tree Sequence Modeling
Rule Extraction
Non-contiguous Decoding: the Pisces Decoder
Experiments
Conclusion
Training Corpus: Chinese-English FBIS corpus
Development Set: NIST MT 2002 test set
Test Set: NIST MT 2005 test set
Evaluation Metrics: case-sensitive BLEU-4
Parser: Stanford Parser (Chinese/English)
29
Experimental settings
Evaluation:mteval-v11b.pl
Language Model: SRILM 4-gram
Minimum error rate training: (Och, 2003)
Model Optimization: Only allow gaps in one side
Model comparison in BLEU
Table 1: Translation results of different models (cBP refers to contiguous bilingual phrases without syntactic structural information, as used in Moses)
30
System Model BLEU
Moses cBP 23.86
PiscesSTSSG 25.92
SncTSSG 26.53
Rule combination
Table 2: Performance of different rule combination
31
ID Rule Set BLEU
1 cR (STSSG) 25.922 cR w/o ncPR 25.87
3 cR w/o ncPR + tgtncR 26.14
4 cR w/o ncPR + srcncR 26.505 cR w/o ncPR + src&tgtncR 26.51
6 cR + tgtncR 26.11
7 cR + srcncR 26.568 cR+src&tgtncR(SncTSSG) 26.53
cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules)
ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes
srcncR: non-contiguous rules with gaps in the source side
tgtncR: non-contiguous rules with gaps in the target side
src&tgtncR : non-contiguous rules with gaps in either side
Bilingual Phrasal Rules
Table 3: Performance of bilingual phrasal rules
32
System Rule Set BLEU
Moses cBP 23.86
PiscescBP 22.63cBP + tgtncBP 23.74cBP + srcncBP 23.93cBP + src&tgtncBP 24.24
cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules)
ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes
srcncBP: non-contiguous phrasal rules with gaps in the source side
tgtncBP: non-contiguous phrasal rules with gaps in the target side
src&tgtncBP : non-contiguous phrasal rules with gaps in either side
Maximal number of gaps
Table 4: Performance and rule size changing with different maximal number of gaps
33
Max gaps allowed Rule # BLEUsource target
0 0 1,661,045 25.921 1 +841,263 26.532 2 +447,161 26.553 3 +17,782 26.56
∞ +8,223 26.57
Sample translations
34
Output & ReferencesSource 才 /only 过 /pass 了 /null 五年 /five years , 两人 /two people 就 /null 对簿公堂 /confront at
courtReference after only five years the two confronted each other at courtSTSSG only in the five years , the two candidates would 对簿公堂SncTSSG the two people can confront other countries at court leisurely manner only in the five yearskey rules VV( 对簿公堂 )→VB(confront)NP(JJ(other),NNS(countries))IN(at) NN(court) ***
JJ(leisurely)NN(manner)Source 欧元 /Euro 的 /’s 大幅 /substantial 升值 /appreciation 将 /will 在 /in 近期 /recent 的 /’s 调
查 /survey 中 /middle 持续 /continue 对 /for 经济 /economy 信心 /confidence 产生 /produce 影响 /impact
Reference substantial appreciation of the euro will continue to impact the economic confidence in the recent surveys
STSSG substantial appreciation of the euro has continued to have an impact on confidence in the economy , in the recent surveys will
SncTSSG substantial appreciation of the euro will continue in the recent surveys have an impact on economic confidence
key rules AD( 将 /will) *** VV( 持续 /continue) → VP(MD(will),VB(continue))P( 在 /in) *** LC( 中 /middle) → IN(in)
Conclusion
Able to attain better ability of non-contiguous phrase modeling and the reordering caused by non-contiguous constituents with large gaps fromNon-contiguous tree sequence alignment model based on SncTSSG
ObservationsIn Chinese-English translation task, gaps are more effective in Chinese side than in the English side.
Allowing one gap only is effective
Future WorkRedundant non-contiguous rules
Optimization of the large rule set35
36
The End