1 / 36

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼ , Min Zhang ╪ , Chew

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼ , Min Zhang ╪ , Chew Lim Tan ┼. ┼. ╪. Outline. Introduction Non-contiguous Tree Sequence Modeling Rule Extraction Non-contiguous Decoding: the Pisces Decoder Experiments Conclusion.

forbes
Télécharger la présentation

A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation Jun Sun ┼ , Min Zhang ╪ , Chew

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. A non-contiguous Tree Sequence Alignment-based Model for Statistical Machine TranslationJun Sun┼, Min Zhang╪, Chew Lim Tan┼ ┼ ╪

  2. Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion

  3. Contiguous and Non-contiguousBilingual Phrases Non-contiguous translational equivalence Contiguous translational equivalences

  4. Previous Work on Non-contiguous phrases • (-) Zhang et al. (2008) acquire the non-contiguous phrasal rules from the contiguous tree sequence pairs, and find them useless via real syntax-based translation systems. • (+) Wellington et al. (2006) statistically report that discontinuities are very useful for translational equivalence analysis using binary branching structures under word alignment and parse tree constraints. • (+) Bod (2007) also finds that discontinues phrasal rules make significant improvement in linguistically motivated STSG-based translation model.

  5. Previous Work on Non-contiguous phrases (cont.) VP(VV(到),NP(CP[0],NN(时候)))  SBAR(WRB(when),S[0]) Non-contiguous Contiguous tree sequence pair Contiguous tree sequence pair

  6. Previous Work on Non-contiguous phrases (cont.) No match in rule set

  7. Proposed Non-contiguous phrases Modeling . . . Extracted from non-contiguous tree sequence pairs

  8. Contributions • The proposed model extracts the translation rules not only from the contiguous tree sequence pairs but also from the non-contiguous tree sequence pairs (with gaps). With the help of the non-contiguous tree sequence, the proposed model can well capture the non-contiguous phrases in avoidance of the constraints of large applicability of context and enhance the non-contiguous constituent modeling. • A decoding algorithm for non-contiguous phrase modeling

  9. Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion

  10. SncTSSG Synchronous Tree Substitution Grammar (STSG, Chiang, 2006) Synchronous Tree Sequence Substitution Grammar (STSSG, Zhang et al. 2008) Synchronous non-contiguous Tree Sequence Substitution Grammar (SncTSSG)

  11. Word Aligned Parse Tree and Two Parse Tree Sequence VBA VO e e r t b u s P NG VG R 给 我 把 钢笔 a b s t r a c t S u b s t r u VBA c t u r e VO VG R P NG 给 把 1. Word-aligned bi-parsed Tree 2. Two Structure 3. Two Tree Sequences

  12. Contiguous Translation Rules r1. Contiguous Tree-to-Tree Rule r2. Contiguous Tree Sequence Rule

  13. Non-contiguous Translation Rules r1. Non-contiguous Tree-to-Tree Rule r2. Non-contiguous Tree Sequence Rule

  14. Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion

  15. A word-aligned parse tree pairs

  16. Example for contiguous rule extraction(1)

  17. Example for contiguous rule extraction(2)

  18. Example for contiguous rule extraction(3)

  19. Example for contiguous rule extraction(4) Abstract into substructures

  20. Example for non-contiguous rule extraction(1) Extracted from non-contiguous tree sequence pairs

  21. Example for non-contiguous rule extraction(2) Abstract into substructures from non-contiguous tree sequence pairs

  22. Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion

  23. The Pisces Decoder • Pisces conducts searching by the following two modules • The first one is a CFG-based chart parser as a pre-processor for mapping an input sentence to a parse tree Ts (for details of chart parser, please refer to Charniak (1997)) • The second one is a span-based tree decoder (3 phases) • Contiguous decoding (same with Zhang et al. 2008) • Source side non-contiguous translation • Tree sequence reordering in Target side

  24. Source side non-contiguous translation • Source gap insertion Right insertion: Left insertion: NP(...) IN(in) NP(...)

  25. Tree sequence reordering in Target side • Binarize each span into the left one and the right one. • Generating the new translation hypothesis for this span by inserting the candidate translations of the right span to each gap in the ones of the left span. • Generating the translation hypothesis for this span by inserting the candidate translations of the left span to each gap in the ones of the right span. A candidate hypo taget span with gaps Right span Left span

  26. Modeling • : source/target sentence • : source/target parse tree • : a non-contiguous source/target tree sequence • : source/target spans • hm : the feature function

  27. Features • The bi-phrasal translation probabilities • The bi-lexical translation probabilities • The target language model • The # of words in the target sentence • The # of rules utilized • The average tree depth in the source side of the rules adopted • The # of non-contiguous rules utilized • The # of reordering times caused by the utilization of the non-contiguous rules

  28. Outline • Introduction • Non-contiguous Tree Sequence Modeling • Rule Extraction • Non-contiguous Decoding: the Pisces Decoder • Experiments • Conclusion

  29. Experimental settings • Training Corpus: • Chinese-English FBIS corpus • Development Set: • NIST MT 2002 test set • Test Set: • NIST MT 2005 test set • Evaluation Metrics: • case-sensitive BLEU-4 • Parser: • Stanford Parser (Chinese/English) • Evaluation: • mteval-v11b.pl • Language Model: • SRILM 4-gram • Minimum error rate training: • (Och, 2003) • Model Optimization: • Only allow gaps in one side

  30. Model comparison in BLEU Table 1: Translation results of different models (cBP refers to contiguous bilingual phrases without syntactic structural information, as used in Moses)

  31. Rule combination cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules) ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes srcncR: non-contiguous rules with gaps in the source side tgtncR: non-contiguous rules with gaps in the target side src&tgtncR : non-contiguous rules with gaps in either side Table 2: Performance of different rule combination

  32. Bilingual Phrasal Rules cR: rules derived from contiguous tree sequence pairs (i.e., all STSSG rules) ncPR: non-contiguous rules derived from contiguous tree sequence pairs with at least one non-terminal leaf node between two lexicalized leaf nodes srcncBP: non-contiguous phrasal rules with gaps in the source side tgtncBP: non-contiguous phrasal rules with gaps in the target side src&tgtncBP : non-contiguous phrasal rules with gaps in either side Table 3: Performance of bilingual phrasal rules

  33. Maximal number of gaps • Table 4: Performance and rule size changing with different maximal number of gaps

  34. Sample translations

  35. Conclusion • Able to attain better ability of non-contiguous phrase modeling and the reordering caused by non-contiguous constituents with large gaps from • Non-contiguous tree sequence alignment model based on SncTSSG • Observations • In Chinese-English translation task, gaps are more effective in Chinese side than in the English side. • Allowing one gap only is effective • Future Work • Redundant non-contiguous rules • Optimization of the large rule set

  36. The End

More Related