1 / 85

Arianna Bisazza Advisor: Marcello Federico

PhD Thesis:. Linguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation. Arianna Bisazza Advisor: Marcello Federico . Fondazione Bruno Kessler / Università di Trento. PSMT decoding overview.

brasen
Télécharger la présentation

Arianna Bisazza Advisor: Marcello Federico

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PhD Thesis: Linguistically Motivated Reordering Modeling for Phrase-Based Statistical Machine Translation Arianna Bisazza Advisor: Marcello Federico Fondazione Bruno Kessler/ Università di Trento

  2. PSMT decoding overview E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 2 Arianna Bisazza – PhD Thesis – 19 April 2013

  3. PSMT decoding overview ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores TM scores Freedom of movement must be encouraged LM scores LM scores 3 Arianna Bisazza – PhD Thesis – 19 April 2013

  4. PSMT decoding overview ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali TM scores TM scores TM scores TM scores career paths while ensuring that … Freedom of movement must be encouraged LM scores LM scores LM scores LM scores 4 Arianna Bisazza – PhD Thesis – 19 April 2013

  5. PSMT decoding overview E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores Freedom of movement must be encouraged while ensuring that career paths TM scores TM scores TM scores TM scores … LM scores LM scores LM scores LM scores 5 Arianna Bisazza – PhD Thesis – 19 April 2013

  6. Reordering Models Tillman 04,Zens & Ney 06 Al Onaizan & Papineni 06 Galley & Manning 08 Green & al.10, Feng & al.10 … Many solutions have been proposed with different reo. classes, features, train modes, etc. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores 6 Arianna Bisazza – PhD Thesis – 19 April 2013

  7. Reordering Models Tillman04, Zens&Ney06 AlOnaizan & Papineni06 Galley & Manning08 Green &al.10, Feng &al.10 … Tillman 04,Zens & Ney 06 Al Onaizan & Papineni 06 Galley & Manning 08 Green & al.10, Feng & al.10 … Many solutions have been proposed with different reo. classes, features, train modes, etc. ReoM scores ReoM scores ReoM scores ReoM scores ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali No matter what reordering model is used, the permutation search space must be limited!  The power of all reordering models is bound to the reordering constraints in use 7 Arianna Bisazza – PhD Thesis – 19 April 2013

  8. ReoM scores E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali 8 Arianna Bisazza – PhD Thesis – 19 April 2013

  9. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali Reordering Constraints #perm = |w|! ≈40,000,000 9 Arianna Bisazza – PhD Thesis – 19 April 2013

  10. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali Reordering Constraints #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| Source-to-Source distortion 10 Arianna Bisazza – PhD Thesis – 19 April 2013

  11. E' necessario incoraggiare tale mobilità garantendo la sicurezza dei percorsi professionali DL: distortion limit Reordering Constraints #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| DL=3 #perm ≈7,000 Source-to-Source distortion 11 Arianna Bisazza – PhD Thesis – 19 April 2013

  12. The problem with DL… Arabic-English EN EN AR AR 12 Arianna Bisazza – PhD Thesis – 19 April 2013

  13. The problem with DL… German-English EN EN DE DE 13 Arianna Bisazza – PhD Thesis – 19 April 2013

  14. #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| DL=3 #perm ≈7,000 Current solution Increasing the DLimit! Source-to-Source distortion 14 Arianna Bisazza – PhD Thesis – 19 April 2013

  15. #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| DL=3 #perm ≈7,000 DL=7 #perm ≈7,000,000 Current solution Increasing the DLimit! • Coarse reordering • space definition: • slower decoding • worse translations Source-to-Source distortion 15 Arianna Bisazza – PhD Thesis – 19 April 2013

  16. Observations • Word reordering is difficult! • The existing word reordering models are not perfect, but they are expected to guide search over huge search spaces one way to go: our way: • design a perfect model • problem: many have already tried and failed • simplify the task for the existing reordering models 16 Arianna Bisazza – PhD Thesis – 19 April 2013

  17. Working hypotheses • A better definition of the reordering search space (i.e. constraints) can simplify the task of the reordering model • (Shallow) linguistic knowledge can help us to refine the reordering search space for a given language pair 17 Arianna Bisazza – PhD Thesis – 19 April 2013

  18. Outline • The problem • The solutions: • verbreorderinglattices • modifieddistortionmatrices • dynamicallypruning the reordering space • Comparative evaluation & conclusions 18 Arianna Bisazza – PhD Thesis – 19 April 2013

  19. Outline Bisazza and Federico, Chunk-based Verb Reordering in VSO Sentences for Arabic-English, WMT 2010 • The problem • The solutions: • verbreorderinglattices • modifieddistortionmatrices • dynamicallypruning the reordering space • Comparative evaluation & conclusions Bisazza, Pighin, Federico, Chunk-Lattices for Verb Reordering in Arabic-English Statistical Machine Translation, MT Journal 2012 19 Arianna Bisazza – PhD Thesis – 19 April 2013

  20. Idea: keep a low distortion limit and … #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| DL=3 #perm ≈7,000 DL=7 #perm ≈7,000,000 … modify the input to allow only specific long reorderings Source-to-Source distortion 20 Arianna Bisazza – PhD Thesis – 19 April 2013

  21. Reordering patterns in Arabic-English • Example of VSO sentences: • the Arabicverbisanticipatedwrt the English order • Typical PSMT outputs: • *The Moroccan monarch King Mohamed VI __ his support to… • *He renewed the Moroccan monarch King Mohamed VI his support to… 21 Arianna Bisazza – PhD Thesis – 19 April 2013

  22. Working hypothesis • Uneven distribution of long and short-range word movements: • few long: • verb-subject-object sentences • many short: • adjective-noun • head-initial genitive constructions (idafa) We try to model them explicitly! We assume they are well handled in standard PSMT 22 Arianna Bisazza – PhD Thesis – 19 April 2013

  23. Chunk-based fuzzy reordering rules • Shallow syntax chunking: • cheaper and easier than deep parsing • constrains reorderings in a softer way • Fuzzy(non-determinisic) reordering rules: • generate N permutations for each matching sequence • final reordering decision is taken during translation, • guided by all SMT models (reoM, LM...) • Few rules for language pair, to only capture long reordering 23 Arianna Bisazza – PhD Thesis – 19 April 2013

  24. Chunk-based fuzzy reordering rules Move verb chunk ahead by 1 to N chunks … CH(*) CH(V) CH(*) CH(*) CH(*) CH(*) CH(*) … CH(V) CH(*) … CH(*) CH(*) CH(*) CH(*) … CH(*) Move verb chunk and following chunk ahead by 1 to N chunks 24 Arianna Bisazza – PhD Thesis – 19 April 2013

  25. Chunk-based verb reordering in parallel data The optimal reordering is the one that minimizes total distortion 25 Arianna Bisazza – PhD Thesis – 19 April 2013

  26. Chunk-based verb reordering in test data Move verb chunk Move verb chunk andfollowing chunk • Verb chunk • Other chunks 26 Arianna Bisazza – PhD Thesis – 19 April 2013

  27. Experiments • Task: NIST-MT09 (news translation) • Systems based on Moses, include lexicalized phrase reordering models [Tillmann 04; Koehn & al 05] • Non-monotonic lattice decoding [Dyer & al 08] • Evaluation by • - BLEU [Papineni & al 01] for lexical match & local order • - KRS [Birch & al 10]for global order 27 Arianna Bisazza – PhD Thesis – 19 April 2013

  28. Arabic-English: Translation Quality +0.5 BLEU +0.4 KRS Test set: eval09-nw Lattices always used with pre-ordered training Oracle: test pre-ordered looking at reference (more details on lattice pruning in the thesis) 28 Arianna Bisazza – PhD Thesis – 19 April 2013

  29. Arabic-English: -0.1 BLEU -0.3 KRS Translation Quality Translation Time Decoding Pruning Test set: eval09-nw Lattices always used with pre-ordered training Oracle: test pre-ordered looking at reference (more details on lattice pruning in the thesis) 29 Arianna Bisazza – PhD Thesis – 19 April 2013

  30. Lessons learned limiting long reordering of a few chunks only use lattice to represent extra reordering decoding slow down Can we do better? Observation: lattice topology basically distorts word-to-word distances, i.e. during decoding some distant positions become closer Can we achieve the same effect more directly? 30 Arianna Bisazza – PhD Thesis – 19 April 2013

  31. Outline • The problem • The solutions: • verbreorderinglattices • modifieddistortionmatrices • dynamicallypruning the reordering space • Comparative evaluation & conclusions Bisazza and Federico, Modified Distortion Matrices for Phrase-Based Statistical Machine Translation, ACL 2012 31 Arianna Bisazza – PhD Thesis – 19 April 2013

  32. #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| DL=3 #perm ≈7,000 DL=7 #perm ≈7,000,000 Source-to-Source distortion 32 Arianna Bisazza – PhD Thesis – 19 April 2013

  33. #perm = |w|! ≈40,000,000 D(wx,wy)=|y-x-1| DL=3 #perm ≈7,000 DL=7 #perm ≈7,000,000 DL=3 & modif(D)  #perm ≈20,000 Idea: modify the distortion matrix for each test sentence! Refined reordering search space Source-to-Source distortion 33 Arianna Bisazza – PhD Thesis – 19 April 2013

  34. Chunk-based fuzzy reordering rules Arabic-English “Move verb chunk (and following chunk) to the right by 1 to N chunks” w- $ArkfyAltZAhrpE$rAtAlmslHynmnAlktA}b . and took part in the march dozens of militants from the Brigades CC1 VC2 PC3 NC4 PC5 Pct6 34 Arianna Bisazza – PhD Thesis – 19 April 2013

  35. Chunk-based fuzzy reordering rules Arabic-English “Move verb chunk (and following chunk) to the right by 1 to N chunks” w- $ArkfyAltZAhrpE$rAtAlmslHynmnAlktA}b . and took part in the march dozens of militants from the Brigades CC1 VC2 PC3 NC4 PC5 Pct6 Pct6 CC1 PC3 VC2 NC4 PC5 NC4 Pct6 PC5 CC1 PC3 VC2 PC3 NC4 PC5 VC2 CC1 Pct6 35 Arianna Bisazza – PhD Thesis – 19 April 2013

  36. Chunk-based fuzzy reordering rules Arabic-English “Move verb chunk (and following chunk) to the right by 1 to N chunks” w- $ArkfyAltZAhrpE$rAtAlmslHynmnAlktA}b . and took part in the march dozens of militants from the Brigades CC1 VC2 PC3 NC4 PC5 Pct6 Pct6 CC1 PC3 VC2 NC4 PC5 NC4 Pct6 PC5 CC1 PC3 VC2 PC3 NC4 PC5 VC2 CC1 Pct6 CC1 NC4 VC2 PC3 PC5 Pct6 CC1 NC4 PC5 VC2 PC3 Pct6 36 Arianna Bisazza – PhD Thesis – 19 April 2013

  37. Chunk-based fuzzy reordering rules Reordered source LM Reordering selection w- $ArkfyAltZAhrpE$rAtAlmslHynmnAlktA}b . and took part in the march dozens of militants from the Brigades CC1 VC2 PC3 NC4 PC5 Pct6 0.7 Pct6 CC1 PC3 VC2 NC4 PC5 NC4 Pct6 PC5 CC1 PC3 VC2 0.1 PC3 NC4 PC5 VC2 CC1 Pct6 0.1 CC1 NC4 VC2 PC3 PC5 Pct6 0.4 0.9 CC1 NC4 PC5 VC2 PC3 Pct6 37 Arianna Bisazza – PhD Thesis – 19 April 2013

  38. Chunk-based fuzzy reordering rules Reordered source LM Reordering selection w- $ArkfyAltZAhrpE$rAtAlmslHynmnAlktA}b . and took part in the march dozens of militants from the Brigades CC1 VC2 PC3 NC4 PC5 Pct6 0.7 Pct6 CC1 PC3 VC2 NC4 PC5 0.1 0.1 Reorderings to include in the distortion matrix 0.4 0.9 CC1 NC4 PC5 VC2 PC3 Pct6 38 Arianna Bisazza – PhD Thesis – 19 April 2013

  39. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 39 Arianna Bisazza – PhD Thesis – 19 April 2013

  40. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 40 Arianna Bisazza – PhD Thesis – 19 April 2013

  41. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 41 Arianna Bisazza – PhD Thesis – 19 April 2013

  42. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 42 Arianna Bisazza – PhD Thesis – 19 April 2013

  43. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 43 Arianna Bisazza – PhD Thesis – 19 April 2013

  44. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 44 Arianna Bisazza – PhD Thesis – 19 April 2013

  45. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 45 Arianna Bisazza – PhD Thesis – 19 April 2013

  46. Modifying the distortion matrix Pct6 CC1 PC3 VC2 NC4 PC5 Reorderings to include in the distortion matrix CC1 NC4 PC5 VC2 PC3 Pct6 46 Arianna Bisazza – PhD Thesis – 19 April 2013

  47. Modifying the distortion matrix “ w- $ArkfyAltZAhrpE$rAt AlmslHynmnAlktA}b . ” Decoder input 47 Arianna Bisazza – PhD Thesis – 19 April 2013

  48. Experiments • Tasks: NIST-MT09 for Ar-En, WMT10 for De-En • Systems based on Moses, include state-of-the-art hierarchical lexicalized reordering models [Tillmann 04; Koehn & al 05; Galley & Manning 08] • Baseline Distortion Limits: 5 in Ar-En, 10 in De-En • Evaluation by: • - BLEU for lexical match & local order • - KRS for global order 48 Arianna Bisazza – PhD Thesis – 19 April 2013

  49. Arabic-English: +0.9 BLEU +0.6 KRS Translation Quality Translation Time Test set: eval09-nw Distortion modified with 3-best reorderings per rule-matching sequence 49 Arianna Bisazza – PhD Thesis – 19 April 2013

  50. German-English: +0.5 BLEU +0.7 KRS Translation Quality Translation Time Test set: newstest10 Distortion modified with 3-best reorderings per rule-matching sequence 50 Arianna Bisazza – PhD Thesis – 19 April 2013

More Related