1 / 23

Dependency Tree-to-Dependency Tree Machine Translation

Dependency Tree-to-Dependency Tree Machine Translation. November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration with: Chris Dyer, Noah Smith, Stephan Vogel. Problem. Swahili: Watoto ni kusoma vitabu . Gloss: children aux- pres read books

glenna
Télécharger la présentation

Dependency Tree-to-Dependency Tree Machine Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dependency Tree-to-Dependency Tree Machine Translation November 4, 2011 Presented by: Jeffrey Flanigan (CMU) Lori Levin, Jaime Carbonell In collaboration with: Chris Dyer, Noah Smith, Stephan Vogel

  2. Problem Swahili: Watotonikusomavitabu. Gloss: children aux-pres read books English: Children are reading books. MT (Phrase-based): Children are reading books. Swahili: Watotonikusomavitabutatumpya. Gloss: children aux-pres read books three new English: Children are reading three new books. MT (Phrase-based): Childrenarethreenewbooks. Why? Phrase Table: Pr(reading books| kusomavitabu) Pr(books | kusomavitabu) Language model: Childrenarethreenewreading books. Childrenare reading booksthreenew.

  3. Problem: Grammatical Encoding Missing Swahili: Nimeonasamakiwaliokulamashua. Gloss: I-found fish who-ate boat English: I found the fish that ate the boat. MT System: I found that eating fish boat. Predicate-argument structure was corrupted.

  4. Grammatical Relations ROOT DOBJ OBJ I found the fish that ate the boat. RCMOD REF SUBJ DET DET ⇒ Dependency trees on source and target!

  5. Approach Source Sentence Undo grammatical encoding (parse) Source Dependency Tree Translate Target Dependency Tree Grammatical encoding (choose surface form, linearize) Target Sentence All stages statistical

  6. Extracting the rules:Extract all consistent tree fragment pairs Example Extracted Pairs reading TARGET SIDE SOURCE SIDE DOBJ NSUBJ barasoma reading books AUX DOBJ DOBJ NSUBJ NSUBJ Children are AUX NUM AMOD [1] [2] [2] [1] are three new barasoma reading NSUBJ DOBJ NSUBJ DOBJ AUX [1] ibitabo [1] books are bitatu bishya [1] [1] AMOD NUM NUM AMOD NUM AMOD bitatu three new bishya Abaana ibitabo [1] [1] NUM NUM Children Abaana NSUBJ DOBJ barasoma Abaana Children

  7. Translating Extension of phrase-based SMT • Linear strings → Dependency trees • Phrase pairs → Tree fragment pairs • Language model → Dependency language model Search is top down on the target side using beam search decoder

  8. Translation Example Input arasoma DOBJ NSUBJ The child is reading books umwaana [3] Ibitabo [4] Inventory of Rules arasoma reading DOBJ NSUBJ NSUBJ DOBJ P(e|f)=.5 AUX ibitabo books P(e|f)=.7 [2] [1] [2] [1] is [1] [1] [1] NSUBJ [1] NSUBJ child NSUBJ child NSUBJ P(e|f)=.1 umwaana P(e|f)=.8 umwaana DET DET a the

  9. Translation Example Input reading arasoma DOBJ NSUBJ DOBJ AUX NSUBJ [4] [3] is umwaana [3] Ibitabo [4] Score = w1ln(.5)+w2ln(Pr(reading|ROOT))+w2ln(Pr(is|(reading,AUX))) Inventory of Rules Language model on target dependency tree arasoma reading DOBJ NSUBJ NSUBJ DOBJ P(e|f)=.5 AUX ibitabo books P(e|f)=.7 [2] [1] [2] [1] is [1] [1] [1] NSUBJ [1] NSUBJ child NSUBJ child NSUBJ P(e|f)=.1 umwaana P(e|f)=.8 umwaana DET DET a the

  10. Translation Example Input reading arasoma DOBJ NSUBJ DOBJ AUX NSUBJ books [3] is umwaana [3] ibitabo Score = w1ln(.5)+w1ln(.7)+w2ln(Pr(reading|ROOT)) +w2ln(Pr(is|(reading,AUX)))+w2ln(Pr(books|(reading,DOBJ))) Inventory of Rules arasoma reading DOBJ NSUBJ NSUBJ DOBJ P(e|f)=.5 AUX ibitabo books P(e|f)=.7 [2] [1] [2] [1] is [1] [1] [1] NSUBJ [1] NSUBJ child NSUBJ child NSUBJ P(e|f)=.1 umwaana P(e|f)=.8 umwaana DET DET a the

  11. Translation Example Input reading arasoma DOBJ NSUBJ DOBJ AUX NSUBJ books child is umwaana ibitabo DET the Score(Translation) = w1ln(.5)+w1ln(.7)+w1ln(.8)+w2ln(Pr(reading|ROOT)) +w2ln(Pr(is|(reading,AUX)))+w2ln(Pr(books|(reading,DOBJ))) +w2ln(Pr(child|(reading,NSUBJ)))+w2ln(Pr(the|(child,DET),(reading,ROOT))) Inventory of Rules arasoma reading DOBJ NSUBJ NSUBJ DOBJ P(e|f)=.5 AUX ibitabo books P(e|f)=.7 [2] [1] [2] [1] is [1] [1] [1] NSUBJ [1] NSUBJ child NSUBJ child NSUBJ P(e|f)=.1 umwaana P(e|f)=.8 umwaana DET DET a the

  12. Linearization is • Generate projective trees • A* Search • Left to right with target LM • Admissible Heuristic: Highest scoring completion without LM COP NSUBJ strong He ADVMOD enough

  13. Linearization is • Generate projective trees • A* Search • Left to right with target LM • Admissible Heuristic: Highest scoring completion without LM COP NSUBJ strong He ADVMOD enough He is enough strong Score=Pr(He|START)∙Pr(<NSUBJ,HEAD,COP>|is)∙Pr(<HEAD,ADVMOD>|strong)

  14. Linearization is • Generate projective trees • A* Search • Left to right with target LM • Admissible Heuristic: Highest scoring completion without LM COP NSUBJ strong He ADVMOD enough He is enough strong Score=Pr(He|START)∙Pr(is|He,START)∙Pr(<NSUBJ,HEAD,COP>|is)∙Pr(<HEAD,ADVMOD>|strong)

  15. Linearization is • Generate projective trees • A* Search • Left to right with target LM • Admissible Heuristic: Highest scoring completion without LM COP NSUBJ strong He ADVMOD enough He is strong enough Score=Pr(He|START)∙Pr(is|He,START)∙Pr(strong|He,is)∙ Pr(<NSUBJ,HEAD,COP>|is)∙Pr(<ADVMOD,HEAD>|strong)

  16. Linearization is • Generate projective trees • A* Search • Left to right with target LM • Admissible Heuristic: Highest scoring completion without LM COP NSUBJ strong He ADVMOD enough He is strong enough Score=Pr(He)∙Pr(is|He)∙Pr(strong|He, is)∙Pr(enough|strong, is)∙ Pr(<NSUBJ,HEAD,COP>|is)∙Pr(<HEAD,ADVMOD>|strong)

  17. Comparison To Major Approaches

  18. Conclusion • Separate translation from reordering • Dependency trees capture grammatical relations • Can extend phrase-based MT to dependency trees • Complements ISI’s approach nicely Work in progress!

  19. Backup Slides

  20. Allowable Rules • Nodes consistent w/ alignments • All variables aligned • Nodes ∪ variables ∪ arcs ∪ alignments = connected graph Optional Constraints • Nodes on source connected • Nodes on target connected • Nodes on source and target connected Decoding Constraint • Target tree connected

  21. Head Switching Example vient vient [2] PREP NSUBJ PREP NSUBJ ADVMOD NSUBJ de [1] de just bébé [1] POBJ POBJ DET [2] tomber Le [2] fell PREP NSUBJ The ADVMOD NSUBJ de [1] [2] [1] ADVMOD POBJ child just tomber ADVMOD NSUBJ fell

  22. Moving Up the Triangle Propositional Semantic Dependencies Deep Syntactic Dependencies Surface Syntactic Dependencies

  23. Comparison to Synchronous Phrase Structure Rules • Training dataset: • Test sentence: • Synchronous decoders (SAMT, Hiero, etc) produce: The children are reading book ’s Charles new all of . The children are reading book Charles ’s all of new . Problem: Grammatical encoding tied to word order. Kinyarwanda: AbaanabaasomaigitabogishyakyosecyaaKarooli . English: The children are reading all of Charles ’s new book . Kinyarwanda: AbaanabaasomaigitabocyaaKarooligishyakyose.

More Related