1 / 16

Training a Parser for Machine Translation Reordering

Training a Parser for Machine Translation Reordering. Jason Katz-Brown, Slav Petrov , Ryan McDonald, Franz Och David Talbot, Hiroshi Ichikawa, Masakazu Seno, Hideto Kazawa. Dependency Parsing. Given a sentence, label the dependencies (from nltk.org )

dinh
Télécharger la présentation

Training a Parser for Machine Translation Reordering

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Training a Parser for Machine Translation Reordering Jason Katz-Brown, Slav Petrov, Ryan McDonald, Franz Och David Talbot, Hiroshi Ichikawa, Masakazu Seno, HidetoKazawa

  2. Dependency Parsing • Given a sentence, label the dependencies • (from nltk.org) • Output is useful for downstream tasks like machine translation • Also of interest to NLP reaserchers

  3. Overview of Paper • Motivation • Targeted Self Training Algorithm • MT experiments • Domain adaptation

  4. Motivation - Evaluation • Intrinsic • How well does system replicate gold annotations? • Precision/recall/F1, accuracy, BLEU, ROUGE, etc. • Extrinsic • How useful is system for some downstream task? • High performance on one doesn’t necessarily mean high performance on the other • Can be hard to evaluate extrinsically

  5. Motivation • Parsing is not a stand-alone task • Useful as part of a larger system • High-fidelity replication of gold parses won’t necessarily yield the best downstream performance • Try to train a model that will yield better downstream performance than a model trained to replicate gold standard • Maximize extrinsic quality, rather than intrinsic

  6. Targeted Self Training Algorithm • For each sentence in a corpus • Parse sentence S with a baseline parser, get k-best • Choose the parse of S that optimizes some function F, add to training data • Retrain parser • F measures the extrinsic quality of the parse • Finding F can be challenging! • Standard self training: just choose 1-best

  7. Reordering • Reordering is changing source language word order to target language word order • Here doing English (SVO) to Japanese (SOV) • Metrics that account for word order correlate better with human judgment than those that prefer word choice • Can use manually or automatically derived tree transforms to reorder • Reordering is useful as a preprocessing step

  8. Reordering • Reordering is its own step • Function to evaluate reordering quality, given gold reordering: 1 – ((# chunks – 1) / (# words – 1)) • Chunks are contiguous spans in both predicted and gold • Prediction: A B E C D; Gold: A B C D E • 1 – ((3 -1 ) / (5 – 1))

  9. Parsing and Reordering • Different parses yield different reorderings • Systems tend to be sensitive to errors

  10. MT Experiment Setup • Train baseline Nivre dependency parser on WSJ (and Berkeley parser) • English/Japanese corpus with literal translations and manual word alignments • 6,268 training / 7,327 test sentences • Annotators need very little training • Makes this relatively cheap • Annotating dependency parses requires a lot of training

  11. MT Experiment Setup • Use hand-crafted rules for reordering • Phrase-based MT system • Train parser in 3 ways: • Baseline • Standard self-training • Targeted self-training • Look at: • Labeled attachment score (LAS; intrinsic) • Reordering score • MT quality (BLEU and human)

  12. Results

  13. Results • Evaluated MT quality with BLEU and humans • Varied the training of the dependency parser that feeds into reordering component • Experiments in Korean, Japanese and Turkish (all SOV languages) • In all cases BLEU and human opinion improves with targeted self training (10x) compared to baseline parser • Humans still put the translation quality in the “some meaning/grammar” range (~2.5/6) • Improvement is not drastic

  14. Domain Adaptation Experiment • Use Question Treebank (QTB) to make MT system translate questions better than baseline system • Have 2k questions parsed • Have 2k questions translated and annotated for reordering • Compare translation output from system that includes parsers trained in different ways

  15. Results

  16. Results • BLEU score and human opinion of Japanese translations of QTB test sentences was higher with targeted self training than with baseline parser • Gold QTB yielded better reordering score, but more expensive to produce than alignments • Didn’t report BLEU/human opinion on resulting translations

More Related