html5-img
1 / 47

Translation Divergence

Translation Divergence. LING 580MT Fei Xia 1/10/06. Papers. Bonnie Dorr (1994): Machine Translation Divergences: a Formal Description and Proposed Solution. Outline. Formal definition of translation divergence Seven types of divergence Discussion Remaining questions.

aviva
Télécharger la présentation

Translation Divergence

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Translation Divergence LING 580MT Fei Xia 1/10/06

  2. Papers • Bonnie Dorr (1994): Machine Translation Divergences: a Formal Description and Proposed Solution

  3. Outline • Formal definition of translation divergence • Seven types of divergence • Discussion • Remaining questions

  4. Formal definition of translation divergence

  5. Distinction between the source and target languages Two categories (Bernett et. al., 1991): • Translation divergence: same information, different structures • Translation mismatches: different information  important, but outside of the scope of the paper

  6. How to define translation divergence formally? Define the language-to-language divergence via language-to-interlingua divergence: Interlingua: lexical conceptual structure (LCS) Language-to-interlingua: mapping from syntactic form to LCS

  7. Lexical conceptual structure (LCS) X’ T(X’) W’ T(W’) Z1’ T(Z1’) Zn’ T(Zn’) Q1’ T(Q1’) Qm’ T(Qm’) … … X’: logical head W’: logical subject Z1’…Zn’: logical argument Q1’…Qm’: logical modifiers T(Φ’) is the logical type (Event, Path, ….) of the primitive Φ’ (CAUSE, LET, GO, …)

  8. Root LCS (RLCS) • A RLCS is an un-instantiated LCS that is associated with a word definition in the lexicon (i.e., a LCS with unfilled variable position) • LCSs are recursively defined.

  9. RLCS representation for go GOLoc Event X Thing TOLoc Path ATLoc Position X Thing Z Location  It is different from dependency structure

  10. Composed LCS (CLCS) • A CLCS is an instantiated LCS that is the result of combining two or more RLCSs by means of unification (roughly). • This is the interlingua form that serves as the pivot between the source and target languages.

  11. CLCS representation for “John went happily to school” GOLoc Event John Thing TOLoc Path Happily Manner ATLoc Position School Location John Thing The operations of combining are not defined in this paper.

  12. Syntactic phrase X: syntactic head W: external argument Z-MAX i: internal arguments Q-MAXi: syntactic adjuncts  Similar to X-bar theory, GB theory, etc.

  13. An example

  14. Mapping between LCS and syntactic form • Generalized linking routine (GLR): • X’  X (logical head  syntactic head) • W’  W (logical subject  external argument) • Z’  Z (logical argument  internal argument) • Q’  Q (logical modifiers  syntactic adjunct) • Canonical syntactic realization (CSR) • Relate T(Φ’) to CAT(Φ): (logical type  syntactic category) Ex: THING N, EVENT  V

  15. Divergence problem • Translation divergences occur when there is an exception either to the GLR or to the CSR (or to both) in one language, but not in the other.

  16. Outline • Formal definition of translation divergence • Seven types of divergence • Discussion • Remaining questions

  17. T1: Thematic divergence • The repositioning of arguments w.r.t. a head. • GLR: W’ Z and Z’W • Example: I like Mary  Maria me gusta

  18. :INT and :EXT

  19. General Solution

  20. T2: Promotional Divergence • Promoting a logical modifier into a main verb position (or vice versa) • GLR: X’Z and Q’X • Ex: John usually goes home  Juan suele ir a casa

  21. :PROMOTE

  22. General Solution

  23. T3: Demotional Divergence • Demoting a logical head into an internal argument (adjunct?) position (or vice versa). • GLR: X’Q and Z’X • Ex: I like to eat Ich gern esse

  24. :DEMOTE

  25. General Solution

  26. T4: Structural divergence • It does not alter the positions used in GLR mapping • But it changes the nature of the relation between different positions (i.e., the “” correspondence) • Ex: John entered the house Juan entro en la casa

  27. * marker Marker forces logical constituents to be realized compositionally at different levels

  28. General solution

  29. T5: Conflational Divergence • The suppression of a CLCS constituent (or the inverse of the process) • GLR:  correspondence of step (3) or (4) is changed.

  30. Example I stabbed John  Yo le di punaladas a Juan

  31. :CONFLATED

  32. General solution

  33. T6: Categorical divergence • CAT(Φ) is different from CSR(T(Φ’)). • Ex: I am hungry  Ich hunger habe

  34. :CAT

  35. General solution

  36. T7: Lexical divergence • As a side effect of other divergences. • Ex: John broke into the room  Juan forzo la entrada al cuarto

  37. Summary of seven types • Repositioning (GLR mappings): thematic, promotional, demotional divergences • Changing  correspondence: structural, conflational divergences • Category: categorical divergence • ??: Lexical divergence

  38. Discussion

  39. Discussion • Limits on Repositioning Divergences • Promotional vs. Demotional Divergences • Lexical Selection: Full Coverage Constraint • Interacting Divergence Types

  40. Limits on Repositioning divergences • Three types to cover all repositioning divergences: • Thematic: W’Z, Z’W • Promotional: X’Z, Q’X • Demotional: X’Q, Z’X • (X, W, Z, Q)  (X’, W’, Z’, Q’) • W has a special status: 44=256  33=27 • a CLCS must contain exactly one head: 33=2712

  41. Limits on Repositioning Divergences (cont) • Z can never be associated with Q’, and Q can never be associated with Z’: 12 5 • Modifying relation cannot be reversed: 54 (Q’X, X’Q, Z’Z) • Argument relation cannot be reversed: 4 3 (Z’X, X’Z, Q’Q) • Canonical positions: 3  2

  42. Promotional vs. Demotional Divergences • Promotion is triggered by a main verb (e.g., soler in soler-usually) • Demotion is triggered by an adverb (e.g., gern in like-gern)

  43. Interacting Divergence Types • Promotional and thematic divergence: S: Leer libros le suele gustar a Juan ‘reading books (him) tends to please (to) John’ E: John usually likes reading books

  44. Remaining questions

  45. Remaining questions: Interlingua • How to build RLCS? • What are logical head, subject, arguments and modifiers? Ex: like  likingly • How to represent a verb: stab  CAUSE GOPoss KNIFE-WOUND • How are RLCSs combined to form CLCSs? • Unification = substitution? • Are CLCSs really sufficient to handle all the languages?

  46. Remaining issues: divergences • Are the seven types really sufficient to cover all the convergences? • Is the “proof” for limits on repositioning divergences convincing? • “Translation divergences occur when there is an exception to GLR/CSR in one language, but not the other”: what if there are exceptions in both languages? • Can a dependent of X become a dependent of Y?

  47. Remaining issues: MT • How to build a real MT system with this approach?

More Related