10 likes | 118 Vues
This paper presents novel approaches in machine translation for lesser-resourced languages, particularly highlighting the importance of linguistic structure and bilingual informants. We explore methods for unsupervised induction of paradigm morphology, allowing for the identification of morphological structures essential for effective translation. Additionally, we report on the efficacy of syntactic rule induction and refinement tools, showcasing successful applications from English to Spanish and Mapudungun to Spanish. Our findings reinforce the role of linguistic data in improving translation systems for underrepresented languages.
E N D
VP N V N S NP VP N V NP Det N S VP John ate an apple NP VP V NP John ne ek seb khaya John ate an apple ate an apple John ne ek seb khaya ek seb khaya NP VP NP V S VP S S VP VP NP VP NP ne VP V NP NP V S NP VP PolP NP N los niños Det un V N jugaron juego Linguistic Structure and Bilingual Informants Help Induce Machine Translation of Lesser-Resourced Languages Christian Monson, Ariadna Font Llitjós, Vamshi Ambati, Lori Levin, Alon Lavie, Alison Alvarez, Roberto Aranovich, Jaime Carbonell, Robert Frederking, Erik Peterson, Katharina Probst ParaMor: Unsupervised Induction of Paradigm Morphology ParaMor Identifies Paradigms Paradigms: The Structure of Inflectional Morphology e.er.erá.ido.ieron.ió 28: deb, escog, ofrec, roconoc, vend, ... e.ido.ieron.ir.irá.ió 28: asist, dirig, exig, ocurr, sufr, ... azar.e.ido.ieron.ir.ió 1: sal e.er.erá.ieron.ió 32: deb, padec, romp, ... e.erá.ido.ieron.ió 28: deb, escog, ... e.er.ido.ieron.ió 46: deb, parec, recog... e.ido.ieron.irá.ió 28: asist, dirig, ... e.ido.ieron.ir.ió 39: asist, bat, sal, ... e.ido.ieron.ió 86: asist, deb, hund,... e.erá.ieron.ió 32: deb, padec, ... er.ido.ieron.ió 58: ascend, ejerc, recog, ... ido.ieron.ir.ió 44: interrump, sal, ... Results Syntactic Rule Induction for Machine Translation Elicitation Tool Rule Induction Statistics Automatic Syntax Induction Syntactic Rule Refinement Translation Correction Tool Automatic Syntax Refinement Refinement Results Automatic metrics evaluate an English to Spanish MT System Rule Refinement has also been succesfully applied for Mapudungun to Spanish MT