250 likes | 362 Vues
This paper discusses the complexities and innovative approaches in processing multi-word units within machine translation systems. It explores the definitions and integration of multi-word units into lexicon-grammar, lemmatization criteria, and the corpus-linguistic approach to understanding these challenges. The study includes typical ambiguities faced by machine translation systems and evaluates different methods, including semantic-syntactic integration, and qualitative metrics for assessing machine translation effectiveness. By addressing these elements, the authors aim to enhance the overall performance of multi-word unit processing in machine translation.
E N D
Taking on new challenges in multi-word unit processing for machine translation Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI
Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011 Outline
Multi-word units in the Lexicon-Grammar: definition Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-word units in the Lexicon-Grammar Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-word units in the Lexicon-Grammar : part of a continuum Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-word units in the Lexicon-Grammar: lemmatization Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-word units in the Lexicon-Grammar: lemmatization criteria Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
The corpus-linguistic approach Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-wordunits in MachineTranslation Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-wordunits in MachineTranslation: mainproblems Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Multi-word units in Machine Translation: different approaches Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Lexicalambiguitieshandledbydifferentsystems • Corpus: • non-specialized texts • approx. 300 sentences (10,000 words) • multi-word units • extracted from the Web • Webcorp LSE, Web as a Corpus • MT systems : • Google Translate • OpenLogos OpenLogos Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Typical ambiguities: examples Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Typical ambiguities: examples Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Integration of Semantico-Syntactic knowledge Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Integration of Semantico-Syntactic knowledge Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Integration of Semantico-Syntactic knowledge: mix up Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
SemTab rules comment lines for the verb mix up Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Qualitative MT Evaluationmetrics Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Qualitative MT Evaluationmetrics Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Qualitative MT Evaluationmetrics Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Qualitative MT Evaluationmetrics Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Qualitative MT Evaluation metrics: the «ideal» evaluation tool Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Conclusions Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011
Thank you for your attention ! Johanna MONTI, Anabela BARREIRO Annibale ELIA, Federica MARANO, Antonella NAPOLI