430 likes | 624 Vues
A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation. Manuel Medina Gonz á lez and Hirosato Nomura Kyushu Institute of Technology. Outline. Introduction Spanish Features and Considerations when translating. Parts of speech Voices System Summary
E N D
A Cross-Lingual Grammar Model and its Application to Japanese-Spanish Machine Translation Manuel Medina González and Hirosato Nomura Kyushu Institute of Technology
Outline • Introduction • Spanish Features and Considerations when translating. • Parts of speech • Voices • System • Summary • Conclusions
Introduction • Current Machine Translation Systems output incorrect sentences when translating from Japanese to Spanish. http://www.worldlingo.com/en/products_services/worldlingo_translator.html
Introduction • The reason is becauseEnglish is used as intermediate language, thus, leading to loss of grammatical information due to the differencesbetween the languages. http://www.worldlingo.com/en/products_services/worldlingo_translator.html
Introduction • The idea is simple: To translate directly from Japanese to Spanish. El niño juega en el parque 子供は公園で遊ぶ • In order to accomplish this, the way the analysis is performed must be adapted to support Spanish features
Introduction • We base our model on ALT J/E Machine Translation System Model, with some modifications.
Model Source Language Target Language Direct Method Corpus-based Translation • Noun features • Determiner • Subjunctive Mood Analysis Generation Transfer Method Conversion Intermediate Language(PIVOT)
Model Determining the missing information by predicting the result as we analyze the sentence.
Outline • Introduction • Spanish Features and Considerations when translating. • Parts of speech • Voices • System • Summary • Conclusions
Spanish Features: Nouns • Gender Table: Feminine. Book: Masculine. • Number Table: Singular Tables: Plural Noun’s features decide almost all the possible changes a Spanish sentence can suffer.
Spanish Features: Nouns あのテーブルは汚い。拭いておきましょう。 女性 あの → 女性形 汚い → 女性形 Zero代名詞 → 女性形
Semantic Categorization • ALT J/E Semantic Categorization (2710 different, non-exclusive categories).
Spanish Features: Adjectives • As in English, only one category exists. • 2 verbs mainly used: “Ser” and “Estar”. Both are equivalent to English “To be” verb. • The meaning is different depending on the verb used. Yo soy feliz 私は幸せだ Yo estoy feliz
Spanish Features: Adjectives • Creation of categories • Temporary state: Sad • Permanent feature: Boring, interesting • Weather • Weather category is necessary because other 2 verbs are used: “Tener” and “Hacer”. Tengo calor What you feel 暑い The weather Hace calor
Spanish Features: Adverbs • Thinking of them as theyare in Spanish, we create categories as in this language: • Place • Time • Mode • Quantity • Order • Affirmation • Denial • Doubt • Addition • Exclusion
Spanish Features: Verbs • Tenses • Japanese: 3 (Present, past, future) • Spanish: 16 • Conjugations • Different conjugation for each person in each tense. • A conjugator system can be made for regular verbs, but there are too many rules to consider.
洗う 洗う 私 私 車 顔 Reflexive Verbs • A verb is reflexive if the action returns to its performer. • There are verbs in Japanese that can be reflexive and non-reflexive at the same time. Non-Reflexive 洗う Reflexive 私は車を洗う 私は顔を洗う
人間 目 顔 鼻 Reflexive Verbs • Creation of “Has-a” relationships to determine whether if a verb must be treated as reflexive.
Translation Rules • Based on ALT J/E Translation Rules. • Verb • Particles used in special cases • Categories of the expected nouns accompaining the particles • Translation of the verb in each case. • Indication if the verb must be treated as reflexive.
「乗る」 <動物>に乗る= Montaren<動物> <交通機関 | 乗り物>に乗る= Subir(R)a<交通機関 | 乗り物> ...
Outline • Introduction • Spanish Features and Considerations when translating. • Parts of speech • Voices • System • Summary • Conclusions
Voices • Passive: れる、られる • Normal passive • Indirect object reference • Passive Reflexive • Causative: せる、させる • Coercive • Permissive
Voices: Model Japanese Sentence Passive, Causative Identify Voice Voice, Structure Predict Result Pronouns Mood... Add or change elements Analyze elements Not explained deeply here due to the time limitation …
Outline • Introduction • Spanish Features and Considerations when translating. • Parts of speech • Voices • System • Summary • Conclusions
KNP JUMAN System • Named “JEMS”: Japanese Español Machine translation System. Translated Sentence = JEMS Core Semantic Categories Translation Rules Dictionary
Tests and Results • JEMS compared against Worldlingo. • Sentences taken from books like “Momotaro”, “Megane usagi”, “3 nen netaro” etc. • Human-Translating the sentences, then inputting them into the systems and checking the output.
Tests and Results いよいよ春がやってきた Possible Expected Outputs Input Finalmente la primavera llegó. Finalmente la primavera ha llegado. Obtained Output Resorte usted cada vez más 春 → Spring Spring = 1. Primavera 2. ResorteAnalysis is not complete • Errors: • Lack of verb • Incorrect subject • Incorrect structure
Outline • Introduction • Spanish Features and Considerations when translating. • Parts of speech • Voices • System • Summary • Conclusions
Summary • Indirect Translation from Japanese to Spanish is not enough. • Model based on thinking of the translated sentence since the analysis starts. • Presented just a small part of the analysis necessary to translate into Spanish • Developed a prototype system “JEMS” to test the model. Compared against an existent translation system.
Outline • Introduction • Spanish Features and Considerations when translating. • Parts of speech • Voices • System • Summary • Conclusions
Conclusions • Japanese-Spanish Machine Translation is just beginning. There are still many issues to be solved. • Need to make the model bigger in order to analyze longer sentences. • Once this model is finished, it can become the basis for other research about Machine Translation between Japanese and Romance Languages.
Conclusions 私は太郎です Spanish Me llamo Taro Italian Me chiamo Taro Portuguese Me chamo Taro French Je m'appelle Taro
Voices: Passive • Normal Passive • Very much like English Passive voice: Subject + Verb + Object Subject + “Ser” Verb + Verb’s Past Participle + “por” preposition + Agent 子供はボールを蹴った ボールは子供に蹴られた
Translated Sentence Voices:Passive ボールは子供に蹴られた • Identify the agent (子供に). • Identify the subject and its features. (ボールは→女性) . • Use translation rules to get the appropiate verb translation. • Change the past participle according to subject features. Past participle must match these features
Voices: Passive • Indirect object reference 私は財布を盗まれた The subject of the Spanish sentence is neither “I” nor “Wallet”. “I” is the indirect object in the Spanish translated sentence. Indirect Object Weird Japanese 私は財布を盗まれた (ZERO)は私に財布を盗んだ。
Past participle is not used Translated Sentence Voices: Passive 私は財布を盗まれた • No agent in the sentence. Rewrite it to “Weird Japanese” form (ZERO)は... • Get correct verb translation from translation rules. • Use conjugation for “they”
Voices: Passive • Passive Reflexive.Identified only in some patterns: • ~では・には・・・される • Sentences that translated into Spanish don’t have subject, the agent is present in the sentence and the verb is 「考える」、「思う」、「言う」 日本では日本語が話される
Translated Sentence Voices: Passive 日本では日本語が話される • Use Reflexive Pronoun “se”. • Use “Singular 3rd. Person” conjugation
Voices: Causative • 2 cases: Coercitive Sentences and Permissive Sentences. • Sentences are translated differently depending on if the verb is intransitive or not. • Possible use of subjunctive mood in the translated sentence.