220 likes | 320 Vues
Searching Indexed Bilingual Knowledge Banks. by Kee Tuan Hwa. Outline. Introduction Proposed Method Summary. Introduction. Types of machine translation - Rules-based machine translation - Knowledge-based machine translation
E N D
Searching Indexed Bilingual Knowledge Banks by Kee Tuan Hwa
Outline • Introduction • Proposed Method • Summary
Introduction • Types of machine translation - Rules-based machine translation - Knowledge-based machine translation - Statistical-based machine translation - Example-based machine translation (EBMT)
EBMT Architecture • Source • Find the best matching in the bilingual corpus
EBMT Architecture (cont.) Recombination
Examples of EBMT • Gaijin System - Uses a bilingual lexicon and transfer rules • MSR-MT - Uses MindNet, logical form
Weaknesses • “a finite means for generating the potential infinity of linguistic forms a speaker-hearer can produce or recognize” (Chomsky,1928)
The size of BKB is big, it will take time to perform searching. The graph complexity is exponential, O(n)=en Weaknesses 30k of S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC
Poor classification of S-SSTC in the Bilingual knowledge Bank (BKB) Weaknesses S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC S-SSTC
Goal and Objective • Classification of STREE and SNODE correspondence in BKB for effective retrieval and translation
Indexed BKB Indexed S-SSTC Indexed BKB Indexed S-SSTC Indexed S-SSTC Indexed S-SSTC Indexed S-SSTC Indexed S-SSTC
Pivot BKB • Pivot BKB - clustering the S-SSTC - categorize the SNODE, STREE based on pattern-based POS - use the modification of inverted file indexing
Pivot BKB (cont.) I[P] see[V] the[Det] big[Adj] bird [N] saya[P] lihat[V] burung[N] besar[Adj] itu[Det] SNODE CORRESPONDENCE 1.5.1 I [P] saya [P] 1.5.2 see [V] lihat[V] 1.5.3 the [Det] itu [Det] 1.5.4 big [Adj] besar[Adj] 1.5.5 bird [N] burung[N] Stree CORRESPONDENCE 1.6.1 [P] [V] [Det] [Adj] [N] [P] [V] [N] [Adj] [Det] I see the big bird saya lihat burung besar itu 1.6.2 [V] [Det] [Adj] [N] [V] [N] [Adj] [Det] see the big bird lihat burung besar itu 1.6.3 [Det] [Adj] [N] [N] [Adj] [Det] the big bird burung besar itu Name of Indexed BKB = a.xml
Pivot BKB (cont.) Name of Indexed BKB Sense of SNODE <P> <V> - I ,<a.xml;1.5.1|2> - see ,<a.xml;1.5.2|2> Index of SNODE <Det> <Adj> - the ,<a.xml;1.5.3|1> - big ,<a.xml;1.5.4|4> <N> Name of Indexed BKB - bird ,<a.xml;1.5.5|3> <[P] [V] [Det] [Adj] [N]> <[V] [Det] [Adj] [N]> - I see the big bird ,<a.xml;1.6.1> - see the big bird ,<a.xml;1.6.2> Index of STREE <[Det] [Adj] [N]> - the big bird ,<a.xml;1.6.3>
Pivot BKB (cont.) the[Det] old[Adj] man[N] walked [V] orang[N] tua[Adj] ini[Det] berjalan[V] SNODE CORRESPONDENCE 2.5.1 the[Det] ini [Det] 2.5.2 old[Adj] tua [Adj] 2.5.3 man[N] orang [N] 2.5.4 walked [V] berjalan [V] Stree CORRESPONDENCE 2.6.1 [Det] [Adj] [N] [V] [N] [Adj] [Det] [V] the old man walked orang tua ini berjalan 2.6.2 [Det] [Adj] [N] [N] [Adj] [Det] the old man orang tua ini Name of Indexed BKB = a.xml
Pivot BKB (cont.) <P> <V> - I ,<a.xml;1.5.1|2> - see ,<a.xml;1.5.2|2> - walked ,<a.xml;2.5.4|1> <Det> <Adj> - the ,<a.xml;1.5.3|1> - big ,<a.xml;1.5.4|4> - the ,<a.xml;2.5.1|2> - old ,<a.xml;2.5.2|1> <N> - bird ,<a.xml;1.5.5|3> - man ,<a.xml;2.5.3|2> <[P] [V] [Det] [Adj] [N]> <[V] [Det] [Adj] [N]> - I see the big bird ,<a.xml;1.6.1> - see the big bird ,<a.xml;1.6.2> <[Det] [Adj] [N]> <[Det] [Adj] [N] [V]> - the big bird ,<a.xml;1.6.3> - the old man walked ,<a.xml;2.6.1> - the old man ,<a.xml;2.6.2>
Summary • size(n)∝1/speed(n) • Classification and indexing to perform effective retrieval and translation