110 likes | 249 Vues
This document outlines the proceedings from the US-Korea Joint Workshop on Digital Libraries held in San Diego on August 10-11, 2000. It features Stephen Helmreich from New Mexico State University's Computing Research Laboratory, discussing various approaches to multilingual processing of natural language texts. Key topics include machine translation methodologies, information extraction, retrieval, summarization, and knowledge acquisition. The document highlights different machine translation systems such as interlingual, transfer-based, and dictionary-based, emphasizing their applications in enhancing multilingual access to digital libraries.
E N D
Removing the Language Barrier Machine Translation And Digital Libraries
US-Korea Joint Workshop on Digital Libraries August 10-11, 2000 San Diego, California
Stephen Helmreich • Computing Research Laboratory • New Mexico State University • Las Cruces, New Mexico, USA • Shelmrei@crl.nmsu.edu • (505) 646-2141
Computing Research Laboratory • Background • Research efforts – all approaches to multi-lingual processing of natural language texts
Applications • Machine translation ** • Information extraction • Information retrieval ** • Summarization ** • Knowledge acquisition ** • Authoring systems • Translator workstations
Machine Translation -- I • Interlingual – text meaning representation (TMR) • Knowledge-based – uses ontology • Lexicon – connects lexical items in context to ontological concepts • Disambiguation – ontological constraints select appropriate TMR
Uses of MT-I • As a full system, for high-quality translation • The ontology itself would be useful for providing multilingual access to say, metadata • TMRs would be usable to represent content of documents
Machine Translation -- II • Transfer-based MT – focus on syntax and morphology • Rapid deployment – standard engines • Universal representation format – typed feature structures
Uses of MT-II • General purpose translation for assimilation • Within a retrieval system • As an on-line tool
Machine Translation -- III • Dictionary/glossary-based • Korean morphological analyzer – from Pohang, using dictionary with 100,000 entries • Korean-English bilingual dictionary with 85,000 items
Uses of MT-III • Available immediately for use • Currently embedded in a document retrieval system • Keizai System Demo: http://crl.nmsu.edu -- click on “Research” , then on “URSA”, then look for the Keizai Demo