1 / 23

An Overview of Machine Translation

An Overview of Machine Translation. A Presentation by: Mahsa Mohaghegh. Outline. Introduction A brief introduction to Translation technology Interest in MT Problems Involved in Machine Translation Translation Technology Knowledge-based systems

elita
Télécharger la présentation

An Overview of Machine Translation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. An Overview of Machine Translation A Presentation by: Mahsa Mohaghegh

  2. Outline • Introduction • A brief introduction to Translation technology • Interest in MT • Problems Involved in Machine Translation • Translation Technology • Knowledge-based systems • Statistical machine translation systems • Rule-Based vs. Statistical MT • Current State of Machine Translation in Use • Personal Speech-to-Speech Translators Machine Translation

  3. Introduction • These factors have increased both the demand for translation services and interest in computerized translation technology. • Some industry observers say machine translation, a largely experimental technology that has been around since the late 1950s, is now ready to become commercially viable. Machine Translation

  4. Definition The sub-domain of artificial intelligence concerned with the task of developing programs possessing some capability of ‘understanding’ a natural language in order to achieve some specific goal. NLP Understanding A transformation from one representation (the input text) to another (internal representation) Machine Translation

  5. Introduction: Machine Translation : The use of computers to translate from one language to another. One of the oldest dreams of NLP, AI, and CS (first system in 1954). Machine Translation

  6. Why Machine Translation? • Cheap, universal access to world’s online information regardless of original language. • (That’s the goal) Machine Translation

  7. Interest in MT Machine Translation

  8. Problems Involved in Machine Translation are the main problems faced by MT systems. A classic example is illustrated in the following pair of sentences:  Time flies like an arrow. Fruit flies like an apple. Machine Translation

  9. How can a machine understand these differences? • Get the cat with the gloves. Machine Translation

  10. Outline • Introduction • A brief introduction to Translation technology • Interest in MT • Problems Involved in Machine Translation • Translation Technology • Knowledge-based systems • Statistical machine translation systems • Rule-Based vs. Statistical MT • Current State of Machine Translation in Use • Personal Speech-to-Speech Translators Machine Translation

  11. TRANSLATION TECHNOLOGY • There are two kinds of machine translation: • Knowledge-based systems • Statistical machine translation • Knowledge-based systems Traditional translation technology takes a knowledge-based approach. These expert systems—used by vendors such as Fujitsu, Logos, and Systran—translate documents by converting words and grammar directly from one language into another. Machine Translation

  12. Knowledge-based systems How they work. Knowledge basedsystems rely on programmers to enter various languages’ vocabulary and syntax information into data bases. The programmers then write lists of rules that describe the possible relationships between a language’s parts of speech. The software, which can run on a high-powered PC, analyzes a document and examines the rules for both the text’s language and the target language to translate material. Hmm, every time he sees “banco”, he either types “bank” or “bench” … but if he sees “banco de…”, he always types “bank”, never “bench”… Man, this is so boring. Translated documents 12 Machine Translation

  13. Statistical machine translation systems Statistical machine translation Rather than using the knowledge based system’s direct word-by-word translation techniques, statistical approaches translate documents by statistically analyzing entire phrases and, over time, “learning” how various languages work. How it works. Statistical systems start with minimal dictionary and language resources. Users then must train the system before they can work with it on extensive translations. During the training, researchers feed the system documents for which they already have accurate human translations. The system then uses its resources to guess at the documents’ meanings. Machine Translation

  14. Statistical machine translation systems Statistical systems generally work by dividing documents into N-grams, with N the number of words, usually three, in a phrase. N-grams are statistical translation’s building blocks. Analyzing N-grams helps improve translation accuracy and performance because, while a word by itself may have many definitions, it has far fewer potential meanings when used as part of a phrase. Machine Translation

  15. Statistical machine translation systems Machine Learning Magic Books in English Same books, in Farsi P(F|E) model Statistical machine translation (SMT) can be defined as the process of maximizing the probability of a sentence s in the source language matching a sentence t in the target language. We call collections stored in two languages parallel corpora or parallel texts. Machine Translation

  16. Statistical machine translation systems Statistical machine translation systems, which statistically analyze entire phrases and “learn” how various languages work, frequently work with other types of systems to improve output quality. The lexicon system provides translated words and their variations. The alignment system assures that phrases from the source language are converted to the proper phrases and presented in the proper order in the target language. The language system performs a morphological analysis of individual words or a syntactic analysis of sentences and thereby produces translations that read properly. Machine Translation

  17. Rule-Based vs. Statistical MT • Rule-based MT: • very labour intensive, time-consuming, and expensive • Rules can be based on lexical or structural transfer • Each program must be customized for each language-pair it works with. • Pro: firm grip on complex translation phenomena • Con: time-consuming, and expensive,Often very labor-intensive -> lack of robustness • Statistical MT • Mainly word or phrase-based translations • Translation are learned from actual data • In general, in statistical machine translation, if more data will be provided for learning; higher will be the quality of translation. • Pro: Translations are learned automatically • Con: Difficult to model complex translation phenomena Machine Translation

  18. Current State of Machine Translation in Use Google Translate is a service provided by Google Inc. to translate a section of text, or a webpage, into another language, with limits to the number of paragraphs, or range of technical terms, translated. For some languages, users are asked for alternative translations, such as for technical terms, to be included for future updates to the translation process. Google translate is based on an approach called statistical machine translation. Machine Translation

  19. Current State of Machine Translation in Use cont. SYSTRAN's methodology is a sentence by sentence approach, concentrating on individual words and their dictionary data, then on the parse of the sentence unit, followed by the translation of the parsed sentence. AltaVista’s Babel fish Babel Fish is a web-based application developed by AltaVista (now part of Yahoo!) which automatically translates text or web pages from one of several languages into another. The translation technology for Babel Fish is provided by SYSTRAN, whose technology also powers a number of other sites and portals. Machine Translation

  20. Current State of Machine Translation in Use cont. is a Los Angeles, California–based company that was founded in 2002 by the University of Southern California's Kevin Knight and Daniel Marcu, to commercialize a statistical approach to automatic |language translation and natural language processing - now known globally as statistical machine translation software (SMTS) Language Weaver’s statistically-based translation software is an instance of a recent advance in automated translation. is a service provided by Microsoft as part of its Windows Live services which allow users to translate texts or entire web pages into different languages. Computer-related texts are translated by Microsoft's own statistical machine translation technology for eight supported languages Machine Translation

  21. Personal Speech-to-Speech Translators • One of the newest research areas in machine translation is the personal speech to-speech translator. People on business or personal trips could use these devices to translate on the fly. • Speech-to-speech translation, which is still in the experimental • stage, is a complex process requiring speech-recognition • technology that converts speech to text, machine translation of the text, and then text-to-speech conversion. • IBM is working on the handheld multilingual automatic speech-to-speech translator (Mastor), which uses a hybrid statistical/knowledge-base engine to translate the content. Mastor tries to determine the general meaning of a phrase, rather than its exact translation. This approach requires less database capacity, which makes it more suitable for small devices. Machine Translation

  22. LOOKING AHEAD • Because of ongoing demand for better translation systems, research money will continue to flow into the field. In addition, companies are likely to develop and release more commercial products. Machine Translation

  23. Questions ? Machine Translation

More Related