1 / 46

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches)

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches). Pushpak Bhattacharyya CSE Dept., IIT Bombay 25 th Jan and 27 th Jan , 2011. Acknowledgement: parts are from Hansraj’s dual degree seminar presentation. Czeck -English data.

teagan
Télécharger la présentation

CS460/626 : Natural Language Processing/Speech, NLP and the Web (Lecture 10, 11–MT approaches)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CS460/626 : Natural Language Processing/Speech, NLP and the Web(Lecture 10, 11–MT approaches) Pushpak BhattacharyyaCSE Dept., IIT Bombay 25th Jan and 27th Jan, 2011 Acknowledgement: parts are from Hansraj’s dual degree seminar presentation

  2. Czeck-English data • [nesu] “I carry” • [ponese] “He will carry” • [nese] “He carries” • [nesou] “They carry” • [yedu] “I drive” • [plavou] “They swim”

  3. To translate … • I will carry. • They drive. • He swims. • They will drive.

  4. Hindi-English data • [DhotAhuM] “I carry” • [DhoegA] “He will carry” • [DhotAhAi] “He carries” • [DhotehAi] “They carry” • [chalAtAhuM] “I drive” • [tErtehEM] “They swim”

  5. Bangla-English data • [bai] “I carry” • [baibe] “He will carry” • [bay] “He carries” • [bay] “They carry” • [chAlAi] “I drive” • [sAMtrAy] “They swim”

  6. MT Approaches interlingua semantics semantics syntax syntax phrases phrases words words SOURCE TARGET

  7. Taxonomy MT Approaches Data driven; Machine Learning Based Knowledge Based; Rule Based MT Statistical MT Example Based MT (EBMT) Interlingua Based Transfer Based

  8. Motivation • MT: NLP Complete • NLP: AI complete • AI: CS complete • How will the world be different when the language barrier disappears? • Volume of text required to be translated currentlyexceeds translators’ capacity (demand outstrips supply). • Solution: automation (the onlysolution) • Many machine translation techniques • Which approach is better for Hindi-English MT

  9. Interlingual representation: complete disambiguation • Washington voted Washington to power Vote @past <is-a > action agent object goal Washington power Washington @emphasis <is-a > capability <is-a > place <is-a > person <is-a > …

  10. Kinds of disambiguation needed for a complete and correct interlingua graph • N: Name • P: POS • A: Attachment • S: Sense • C: Co-reference • R: Semantic Role

  11. Target Sentence Generation from interlingua Target Sentence Generation Morphological Synthesis Lexical Transfer Syntax Planning (Word/Phrase Translation ) (Word form Generation) (Sequence)

  12. Role of function word • Washington voted Washington to power. • WashingtonagentneWashingtonobjectkosattaagoalkeliyechunaa Vote- chunna Power- sattaa

  13. Statistical Machine Translation (SMT) • Data driven approach • Goal is to find out the English sentence e given foreign language sentence f whose p(e|f) is maximum. • Translations are generated on the basis of statistical model • Parameters are estimated using bilingual parallel corpora

  14. SMT: Language Model • To detect good English sentences • Probability of an English sentence s1s2 …… sn can be written as Pr(s1s2 …… sn) = Pr(s1) * Pr(s2|s1) *. . . * Pr(sn|s1 s2 . . . sn-1) • Here Pr(sn|s1 s2 . . . sn-1)is the probability that word sn follows word string s1 s2 . . . sn-1. • N-gram model probability • Trigram model probability calculation

  15. SMT: Translation Model • P(f|e): Probability of some f given hypothesis English translation e • How to assign the values to p(e|f) ? • Sentences are infinite, not possible to find pair(e,f) for all sentences • Introduce a hidden variable a, that represents alignments between the individual words in the sentence pair Sentence level Word level

  16. Alignment • If the string, e= e1l= e1 e2 …el, has l words, and the string, f= f1m=f1f2...fm, has m words, • then the alignment, a, can be represented by a series, a1m= a1a2...am, of m values, each between 0 and l such that if the word in position j of the f-string is connected to the word in position i of the e-string, then • aj= i, and • if it is not connected to any English word, then aj= O

  17. Example of alignment • English: Ram went to school • Hindi: Raamapaathashaalaagayaa • Ram went to school • <Null>Raamapaathashaalaagayaa

  18. Translation Model: Exact expression • Five models for estimating parameters in the expression [2] • Model-1, Model-2, Model-3, Model-4, Model-5 Choose the length of foreign language string given e Choose alignment given e and m Choose the identity of foreign word given e, m, a

  19. Proof of Translation Model: Exact expression ; marginalization ; marginalization m is fixed for a particular f, hence

  20. Model-1 • Simplest model • Assumptions • Pr(m|e) is independent of m and e and is equal to ε • Alignment of foreign language words (FLWs) depends only on length of English sentence = (l+1)-1 • l is the length of English sentence • The likelihood function will be • Maximize the likelihood function constrained to

  21. Model-1: Parameter estimation • Using Lagrange multiplier for constrained maximization, the solution for model-1 parameters • λe : normalization constant; c(f|e; f,e) expected count;δ(f,fj) is 1 if f & fj are same, zero otherwise. • Estimate t(f|e) using Expectation Maximization (EM) procedure

  22. Model-2 • Alignment of FLW to an Eng word depends on its position • The likelihood function is • Model-1 & 2 • Model-1 is special case of model-2 where • To instantiate the model-2 parameters, use parameter estimated in model-1

  23. Model-3 • Fertility: Number of FLWs to which an Eng word is connected in a randomly selected alignment • Tablet: List of FLWs connected to an Eng word • Tableau: The collection of tablets • The alignment process • foreach English word • Begin • Decide the fertility of the word • Get a list of French words to connect to the word • End • Permute words in tableau to generate f

  24. Model-3: Example • English Sentence (e) = Annual inflation rises to 11.42% • Step-1: Deciding fertilities(F) • e = Annual inflation rises to 11.42% • F = Annual inflation inflation inflation rises rises rises to 11.42%

  25. Model-3: Example • English Sentence (e) = Annual inflation rises to 11.42% • Step-2: Translation to FLWs(T) • e = Annual inflation rises to 11.42% • F = Annual inflation inflation inflation rises rises rises to 11.42% • T=वार्षिक मुद्रास्फीति की दर बढ गई है तक 11.42%

  26. Model-3: Example • English Sentence (e) = Annual inflation rises to 11.42% • Step-3: Reordering FLWs(R) • e = Annual inflation rises to 11.42% • F =Annual inflation inflation inflation rises rises rises to 11.42% • T = वार्षिक मुद्रास्फीति की दर बढ गई है तक 11.42% • R = वार्षिक मुद्रास्फीति की दर 11.42% तक बढ गई है • Values fPr, T, R calculated using the formulas obtained in model-3 [2]

  27. Model-4 & 5 • Model-3: Every word is moved independently • Model-4: Consider phrases (cept) in a sentence • Distortion probability is replaced by • A parameter for head of the each cept • A parameter for the remaining part of the cept • Deficiency in model-3 & 4 • In distortion probability • Model-5 removes the deficiency • Avoid unavailable positions • Introduces a new variable for the positions

  28. Example Based Machine Translation (EBMT) • Basic idea: translate a sentence by using the closest match in parallel data • Inspired by human analogical thinking

  29. Issues Related to Examples in Corpora • Granularity of examples • Parallel text should be aligned at the subsentence level • Number of examples • Suitability of examples • (i) Columbus discovered America (ii) America was discovered by Columbus • (a) Time flieslike an arrow (b) Time flies like an arrow • How examples should be stored? • Annotated tree structure • Generalized examples • “Rajesh will reach Mumbai by 10:00 pm”->”P will reach D by T”

  30. Annotated Tree Structure: example • Fully annotated tree with explicit links

  31. EBMT: Matching and Retrieval (1/2) • System must be able to recognize the similarity and differences b/w the input and stored examples • String based matching: • Longest common subsequence • Takes word similarity into account while sense disabiguation

  32. EBMT: Matching and Retrieval (2/2) • Angle of similarity: • Trigonometric similarity measure based on relative length & relative contents • (x). Select ‘Symbol’ in the Insert menu. • (y). Select ‘Symbol’ in the Insert menu to enter a character from the symbol set. • (z). Select ‘Paste’ in the Edit menu. • (w). Select ‘Paste’ in the Edit menu to enter some text from the clip board. θxy: the qualitative difference between sentence x and y δ(x,y): the difference between size of x and y,

  33. EBMT: Adaptation & Recombination • Adaptation • Extracting appropriate fragments from the matched translation • The boy entered the house-> लड़केनेकमरेमेंप्रवेशकिया • I saw a tiger -> मैंने एक चीता देखा • The boy eats his breakfast -> लड़केनेउसकानास्ताखायाथा • I saw the boy -> मैंनेलड़केकोदेखाथा • Boundary Friction • Retrieved translations do not fit the syntactic context • I saw the boy -> * मैंनेलड़केनेदेखाथा • Recombine fragments into target text • SMT “language model” can be used

  34. Interlingua Based MT • Interlingua • "between languages“ • SL text converted into a language-independent or 'universal' abstract representation then transform into several TL

  35. Universal Networking Language (UNL) • UNL is an example of interlingua • Represents information sentence by sentence • UNL is composed of • Universal words • Relations • Example: “I gave him a book” {unl} agt ( give.@entry.@past, i )obj ( give.@entry.@past, book.@indef )gol ( give.@entry.@past, he )         {/unl}

  36. Issues related to interlingua • Interlingua must • Capture the knowledge in text precisely and accurately • Handle cross language divergence • Divergence between Hindi-English language • Constituent order divergence • Null subject divergence • जा रहा हु == * am going (I am going) • Conflational divergence • जीम ने जोहन को छुरे से मारा == Jim stabbed John • Promotional divergence • The play is on == खेल चल रहा है

  37. Benefits & Shortcomings(1/3) • Statistical Machine translation “Every time I fire a linguist, my system’s performance improves” (Brown et al. 1988) • Pros • No linguistic knowledge is required • Great deal of natural language in machine readable text • Loose dependencies b/w languages can be modeled better • Cons • Probability of rare words can’t be trusted • Not good for idioms, jokes, compound words, text having hidden meaning • Selection of correct morphological word is difficult

  38. Benefits & Shortcomings(2/3) • Example Based MT • Pros • Perfect translation of a sentence if very similar one found in example sentences • No need to bother about previously translated sentences • Cons • Fails if no match found in corpora • Problem at points of example concatenation in recombination step

  39. Benefits & Shortcomings(3/3) • Interlingua based MT • Pros • Add a new language and get all-ways translation to all previously added languages • Monolingual lingual development team • Economical in situation where translation among multiple languages is used • Cons • “Meaning” is arbitrarily deep. At what level of detail do we stop? • Human development time

  40. Translation is Ubiquitous • Between Languages • Delhi is the capital of India • दिल्ली  भारत की राजधानी है • Between dialects • Example next slide • Between registers • My “mom” not well. • My “mother” is unwell (in a leave application)

  41. Between dialects (1/3) • LageRahoMunnabhai: an excellent example • Scene: Munnabhai (Sanjay Dutt) is Prof. Murli Prasad Sharma being interviewed with some citizens asking questions in presence of Jahnavi (VidyaBaalan) • Question by citizen: • प्रोफेसर साब, पार्क में एक नौजवान पत्थर उठा के बापू के मूर्ति पर मारा और उसका एक हाथ  टूटा दिया. मेरे समझ में नही आया में उस नौजवान को केसे समझाऊ.

  42. Between dialects (2/3) • Bapu from behind invisible to others: • उस के हाथ में एक पत्थर देकर कहना चाहिए था बापू का इस पुतला गिरा दो  • Munnabhai • उस का हाथ में एक पत्थर देने का और कहनेका कीबापू का इस पुतला गिरा दो  • Bapu • इस देश में मेरे जितना भी पुतला है सब गिरा दो • Munnabhai • ई full country में मेरा जितना भी पुतला है सब गिरा दो • Bapu • हर इमारत हर चौराहे हर मार्ग से मेरा नाम मिटा दो • Munnabhai • हर बिल्डिंग हर नोट वोट रोड से मेरा नाम मिटा दो

  43. Between dialects (3/3) • Bapu • मेरे हर तसबीर को दीवार से हठा दो • Munnabhai • मेरे जितना भी तसबीर दीवार पे लटकेला है ना, उसको निकाल के फेक दो • Bapu • अगर कही रखना है तो अपने दिलो में रखो • Munnabhai • क्या है की कही रखना छे तो, अपने दिल में रखो ना, समझा क्या, इधर heart में heart में!

  44. Comparison b/w SMT, EBMT, Interlingua

  45. References (1/2) • P. Brown, S. Della Pietra, V. Della Pietra, and R. Mercer. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2), 263-311. (1993) • Makoto Nagao. A framework of a mechanical translation between Japanese and English by analogy principle, in A. Elithorn and R. Banerji: Artificial and Human Intelligence. Elsevier Science Publishers. (1984). • Somers H. Review Article: Example based Machine Translation. Machine Translation, Volume 14, Number 2, pp. 113-157(45). (June 1999) • D. Turcato, F. Popowich. What is Example-Based Machine Translation? In M. Carl and A. Way (eds). Recent Advances of EBMT. KluwerAdacemic Publishers, Dordrecht. Note, revised version of Workshop Paper. (2003)

  46. References (2/2) • Dave S., Parikh J. and Bhattacharyya. Interlingua Based English Hindi Machine Translation and Language Divergence. P. Journal of Machine Translation, Volume 17. (2002) • Adam L. Berger, Stephen A. Della Pietra Y, Vincent J. Della Pietra Y. A maximum entropy approach to natural language processing. Computational Linguistics, (22-1), (March 1996). • Jason Baldridge, Tom Morton, and Gann Bierner. The opennlp.maxent package: POS tagger, end of sentence detector, tokenizer, name finder. http://maxent.sourceforge.net/ version- 2.4.0 (Oct. 2005) • Universal Networking Language (UNL) Specifications. UNL Center of UNDL Foundation. URL: http://www.undl.org/unlsys/unl/unl2005/. 7 June 2005.

More Related