1 / 84

Natural Language Generation An Introductory Tour

Natural Language Generation An Introductory Tour. Anupam Basu Dept. of Computer Science & Engineering IIT Kharagpur. Natural Language Understanding. Natural Language Generation. Speech Recognition. Speech Synthesis. Language Technology. Meaning. Text. Text. Speech. Speech.

vahe
Télécharger la présentation

Natural Language Generation An Introductory Tour

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language GenerationAn Introductory Tour Anupam Basu Dept. of Computer Science & Engineering IIT Kharagpur Summer School on Natural Language Processing and Text Mining 2008

  2. Natural Language Understanding Natural Language Generation Speech Recognition Speech Synthesis Language Technology Meaning Text Text Speech Speech

  3. What is NLG? Thought / conceptualization of the world ------ Expression The block c is on block a The block a is under block c The block b is by the side of a The block b is on the right of a The block b has its top free The block b is alone ………

  4. Conceptualization • Some intermediate form of representation ON (C, A) ON (A, TABLE) ON (B, TABLE) RIGHT_OF (B,A) ……. What to say?

  5. Conceptualization Is_a Block C On Is_a B A Right_of What to say?

  6. What to say ? How to say ? Natural language generation is the process of deliberately constructing a natural language text in orderto meet specified communicative goals. [McDonald 1992]

  7. Some of the Applications • Machine Translation • Question Answering • Dialogue Systems • Text Summarization • Report Generation

  8. Thought / Concept  Expression • Objective: • produce understandable and appropriate texts in human languages • Input: • some underlying non-linguistic representation of information • Knowledge sources required: • Knowledge of language and of the domain

  9. Involved Expertise • Knowledge of Domain • What to say • Relevance • Knowledge of Language • Lexicon, Grammar, Semantics • Strategic Rhetorical Knowledge • How to achieve goals, text types, style • Sociolinguistic and Psychological Factors • Habits and Constraints of the end user as an information processor

  10. Asking for a pen • have(X, z) not have (Y,z) • want have (Y,z) • ask(give (X,z,Y))) • Could you please give me a pen? Situation Why? Goal What? Conceptualization How? Expression

  11. Some Examples Summer School on Natural Language Processing and Text Mining 2008

  12. Example System #1: FoG • Function: • Produces textual weather reports in English and French • Input: • Graphical/numerical weather depiction • User: • Environment Canada (Canadian Weather Service) • Developer: • CoGenTex • Status: • Fielded, in operational use since 1992

  13. FoG: Input

  14. FoG: Output

  15. Example System #2: STOP • Function: • Produces a personalised smoking-cessation leaflet • Input: • Questionnaire about smoking attitudes, beliefs, history • User: • NHS (British Health Service) • Developer: • University of Aberdeen • Status: • Undergoing clinical evaluation to determine its effectiveness

  16. STOP: Input

  17. STOP: Output Dear Ms Cameron Thank you for taking the trouble to return the smoking questionnaire that we sent you. It appears from your answers that although you're not planning to stop smoking in the near future, you would like to stop if it was easy. You think it would be difficult to stop because smoking helps you cope with stress, it is something to do when you are bored, and smoking stops you putting on weight. However, you have reasons to be confident of success if you did try to stop, and there are ways of coping with the difficulties.

  18. Approaches Summer School on Natural Language Processing and Text Mining 2008

  19. Template-based generation • Most common technique • In simplest form, words fill in slots: • “The train from Source to Destination will leave platform number at time hours” Most common sort of NLG found in commercial systems

  20. Pros and Cons • Pros • Conceptually simple • No specialized knowledge needed • Can be tailored to a domain with good performance • Cons • Not general • No variation in style – monotonous • Not scalable

  21. Modern Approaches • Rule Based approach • Machine Learning Approach

  22. Some Critical Issues Summer School on Natural Language Processing and Text Mining 2008

  23. Context Sensitivity in Connected Sentences • X-town was a blooming city. Yet, when the hooligans started to invade the place, __________ . The place was not livable any more. • the place was abandoned by its population • the place was abandoned by them • the city was abandoned by its population • it was abandoned by its population • its population abandoned it……..

  24. Referencing John is Jane’s friend. He loves to swim with his dog in the pool. It is really lovely. I am taking the Shatabdi Express tomorrow. It is a much better train than the Rajdhani Express. It has a nice restaurant car, while the other has nice seats.

  25. Referencing John stole the book from Mary, but he was caught. John stole the book from Mary, but the fool was caught.

  26. Aggregation The dress was cheap. The dress was beautiful The dress was cheap and beautiful The dress was cheap yet beautiful I found the boy. The boy was lost. I found the boy who was lost I found the lost boy. Sita bought a story book. Geeta bought a story book. ???? Sita and Geeta bought a story book. ???? Sita bought a story book and Geeta also bought a story book

  27. Choice of words (Lexicalization) The bus was in time. The journey was fine. The seats were bad. The bus was in perfect time. The journey was fantastic. The seats were awful. The bus was in perfect time. The journey was fantastic. However, the seats were not that good.

  28. General Architecture Summer School on Natural Language Processing and Text Mining 2008

  29. Component Tasks in NLG • Content Planning === Macroplanner • Document Structuring • Sentence Planner === Microplanning • Aggregation ; Lexicalization; Referring Expression Generation • Surface Form Realization • Linguistic realization; Structure Realization

  30. Document Planning Document Plan A Pipelined Architecture Microplanning Text Specification Surface Realization

  31. An Example Consider two assertions has (Hotel_Bliss, food (bad)) has (Hotel_Bliss, ambience (good)) Content Planning selects information ordering Hotel Bliss has bad food but its ambience is good Hotel Bliss has good ambience but its food is good

  32. has (Hotel_Bliss, food (bad)) Sentence Planning choose syntactic templates choose lexicon bad or awful food or cuisine good or excellent Aggregate the two propositions Generate referring expressions It or this restaurant Ordering A big red ball OR A red big ball Have Entity Feature Modifier Subj Obj

  33. Realization correct verb inflection Have  Has may require noun inflection (not in this case) Articles required? Where? Conversion into final string Capitalization and Punctuation

  34. Content Planning • What to say • Data collection • Making domain specific inferences • Content selection • Proposition formulation • Each proposition  A clause • Text structuring • Sequential ordering of propositions • Specifying Rhetorical Relations

  35. Content Planning Approaches • Schema based (McKeown 1985) • Specify what information, in which order • The schema is traversed to generate discourse plan • Application of operators (similar to Rule Based approach) --- Hovy 93 • The discourse plan is generated dynamically • Output is Content Plan Tree

  36. Discourse Detailed view Group nodes Demograph Summary Name Age Care Blood Sugar

  37. Content Plan • Plan Tree Generation • Ordering – of Group nodes • Propositions • Rhetorical relations between leaf nodes • Paragraph and sentence boundaries

  38. Rhetorical Relations ENABLEMENT MOTIVATION MOTIVATION EVIDENCE You should ... I’m in ... The show ... It got a ... You can get ...

  39. Rhetorical Relations Three basic rhetorical relationships: • SEQUENCE • ELABORATION • CONTRAST Others like • Justification • Inference

  40. Nucleus and Satellites Contrast I drive my Maruti 800 Elaboration I love to collect classic cars My favourite car is Toyota Innova N

  41. Target Text The month was cooler and drier than average, with the average number of rain days, but the total rain for the year so far is well below average. Although there was rain on every day for 8 days from 11th to 18th, rainfall amounts were mostly small.

  42. Document Structuring in WeatherReporter The Message Set: MonthlyTempMsg ("cooler than average") MonthlyRainfallMsg ("drier than average") RainyDaysMsg ("average number of rain days") RainSoFarMsg ("well below average") RainSpellMsg ("8 days from 11th to 18th") RainAmountsMsg ("amounts mostly small")

  43. SEQUENCE ELABORATION ELABORATION CONTRAST CONTRAST MonthlyRainfallMsg RainyDaysMsg RainSoFarMsg RainSpellMsg RainAmountsMsg Document Structuring in Weather Reporter MonthlyTmpMsg

  44. Some Common RST Relationships • Elaboration: The satellite presents more details about the content of the nucleus • Contrast: The nuclei presents things, which are similar in some respects but different in some other relevant way. • Multinuclear – no distinction bet. N and S • Purpose: S presents the goal of performing the activity presented in the nucleus • Condition: S presents something that must occur before the situation presented in N can occur • Result: N results from S

  45. Planning Approach Save Document The system saves the document Click Save Button Choose Save option Type Filename Select Folder A dialog box displayed Dialog box closed

  46. Planning Operator Name: Expand Purpose Effect: (COMPETENT hearer(DO-ACTION ?action)) Constraints: (AND (get_all_substeps ?action ?subaction) (NOT (singular list ?subaction)) Nucleus: (COMPETENT hearer (DO-SEQUENCE ?subaction)) Satellite: (((RST-PURPOSE (INFORM hearer (DO ?action)))

  47. Expand Subactions Effect: (COMPETENT hearer (DO-SEQUENCE ?actions)) Constraints: NIL Nucleus: (for each ?actions (RST-SEQUENCE (COMPETENT hearer (DO-ACTION ?actions)))) Satellites: NIL

  48. Purpose Sequence Choose Folder Choose Save Dialog Box Opens Result

  49. Discourse • To save a file • 1. Choose save option from file menu A dialog box will appear • 2. Choose the folder • 3. Type the file name • 4. Click the Save button The system will save the document

  50. Rhetorical Relations – Difficult to infer Johh abused the duck The duck buzzed John • John abused the duck that had buzzed him • The duck buzzed John who had abused it • The duck buzzed John and he abused it • John abused the duck and it buzzed him

More Related