1 / 34

Human Language Technology

Human Language Technology. Overview. Acknowledgement. Material for some of these slides taken from J Nivre, University of Gotheborg, Sweden D. Jurafsky & J. Martin. Human Language Technology. HLT sometimes referred to as Natural Language Processing focus on linguistic processing

scout
Télécharger la présentation

Human Language Technology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Human Language Technology Overview HLT

  2. Acknowledgement • Material for some of these slides taken from • J Nivre, University of Gotheborg, Sweden • D. Jurafsky & J. Martin HLT

  3. Human Language Technology HLT sometimes referred to as • Natural Language Processing • focus on linguistic processing • Computational Linguistics • focus on understanding language • Language Engineering • focus on practical tasks and results HLT

  4. HLT – Engineering v. Science • Engineering • NLP is concerned with the design and implementation of effective NL input and output components for computational systems (Robert Dale 2000) • Science • The use of computers for linguistic research and applications HLT

  5. HLT is Interdisciplinary • Linguistics • Theoretical • Applied • Computer Science • Algorithms • Compiling Techniques • Artificial Intelligence • Understanding, reasoning • Intelligent Action HLT

  6. HLT is Commercial • Lot’s of exciting stuff going on… Powerset HLT

  7. Google Translate HLT

  8. Google Translate HLT

  9. Web Q/A HLT

  10. Web Analytics • Data-mining of social media • weblogs, discussion forums, message boards, user groups, and other forms of user generated media • Sentiment analysis, social network analysis • Product marketing information • Opinion tracking over space and time • Social network analysis • Buzz analysis (what’s hot, what topics are people talking about right now). HLT

  11. HLT can help with • Understanding how language works • by implementing complex theories directly • More Natural Communication • development of multimodal M/M communication: language, speech, gesture • Development of multilingual applications • Knowledge Management • Language is the fabric of the web HLT

  12. Language Enabled Applications • What makes an application a language processing application (as opposed to any other piece of software)? • An application that requires the use of knowledge about human languages • Example: Is Unix wc (word count) an example of a language processing application? HLT

  13. Language Enabled Applications • Word count? • When it counts words: Yes • To count words you need to know what a word is. That’s knowledge of language. • When it counts lines and bytes: No • Lines and bytes are computer artifacts, not linguistic entities HLT

  14. Small Spelling correction Hyphenation Medium Word-sense disambiguation Named entity recognition Information retrieval Big Question answering Conversational agents Automatic Summarisation Machine translation Stand-alone Enabling applications Funding/Business plans Topics: Applications HLT

  15. Big Applications • These kinds of applications require a tremendous amount of knowledge of language. • Consider the following interaction with HAL the computer from 2001: A Space Odyssey HLT

  16. HAL from 2001 • Dave: Open the pod bay doors, Hal. • HAL: I’m sorry Dave, I’m afraid I can’t do that. • http://www.youtube.com/watch?v=kkyUMmNl4hk HLT

  17. What’s needed? • Speech recognition and synthesis • Knowledge of the English words involved • What they mean • How groups of words fit together into groups • What the groups mean • How the groups relate to each other. HLT

  18. What’s needed? • Dialog • It is polite to respond, even if you’re planning to kill someone. • It is polite to pretend to want to be cooperative (I’m afraid, I can’t…) HLT

  19. Summary of Application Areas • Document Processing • Classification • Summarisation • Information Extraction • Question Answering • Information Retrieval • Dialogue • Multilinguality • Machine Translation • Translation tools • Multimodality • speech • intonation • image HLT

  20. Basic Problems • Analysis • Conversion of NL input to internal representations • Generation • Conversion of internal representations to NL output • Issues • What kind of input/output/representations? • Role of learning • Supervised v unsupervised • What training data is available? • System Evaluation HLT

  21. Levels of Linguistic Knowledge • Phonetics/Phonology: sound structure • Morphology: word structure • Syntax: sentence structure • Semantics: meanings • Pragmatics: use of language in context • Discourse: paragraphs, texts, dialogues HLT

  22. Each level of knowledge is associated with an encapsulated set of processes. Interfaces are defined that allow the various levels to communicate. This often leads to a pipeline architecture. Processing Pipelines HLT

  23. Ambiguity • Computational linguists are obsessed with ambiguity • Ambiguity is a fundamental problem of computational linguistics • Resolving ambiguity is a crucial goal • Ambiguity arises at different levels of analysis HLT

  24. Ambiguity – different flavours • LexicalI made her duck • SyntacticYoung men and women • ReferentialShe did it • PragmaticCan you pass the salt? HLT

  25. Ambiguity • Find at least 5 meanings of this sentence: • I made her duck • I cooked waterfowl for her benefit (to eat) • I cooked waterfowl belonging to her • I created the (plaster?) duck she owns • I caused her to quickly lower her head or body • I waved my magic wand and turned her into undifferentiated waterfowl HLT

  26. Ambiguity is Pervasive I made her duck • I caused her to quickly lower her head or body • Lexical category: “duck” can be a N or V • I cooked waterfowl belonging to her. • Lexical category: “her” can be a possessive (“of her”) or dative (“for her”) pronoun • I made the (plaster) duck statue she owns • Lexical semantics: “make” can mean “create” or “cook” HLT

  27. Ambiguity is Pervasive • Grammar: Make can be: • Transitive: (verb has a noun direct object) • I cooked [waterfowl belonging to her] • Ditransitive: (verb has 2 noun objects) • I made [her] (into) [undifferentiated waterfowl] • Action-transitive (verb has a direct object and another verb) • I caused [her] [to move her body] HLT

  28. Ambiguity is Pervasive • Phonetics! • I mate or duck • I’m eight or duck • Eye maid; her duck • Aye mate, her duck • I maid her duck • I’m aid her duck • I mate her duck • I’m ate her duck • I’m ate or duck • I mate or duck HLT

  29. Dealing with Ambiguity • Four possible approaches: • Tightly coupled interaction among processing levels; knowledge from other levels can help decide among choices at ambiguous levels. • Pipeline processing that ignores ambiguity as it occurs and hopes that other levels can eliminate incorrect structures. HLT

  30. Dealing with Ambiguity • Probabilistic approaches based on making the most likely choices • Don’t do anything, maybe it won’t matter • We’ll leave when the duck is ready to eat. • The duck is ready to eat now. • Does the “duck” ambiguity matter with respect to whether we can leave? HLT

  31. Ways of Studying NLP • By ApplicationMT, IE, IR etc. • By Approachrational vs. empirical • By Linguistic Levelmorphology, syntax etc. • By Algorithm HLT

  32. Algorithms • State Machines • automata and transducers • Rule Systems • regular and context free grammars • Search • top-down/bottom-up parsing • Probabilistic algorithms HLT

  33. Organisation of Course • Module 1: Words • Linguistics: Morphological Structure • Morphological Processing • LAB + Assignment I • Module 2: Sentences • Linguistics: Syntactic Structure • NL Parsing Algorithms • LAB + Assignment II • Module 3: Texts • Statistics • Text Classification • LAB + Assignment III HLT

  34. Course Information • Course Websitehttp://staff.um.edu.mt/mros1/hlt • Reference Texts • D. Jurafsky and J. Martin, Speech and Language Processing, 2nd Edition, Prentice-Hall • S. Bird, E. Klein and E. Loper, Natural Language Processing with Pythonhttp://www.nltk.org • Thank you HLT

More Related