1 / 32

Natural Language Processing

Natural Language Processing. Meeting 2 — 9/4/2012 CSCE 5290 Rodney Nielsen. Natural Language Processing. We’re going to study what goes into getting computers to perform useful and interesting tasks involving human language. Natural Language Processing.

barney
Télécharger la présentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language Processing Meeting 2 — 9/4/2012 CSCE 5290 Rodney Nielsen

  2. Natural Language Processing We’re going to study what goes into getting computers to perform useful and interesting tasks involving human language.

  3. Natural Language Processing More specifically, it’s about the algorithms that we use to process human language, the formal basis for those algorithms, and the facts about human language that allow those algorithms to work.

  4. Major Topics • Morphology / Words • Syntax / Structure • Semantics / Meaning • Pragmatics & Dialog / Texts, Context & Implicatures 5. Applications

  5. How? • Exploiting regularities • Complex and trivial ways Language structure Formal models Practical applications

  6. Topics: Techniques • Finite-state methods • Context-free methods • Probabilistic models Supervised machine learning methods

  7. Categories of Knowledge Morphological Processing Syntactic Analysis Semantic Interpretation Context • Phonology • Morphology • Syntax • Semantics • Pragmatics • Discourse Typically mapped to separate processes Interfaces Leads to:

  8. Ambiguity • Ambiguity is a fundamental problem in computational linguistics • Hence, resolving, or managing, ambiguity is a recurrent theme

  9. Ambiguity • How many meanings can you find for this sentence: • I made her duck

  10. Ambiguity • Find at least 5 meanings of this sentence: • I made her duck • I cooked waterfowl for her benefit (to eat) • I cooked waterfowl belonging to her • I created the (ceramic?) duck she owns • I caused her to quickly lower her upper body • I waved my magic wand and turned her into undifferentiated waterfowl

  11. Ambiguity is Pervasive • I caused her to quickly lower her head or body • Lexical category: “duck” can be a noun or verb • I cooked waterfowl belonging to her. • Lexical category: “her” can be a possessive (“of her”) or dative (“for her”) pronoun • I made the (ceramic) duck statue she owns • Lexical Semantics: “make” can mean “create” or “cook”, and about 100 other things as well

  12. Ambiguity is Pervasive • Grammar: Make can be: • Transitive: (verb has a noun direct object) • I cooked [waterfowl belonging to her] • Ditransitive: (verb has 2 noun objects) • I made [her] (into) [undifferentiated waterfowl] • Action-transitive (verb has a direct object and another verb) • I caused [her] [to move her body]

  13. Ambiguity is Pervasive • Phonetics! • I mate or duck • I’m eight or duck • Eye maid; her duck • Aye mate, her duck • I maid her duck • I’m aid her duck • I mate her duck • I’m ate her duck • I’m ate or duck • I mate or duck

  14. Problem Morphological Processing Syntactic Analysis Semantic Interpretation Context • Remember our pipeline...

  15. Really it’s this Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Morphological Processing Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation

  16. Or is it? Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Syntactic Analysis Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Morphological Processing Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation Semantic Interpretation

  17. Dealing with Ambiguity • Four possible approaches: • Tightly coupled • Pipeline • Probabilistic • Or n-best • Don’t do anything, maybe it won’t matter • We’ll leave when the duck is ready to eat. • The duck is ready to eat now.

  18. Models and Algorithms • Models • linguistic knowledge • Algorithms

  19. Models • State machines • Rule-based approaches • Logical formalisms • Probabilistic models

  20. Algorithms • Transducers • But ambiguity…

  21. Paradigms • In particular.. • State-space search • To manage the problem of making choices during processing when we lack the information needed to make the right choice • Dynamic programming • To avoid having to redo work during the course of a state-space search • CKY, Earley, Minimum Edit Distance, Viterbi, Baum-Welch • Classifiers • Machine learning based classifiers that are trained to make decisions based on features extracted from the local context

  22. Administrivia • Course web page: • http://www.cse.unt.edu/~nielsen/csce5290/ • Syllabus, readings, slides, assignments, announcements, etc. • E-mail • Office hours – open door • TR 12:20-12:50 • W 2:00-…

  23. Readings • Readings: • Speech and Language Processing by Jurafsky and Martin, 2ed. Prentice-Hall 2009 • A few conference or journal papers

  24. Grading • 5% Reading responses / questions • 30% Quiz / class participation • Question responses (20%) Bring laptops Thur 13th • Discussion (10%) • 45% Semester project • Project proposal (5%) • Project literature review (5%) • Intermediate progress (18%) • Final paper (10%) • Final presentation (7%): Tuesday Dec 11, 10:30-12:30 • 20% Significant constructive peer feedback

  25. Projects • Thesis related • Question Answering • Robotic CSE guide • Other

  26. Introductions • Area of specialization / primary interests

  27. Your Questions • Uncanny valley? • How do we detect sentence boundaries? • Questions about "grep”? • "grep -i” – case insensitive • "grep -v” – inverted search • Lazy regex

  28. Your Questions • Why does the author stress that results of a turing machine will not determine whether or not a computer will ever be intelligent or understand languages. (Is he inferring the idea of computer learning is impossible or the limitations of turing machines)? • Would there be any issues with regular expressions handling foreign characters (i.e Mandarin Chinese Symbols)? • Can or can't DFSA be converted into NFSA?

  29. Data: She [the Borg Queen] brought me closer to humanity than I ever thought possible. And for a time, I was tempted by her offer. • Picard: How long a time? • Data: Zero point six-eight seconds, sir... For an android, that is nearly an eternity. • Star Trek: First Contact • http://www.youtube.com/watch?v=kSHytxvDDqU&feature=related 5:20

  30. Your Questions • Regular Expression: All these symbols (., ^, $ etc...) points to be working with Perl language. Will other languages compilers recognize and process them? • How come Memory (\1 together with ()) operation is considered part of regular expression, but it cannot be realized as a finite automaton?

  31. Your Questions • How does a Lexical Disambiguation and syntactic disambiguation technique work? • What is probabilistic parsing and speech act interpretation? • What does Hidden Markov model, Maximum Entropy Markov model and Conditional Random Fields model do? In what aspects are they different from one another?

  32. Questions

More Related