1 / 53

Natural Language Processing

Natural Language Processing. Lecture 17—10/29/2013 Jim Martin. Today. Finish Statistical CFG Parsing Dependency parsing Dependency trees Basic transition-based parsing Machine learning. Simple Probability Model.

Télécharger la présentation

Natural Language Processing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Natural Language Processing Lecture 17—10/29/2013 Jim Martin

  2. Today • Finish Statistical CFG Parsing • Dependency parsing • Dependency trees • Basic transition-based parsing • Machine learning Speech and Language Processing - Jurafsky and Martin

  3. Simple Probability Model • A derivation (tree) consists of the bag of grammar rules that are in the tree • The probability of a tree is the product of the probabilities of the rules in the derivation. Speech and Language Processing - Jurafsky and Martin

  4. Improved Approaches • There are two approaches to overcoming these shortcomings • Rewrite the grammar to better capture the dependencies among rules • Integrate lexical dependencies into the model • And come up with the independence assumptions needed to make it work. Speech and Language Processing - Jurafsky and Martin

  5. Solution 2: Lexicalized Grammars • Lexicalize the grammars with heads • Compute the rule probabilities on these lexicalized rules • Run Prob CKY as before Speech and Language Processing - Jurafsky and Martin

  6. Dumped Example Speech and Language Processing - Jurafsky and Martin

  7. Declare Independence • When stuck, exploit independence and collect the statistics you can… • There are a large number of ways to do this... • Let’s consider one generative story: given a rule we’ll • Generate the head • Generate the stuff to the left of the head • Generate the stuff to the right of the head Speech and Language Processing - Jurafsky and Martin

  8. Example • That is, the rule probability for is estimated as Speech and Language Processing - Jurafsky and Martin

  9. Dependency Parse ROOT I booked a morning flight. (booked, I) (booked, flight) (flight, a) (flight, morning)

  10. Tree Constraints • Words can only have one head • One incoming arc • Every word has to have a head • Result is a tree • There’s a path from the root to each word • There’s only one path from the root to any word • These are the formal constraints on dependency trees. For any given sentence there will be lots of such trees. Most of which are non-sense. Speech and Language Processing - Jurafsky and Martin

  11. Dependency Grammar • The linguistic constraints underlying “correct trees” are usually called a dependency grammar • Which may or may not correspond to an explicit formal generative grammar of the kind we’ve been using • The parsing technique discussed today doesn’t use an explicitly represented grammar Speech and Language Processing - Jurafsky and Martin

  12. Transition-Based Parsing • Transition-based parsing is a greedy word-by-word approach to parsing • A single dependency tree is built up an arc at a time as we move left to right through a sentence • No backtracking • A classifiers is used to make decisions as we move through the sentence Speech and Language Processing - Jurafsky and Martin

  13. Dependency Parse I booked a morning flight.

  14. Transition-Based Parsing • We can (again) view this as a search space through a set of states for a state that contains what we want • In the standard notation a state consists of three elements • A stack representing partially processed words • A list containing the remaining words to be processed • A set containing the relations discovered so far Speech and Language Processing - Jurafsky and Martin

  15. States • So the start state looks like • [[root], [sentence], ()] • A valid final state looks like • [[root], [] (R)] • Where R is the set of relations that we’ve discovered. The [] represents the fact that all the words in the sentence are accounted for Speech and Language Processing - Jurafsky and Martin

  16. Example • Here’s our example • Start • [[root], [I booked a morning flight], ()] • End • [[root], [], ((booked, I) (booked, flight) (flight, a) (flight, morning))] Speech and Language Processing - Jurafsky and Martin

  17. Parsing • The parsing problem is how to get from the start state to the final state • To begin, we’ll define a set of three basic operators that take a state and produce a new state • Left • Right • Shift Speech and Language Processing - Jurafsky and Martin

  18. Shift • Shift takes the next word to be processed and pushes it onto the stack and removes it from the list. • So a shift for our example at the start looks like this [[root], [I booked a morning flight], ()]  [[root, I], [booked a morning flight], ()] Speech and Language Processing - Jurafsky and Martin

  19. Left • The Left operator • Adds relation (a, b) to the set of relations where • a is the first word on the word list • b is the word at the top of the stack • Pops the stack • So for our current state [[root, I], [booked a morning flight], ()]  [[root], [booked a morning flight], (booked, I)] Speech and Language Processing - Jurafsky and Martin

  20. Right • The Right operator • Adds (b, a) to the set of relations • Where b and a are the same as before: a is the first work in the remainder list, and b is the top of the stack • Removes the first word from the remainder list • Pops the stack and places the popped item back at the front of the remaining word list Speech and Language Processing - Jurafsky and Martin

  21. Example Speech and Language Processing - Jurafsky and Martin

  22. Example Speech and Language Processing - Jurafsky and Martin

  23. Example Speech and Language Processing - Jurafsky and Martin

  24. Example Speech and Language Processing - Jurafsky and Martin

  25. Example Speech and Language Processing - Jurafsky and Martin

  26. Example Speech and Language Processing - Jurafsky and Martin

  27. Example Speech and Language Processing - Jurafsky and Martin

  28. Example Speech and Language Processing - Jurafsky and Martin

  29. Example Speech and Language Processing - Jurafsky and Martin

  30. Example Speech and Language Processing - Jurafsky and Martin

  31. Example Speech and Language Processing - Jurafsky and Martin

  32. Example Speech and Language Processing - Jurafsky and Martin

  33. Example Speech and Language Processing - Jurafsky and Martin

  34. Example Speech and Language Processing - Jurafsky and Martin

  35. Two Problems • First, we really want labeled relations • That is, we want things like subject, direct object, indirect object, etc. as relations • Second how did we know which operator (L, R, S) to invoke at each step along the way? • Since we’re not backtracking, one wrong step and we won’t get the tree we want • How do we even know what tree we want? • Well we could add backtracking... Speech and Language Processing - Jurafsky and Martin

  36. Grammatical Relations • Well, to handle this we can just add new transitions • Essentially replace Left and Right with {Left, Right} X {all the relations of interest} • Note this isn’t going to make problem 2 any easier to deal with Speech and Language Processing - Jurafsky and Martin

  37. Example Speech and Language Processing - Jurafsky and Martin

  38. Example Speech and Language Processing - Jurafsky and Martin

  39. Example Speech and Language Processing - Jurafsky and Martin

  40. Example Speech and Language Processing - Jurafsky and Martin

  41. Example Speech and Language Processing - Jurafsky and Martin

  42. Making Choices • Method 1 • Use a set of rules that choose an operator based on features of the current state • As in, if the word at the top of the stack is “I” and the rest of the stack is just “root” and the word at the front of the word list is “booked”, then invoke Left_Subj Speech and Language Processing - Jurafsky and Martin

  43. Making Choices • Method 1 • Use a set of rules that choose an operator based on features of the current state • As in, if there’s a pronoun at the top of the stack and the rest of the stack is just root and there’s a verb at the front of the word list, then invoke Left_Subj Speech and Language Processing - Jurafsky and Martin

  44. Making Choices • Method 2 • Use supervised machine learning (ML) to train a classifier to choose among the available operators • Based on features derived from the states • Then use that classifier to make the right choices Speech and Language Processing - Jurafsky and Martin

  45. Example Speech and Language Processing - Jurafsky and Martin

  46. Three Problems • To apply ML in situations like this we have three problems • Discovering features that are useful indicators of what to do in any situation • Characteristics of the state we’re in • Acquiring the necessary training data • Treebanks • Training Speech and Language Processing - Jurafsky and Martin

  47. Three Problems: Features • Features are typically described along two dimensions in this style of parsing • Position in the state (aka configuration) • Position in the stack, position in the word list, location in the partial tree • Attributes of particular locations or attributes of tuples of locations • Part of speech of the top of the stack, POS of the third word in the remainder list, lemmas, last three letters • Head word of a word, number of relations already attached to a word, does the word already have a SUBJ relation, etc. Speech and Language Processing - Jurafsky and Martin

  48. Three Problems: Data • Training data • Get a treebank • Directly as a dependency treebank • Or derived from a phrase-structure treebank Speech and Language Processing - Jurafsky and Martin

  49. Three Problems: Training • This is tricky • Our treebanks associate sentences with their corresponding trees • We need parser states paired with their corresponding correct operators (never going to get this directly) • But we do know the correct trees • So.... Speech and Language Processing - Jurafsky and Martin

  50. Three Problems: Training • We’ll parse with our standard algorithm, asking an oracle which operator to use at any given time. • The oracle has access to the correct tree for this sentence. At each stage it chooses as a case statement • Left if the resulting relation is in the correct tree. • Right if the resulting relation is in the correct tree AND if all the other outgoing relations associated with the word are already in the relation list. • Otherwise shift Speech and Language Processing - Jurafsky and Martin

More Related