1 / 21

Chapter 12 Lexicalized and Probabilistic Parsing

Chapter 12 Lexicalized and Probabilistic Parsing. Guoqiang Shan University of Arizona November 30, 2006. Outline. Probabilistic Context-Free Grammars Probabilistic CYK Parsing PCFG Problems. Probabilistic Context-Free Grammars. Intuition Behind

teagan
Télécharger la présentation

Chapter 12 Lexicalized and Probabilistic Parsing

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Chapter 12Lexicalized and Probabilistic Parsing Guoqiang Shan University of Arizona November 30, 2006

  2. Outline • Probabilistic Context-Free Grammars • Probabilistic CYK Parsing • PCFG Problems

  3. Probabilistic Context-Free Grammars • Intuition Behind • To find “correct” parse for the ambiguous sentences • i.e. can you book TWA flights? • i.e. the flights include a book • Definition of Context-Free Grammar • 4-tuple G = (N, Σ, P, S) • N: a finite set of non-terminal symbols • Σ: a finite set of terminal symbols, where N ΛΣ = Φ • P: A β , where A is in N, and β is in (N V Σ)* • S: start symbol in N • Definition of Probabilistic Context-Free Grammar • 5-tuple G = (N, Σ, P, S, D) • D: A function P  [0,1] to assign a probability to each rule in P • Rules are written as A β[p], where p = D(A β) • i.e. A  a B [0.6], B  C D [0.3]

  4. PCFG Example Det  that .5 Det  the .8 Det  a .15 Noun  book .1 Noun  flights .5 Noun  meal .4 Verb  book .3 Verb  include .3 Verb  want .4 Aux  can .4 Aux  does .3 Aux  do .3 ProperN  TWA .4 ProperN  Denver .6 Pronoun  you .4 Pronoun  I .6 S  NP VP .8 S  Aux NP VP .15 S  VP .05 NP  Det Nom .2 NP  ProperN .35 NP  Noun .05 NP  ProNoun .4 Nom  Noun .75 Nom  Noun Nom .2 Nom  ProperN Nom .05 VP  Verb .55 VP  Verb NP .4 VP  Verb NP NP .05

  5. Probability of A Sentence in PCFG • Probability of any parse tree T of S • P(T,S) = Π D(r(n)) • T is the parse tree and S is the sentence to be parsed • n is a sub tree of T and r(n) is a rule to expand n • Probability of A parse tree • P(T,S) = P(T) * P(S|T) • A parse tree T uniquely corresponds a sentence S, so P(S|T) = 1 • P(T) = P(T,S) • Probability of a sentence • P(S) = Σ P(T), where T is in τ(S), the set of all the parse trees of S • In particular, for an unambiguous sentence, P(S) = P(T)

  6. Example • P(Tl) = 0.15*0.40*0.05* 0.05*0.35*0.75* 0.40*0.40*0.30* 0.40*0.50= 3.78*10-7 • P(Tr) = 0.15*0.40*0.40* 0.05*0.05*0.75* 0.40*0.40*0.30* 0.40*0.50= 4.32*10-7

  7. Probabilistic CYK Parsing of PCFG • Bottom-Up approach • Dynamic Programming: fill the tables of partial solutions to the sub-problems until they contain all the solutions to the entire problem • Input • CNF: ε-free, each production in form A β or A BC • n words, w1, w2, …, wn • Data Structure • Π[i, j, A]: the maximum probability for a constituent with non-terminal A spanning j words from wi • β[i, j, A] = {k, B, C}, where A  BC, and B spans k words from wi (for rebuilding the parse tree) • Output • The maximum probability parse will be Π[1,n,1] • The root of the parse tree is S, and spans entire string

  8. Π [i,0,A] {k, B, C} CYK Algorithm • Base case • Consider the input strings of length one • By the rules A  wi • Recursive case • For strings of words of length>1, A → wij • There exists some rules A  BC and k • 0<k<j • B → wik (known) • C → w(i+k)(j-k) (known) • Compute the probability of wij by multiplying the two probabilities • If there are more than one A  BC, pick the one that maximize the probability of wij My implementation is in lectura under directory /home/shan/538share/pcyk.c

  9. PCFG Example – Revisit to rewrite Det  that .5 Det  the .8 Det  a .15 Noun  book .1 Noun  flights .5 Noun  meal .4 Verb  book .3 Verb  include .3 Verb  want .4 Aux  can .4 Aux  does .3 Aux  do .3 ProperN  TWA .4 ProperN  Denver .6 Pronoun  you .4 Pronoun  I .6 S  NP VP .8 S  Aux NP VP .15 S  VP .05 NP  Det Nom .2 NP  ProperN .35 NP  Noun .05 NP  ProNoun .4 Nom  Noun .75 Nom  Noun Nom .2 Nom  ProperN Nom .05 VP  Verb .55 VP  Verb NP .4 VP  Verb NP NP .05

  10. Example (CYK Parsing) - Rewrite as CNF S  NP VP .8 (S  Aux NP VP .15) S  Aux NV .15 NV  NP VP 1.0 (S  VP .05) S  book .00825 S  include .00825 S  want .011 S  Verb NP .02 S  Verb DNP .0025 NP  Det Nom .2 (NP  ProperN .35) NP  TWA .14 NP  Denver .21 (NP  Nom .05) NP  book .00375 NP  flights .01875 NP  meal .015 NP  Noun Nom .01 NP  ProperN Nom .0025 (NP  ProNoun .4) NP  you .16 NP  I .24 (Nom  Noun .75) Nom  book .075 Nom  flights .375 Nom  meal .3 Nom  Noun Nom .2 Nom  ProperN Nom .05 (VP  Verb .55) VP  book .165 VP  include .165 VP  want .22 VP  Verb NP .4 (VP  Verb NP NP .05) VP  Verb DNP .05 DNP  NP NP 1.0

  11. Example (CYK Parsing) – Πmatrix

  12. Example (CYK Parsing) – Πmatrix

  13. Example (CYK Parsing) – Πmatrix

  14. Example (CYK Parsing) – Πmatrix

  15. Example (CYK Parsing) – Πmatrix

  16. Example (CYK Parsing) – βmatrix

  17. PCFG Problems • Independence Assumption • Assumption: the expansion of one nonterminal is independent of the expansion of others. • However, examination shows that how a node expands is dependent on the location of the node • 91% of the subjects are pronouns. • She’s able to take her baby to work with her. (91%) • Uh, my wife worked until we had a family. (9%) • But only 34% of the objects are pronouns. • Some laws absolutely prohibit it. (34%) • All the people signed confessions. (66%)

  18. PCFG Problems • Lack of sensitivity of words • Lexical information in a PCFG can only be represented via the probability of pre-terminal nodes (such as Verb, Noun, Det) • However, lexical information and dependencies turns out to be important in modeling syntactic probabilities. • Example: Moscow sent more than 100,000 soldiers into Afghanistan. • In PCFG, into Afghanistan may attach NP (more than 100,000 soldiers) or VP (sent) • Statistics shows that NP attachment is 67% or 52% • Thus, PCFG will produce an incorrect result. • Why? the word “Send” subcategorizes for a destination, which can be expressed with the preposition “into”. • In fact, when the verb is “send”, “into” always attaches to it

  19. However, PCFG assigns them the same probability, since the structures are using exactly the same rules. PCFG Problems • Coordination ambiguity • Look at the following case • Example: dogs in houses and cats • Semantically, dogs is a better conjunct for cats than houses • Thus, the parse [dogs in [NP houses and cats]] intuitively sounds unnatural, and should be dispreferred.

  20. References • NLTK Tutorial: Probabilistic Parsing: http://nltk.sourceforge.net/tutorial/pcfg/index.html • Stanford Probabilistic Parsing Group: http://nlp.stanford.edu/projects/stat-parsing.shtml • General CYK algorithm http://en.wikipedia.org/wiki/CYK_algorithm • General CYK algorithm web compute http://www2.informatik.hu-berlin.de/~pohl/cyk.php?action=example • Probabilistic CYK parsing http://www.ifi.unizh.ch/cl/gschneid/ParserVorl/ParserVorl7.pdf http://catarina.ai.uiuc.edu/ling306/slides/lecture23.pdf

  21. Questions? Thank You!

More Related