1 / 59

Parsing with PCFG

Parsing with PCFG. Ling 571 Fei Xia Week 3: 10/11-10/13/05. Outline. Misc CYK algorithm Converting CFG into CNF PCFG Lexicalized PCFG. Misc. Quiz 1: 15 pts, due 10/13 Hw2: 10 pts, due 10/13, ling580i_au05@u, ling580e_au05@u Treehouse weekly meeting:

erol
Télécharger la présentation

Parsing with PCFG

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Parsing with PCFG Ling 571 Fei Xia Week 3: 10/11-10/13/05

  2. Outline • Misc • CYK algorithm • Converting CFG into CNF • PCFG • Lexicalized PCFG

  3. Misc • Quiz 1: 15 pts, due 10/13 • Hw2: 10 pts, due 10/13, ling580i_au05@u, ling580e_au05@u • Treehouse weekly meeting: • Time: every Wed 2:30-3:30pm, tomorrow is the 1st meeting • Location: EE1 025 (Campus map 12-N, South of MGH) • Mailing list: cl-announce@u • Others: • Pongo policies • Machines: LLC, Parrington, Treehouse • Linux commands: ssh, sftp, … • Catalyst tools: ESubmit, EPost, …

  4. CYK algorithm

  5. Parsing algorithms • Top-down • Bottom-up • Top-down with bottom-up filtering • Earley algorithm • CYK algorithm • ....

  6. CYK algorithm • Cocke-Younger-Kasami algorithm (a.k.a. CKY algorithm) • Require CFG to be in Chomsky Normal Form (CNF). • Bottom-up chart parsing algorithm using DP. • Fill in a two-dimension array: C[i][j] contains all the possible syntactic interpretations of the substring • Complexity:

  7. Chomsky normal form (CNF) • Definition of CNF: • A  B C • A  a • S  A, B, C are non-terminals; a is a terminal. S is the start symbol; B and C are not. • For every CFG, there is a CFG in CNF that is weakly equivalent.

  8. CYK algorithm • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: If then

  9. CYK algorithm (another way) • For every rule Aw_i, add it to Cell[i][i] • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: If Cell[begin][m] contains B... and Cell[m+1][end] contains C… and ABC is a rule in the grammar then add ABC to Cell[begin][end] and remember m

  10. An example Rules: VP  V NP V book VP  VP PP Nbook/flight/cards NP  Det N Det that/the NP  NP PP P with PP  P NP

  11. Parse “book that flight”: C1[begin][end] end=3 end=2 end=1 begin=1 begin=2 begin=3

  12. Parse “book that flight”: C2[begin][span] span=3 span=2 span=1 begin=1 begin=2 begin=3

  13. Data structures for the chart (1) (2) (3) (4)

  14. Summary of CYK algorithm • Bottom-up using DP • Require the CFG to be in CNF • A very efficient algorithm • Easy to be extended

  15. Converting CFG into CNF

  16. Chomsky normal form (CNF) • Definition of CNF: • A  B C, • A  a, • S  Where A, B, C are non-terminals, a is a terminal, S is the start symbol, and B, C are not start symbols. • For every CFG, there is a CFG in CNF that is weakly equivalent.

  17. Converting CFG to CNF • Add a new symbol S0, and a rule S0S (so the start symbol will not appear on the rhs of any rule) (2) Eliminate for each rule add for each rule , add unless has been previously eliminated.

  18. Conversion (cont) (3) Remove unit rule add if unless the latter rule was previously removed. (4) Replace a rule where k>2 with replace any terminal with a new symbol and add a new rule

  19. An example

  20. Adding

  21. Removing rules Remove B Remove A

  22. Removing unit rules • Remove • Remove

  23. Removing unit rules (cont) • Remove • Removing

  24. Converting remaining rules

  25. Summary of CFG parsing • Simply top-down and bottom-up parsing generate useless trees. • Top-down with bottom-up filtering has three problems. • Solution: use DP: • Earley algorithm • CYK algorithm

  26. Probabilistic CFG (PCFG)

  27. PCFG • PCFG is an extension of CFG. • A PCFG is a 5-tuple=(N, T, P, S, Pr), where Pr is a function assigning probability to each rule in P: or • Given a non-terminal A,

  28. A PCFG S  NP VP 0.8 N Mary 0.01 S  Aux NP VP 0.15 Nbook 0.02 S  VP 0.05 VPV 0.35 Vbought 0.02 VPV NP 0.45 VPVP PP 0.20 Deta 0.04 NPN 0.8 NPDet N 0.2 ….

  29. Using probabilities • To estimate prob of a sentence and its parse trees. • Useful in disambiguation. • The prob of a tree: n is a node in T, r(n) is the rule used to expand n in T.

  30. Computing P(T) S  NP VP 0.8 N Mary 0.01 S  Aux NP VP 0.15 Nbook 0.02 S  VP 0.05 VPV 0.35 Vbought 0.02 VPV NP 0.45 VPVP PP 0.20 Deta 0.04 NPN 0.8 NPDet N 0.2 The sentence is “Mary bought a book”.

  31. The most likely tree • P(T, S) = P(T) * P(S|T) = P(T) T is a parse tree, S is a sentence • The best parse tree for a sentence S

  32. Find the most likely tree Given a PCFG and a sentence, how to find the best parse tree for S? One algorithm: CYK

  33. CYK algorithm for CFG • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: If then

  34. CYK algorithm for CFG (another implementation) • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then

  35. Variables for CFG and PCFG • CFG: whether there is a parse tree whose root is A and which covers • PCFG: the prob of the most likely parse tree whose root is A and which covers

  36. CYK algorithm for PCFG • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then

  37. A CFG Rules: VP  V NP V book VP  VP PP Nbook/flight/cards NP  Det N Det that/the NP  NP PP P with PP  P NP

  38. Parse “book that flight” end=3 end=2 end=1 begin=1 begin=2 begin=3

  39. A PCFG Rules: VP  V NP 0.4 V book 0.001 VP  VP PP 0.2 Nbook 0.01 NP  Det N 0.3 Det that 0.1 NP  NP PP 0.2 P with 0.2 PP  P NP 1.0 Nflight 0.02

  40. Parse “book that flight” end=3 end=2 end=1 begin=1 begin=2 begin=3

  41. N-best parse trees • Best parse tree: • N-best parse trees:

  42. CYK algorithm for N-best • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: for each if val > one of probs in then remove the last element in and insert val to the array remove the last element in B[begin][end][A] and insert (m, B,C,i, j) to B[begin][end][A].

  43. PCFG for Language Modeling (LM) • N-gram LM: • Syntax-based LM:

  44. Calculating Pr(S) • Parsing: the prob of the most likely parse tree • LM: the sum of all parse trees

  45. CYK for finding the most likely parse tree • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C: if then

  46. CYK for calculating LM • For every rule Aw_i, • For span=2 to N for begin=1 to N-span+1 end = begin + span – 1; for m=begin to end-1; for all non-terminals A, B, C:

  47. CYK algorithm

  48. Learning PCFG Probabilities Given a treebank (i.e., a set of trees), use MLE: Without treebanks  inside-outside algorithm

  49. Q&A • PCFG • CYK algorithm

  50. Problems of PCFG • Lack of sensitivity to structural dependency: • Lack of sensitivity to lexical dependency:

More Related