1 / 40

PARSING WITH CONTEXT-FREE GRAMMARS

PARSING WITH CONTEXT-FREE GRAMMARS. cc437. PARSING. Parsing is the process of recognizing and assigning STRUCTURE Parsing a string with a CFG: Finding a derivation of the string consistent with the grammar The derivation gives us a PARSE TREE. EXAMPLE (CFR LAST WEEK). PARSING AS SEARCH.

jeads
Télécharger la présentation

PARSING WITH CONTEXT-FREE GRAMMARS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PARSING WITH CONTEXT-FREE GRAMMARS cc437

  2. PARSING • Parsing is the process of recognizing and assigning STRUCTURE • Parsing a string with a CFG: • Finding a derivation of the string consistent with the grammar • The derivation gives us a PARSE TREE

  3. EXAMPLE (CFR LAST WEEK)

  4. PARSING AS SEARCH • Just as in the case of non-deterministic regular expressions, the main problem with parsing is the existence of CHOICE POINTS • There is a need for a SEARCH STRATEGY determining the order in which alternatives are considered

  5. TOP-DOWN AND BOTTOM-UP SEARCH STRATEGIES • The search has to be guided by the INPUT and the GRAMMAR • TOP-DOWN search: the parse tree has to be rooted in the start symbol S • EXPECTATION-DRIVEN parsing • BOTTOM-UP search: the parse tree must be an analysis of the input • DATA-DRIVEN parsing

  6. AN EXAMPLE OF TOP-DOWN SEARCH(IN PARALLEL)

  7. AN EXAMPLE OF BOTTOM-UP SEARCH

  8. NON-PARALLEL SEARCH • If it’s not possible to examine all alternatives in parallel, it’s necessary to make further decisions: • Which node in the current search space to expand first (breadth-first or depth-first) • Which of the applicable grammar rules to expand first • Which leaf node in a parse tree to expand next (e.g., leftmost)

  9. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT

  10. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (II)

  11. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (III)

  12. TOP-DOWN, DEPTH-FIRST, LEFT-TO-RIGHT (IV)

  13. A T-D, D-F, L-R PARSER

  14. TOP-DOWN vs BOTTOM-UP • TOP-DOWN: • Only search among grammatical answers • BUT: suggests hypotheses that may not be consistent with data • Problem: left-recursion • BOTTOM-UP: • Only forms hypotheses consistent with data • BUT: may suggest hypotheses that make no sense globally

  15. LEFT-RECURSION • A LEFT-RECURSIVE grammar may cause a T-D, D-F, L-R parser to never return • Examples of left-recursive rules: • NP  NP PP • S  S and S • But also: • NP  Det Nom • Det  NP’s

  16. THE PROBLEM WITH LEFT-RECURSION

  17. LEFT-RECURSION: POOR SOLUTIONS • Rewrite the grammar to a weakly equivalent one • Problem: may not get correct parse tree • Limit the depth during search • Problem: limit is arbitrary

  18. LEFT-CORNER PARSING • A hybrid of top-down and bottom-up parsing • Strategy: don’t consider any expansion unless the current input can serve as the LEFT-CORNER of that expansion

  19. FURTHER PROBLEMS IN PARSING • Ambiguity • Church and Patel (1982): the number of attachment ambiguities grows like the Catalan numbers • C(2) = 2, C(3) = 5, C(4) = 14, C(5) = 132, C(6) = 469, C(7) = 1430, C(8) = 4867 • Avoiding reparsing

  20. COMMON STRUCTURAL AMBIGUITIES • COORDINATION ambiguity • OLD (MEN AND WOMEN) vs (OLD MEN) AND WOMEN • ATTACHMENT ambiguity: • Gerundive VP attachment ambiguity • I saw the Eiffel Tower flying to Paris • PP attachment ambiguity • I shot an elephant in my pajamas

  21. PP ATTACHMENT AMBIGUITY

  22. AMBIGUITY: SOLUTIONS • Use a PROBABILISTIC GRAMMAR (not covered in this module) • Use semantics

  23. AVOID RECOMPUTING INVARIANTS • Consider parsing with a top-down parser the NP: • A flight from Indianapolis to Houston on TWA • With the grammar rules: • NP  Det Nominal • NP  NP PP • NP  ProperNoun

  24. INVARIANTS AND TOP-DOWN PARSING

  25. THE EARLEY ALGORITHM

  26. DYNAMIC PROGRAMMING • A standard T-D parser would reanalyze A FLIGHT 4 times, always in the same way • A DYNAMIC PROGRAMMING algorithm uses a table (the CHART) to avoid repeating work • The Earley algorithm also • Does not suffer from the left-recursion problem • Solves an exponential problem in O(n3)

  27. THE CHART • The Earley algorithm uses a table (the CHART) of size N+1, where N is the length of the input • Table entries sit in the `gaps’ between words • Each entry in the chart is a list of • Completed constituents • In-progress constituents • Predicted constituents • All three types of objects are represented in the same way as STATES

  28. THE CHART: GRAPHICAL REPRESENTATION

  29. STATES • A state encodes two types of information: • How much of a certain rule has been encountered in the input • Which positions are covered • A  , [X,Y] • DOTTED RULES • VP  V NP  • NP  Det  Nominal • S   VP

  30. EXAMPLES

  31. SUCCESS • The parser has succeeded if entry N+1 of the chart contains the state • S   , [0,N]

  32. THE ALGORITHM • The algorithm loops through the input without backtracking, at each step performing three operations: • PREDICTOR: add predictions to the chart • COMPLETER: Move the dot to the right when looked-for constituent is found • SCANNER: read in the next input word

  33. THE ALGORITHM: CENTRAL LOOP

  34. EARLEY ALGORITHM: THE THREE OPERATORS

  35. EXAMPLE, AGAIN

  36. EXAMPLE: BOOK THAT FLIGHT

  37. EXAMPLE: BOOK THAT FLIGHT (II)

  38. EXAMPLE: BOOK THAT FLIGHT (III)

  39. EXAMPLE: BOOK THAT FLIGHT (IV)

  40. READINGS • Jurafsky and Martin, chapter 10.1-10.4

More Related