1 / 24

Fall 2004 Lecture Notes #4

EECS 595 / LING 541 / SI 661. Natural Language Processing. Fall 2004 Lecture Notes #4. Parsing with Context-Free Grammars. Introduction. Parsing = associating a structure (parse tree) to an input string using a grammar

ross
Télécharger la présentation

Fall 2004 Lecture Notes #4

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. EECS 595 / LING 541 / SI 661 Natural Language Processing Fall 2004 Lecture Notes #4

  2. Parsing withContext-Free Grammars

  3. Introduction • Parsing = associating a structure (parse tree) to an input string using a grammar • CFG are declarative, they don’t specify how the parse tree will be constructed • Parse trees are used in grammar checking, semantic analysis, machine translation, question answering, information extraction • Example: “How many people in the Human Resources Department receive salaries above $30,000?”

  4. Parsing as search

  5. Parsing as search Book that flight. S Two types of constraints on the parses: a) some that come from the input string,b) others that come from the grammar VP NP Nom Verb Det Noun Book that flight

  6. S NP VP Top-down parsing S S S Aux NP VP VP S S S S S S NP VP NP VP Aux NP VP Aux NP VP VP VP V Det Nom PropN Det Nom PropN V NP

  7. Bottom-up parsing Book that flight Noun Det Noun Verb Det Noun Book that flight Book that flight NOM NOM NOM Noun Det Noun Verb Det Noun Book that flight Book that flight NP NP NOM NOM VP NOM NOM Noun Det Noun Verb Det Noun Verb Det Noun Book that flight Book that flight Book that flight VP VP NP NP NOM NOM Verb Det Noun Verb Det Noun Book that flight Book that flight

  8. Comparing TD and BU parsers • TD never wastes time exploring trees that cannot result in an S. • BU however never spends effort on trees that are not consistent with the input. • Needed: some middle ground.

  9. Basic TD parser • Practically infeasible to generate all trees in parallel. • Use depth-first strategy. • When arriving at a tree that is inconsistent with the input, return to the most recently generated but still unexplored tree.

  10. A TD-DF-LR parser function TOP-DOWN-PARSE (input, grammar) returns a parse treeagenda (Initial S tree, Beginning of input)current-search-state POP (agenda)loopif SUCCESSFUL-PARSE? (current-search-state) thenreturn TREE (current-search-state)elseif CAT (NODE-TO-EXPAND (current-search-state)) is a POS thenif CAT (node-to-expand)  POS (CURRENT-INPUT (current-search-state)) then PUSH (APPLY-LEXICAL-RULE (current-search-state), agenda)elsereturn rejectelse PUSH (APPLY-RULES (current-search-state, grammar), agenda)ifagenda is empty thenreturn rejectelsecurrent-search-state NEXT (agenda)end

  11. An example Does this flight include a meal?

  12. Problems with the basic parser • Left-recursion: rules of the type: NP  NP PPsolution: rewrite each rule of the form A  Ab | a using a new symbol: A  aA’A  bA’ | e • Ambiguity: attachment ambiguity, coordination ambiguity, noun-phrase bracketing ambiguity • Attachment ambiguity: I saw the Grand Canyon flying to New York • Coordination ambiguity: old men and women

  13. Problems with the basic parser • Example:President Kennedy today pushed aside other White House business to devote all his time and attention to working on the Berlin crisis address he will deliver tomorrow night to the American people over nationwide television and radio. • Solutions: return all parses or include disambiguation in the parser. • Inefficient reparsing of subtrees: a flight from Indianapolis to Houston on TWA

  14. The Earley algorithm • Resolving: • Left-recursive rules • Ambiguity • Inefficient reparsing of subtrees • A chart with N+1 entries • Dotted rules • S  . VP, [0,0] • NP  Det . Nominal, [1,2] • VP  V NP ., [0,3]

  15. Parsing with FSAs • Shallow parsing • Useful for information extraction: noun phrases, verb phrases, locations, etc. • The Fastus system (Appelt and Israel, 1997) • Sample rules for noun groups:NG  Pronoun | Time-NP | Date-NPNG  (DETP) (Adjs) HdNns | DETP Ving HdNnsDETP  DETP-CP | DETP-CP • Complete determiner-phrases: “the only five”, “another three”, “this”, “many”, “hers”, “all”, “the most”

  16. Sample FASTUS output Company Name: Bridgestone Sports Co. Verb Group: said Noun Group: Friday Noun Group: it Verb Group: had set up Noun Group: a joint venture Preposition: in Location: Taiwan Preposition: with Noun Group: a local concern Conjunction: and Noun Group: a Japanese trading house Verb Group: to produce Noun Group: golf clubs Verb Group: to be shipped Preposition: to Location: Japan

  17. Features and unification

  18. Introduction • Grammatical categories have properties • Constraint-based formalisms • Example: this flights: agreement is difficult to handle at the level of grammatical categories • Example: many water: count/mass nouns • Sample rule that takes into account features: S  NP VP (but only if the number of the NP is equal to the number of the VP)

  19. Feature structures CAT NP NUMBER SINGULAR PERSON 3 CAT NP AGREEMENT NUMBER SG PERSON 3 Feature paths: {x agreement number}

  20. Unification [NUMBER SG] [NUMBER SG] + [NUMBER SG] [NUMBER PL] - [NUMBER SG] [NUMBER []] = [NUMBER SG] [NUMBER SG] [PERSON 3] = ?

  21. Agreement • S  NP VP{NP AGREEMENT} = {VP AGREEMENT} • Does this flight serve breakfast? • Do these flights serve breakfast? • S  Aux NP VP{Aux AGREEMENT} = {NP AGREEMENT}

  22. Agreement • These flights • This flight • NP  Det Nominal{Det AGREEMENT} = {Nominal AGREEMENT} • Verb  serve{Verb AGREEMENT NUMBER} = PL • Verb  serves{Verb AGREEMENT NUMBER} = SG

  23. Subcategorization • VP  Verb{VP HEAD} = {Verb HEAD}{VP HEAD SUBCAT} = INTRANS • VP  Verb NP{VP HEAD} = {Verb HEAD}{VP HEAD SUBCAT} = TRANS • VP  Verb NP NP{VP HEAD} = {Verb HEAD}{VP HEAD SUBCAT} = DITRANS

  24. Readings for next time • J&M Chapters 12, 13, 20 • Lecture notes #4 • FUF/CFUF documentation

More Related