500 likes | 688 Vues
Computational lexicology, morphology and syntax Course 9 Diana Trandabăț 2013-2014. Dependency parsing. The syntactic parsing of a sentence consists of finding the correct syntactic structure of that sentence in a given formalism/grammar.
E N D
Computational lexicology, morphology and syntax Course 9 Diana Trandabăț 2013-2014
Dependency parsing The syntactic parsingof a sentence consists of finding the correct syntactic structure of that sentence in a given formalism/grammar. Dependency Grammar (DG) and Phrase Structure Grammar (PSG) are two such formalisms.
Phrase Structure Grammar (PSG) • Breaks sentence into constituents (phrases) • Which are then broken into smaller constituents • Describes phrase structure, clause structure • E.g.. NP, PP, VP etc.. • Structures often recursive • The clever tall blue-eyed old … man
Constituents & Constituency Tests Constituents = the natural groupings of a sentence A constituent is a string of words such that there is one node that dominates those words and no other words.
VP PP NP S Tree 1 NP N V P Det N Sam climbed up the ladder. S Tree 2 VP NP NP V N V P Det N Sam picked up the ladder.
Discussion • What are the constituents of Tree 1 and Tree 2? • Which strings of words are constituents in one tree but not the other?
Well-formed constituents • A constituent is formed if… • 1) a group of words can stand alone • Ex. “What did you find?” “A puppy” (not “found a”) • 2) pronouns can substitute for natural groups • Ex. “Where did you find a puppy?” “I found HIM in the park.” • 3) a group of words can be move.[move unit] • Ex. It was [a puppy] that the child found. • [A puppy] was found by the child.
The boy raced the girl. Words grouped into natural units: [The boy] [raced the girl]. Further division: [[The][boy]][ [raced][[the][girl]]]. Tree Diagram: Sentences: Hierarchical Organization verb phrase noun phrase noun phrase raced The boy the girl
Exercise • Goals of the exercise: • Relying on tests when your intuition fails • Adapting to inconsistent results • (e.g., find evidence for disqualifying some of the tests) • The five trees on the following slide have all been proposed by linguists, in published articles, for the sentence: He has been writing a letter.
S S TREE 2 VP AUX VP V NP PERF PROG V NP AUX He has been writing a letter. NP PERF PROG V NP He has been writing a letter. S S TREE 3 TREE 4 VP V VP VP AUX V VP V VP NP V NP NP V NP He has been writing a letter. He has been writing a letter. S TREE 5 VP AUX VP NP V V NP He has been writing a letter. TREE 1
Dependency Grammar • Syntactic structure consists of lexicalitems, linked by binary asymmetric relations called dependencies • Interested in grammatical relations between individual words (governing & dependentwords) • Does not propose a recursive structure • Rather a network of relations • These relations can also have labels
Phrasal nodes are missing in the dependency structure when compared to constituency structure. Dependency Tree Example
Comparison • Dependency structures explicitly represent • Head-dependent relations (directed arcs) • Functional categories (arc labels) • Possibly some structural categories (parts-of-speech) • Phrase structure explicitly represent • Phrases (non-terminal nodes) • Structural categories (non-terminal labels) • Possibly some functional categories (grammatical functions)
Comparison • Dependency Parsing is more straightforward • Parsing can be reduced to labeling each token pair wiand wj • Direct encoding of predicate-argument structure • Fragments are directly interpretable • Dependency structure independent of word order • Suitable for free word order languages (like Indian languages)
Predicate-argument structure • Predicates(predicational words) designate events, properties of, or relations between, entities. • Linguistic expressions can be dependent or independent. • hat - can be understood outside any circumstance, time, or person, it is an individual. • red - cannot be understood outside its association with an individual: red hat. • In linguistic terms, dependent phenomena are predicates, while individuals are arguments. • Predication - the way individuals instantiate properties, actions, attributes and states.
Predicate-argument structure • Who, what, where, when, why? • Predicates • Verbs: sell, buy, cost, etc. • Nouns: acquisition, etc. • Arguments • buyer • seller • goods • money • time • etc..
Examples of types of Arguments (Semantic Roles) Agent: [Bill]Agent ate his soup quietly. Experiencer: The smell of lilies filled [Jennifer's]Experiencer nostrils. Theme: I like [Kim]Theme. Patient: The falling rocks crushed [the car]Patient. Instrument: Jamie cut the ribbon [with a pair of scissors]Instrument. Location: Johnny and Linda played carelessly [in the park]Location. Recipient: I sent [John]Recipient the letter. Source: The rocket was launched [from Central Command]Source. Time: The rocket was launched [yesterday]Time. Beneficiary: I baked [Reggie]Beneficiary a cake. Manner: [With great urgency]Manner, Agatha phoned 911. Purpose: Agatha phoned 911 right away [in order to get some help]P urpose. Cause: [Since Clyde was hungry]Cause, he ate the cake.
Valence • A predicational word needs, in order to complete its sense, arguments (mandatory) and adjuncts (optional). • Arguments: • John reads a book. (valence 2) • John gives Marya book. (valence 3) • Adjuncts (circumstantial complements) • John reads a book in the train. • John gave Mary a book for her birthday. • Arguments can be established for verbs, nouns or adjectives: • John presented the situation. • John's presentation of the situation was very clear. • The presented situation was simple.
Dependency Tree • Formaldefinition • An input word sequence w1…wn • Dependency graph D = (W,E) where • W is the set of nodes i.e. word tokens in the input seq. • E is the set of unlabeled tree edges (wi, wj) (wi, wjєW). • (wi, wj) indicates an edge from wi (parent) to wj (child). • Task of mapping an input string to a dependency graph satisfying certain conditionsis called dependency parsing
Well-formedness • A dependency graph is well-formed iff • Single head: Each word has only one head. • Acyclic: The graph should be acyclic. • Connected: The graph should be a single tree with all the words in the sentence. • Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B).
Non-projective dependency tree Ram saw a dog yesterday which was a Yorkshire Terrier * Crossing lines English has very few non-projective cases.
Dependency Parsing • Dependency based parsers can be broadly categorized into • Grammar driven approaches • Parsing done using grammars. • Data driven approaches • Parsing by training on annotated/un-annotated data. • These approaches are not mutually exclusive
Parsing Methods • Three main traditions • Dynamic programming • CKY, Eisner, McDonald • Constraint satisfaction • Maruyama, Foth et al., Duchier • Deterministic search • Covington, Yamada and Matsumuto, Nivre
Dynamic Programming • Basic Idea: Treat dependencies as constituents. • Use, e.g. , CKY parser (with minor modifications)
Dependency Chart Parsing Grammar is regarded as context-free, in which each node is lexicalized Chart entries are subtrees, i.e., words with all their left and right dependents Problem: Different entries for different subtrees spanning a sequence of words with different heads Time requirement: O(n5)
Dynamic Programming Approaches Original version [Hays 1964] (grammar driven) Link grammar [Sleator and Temperley 1991] (grammar driven) Bilexical grammar [Eisner 1996] (data driven) Maximum spanning tree [McDonald 2006] (data driven)
Constraint Satisfaction Uses Constraint Dependency Grammar Grammar consists of a set of boolean constraints, i.e. logical formulas that describe well-formed trees A constraint is a logical formula with variables that range over a set of predefined values Parsing is defined as a constraint satisfaction problem Constraint satisfaction removes values that contradict constraints
Constraint Satisfaction • Parsing is an eliminative process rather than a constructive one such as in CFG parsing • Constraint satisfaction in general is NP complete • Parser design must ensure practical efficiency • Different approaches • Constraint propagation techniques which ensure local consistency [Maruyama 1990] • Weighted CDG [Foth et al. 2000, Menzel and Schroder 1998]
Weighted Constraint Parsing Robust parser, which uses soft constraints Each constraint is assigned a weight between 0.0 and 1.0 Weight 0.0: hard constraint, can only be violated when no other parse is possible Constraints assigned manually (or estimated from treebank) Efficiency: uses a heuristic transformation-based constraint resolution method
Transformation-Based Constraint Resolution Heuristic search Very efficient Idea: first construct arbitrary dependency structure, then try to correct errors Error correction by transformations Selection of transformations based on constraints that cause conflicts Anytime property: parser maintains a complete analysis at anytime → can be stopped at any time and return a complete analysis
Deterministic Parsing • Basic idea: • Derive a single syntactic representation (dependency graph) through a deterministic sequence of elementary parsing actions • Sometimes combined with backtracking or repair • Motivation: • Psycholinguistic modeling • Efficiency • Simplicity
Shift-Reduce Type Algorithms • Data structures: • Stack [. . . ,wi ]S of partiallyprocessed tokens • Queue [wj , . . .]Q of remaininginput tokens • Parsing actions built from atomic actions: • Adding arcs (wi → wj , wi ← wj ) • Stack and queue operations • Left-to-right parsing • Restricted to projective dependency graphs
Yamada and Matsumoto Parsing in several rounds: deterministic bottom-up O(n2) Looks at pairs of words 3 actions: shift, left, right Shift: shifts focus to next word pair
Yamada and Matsumoto • Left: decides that the left word depends on the right one Right: decides that the right word depends on the left word
Parsing Algorithm • Go through each pair of words • Decide which action to take • If a relation was detected in a pass, do another pass • E.g. the little girl • First pass: relation between littleand girl • Second pass: relation between theand girl • Decision on action depends on word pair and context
Classifier-Based Parsing • Data-driven deterministic parsing: • Deterministic parsing requires an oracle. • An oracle can be approximated by a classifier. • A classifier can be trained using treebank data. • Learning algorithms: • Support vector machines (SVM) [Kudo and Matsumoto 2002, Yamada and Matsumoto 2003,Isozaki et al. 2004, Cheng et al. 2004, Nivre et al. 2006] • Memory-based learning (MBL) [Nivre et al. 2004, Nivre and Scholz 2004] • Maximum entropy modeling (MaxEnt) [Cheng et al. 2005]
Feature Models • Learning problem: • Approximate a function from parser states, represented by feature vectors to parser actions, given a training set of gold standard derivations. • Typical features: • Tokens and POS tags of : • Target words • Linear context (neighbors in S and Q) • Structural context (parents, children, siblings in G) • Can not be used in dynamic programming algorithms.
Dependency Parsers for download MST parser by Ryan McDonald Malt parser by Joakim Nivre Stanford parser
Great! See you next time!