500 likes | 1.17k Vues
Generative Lexicon- Idea and Practicality. Debasri Chakrabarti 02408601 Guide : Prof.Milind S. Malshe Co-Guide : Prof. Pushpak Bhattacharyya. Overview. Introduction Polysemy and the Logical Problem of Polysemy Generative Lexicon Theory Lexicon Building
E N D
Generative Lexicon- Idea and Practicality Debasri Chakrabarti 02408601 Guide: Prof.Milind S. Malshe Co-Guide: Prof. Pushpak Bhattacharyya
Overview • Introduction • Polysemy and the Logical Problem of Polysemy • Generative Lexicon Theory • Lexicon Building • Applications and Limitations of GLT • Conclusion
Introduction • Lexicon— ideally collection of all words of a language • Information stored in a lexicon- • Phonetic information • pronunciation • Semantic information • meaning • Morphological information • transitivity and intransitivity (verbs) , count vs. mass (noun)
Lexicon (contd…) Example of “eat” in the Oxford Advanced Learner’s Dictionary eat /i:t/ v (pt ate /et/; pp eaten /i:tn/):1. sth (up) to food into the mouth,chew and swallow it: he was too ill to eat Pronunciation Meaning Morphological information Lexical entry
Mental Lexicon • Mental Lexicon: information stored in the mind of a native speaker • Native speakers store information • Phonetic information • pronunciation • Semantic information • meaning • Morphological information • transitivity vs.intransitivity (verbs), count vs. mass (noun) • Additional information • use of a word in a new context, syntactic environment of a word, word-formation rules
Example of Mental Lexicon • Example of eat in a native speaker’s mind • Pronunciation: long /i:/ is used in eat • Grammatical information: past tense is ate /et/ • Word-formation rules: /-s/ is the third person singular present tense marker as in • he eats • Meaning: 1. Take in solid food: she ate a banana • 2. Take a meal: we did not eat until 10 P.M. • 3. Worry or cause anxiety in a persistent way: what’s eating you up. • Syntactic Information: eat needs an agent to perform the action. • the agent role is obligatory.
Lexicon in Computational Linguistics • Lexicon meant for Natural Language Processing (NLP) must have the • following properties: • Morphological information • Parts of speech information • Rules should be there to deal with both regular and irregular forms • e.g ate (past tense of eat) • men (plural of man) • Semantic information • Can handle lexical ambiguity • Syntactic information • Action verbs will always have an agent
Polysemy and the Logical Problem of Polysemy Polysemy • An individual word can have indefinite number of subtle meaning difference • Natural Languages are highly polysemous • This creates ambiguity • Weinreich distinguishes between two types of ambiguity • Contrastive ambiguity • Complementary polysemy
Polysemy and the Logical Problem of Polysemy (contd…) Contrastive Ambiguity • A lexical item carries two distinct unrelated meanings • This is a case of homonymy • words spelled or pronounced in the same way but have different meanings Example: • bank a financial institution • bank place beside a body of water.
Polysemy and the Logical Problem of Polysemy (contd…) Complementary Polysemy • Manifestation of the same basic sense • Denotes a relation among different senses Example, • John crawled through the window. • The window is closed. Sense 1. Apparatus Sense 2. Physical Object
Sense Enumeration Lexicon (SEL) • Simplest model of lexical design to capture the logical polysemy. • Widely accepted in both computational and theoritical linguistics. • Direct approach to handle polysemy is to allow the lexicon to have multiple listing of words, each annotated with a separate meaning or lexical sense.
Sense Enumeration Lexicon (SEL) • Example of Contrastive Senses bank2 CAT= count-noun GENUS= shore bank1 CAT= count-noun GENUS= financial-institution
Window1 CAT= count-noun GENUS= apparatus Sense Enumeration Lexicon (SEL) • Example of Complementary Polysemy Window2 CAT= count-noun GENUS= artifact
Sense Enumeration Lexicon (SEL) • Possible Modification of Complementary Polysemy in SEL window sense1 CAT= count-noun GENUS= apparatus sense2 CAT= count-noun GENUS= artifact
Generative Lexicon Theory(GLT) • Major Problems for Lexical Semantics • to explain the polymorphic nature of language • to characterize the semanticality of natural language utterances • to capture the creative use of words in novel contexts • to develop a richer, co-compositional semantic representation • Generative Lexicon Theory • developed by James Pustejovsky • crucial aspect of GLT is the representation and treatment of polysemy • it examines the meaning of words to see the range of polysemy
Methodology of Generative Lexicon Theory Generative lexicon involves the following methodology • Argument Structure • True Arguments • Default Arguments • Shadow Arguments • True Adjuncts • Event Structure • Qualia Structure • Formal • Constitutive • Telic • Agentive
Argument Structure • True Arguments: syntactically realized parameters of the lexical item John arrived late • Default Arguments: logically present in the expressions but are not necessarily expressed syntactically. John built the house out of bricks • True Adjuncts: • modify the logical expression • part of the situational interpretation She drove down to New York on Tuesday.
Argument Structure (contd…) • Shadow Arguments:semantically incorporated in the lexical item and are expressed by discourse specification and contextual factors Mary buttered her toast • hidden argument is the material being spread on the toast • these are not optional arguments but expressible only under specific conditions • refer to the semantic content that is not necessarily expressed in syntax Example: Mary buttered her toast with margarine
Event Structure • event type of a lexical item and a phrase • events can be sub-classified into at least three sorts: State, Process and Transition Event Structure of build as found in the following expressions They are building a new house The house was built by John build EVENTSTR= E1= process E2= state
Qualia Structure • gives a relational force for a lexical item • composed of four qualia roles • Formal:This qualia role distinguishes a lexical item within a larger domain. • Constitutive: This is a relation between an object and its constituent parts. • Telic:This specifies the purpose and function of a lexical item. • Agentive:This indicates the factors involved in the origin of a lexical item.
novel const = narrative formal = book telic = reading agent = writing Qualia Qualia Structure (contd…) Qualia Structure for novel
Lexical Conceptual Paradigm (LCP) • The term is used by Pustejovsky and Anick (1988) • Refers to the ability of a lexical item to cluster multiple senses Example, • John crawled through the window. • The window is closed. • Resulting LCP • phys-obj.aperture-lcp = [phys-obj] [aperture]
Generative Device • Type Coercion • a lexical item or phrase is coerced to a semantic interpretation by a governing item in the phrase, without changing its syntactic type Mary wants John to leave Mary wants to leave Mary wants the book • Function Application with Coercion • different complement type of the verb • different interpretations of the verb that arise for the different complements
Generative Device • Selective Binding • a lexical item or a phrase operates specifically on the substructure of a phrase, without changing the overall type in the composition a good knife: a knife that cuts well • Co-composition • multiple elements within a phrase behave as functors, generating new non-lexicalized senses for the words in composition John baked the potato John baked the cake
Lexicon Building • Building of WordNet • lexical database organised in terms of concept • each concept is related to each other in terms of various semantic relations • Building of a Universal Word Dictionary • building a lexicon for Universal Networking Language • Universal Networking Language (UNL) is an electronic language for computers to express and exchange all kinds of information • Creation of Verb hierarchy Tree • creating a verb knowledge base for the UNL system
Building of WordNet • Different semantic relations in WordNet • Synonymy • Antonymy • Hypernymy and Hyponymy • Meronymy and Holonymy • Entailment and Troponymy • Multiple Hypernymy in Euro WordNet • Disjunctive Hypernym • Conjunctive Hypernym • Nonexclusive Hypernym
Building of WordNet • Disjunctive Hypernym • these are incompatible types that never apply simultaneously • found amongnouns that refer to the participant in an event but do not restrict for the type of entity participating threat - Role- Agent threaten - Has Hypernym person; disjunctive - Has Hypernym thing; disjunctive - Has Hypernym idea; disjunctive
Building of WordNet • Conjunctive Hypernym • these are compatible types that always apply simultaneously • found for verbs in which multiple aspects are combined. • Dutch Example doodschoppento kick to death - Has Hypernym doden (to kill); conjunctive - Has Hypernym schoppen (to kick); conjunctive • Similar Hindi example huMkarnaa: Dranao ko ilae jaaor ka Sabd krnaa(to shout to scare somebody) - Has Hypernym Dranaa (to scare) conjunctive - Has Hypernym icallaanaa (to shout)conjunctive
Building of WordNet • Non-exclusive Hypernym • either both aspects may apply simultaneously or one of both may apply knife - Has Hypernym weapon - Has Hypernym cutlery
Building of a Universal Word Dictionary • Construction of Universal Word (UW) in Universal Networking Language (UNL) • UNL – electronic language for computers to express and exchange all kinds of • information • UW – character strings representing unique concept • eat (icl>consume) as in he is eating • eat (icl> damage) as in the house was eaten up by the heat • represented by an English word • captures all the meanings conveyed by that word • restrictions are attached to create unique sense • UNL Knowledge Base (KB)— performs the task of defining all possible • relationships between two UWs.
How to create an UW I. First a category is decided a. nominal concept (icl> thing) is attached e.g swallow(icl> thing) b. verbal concept (icl>do) concept of an event caused by something or someone change (icl>do) as in I changed my mind. (icl>occur) concept of an event that happens of its own accord change (icl>occur) as in The weather will change. (icl>be) concept of a state verb know(icl>be) as in I know you.
How to create a UW(contd…) • To handle the ambiguity of a UW • For a nominal concept, a subordinate category from the uw hierarchy • should be used rather than a thing. • Example: swallow (icl>bird) as in the swallow is singing. • swallow(icl>action) as in he took the drink at [in] one swallow. • swallow(icl>quantity) as in take a swallow of water. • For a verbal concept possible case relations are attached. • case relations are like obj>thing, obj>person, gol>thing • Example: spring(icl>occur(obj>liquid)): expresses gushing out as in to spring out • spring(icl>do(gol>place)): expresses jumping up as in to spring up
Creation of a verb hierarchal tree Creation of the Verb knowledge base Following : 1.Beth Levin’s methodology of verb alternation example, a. Bill sold a car. b. Bill sold Tom a car. 2. Hypernymy relation of English Wordnet Hypernym denotes superset of a concept example, animal Hypernym cat
Creation of a verb hierarchal tree contd… • Beth Levin gives the syntactic information. • Hypernymy gives the semantic information. • The classification is in the following manner: • "do(agt>thing,obj>thing {,gol>thing,src>thing,icl>do})" • "argue({icl>do(}agt>thing,obj>thing,ptn>thing{)})"
Creation of a verb hierarchal tree contd… Format of the entry: 1Tab "attack({icl>do(}agt>thing,obj>thing{)})"; Most wild animals won't attack humans unless they are provoked. /Army forces have been attacking (the town) since dawn with mortar and shell fire. / Napoleon attacked Russia in 1812 and was defeated and forced to retreat. (to make an attack on sb/sth) 2Tab Tab"assault(icl>attack(agt>thing,obj>thing,man>emotionally))" Nightmares assaulted him regularly.(to attack sb emotionally) 2Tab Tab"assault(icl>attack(agt>thing,obj>thing,man>physically))" ;He got two year's imprisonment for assaulting a police officer.[Vn](to attack sb physicaly and violently, esp when this is a crime)
Application of GLT • Formal role is similar with the hypernymy relation • Constitutive role is similar with the meronymy relation • Telic role is similar with the functional link given between a Noun and a Verb in the Hindi WordNet • LCP is used in the multi hypernymy process • Event structure is specified by the ontology nodes in the Hindi WordNet
Application of GLT • English Wordnet (1.7.1) gives 63 senses for the verb sense of break interrupt, break 1-- (end prematurely; break a lucky streak) break, break off, discontinue, stop 10-- (prevent completion; stop the project; break the silence) break, break away18-- (interrupt a continued activity; She had broken with the traditional patterns) break31-- (stop or interrupt; He broke the engagement; We had to break our plans for a trip to China) separate, part, split up, split, break, break up 33-- (discontinue an association or relation; go different ways; The business partners broke over a tax question; The couple separated after 25 years of marriage; My friend and I split up)
Break EVENTSTR QUALIA E: event FORMAL: interruption AGENTIVE: break_act Application of GLT • Merging of senses using GLT
Limitations Of GLT • Attempts to distinguish between polysemy and accidental homonymy Example of bake • baked a cake (creativity) • baked a potato (change of state) • Pustejovsky’s suggestion • cake-artifact • potato-nat obj Problem: how to deal with artifacts like knife, car?
Conclusion • Generative mechanisms fail to predict polysemy or • generate polysemous sense • Generative mechanisms along with ontology can be a • powerful device • This implies the building of a rich ontology