1 / 10

Fine-grained and Coarse-grained Word Sense Disambiguation

Fine-grained and Coarse-grained Word Sense Disambiguation. Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003. Outline. Maxent Word Sense Disambiguator Coarse-grained WSD by Decision Tree Future Work. Maxent Word Sense Disambiguator (Martha, Hoa, Christiane, 2002 ).

morse
Télécharger la présentation

Fine-grained and Coarse-grained Word Sense Disambiguation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Fine-grained and Coarse-grained Word Sense Disambiguation Jinying Chen, Hoa Trang Dang, Martha Palmer August 22, 2003

  2. Outline • Maxent Word Sense Disambiguator • Coarse-grained WSD by Decision Tree • Future Work

  3. Maxent Word Sense Disambiguator(Martha, Hoa, Christiane, 2002 ) • A deterministic model producing probability fj(sense,context):binary features : weight of feature j Z(context) : normalizing factor • Can combine evidence from different knowledge source • Feature weights determined automatically (GIS)

  4. Maxent Word Sense Disambiguator Features used in Maxent Model for English WSD • Local Contextual Predicates • Collocational features, e.g., target verb w, pos of w; pos of words at position –1,+1, w.r.t. w; words at positions –2, -1, +1, +2, w.r.t. w • Systactic features, e.g., active vs. passive, is there a sentential complement, subj, obj or indirect obj etc. • Semantic features, e.g., Named Entity tag (PER, ORG, LOC) for proper nouns, and WN synsets and hypernyms for all nouns in above syntactic relation to w • Topical Contextual Keywords • Select 200-300 words k with lowest entropy (P(sense|k)), i.e., being most informative, from anywhere in context

  5. Maxent Word Sense Disambiguator Fine-grained and coarse-grained WSD • Part of the results from (Martha, Hoa, Christiane, 2002 ) Table 1 The Performance of Maxent Word Sense Disambiguator on five verbs

  6. Coarse-grained WSD by Decision Tree • A simpler model compared with Maxent Model • Using Semantic Features from PropBank • PropBank • Each verb is defined by several framesets • All verb instances belonging to the same frameset share a common set of roles • Roles can be argn (n=0,1,…) and argM-f • Frameset is consistent with Verb Sense Group • Frameset tags and roles are semantic features for VSG

  7. Automatic Verb Sense Grouping Coarse-grained WSD by Decision Tree Building Decision Tree • Use c5.0 of DT • 3 Feature Sets: • SF (Simple Feature set) works best: • VOICE: PAS, ACT • FRAMESET: 01,02, … • ARGn (n=0,1,2 …) : 0(not occur), 1(occur) • CoreFrame: 01-ARG0-ARG1, 02-ARG0-ARG2,… • ARGM: 0(has not ARGM), 1(has ARGM) • ARGM-f(f=DIS, ADV, …): i (occur i times)

  8. Coarse-grained WSD by Decision Tree Experimental Results Table 2 Error rate of Decision Tree on five verbs

  9. Coarse-grained WSD by Decision Tree Discussion • Simple feature set and simple DT algorithms works well • Potential sparse data problem • Complicate DT algorithms (e.g., with boosting) tend to overfit the data • Complex features are not utilized by the model • Solution: use large corpus, e.g., parsed BNC corpus without frameset annotation

  10. Future Work • Train DT or other models for coarse-grained WSD on large corpus without frameset annotation • Unsupervised Frameset Tagging by EM-clustering • Clustering nouns automatically instead of using WordNet to group nouns

More Related