1 / 8

Automatic Verb Sense Grouping --- Term Project Proposal for CIS630

Automatic Verb Sense Grouping --- Term Project Proposal for CIS630. Jinying Chen 10/28/2002. Motivation. “Making fine-grained and coarse-grained distinction, both manually and automatically (Martha, Hoa, Christiane, 2002)

chrisbrown
Télécharger la présentation

Automatic Verb Sense Grouping --- Term Project Proposal for CIS630

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Automatic Verb Sense Grouping --- Term Project Proposal for CIS630 Jinying Chen 10/28/2002

  2. Motivation • “Making fine-grained and coarse-grained distinction, both manually and automatically (Martha, Hoa, Christiane, 2002) • The difficulty of finding consistent criteria for making fine-grained sense distinction, either manually or automatically • Well-defined sense groups can alleviate this problem • Potential application in Machine Translation

  3. Model • Unsupervised Learning • EM algorithm (similar as in Dan Gildea 2002, Walde 2000, Rooth 1999, Ted Pedersen, 1997)

  4. EM clustering algorithm • Soft clustering P(v|c) • Each verb vi is associated with a set of features {fi1, fi2, … fin}, there are m clusters {c1 , c2, … cm} • Estimate P(v|c) by maximize loglikelihood

  5. Two problems • How many clusters for a particular verb? • human knowledge of the rough number of verb sense groups is instructive in unsupervised learning • Olga’s proposal • How many features for a particular verb? • May not be a problem: hopefully the EM algorithm can do feature selection on some degree • However, a well-restricted feature set can reduce the model complexity (O(nm)) and alleviate the effect of noise data • Borrow ideas from “Automatic Verb Classification based on Statistical Distribution of Argument Structure” (Paola Merlo and Suzanne Stevenson, 2001)

  6. Plan • Phase I --- Corpus analysis • Automatically and manually • Determine the range of feature set for each verb • Phase II --- Automatic verb sense grouping • Implement EM clustering algorithm • Evaluate the performance • Phase III --- Compare with other clustering methods • Ward’s minimum-variance method (Ward, 1963) • McQuitty’s similarity analysis (McQuitty, 1966) • Spectral Clustering (Brew & Walde, 2002)

More Related