90 likes | 175 Vues
Our group developed a keyphrase extraction system by leveraging the logical structure of documents. We identified key sections and enhanced precision without compromising recall, ranking 2nd among 19 teams. Our approach utilizes TFIDF, initial phrase occurrence, and co-usage statistics. Additional talks covered factors influencing name errors and computational advertising. The IBM paper explored social networks in literary fiction, debunking existing theories and showcasing novel findings on dialogues and settings' impacts.
E N D
ACL 2010 Thuy Dung Nguyen
Outlines • Our group work: keyprhase extraction system • Invited talks • Towards a Psycholinguistics of Social Interaction by Zenzi M Griffin, University of Texas at Austin • Computational Advertising by Andrei Broder, Yahoo! Research • IBM best student paper • Extracting Social Networks from Literary Fiction by David Elson, Nicholas Dames and Kathleen McKeown
SemEval Task 5: Keyphrase Extraction for Scientific Articles • Our approach: utilize document logical structure (given by ParsCit) • identify which sections of the document contain the most keyphrases • shorten input text to contain only those sections: title, abstract, introduction, related works, conclusion & 1st sentence of each paragraph of other sections • increase precision but not sacrify recall. • final result : ranked 2nd out of 19 teams. • Other approaches: • make use of document logical structure • similar features: TFIDF, first occurrence, phrase length, phrase’s occurrence in important sections, statistics of co-usage of keyphrases in large publication repository (HAL, Europarl)
Invited talk 1 Study which factors influence errors in addressing people by name • Shared roles • Similar social relationship (boyfriend/girlfriend, family members, dependants) • Shared features • Gender, age, physical similarity • Same initial sound in name (Cathy, Ken)
Invited talk 2 Computational Advertising • Challenge: find “best match" between a given user in a given context and a suitable advertisement. • Previous approach: matching based on similar words/phrases in both the webpage and the ad • Yahoo! Research: not only matches ads based on keywords but on the general topic. • Classify webpages and ads into large tree of topics • Map ad and webpage to a specific node on the tree • Leverage the nodes for better matching
IBM best student paper Extracting Social Networks from Literary Fictions • Construct social networks among characters in 19th century British novels • Provide evidence that these networks do not fit 2 theories provided by literacy scholars • There is an inverse correlation between the amount of the dialogue and the number of characters • Novel setting (urban or rural) would have an effect on the structure of social network - more interactions occurring in rural communities than urban communities
IBM best student paper • What’s the application of the research? • Using statistical method to test the validity of theories about social interaction in real world and their representation in novels
Others • Best long paper Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates by Matthew Gerber and Joyce Chai • Challenge paper The Human Language Project: Building a Universal Corpus of the World’s Languages, by Steven Abney and Steven Bird