Innovative Keyphrase Extraction System Using Document Logical Structure

ACL 2010 Thuy Dung Nguyen

Outlines • Our group work: keyprhase extraction system • Invited talks • Towards a Psycholinguistics of Social Interaction by Zenzi M Griffin, University of Texas at Austin • Computational Advertising by Andrei Broder, Yahoo! Research • IBM best student paper • Extracting Social Networks from Literary Fiction by David Elson, Nicholas Dames and Kathleen McKeown

SemEval Task 5: Keyphrase Extraction for Scientific Articles • Our approach: utilize document logical structure (given by ParsCit) • identify which sections of the document contain the most keyphrases • shorten input text to contain only those sections: title, abstract, introduction, related works, conclusion & 1st sentence of each paragraph of other sections • increase precision but not sacrify recall. • final result : ranked 2nd out of 19 teams. • Other approaches: • make use of document logical structure • similar features: TFIDF, first occurrence, phrase length, phrase’s occurrence in important sections, statistics of co-usage of keyphrases in large publication repository (HAL, Europarl)

Invited talk 1 Study which factors influence errors in addressing people by name • Shared roles • Similar social relationship (boyfriend/girlfriend, family members, dependants) • Shared features • Gender, age, physical similarity • Same initial sound in name (Cathy, Ken)

Invited talk 2 Computational Advertising • Challenge: find “best match" between a given user in a given context and a suitable advertisement. • Previous approach: matching based on similar words/phrases in both the webpage and the ad • Yahoo! Research: not only matches ads based on keywords but on the general topic. • Classify webpages and ads into large tree of topics • Map ad and webpage to a specific node on the tree • Leverage the nodes for better matching

IBM best student paper Extracting Social Networks from Literary Fictions • Construct social networks among characters in 19th century British novels • Provide evidence that these networks do not fit 2 theories provided by literacy scholars • There is an inverse correlation between the amount of the dialogue and the number of characters • Novel setting (urban or rural) would have an effect on the structure of social network - more interactions occurring in rural communities than urban communities

IBM best student paper • What’s the application of the research? • Using statistical method to test the validity of theories about social interaction in real world and their representation in novels

Others • Best long paper Beyond NomBank: A Study of Implicit Arguments for Nominal Predicates by Matthew Gerber and Joyce Chai • Challenge paper The Human Language Project: Building a Universal Corpus of the World’s Languages, by Steven Abney and Steven Bird

Innovative Keyphrase Extraction System Using Document Logical Structure

Innovative Keyphrase Extraction System Using Document Logical Structure

Presentation Transcript

ACL Presentation

ACL Functions

ACL Connections

ACL SCREEN

ACL 2009

ACL Webinar

ACL 2009

ACL 2009

ACL Reconstruction

ACL

ACL Reconstruction

ACL

ACL Scripts

ACL

ACL injury

ACL ACL TREATMENT ACL SURGERY DR MANU BORA

ACL Reconstruction