1 / 14

UROP Research Update Citation Function Classification

Eric Yulianto A0069442B 22 February 2013. UROP Research Update Citation Function Classification. Motivation. To assist researchers during paper review process. Quick categorization with minimal amount of reading. Help prioritize more important papers. Problem. Given a citation on a paper.

khan
Télécharger la présentation

UROP Research Update Citation Function Classification

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Eric Yulianto A0069442B 22 February 2013 UROP Research UpdateCitation Function Classification

  2. Motivation • To assist researchers during paper review process. • Quick categorization with minimal amount of reading. • Help prioritize more important papers.

  3. Problem • Given a citation on a paper. • What is the purpose of the citation? • Need to repeatedly read a section of the paper. • Intention may not be obvious from the citation sentence.

  4. Related Work • Teufel et al., 2006 • Feature used: • Cue phrases • Verb Clusters • Verb Tense • Modality • Self-citation indicator • Ibk/k-Nearest Neighbour Algorithm • Accuracy: 77%

  5. Related Work • Angrosh et al., 2010 • Citation classification => Sentence classification • Related Work Section only. • Feature Used: • Word Category. • Presence of citation in previous sentence. • Conditional random field. • Generally perform well: Accuracy: 96.51%. • Did not perform well on citation sentence.

  6. Related Work • Dong and Schafer, 2011 • Feature used: • Cue words. • Physical: Location,Popularity,Density,AvgDens. • Sentence syntax • Ensemble-style self-training algorithm.

  7. Current Progress (Analysis) • Citation scheme • Adopt and modify the scheme done in Teufel et al., 2006. • 12 classes => 4 classes. • Weakness • CompareContrast • Positive • Neutral

  8. Current Progress (Analysis) • Dataset • ANLP Conference from ACL Anthology. • Context extracted from ParsCit output. • Distribution: 609 citations • Weakness: 30 • CompareContrast: 72 • Positive: 236 • Neutral: 271

  9. Current Progress (Analysis) • Classification Algorithm • Weka Implementation of Naive Bayes and SVM • Uses chi-square attribute selection filter

  10. Current Progress (Analysis) • Feature Used and Tested: • Cue Words • Cue Words + chi-square filter • Word Categories (Angrosh et al., 2010)

  11. Current Progress (Analysis)

  12. Ongoing Process • Feature extracted but not yet tested: • Physical Features (Dong and Schafer, 2011) • Location • Density • Popularity • Author and Title Information • Publication Year

  13. Follow Up • Add more features that can help differentiate the citation functions. • Larger dataset • Split the classification into two stages: • Use the metadata(physical features, author information, title information, publication year) • Use the cue words to refine the classification

  14. Thank You

More Related