1 / 11

Exploiting Relational Structure to Understand Publication Patterns in High-Energy Physics

This study explores the use of relational structure to understand publication patterns in high-energy physics, including data cleaning, extraction, and analysis. The authors identify research communities and predict journal publications using KDL's PROXIMITY software. They also analyze data dependencies and author influence.

baughman
Télécharger la présentation

Exploiting Relational Structure to Understand Publication Patterns in High-Energy Physics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Exploiting Relational Structure to Understand Publication Patterns in High-Energy Physics Amy McGovern, Lisa Friedland, Michael Hay, Brian Gallagher, Andrew Fast, Jennifer Neville, David Jensen Knowledge Discovery LaboratoryUniversity of Massachusetts Amherst

  2. Knowledge Discovery Process Data cleaning Data extraction Data analysis Citation analysis Identifying research communities Predicting journal publication Data dependencies Understanding author influence Implemented using KDL’s PROXIMITY software

  3. Extracted abstracts Consolidated authors Same name assumed 13,185 authors to 9,200 Co-authored with similar names Authors of referenced papers with similar names Authors with similar email domains and the same username Data cleaning and extraction Relational schema

  4. Data dependencies • Examples of high correlations: • Number of downloads in first 60 days and number of citations • Is paper published and number of citations (binned) • Examples of high autocorrelation: • Journal name (through author) • Topic cluster of paper (through author) • Author’s total co-authors (through paper) • Number of downloads in first 60 days (through journal) + – + – + – – + + + Low autocorrelation High autocorrelation

  5. Influential Authors

  6. 20% of physicists receive 80% of the citations

  7. Influential authors are more connected

  8. Papers from 1995-2000 68% accuracy, 0.75 AUC Will a paper be accepted by Physics Letters B?

  9. Identifying Research Communities • Spectral clustering on citation graph and abstracts • Papers from 1995 to 2000

  10. Example topic clusters Cluster 2: Black hole approach to string theory: Sumit R.Das (251), Physical Review D Absorption of Fixed scalars and the D-brane Approach to Black Holes Universal Low-Energy Dynamics for Rotating Black Holes Interactions involving D-branes Black Hole Greybody Factors and D-Brane Spectroscopy Cluster 10: Tachyon Condensation: Juan M. Maldacena (1924), Journal of High Energy Physics Field theory models for tachyon and gauge field string dynamics Super-Poincare Invariant Superstring Field Theory Level Four Approximation to the Tachyon Potential in Superstring Field Theory SO(32) Spinors of Type I and Other Solitons on Brane-Antibrane Pair

  11. KDD Cup 2003 Paper:kdl.cs.umass.edu/papers/kddcup2003.htmlProximity:kdl.cs.umass.edu/proximity/Email:amy@cs.umass.edu

More Related