1 / 17

eCuration : speed curating with PubTator

Zhiyong Lu, Earl S tadtman Investigator National Center for Biotechnology Information (NCBI) National Library of Medicine (NLM) National Institutes of Health (NIH). eCuration : speed curating with PubTator. eCuration (computer-assisted biocuration) is necessary.

rupert
Télécharger la présentation

eCuration : speed curating with PubTator

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Zhiyong Lu, Earl Stadtman Investigator National Center for Biotechnology Information (NCBI) National Library of Medicine (NLM) National Institutes of Health (NIH) eCuration: speed curating with PubTator

  2. eCuration (computer-assisted biocuration) is necessary

  3. Most searched topics in PubMed Bibliographic Non-bibliographic

  4. Key biological entities

  5. Our NER Tools • Freely available & open source • High Performance • DNorm: Best in 2013 ShARe/CLEF shared task on Disease Normalization • tmChem: Best in 2013 BioCreative IV Chemical Entity Mention task • GenNorm: Best in 2010 BioCreative III Gene Normalization Task • BioC format compatible for improved interoperability All numbers are F1 scores

  6. Our tmTools are publicly available • DNorm: www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/Dnorm/ • tmVar: www.ncbi.nlm.nih.gov/CBBresearch/Lu/pub/tmVar/ • SR4GN: www.ncbi.nlm.nih.gov/CBBresearch/Lu/downloads/SR4GN/ • GenNorm: http://ikmbio.csie.ncku.edu.tw/GN/ • tmChem: http://www.ncbi.nlm.nih.gov/CBBresearch/Lu/Demo/tmChem/ To make it easy for biocurators, we have already applied all these tools to PubMed abstracts and store results in our Web-based annotation tool – PubTator!

  7. PubTator Intro/Highlights • Web-based; no installation required; in sync with PubMed • One-stop curation service from literature search to annotation • Curator friendly (PubMed-like) interface; easy to use • Integrates competition-winning text-mining tools for automatic pre-annotations • Easy to adapt and customizeto different curation tasks Wei, Kao, & Lu: PubTator: a Web-based text-mining tool for assisting biocuration, to appear in Nucleic Acids Research, 2013

  8. PubTator’s Curation Interface Document triage Bio-concept annotation Bio-relation annotation

  9. BioCreativeChallenge (2003 – ) www.biocreative.org

  10. PubTator evaluation • Task: manually annotating genes in 50 abstracts • Experimental settings (25 abstracts each) • PubMed + spreadsheet (baseline) • PubTator + computer-generated gene results • Results: 40% decease in curation time & slightly higher accuracy Wei, Harris, … Lu. Accelerating literature curation with text mining tools: a case study of using PubTator to curate genes in PubMed abstracts. Database, 2012; bas041

  11. Top rated by biocurators Arighi, et al., An overview of the BioCreative 2012 Workshop Track III: interactive text mining task. Database, 2013. bas056

  12. Successful applications of PubTator “PubTator substantially reduces the manual data input involved, reflected in both time-savings and reduction in physical fatigue of keyboard typing.” – Mindy C.

  13. Discussions • eCuration: computer-assisted curation can improve productivity • Future directions • Working with ontologies • Working with full-text • What would you do with PubTator?

  14. Acknowledgments • My Team • Rezarta Dogan • Bethany Harris • RituKhare • AurelieNeveol • Yuqing Mao • Robert Leaman • Jiao Li • Chih-Hsuan Wei • BioCreative • Lynette Hirschman, MITRE • Kevin Cohen, U of Colorado • Alfonso Valencia; Martin Krallinger, CNIO • Cecilia Arighi, Cathy Wu, U of Delaware • Carolyn Mattingly; Tom Wiegers, NCSU Supported by NIH Intramural Research Program, National Library of Medicine.

  15. Pacific Symposium on Biocomputing (PSB) 2015 January 4 – 8, 2015 The Big Island of Hawaii Crowdsourcing and Mining Crowd Data Crowdsourcing techniques microtask environments games with a purpose workflow sequestration Crowd data human genomics sequence data electronic health records social media data Robert Leaman and Zhiyong Lu, NCBI/NLM/NIH Ben Good and Andrew Su, Scripps Research Institute

  16. Questions? Thank you! zhiyong.lu@nih.gov

More Related