40 likes | 172 Vues
This document provides an overview of current advancements in Named Entity Recognition (NER) using MetaMap. It details the types of entities being identified, including experimental platforms, conditions, cell types, molecules, and therapeutic modalities, while addressing existing errors. The work focuses on a modified ABNER model, enhancing it with additional orthographic features and updating MALLET usage. Future plans include completing coding tasks, conducting stratified cross-validation, and training the model with input derived from MetaMap-processed abstracts to ensure compatible output for subsequent stages.
E N D
Shock GroupNER Replacement Laura Christiansen
Overview • Currently using MetaMap for NER • Types: Experimental Platform, Condition, Cell Type, Molecule, Drug/Chemical Compound/Therapeutic Modality • Errors present • Trainable alternative • Construct new model with ABNER (http://pages.cs.wisc.edu/~bsettles/abner/)
Current Work • Modified ABNER source code • Included additional orthographic feature selection • Updated MALLET usage and references (http://mallet.cs.umass.edu/) • Worked with JLex for ABNER tokenization scanner (http://www.cs.princeton.edu/~appel/modern/java/JLex/) • Incorporating new functionality • Stratified cross validation
Future Work • Finish coding (and buy more coffee) • Test new model with stratified cross validation • Train with modified input from MetaMap-processed abstracts • Format output to be usable with stage 2