170 likes | 278 Vues
Explore the journey of developing Python tools for medical image analysis, with a focus on pulmonary embolism detection and text processing. Learn about the pyConText framework, its core components, and the author's motivation and objectives. Discover the applications, challenges, and future directions in this evolving field.
E N D
Development of ConText Tools in Python Brian E. Chapman, PhD, Glenn Dayton, Wendy W. Chapman, PhD Division of Biomedical Informatics
Caveats and Apologies • I’m not a linguist, computational or otherwise, nor am I a grammarian, or multilingual, or particularly well spoken in my native English • In fact I’m a medical physicist who has drifted into imaging informatics with an emphasis on the image part of imaging informatics • Which is a long way of saying that I got into this field because of a specific problem
Motivation • Received NIH funding for computer-aided detection of pulmonary embolism in CT pulmonary angiography (CTPA) • How to identify appropriate cases from clinical PACS?
Case Identification Approach #1 • Talk to an honest broker • Who was obviously overworked • Who used procedure codes from RIS to identify potential cases • Who then read the dictated report • Who then classified the case • Who nearly fainted when I told her I needed hundreds of positive cases • Who then quickly asked, “Do you have a lot of money?”
Case Identification Approach #2 • Honest broker’s task is perfect for NegEx • Use procedure codes to identify reports in MARS repository at University of Pittsburgh • Use NegEx to classify reports as +/- for PE • Within minutes find hundreds of cases • Very happy honest broker
What if you wanted to answer more questions? • Disease uncertainty • Disease temporality • Image quality • Can we a priori specify all of these?
peFinder • Application to characterize CTPA reports • Presence or absence of PE • Temporal state of positive PE • Uncertainty of disease state • Technical quality of the exam
For Review: NegEx Clinical condition: Cough Negation: Negated scope Patient deniescoughbut complains of headache. No change in the patient’s chest pain. trigger term termination term pseudo-trigger term
Python Implementations • What Drove My Organic Design • What existed in NegEx • GUI program written in Tcl/Tk • Lots of enumerated trigger terms • What I wanted • I wanted a package that could be used to build a variety of accurate applications • I wanted it to be easy for others to use • I am an engineer and so lazy • Generalize relationships • Replace exhaustive enumeration of trigger terms with regular expressions
pyConText: Basic Framework • Item Objects: 4-tuple containing Lexical and Domain Knowledge • Literal (label): “pulmonary embolism” • Category/Concept • Regular expression • r‘’‘(pulmonary )(artery )?(embol[a-z]+)’’‘ • Rule • Directional influence of item in sentence • Category interaction?
pyConText: Basic Framework • Item Objects parse sentence to create Tag Objects within sentences • Tag Objects interact/modify each other • Targets • Modifiers • Conjunctions • Prune to eliminate subset tag objects • Directional Graph represents relationships
Did I Meet My Objectives? • Accurate • Yes: JBI 2011 • Modular • Yes: package in pypi • Easy for others to use • Depends on your definition of others • Wilson, et al. Journal of Pathology Informatics • Gentili and Chapman RSNA
Did I Meet My Objectives? • Easy for others to use (continued) • Can any application relying on user to provide regular expressions be defined as easy?
Current and Future Work • Web and GUI applications • Django • Django with Twisted for desktop port
Current and Future Work • Improved Knowledge Representation • Separating linguistic and domain knowledge • Integration with external knowledge bases • Use graphs to further reduce enumeration of items • No/definite/evidence of/pulmonary embolism
Thanks for the invitation • Looking forward to • Learning and • Working and • Skateboarding • For the next three weeks