180 likes | 184 Vues
SemanticFind: Locating What You Want in a Patient Record, Not Just What You Ask For John M. Prager, Jennifer J. Liang, Murthy V. Devarakonda IBM T.J. Watson Research Center. AMIA Joint Summits, San Francisco CA, March 31 2017. Overview. What is SemanticFind? What it is not
E N D
SemanticFind: Locating What You Want in a Patient Record, Not Just What You Ask ForJohn M. Prager, Jennifer J. Liang, Murthy V. DevarakondaIBM T.J. Watson Research Center AMIA Joint Summits, San Francisco CA, March 31 2017
Overview • What is SemanticFind? • What it is not • How is SemanticFind different from Crtl/F? • 13 match types • 4 technologies • Prototype User Interface • Evaluation • Conclusions
What is SemanticFind? • An application that aids the physician search a patient’s medical record for matches to terms of interest • Extension of the familiar Ctrl/F “find” capability in document creation and reading applications • Not limited to matching solely the entered search term • Can find matching content along a variety of different dimensions
What it is Not • Not an application designed for finding matching records (patients) amongst a large collection, i.e. clinical trial matching • Can be done, just not optimized for this • Not a presentation of a new user interface • Prototype GUI is used in subsequent slides for demonstration of functionality • Not a Question-Answering System
So how is it different from Ctrl/F • Search terms represent information needs, but • Information needs cannot usually be answered fully just by locating instances of them in text • Ambiguous intent behind the search term E.g. if the search term is a disease, user might be wondering • When/how it was first diagnosed • What indicative labs for it were over time • Are there counterindications • Are there complications • What treatments were prescribed for it • Approach is to perform a variety of different searches simultaneously, and present the results organized by search type • EHR systems have both structured information (e.g. tables of orders, diagnoses, lab results) and unstructured (e.g. progress and other notes, free text) • Both kinds are searched • Search terms are unrestricted: symptoms, diseases, medications, anatomical structures, any sequence of characters. • Search is mediated by UMLS, so search terms that correspond to concepts in UMLS can be more fully explored
SemanticFind Search Types (1) Trad. Search Conceptual Search Associative Search 7
SemanticFind Search Types (2) Inferential Search 8
Technologies Used • Literal Match • As traditional search, but case-insensitive and disregards singular/plural • Conceptual Match • UMLS concepts and relations for Semantic, More General/Specific • Our own Medical Concept Annotator, conceptually similar to cTAKES and MetaMap, but higher accuracy (paper in preparation) • Lab values and vitals mapped to indicated conditions • K 6.1 gets annotated as hyperkalemia • Parse- and Linguistic-principle-based transformations to catch semantically matching concepts/variants in UMLS • Pain in the abdomen not a variant of abdominal pain • NLP for Negation and Hypothetical • Patient denies discomfort with the rash • Ordered urine test to rule out arsenic poisoning • Associative Match • Uses Latent Semantic Analysis • Finds terms in the record that occur in the same contexts in the literature as the search terms • Useful for finding terms correlated with search term, but no “named” relation, e.g. sob wheezing • Inferential Match • Finds terms in the record that are related through curated relation chains to the search term • Most useful for <treats>, <prevents>, <causes> relations, e.g. Infection <includes> Lower respiratory tract infection<treated by> Amoxicillin <is ingredient of> Augmentin 875 mg-125 mg tablet
Evaluation • 3 types suitable for evaluation • Semantic Match • More Specific • Contradicted • 10 records selected at random • Average of 250 clinical notes per record • MD developed list of (13-32) search terms for each • Total of 169 terms, 134 unique • 4th-yr medical students used as assessors • Assessors generated a list of paraphrases for each search term • 0-13 per term. Total of 652. • Based on medical knowledge and/or lookup, not seeing medical records.
Assessment task, per search term • SemanticFind used interactively to locate matches • Precision: • GUI enhanced with evaluation widgets for assessors to enter judgments of GOOD or BAD • #GOOD = True Positives (TP) • #BAD = False Positives (FP) • Precision = TP/(TP+FP) • Recall: • System automatically searched for user-generated paraphrases (via Literal Match), and counted how many of these did not correspond to GOOD in Precision task. • This count = False Negatives (FN) • Recall = TP/(TP+FN) • F-Measure • F = 2PR/(P+R)
Results (1) • Precision • Error analysis shows most FPs due to • ambiguity of abbreviations • negation detection error • Recall: 2 modes evaluated • Unconstrained = all supplied paraphrases • Constrained = only those paraphrases that matched UMLS concepts • F-Measure Unconstrained/Constrained = 0.87/0.92
Results (2) • Progressive analysis of GOOD matches: • Relative to Literal Match as a baseline • Semantic Match • Corresponds (very roughly) to Ctrl/F + Synonym Expansion • Semantic Match + More Specific • Semantic Match, More Specific + Contradicted 103% Dark Matter
Interesting Negation Detection Error • Due to somewhat informal formatting/writing of clinical notes, e.g.: • Implicit sentence-end clear to humans, but not to computer, giving rise to recognition of no smoking • On fixing problem, reduced false positives by 30% … alcohol use : no smoking : yes …
Conclusions • SemanticFind is application to search within a patient record • 13 searches performed simultaneously • using a variety of NLP technologies • Organised in a tabbed interface • High accuracy • F = 0.87 or 0.92 • Est. 2 points higher when sentence-end problem fixed • “Dark Matter” calculation shows that Ctrl/F misses as many desirable matches as it finds