Machine Learning and NLP in Healthcare Data Analytics

Health Care Data Analytics Machine Learning and Natural Language Processing Lecture c This material (Comp 24 Unit 6) was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number 90WT0001. This work is licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. To view a copy of this license, visit http://creativecommons.org/licenses/by-nc-sa/4.0/.

Machine Learning and Natural Language ProcessingLearning Objectives • Describe the major tasks for which machine learning is used (Lecture a) • Compare and contrast the major approaches for machine learning (Lecture a) • Describe the major tasks for which natural language processing is used (Lectures b-c) • Discuss the major approaches and challenges for processing clinical narratives (Lectures b-c)

NLP of Clinical Text • Basic definitions and approaches to NLP • Challenges in processing the clinical narrative • Clinical NLP approaches and projects • Alternatives and future directions

Clinical NLP Approaches and Projects - 1 • Linguistic String Project • Medical Language Extraction and Encoding (MedLEE) system • Other NLP systems

Clinical NLP Approaches and Projects - 2 • Electronic Medical Records and Genomics (eMERGE) Network • i2b2 challenge evaluations • Other NLP uses and results • A growing number of commercial systems have become available.

Linguistic String Project (LSP) - 1(Sager, 1987) • Presumptions • Most clinical narrative statements cTechnical documents in a single field use only a subset of English grammar and vocabulary • Can be reduced to six information formats: • Medication • Test and result • Patient state • Patient behavior • General medical management • Treatment other than medication

Linguistic String Project (LSP) - 2(Sager, 1987) • Steps • Parsing: Words labeled with syntactic category • Sublanguage selection: Disambiguation • Regularization: Words normalized into equivalent forms • Information formatting: Selection of one of six information formats • If sentence unambiguously maps into structure, it can be entered into a database.

Medication Information Format

MedLEE (Friedman, 1994) - 1 • Core approach is a “semantic grammar” that recognizes terms and attributes but not syntax. • Initial focus on radiology reports (chest x-ray [CXR], mammogram) but has been extended to discharge summaries, literature, and other areas.

MedLEE (Friedman, 1994) - 2 • Four steps • Preprocessor • Parser • Phrase regularizer • Encoder • After processing, output is sent to clinical information system.

Evaluation of MedLEE - 1 • Ability to detect presence of four conditions in CXR reports (Friedman, 1994) • 230 reports coded by three physicians • Basic system, recall = 70% and precision = 87% • When system modified for specific queries, recall improved to 85% while precision remained unchanged

Evaluation of MedLEE - 2 • Comparison with human coders (Hripcsak, 1995) • Measured “distance” (average number of conditions per report where physicians disagreed) across internists, radiologists, lay persons, and computer systems (including MedLEE) for six conditions in 200 CXR reports • Distance across physicians (0.24) within confidence interval of MedLEE (0.26), larger for lay persons and other computer systems

Extension of MedLEE - 1 • Subsequently extended to • Parsing of notational text (Barrows, 2000) • Terse, highly abbreviated text from ophthalmologists • MedLEE performed better than specialized parser for glaucoma • For six findings, had recall > 80% and precision = 100%

Extension of MedLEE - 2 • Subsequently extended to • Coding locations of strokes (Elkins, 2000) • MedLEE performed comparable to manual coding (based on ROC area) • Clinical documents generally (Friedman, 2004; Hripcsak, 2007) • Temporal data (Zhou, 2007; Zhou, 2008) • Combined with machine learning (Yadav, 2013) • Used operationally in New York Presbyterian Hospital

Other Clinical NLP Systems • HITEX: Part of i2b2 software (Zeng, 2006) https://www.i2b2.org/software/projects/hitex/hitex_manual.html • KnowledgeMap: Part of eMERGE Network (Denny, 2009) https://medschool.vanderbilt.edu/cpm/blog-categories/nlp • MetaMap: From NLM, makes use of UMLS Metathesaurus (Aronson, 2010) http://metamap.nlm.nih.gov • cTAKES: From Mayo Clinic (Savova, 2010) https://ctakes.apache.org • TIES: From University of Pittsburgh (Liu, 2011) http://ties.upmc.com

Electronic Medical Records and Genomics (eMERGE) Network - 1 • Looking for associations between genotype (genes in DNA) and phenotype (characteristics expressed in living organism) http://emerge.mc.vanderbilt.edu • Trying to link DNA biorepositories with EHR systems • Goal: “large-scale, high throughput genetic research” (McCarty, 2011; Wilke, 2011)

Electronic Medical Records and Genomics (eMERGE) Network - 2 • For most phenotypes, ICD-9 codes inadequate; NLP of text notes and reports as well as medication data provides higher accuracy in identification (Ritchie, 2010; Denny, 2012)

Results from eMERGE - 1 • Initial work replicated findings of known gene-disease associations from research data in EHR data (Denny, 2010) • Additional work has led to discovery of new associations (Denny, 2013; Crawford, 2014)

Results from eMERGE - 2 • NLP algorithms easily transportable across institutions (Kullo, 2011; Liu, 2012) • Given rise to phenome-wide association studies (PheWAS), where many aspects of patient phenome associated with a genome variant (Denny, 2010; Bush, 2016)

i2b2 Challenge Evaluations • Annual challenges with overview and system papers • Relationships between concepts (entities) in clinical text (Uzuner, 2011) • Coreference resolution and sentiment classification (Uzuner, 2012) • Temporal relations (Uzuner, 2013) • De-identification and risk factor detection (Uzuner, 2015) • Automated de-identification of records (Uzuner, 2007) • Identification of smoking status from medical discharge summaries (Uzuner, 2008) • Identification of obesity and its co-morbidities (Uzuner, 2009) • Extracting medication information (Uzuner, 2010)

Other Research in NLP of Clinical Text - 1 • Negation detection (Chapman, 2001) – NegEx system to detect negation in clinical charts • Syndromic surveillance of emergency department chief complaints (Chapman, 2005) • Detection of healthcare quality measures (Hazlehurst, 2005; Hazlehurst, 2005; Yetisgen, 2015)

Other Research in NLP of Clinical Text - 2 • Clinical research – finding patients with congestive heart failure (Pakhomov, 2007) and classifying foot examination results in patients with diabetes (Pakhomov, 2008) • Identification of follow-up recommendations from radiology reports (Yetisgen-Yildiz, 2013) • Handling abbreviation (Wu, 2015) and other ambiguity in clinical text (Blair, 2014)

Evaluation Results – Is Clinical NLP Ready for “Prime Time?” Recall of coding and classification studies over time Precision of coding and classification evaluations over time Stanfill, 2010 Stanfill, 2010

Alternatives and Future Directions - 1 • Clinical NLP systems are limited by • Difficulty to generalize across domains – need to develop new rules, data, etc. for each new area of use • How good must performance of systems be for clinical use and reliability? Is even 95-98% good enough?

Alternatives and Future Directions - 2 • Alternative: can we enter structured clinical information by other means? • Menu-driven systems tried for years (Greenes, 1982; Cimino, 1987; Bell, 1994) but probably best for limited domains • Need tools and shared tasks to define optimal role (Chapman, 2011)

Machine Learning and Natural Language ProcessingSummary – Lecture c • There have been many clinical NLP systems but only a small number are used operationally for clinical care or research • The performance of clinical NLP systems is imperfect, and the adequate level of performance for clinical use is not known • Further research is required to determine the optimal use of NLP in health care

Machine Learning and Natural Language ProcessingSummary • Being able to learn from data and process data within text are important aspects of applying data analytics to health care • Machine learning is the field focused on learning from data, and can occur in a supervised or unsupervised manner • Natural language processing is the area that aims to understand the text in natural languages, and has many challenges in the clinical domain

Machine Learning and Natural Language ProcessingReferences – 1 – Lecture c References Aronson, A., & Lang, F. (2010). An overview of MetaMap: historical perspective and recent advances. Journal of the American Medical Informatics Association, 17, 229-236. Barrows, R., Busuioc, M., & Friedman, C. (2000). Limited parsing of notational text visit notes: ad-hoc vs. NLP approaches. Paper presented at the Proceedings of the AMIA 2000 Annual Symposium, Los Angeles, CA. Bell, D., & Greenes, R. (1994). Evaluation of UltraSTAR: performance of a collaborative structured data entry system. Paper presented at the Proceedings of the 18th Annual Symposium on Computer Applications in Medical Care, Washington, DC. Blair, D., Wang, K., Nestorov, S., Evans, J., & Rzhetsky, A. (2014). Quantifying the impact and extent of undocumented biomedical synonymy. PLoS Computational Biology, 10, e1003799. Bush, W., Oetjens, M., & Crawford, D. (2016). Unravelling the human genome-phenome relationship using phenome-wide association studies. Nature Reviews Genetics, 17, 129-145.

Machine Learning and Natural Language ProcessingReferences – 2 – Lecture c References Chapman, W., Bridewell, W., Hanbury, P., Cooper, G., & Buchanan, B. (2001). A simple algorithm for identifying negated findings and diseases in discharge summaries. Journal of Biomedical Informatics, 34, 301-310. Chapman, W., Dowling, J., & Wagner, M. (2005). Classification of emergency department chief complaints into 7 syndromes: a retrospective analysis of 527,228 patients. Annals of Emergency Medicine, 46, 445-455. Chapman, W., Nadkarni, P., Hirschman, L., D'Avolio, L., Savova, G., & Uzuner, O. (2011). Overcoming barriers to NLP for clinical text: the role of shared tasks and the need for additional creative solutions. Journal of the American Medical Informatics Association, 18, 540-543. Cimino, J., & Barnett, G. (1987). The physician's workstation: recording a physical examination using a controlled vocabulary. Paper presented at the Proceedings of the 11th Annual Symposium on Computer Applications in Medical Care, Washington, DC. Crawford, D., Crosslin, D., Tromp, G., Kullo, I., Kuivaniemi, H., Hayes, M., . . . Ritchie, M. (2014). eMERGEing progress in genomics-the first seven years. Frontiers in Genetics, 5, 184.

Machine Learning and Natural Language ProcessingReferences – 3 – Lecture c References Denny, J., Bastarache, L., Ritchie, M., Carroll, R., Zink, R., Mosley, J., . . . Roden, D. (2013). Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data. Nature Biotechnology, 31, 1102-1111. Denny, J., Choma, N., Peterson, J., Miller, R., Bastarache, L., Li, M., & Peterson, N. (2012). Natural language processing improves identification of colorectal cancer testing in the electronic medical record. Medical Decision Making, 32, 188-197. Denny, J., Miller, R., Waitman, L., Arrieta, M., & Peterson, J. (2009). Identifying QT prolongation from ECG impressions using a general-purpose natural language processor. International Journal of Medical Informatics, 78(Suppl 1), S34-42. Denny, J., Ritchie, M., Basford, M., Pulley, J., Bastarache, L., Brown-Gentry, K., . . . Crawford, D. (2010). PheWAS: Demonstrating the feasibility of a phenome-wide scan to discover gene-disease associations. Bioinformatics, 26, 1205-1210. Denny, J., Ritchie, M., Crawford, D., Schildcrout, J., Ramirez, A., Pulley, J., . . . Roden, D. (2010). Identification of genomic predictors of atrioventricular conduction: using electronic medical records as a tool for genome science. Circulation, 122, 2016-2021.

Machine Learning and Natural Language ProcessingReferences – 4 – Lecture c References Elkins, J., Friedman, C., Boden-Albala, B., Sacco, R., & Hripcsak, G. (2000). Coding neuroradiology reports for the Northern Manhattan Stroke Study: a comparison of natural language processing and manual review. Computers and Biomedical Research, 33, 1-10. Friedman, C., Alderson, P., Austin, J., Cimino, J., & Johnson, S. (1994). A general natural-language text processor for clinical radiology. Journal of the American Medical Informatics Association, 1, 161-174. Friedman, C., Shagina, L., Lussier, Y., & Hripcsak, G. (2004). Automated encoding of clinical documents based on natural language processing. Journal of the American Medical Informatics Association, 11, 392-402. Greenes, R. (1982). OBUS: a microcomputer system for measurement, calculation, reporting, and retrieval of obstetric ultrasound examinations. Radiology, 144, 879-883. Hazlehurst, B., Frost, H., Sittig, D., & Stevens, V. (2005). MediClass: a system for detecting and classifying encounter-based clinical events in any electronic medical record. Journal of the American Medical Informatics Association, 12, 517-529.

Machine Learning and Natural Language ProcessingReferences – 5 – Lecture c References Hazlehurst, B., Sittig, D., Stevens, V., Smith, K., Hollis, J., Vogt, T., . . . Rigotti, N. (2005). Natural language processing in the electronic medical record: assessing clinician adherence to tobacco treatment guidelines. American Journal of Preventive Medicine, 29, 434-439. Hripcsak, G., Friedman, C., Anderson, P., DuMouchel, W., Johnson, S., & Clayton, P. (1995). Unlocking clinical data from narrative reports: a study of natural language processing. Annals of Internal Medicine, 122, 681-688. Hripcsak, G., Knirsch, C., Zhou, L., Wilcox, A., & Melton, G. (2007). Using discordance to improve classification in narrative clinical databases: an application to community-acquired pneumonia. Computers in Biology and Medicine, 37, 296-304. Kullo, I., Ding, K., Shameer, K., McCarty, C., Jarvik, G., Denny, J., . . . Chute, C. (2011). Complement receptor 1 gene variants are associated with erythrocyte sedimentation rate. American Journal of Human Genetics, 89, 131-138. Liu, K., Hogan, W., & Crowley, R. (2011). Natural language processing methods and systems for biomedical ontology learning. Journal of Biomedical Informatics, 44, 163-179.

Machine Learning and Natural Language ProcessingReferences – 6 – Lecture c References Liu, M., Shah, A., Jiang, M., Peterson, N., Dai, Q., Aldrich, M., . . . Xu, H. (2012). A study of transportability of an existing smoking status detection module across institutions. Paper presented at the AMIA Annual Symposium Proceedings 2012, Chicago, IL. McCarty, C., Chisholm, R., Chute, C., Kullo, I., Jarvik, G., Larson, E., . . . Wolf, W. (2010). The eMERGE Network: a consortium of biorepositories linked to electronic medical records data for conducting genomic studies. BMC Genomics, 4(1), 13. Pakhomov, S., Hanson, P., Bjornsen, S., & Smith, S. (2008). Automatic classification of foot examination findings using statistical natural language processing and machine learning. Journal of the American Medical Informatics Association, 15, 198-202. Pakhomov, S., Weston, S., Jacobsen, S., Chute, C., Meverden, R., & Roger, V. (2007). Electronic medical records for clinical research: application to the identification of heart failure. American Journal of Managed Care, 13, 281-288. Ritchie, M., Denny, J., Crawford, D., Ramirez, A., Weiner, J., Pulley, J., . . . Roden, D. (2010). Robust replication of genotype-phenotype associations across multiple diseases in an electronic medical record. American Journal of Human Genetics, 86, 560-572.

Machine Learning and Natural Language ProcessingReferences – 7 – Lecture c References Sager, N., Friedman, C., & Lyman, M. (1987). Medical Language Processing: Computer Management of Narrative Data. Reading, MA: Addison-Wesley. Savova, G., Masanz, J., Ogren, P., Zheng, J., Sohn, S., Kipper-Schuler, K., & Chute, C. (2010). Mayo clinical Text Analysis and Knowledge Extraction System (cTAKES): architecture, component evaluation and applications. Journal of the American Medical Informatics Association, 17, 507-513. Stanfill, M., Williams, M., Fenton, S., Jenders, R., & Hersh, W. (2010). A systematic literature review of automated clinical coding and classification systems. Journal of the American Medical Informatics Association, 17, 646-651. Uzuner, O. (2009). Recognizing obesity and comorbidities in sparse data. Journal of the American Medical Informatics Association, 16, 561-570. Uzuner, O. (2013). Evaluating temporal relations in clinical text: 2012 i2b2 Challenge. Journal of the American Medical Informatics Association, 20, 806-813. Uzuner, O., Bodnari, A., Shen, S., Forbush, T., Pestian, J., & South, B. (2012). Evaluating the state of the art in coreference resolution for electronic medical records. Journal of the American Medical Informatics Association, 19, 786-791.

Machine Learning and Natural Language ProcessingReferences – 8 – Lecture c References Uzuner, O., Goldstein, I., Luo, Y., & Kohane, I. (2008). Identifying patient smoking status from medical discharge records. Journal of the American Medical Informatics Association, 15, 14-24. Uzuner, O., Luo, Y., & Szolovits, P. (2007). Evaluating the state-of-the-art in automatic de-identification. Journal of the American Medical Informatics Association, 14, 550-563. Uzuner, O., Solti, I., & Cadag, E. (2010). Extracting medication information from clinical text. Journal of the American Medical Informatics Association, 17, 514-518. Uzuner, Ö., South, B., Shen, S., & DuVall, S. (2011). 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. Journal of the American Medical Informatics Association, 18, 552-556. Uzuner, O., & Stubbs, A. (2015). Practical applications for natural language processing in clinical research: The 2014 i2b2/UTHealth shared tasks. Journal of Biomedical Informatics, 58(Suppl), S1-S5.

Machine Learning and Natural Language ProcessingReferences – 9 – Lecture c References Wilke, R., Xu, H., Denny, J., Roden, D., Krauss, R., McCarty, C., . . . Savov, G. (2011). The emerging role of electronic medical records in pharmacogenomics. Clinical Pharmacology and Therapeutics, 89, 379-386. Wu, Y., Denny, J., Rosenbloom, S., Miller, R., Giuse, D., Song, M., & Xu, H. (2015). A preliminary study of clinical abbreviation disambiguation in real time. Applied Clinical Informatics, 6, 364-374. Yadav, K., Sarioglu, E., Smith, M., & Choi, H. (2013). Automated outcome classification of emergency department computed tomography imaging reports. Academic Emergency Medicine, 8, 848-854. Yetisgen, M., Klassen, P., & Tarczy-Hornoch, P. (2015). Automating data abstraction in a quality improvement platform for surgical and interventional procedures. eGEMS, 2, 1114.

Health Care Data AnalyticsMachine Learning and Natural Language ProcessingLecture c This material was developed by Oregon Health & Science University, funded by the Department of Health and Human Services, Office of the National Coordinator for Health Information Technology under Award Number 90WT0001.

Machine Learning and NLP in Healthcare Data Analytics