1 / 53

The UMLS Semantic Network for Natural Language Processing Thomas C. Rindflesch, Ph.D. Lister Hill National Center for

The UMLS Semantic Network for Natural Language Processing Thomas C. Rindflesch, Ph.D. Lister Hill National Center for Biomedical Communications. Workshop on the Future of the UMLS Semantic Network. Goal. Sophisticated access to online information Supplement document retrieval with:

taini
Télécharger la présentation

The UMLS Semantic Network for Natural Language Processing Thomas C. Rindflesch, Ph.D. Lister Hill National Center for

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The UMLS Semantic Network for Natural Language Processing Thomas C. Rindflesch, Ph.D.Lister Hill National Center for Biomedical Communications Workshop on the Future of the UMLS Semantic Network

  2. Goal • Sophisticated access to online information • Supplement document retrieval with: • Information extraction • Automatic summarization • Question answering • Literature-based discovery • Central concern of informatics research

  3. Challenge: Language Complexity The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis.

  4. Challenge: Language Complexity The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis. • Language encodes a lot of information

  5. Natural Language Processing • Various approaches • Correspond to levels of linguistic expression • Words • Phrases • Relations

  6. Words The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis.

  7. ageapproximatelyaveragecardiovascularcharacteristicscomorbidconditionsdiseaseexamplehighageapproximatelyaveragecardiovascularcharacteristicscomorbidconditionsdiseaseexamplehigh hypertensionosteoarthritisparticipantspatientspredominanceprevalencereflecttypicalwomenyears Words

  8. Phrases The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis.

  9. average ageparticipants approximately 63 years predominancewomenhigh prevalencecomorbid conditions examplehypertension cardiovascular diseasetypical characteristics patients osteoarthritis Phrases

  10. Semantic Predications The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis.

  11. Semantic Predications The average age of participants (approximately 63 years), the predominance of women, and the high prevalence of comorbid conditions (for example, hypertension and cardiovascular disease) reflect typical characteristics of patients with osteoarthritis.

  12. Semantic Predications Cardiovascular DiseasesCO-OCCURS_WITHDegenerative polyarthritis HypertensionCO-OCCURS_WITHDegenerative polyarthritis

  13. Semantic Interpretation • Map syntactic structures to structured domain knowledge • Concepts • Relations • Output is semantic predication • Arguments and a predicate in a relationship • Supports enhanced access to online information

  14. Related Research in Biomedicine [Friedman, et al.] • MedLEE, GENIES • Semantic grammar • AQUA • Definite clause grammar • MPLUS • Chart parser • MEDSYNDIKATE • Dependency grammar [Johnson, Campbell] [Haug, et al.] [Hahn, et al.]

  15. SemRep • Interpret semantic predications in Medline • Exploit the UMLS • Concepts: Metathesaurus • Relations: Semantic Network • Syntax: SPECIALIST Lexicon • Use other resources at NLM • MetaMap • UMLS Knowledge Source Server

  16. Minimal Commitment Approach • Focused processing • Syntax • Semantics • Incremental development • Useful results

  17. SemRep:System Overview MedPost Tagger Lexical Look-up Resolve Ambiguity SPECIALIST Lexicon Metathesaurus Parser MetaMap Construct Relation Semantic Network Semantic Predication MedicalText

  18. Input The aim of this study was the characterization of the specific effects of alprazolam versus imipramine in the treatment of panic disorder with agoraphobia and the delineation of dose-response and possible plasma level-response relationships.

  19. Syntactic Processing Resolve Ambiguity SPECIALIST Lexicon MedPost Tagger Text Lexical Look-up Parser

  20. Syntactic Processing The aim of this study was the characterization of the specific effects NP[ofalprazolam][versus]NP[imipramine]NP[in the treatment]NominalizationNP[of panic disorder]NP[with Agoraphobia]and the delineation of dose-response and possible plasma level-response relationships.

  21. MetaMap: Metathesaurus Concepts MedPost Tagger Text Lexical Look-up Resolve Ambiguity SPECIALIST Lexicon Metathesaurus Parser MetaMap

  22. MetaMap: Metathesaurus Concepts The aim of this study was the characterization of the specific effects NP[ofAlprazolam][versus]NP[Imipramine]NP[in treatment]NominalizationNP[ofPanic Disorder]NP[with Agoraphobia]and the delineation of dose-response and possible plasma level-response relationships.

  23. Semantic Types The aim of this study was the characterization of the specific effects NP[of phsu][versus]NP[phsu]NP[in treatment]NominalizationNP[of dsyn]NP[with dsyn] and the delineation of dose-response and possible plasma level response relationships. Pharmacologic Substance Disease or Syndrome

  24. Construct Relation MedPost Tagger MedicalText Lexical Look-up Resolve Ambiguity SPECIALIST Lexicon Metathesaurus Parser MetaMap Construct Relation Semantic Network Semantic Predication

  25. Semantic Interpretation • Indicator rules • Establish a link between words and predicates in the Semantic Network • Argument identification rules • Syntactic constraints • Validation of semantic predications • Semantic Network

  26. Semantic Network Predicates associated_with physically spatially temporally conceptually functionally_related_to occurs_in affects brings_about

  27. Core SemRep Predicates associated_with physically spatially temporally conceptually LOCATION_OF functionally_related_to CO-OCCURS_WITH OCCURS_IN affects brings_about TREATS PREVENTS CAUSES

  28. Semantic Network Predication Occupational Activity Biologic Function Health Care Activity Pathologic Function Therapeutic or Preventive Procedure Disease or Syndrome associated_with physically spatially temporally conceptually functionally_related_to occurs_in affects brings_about treats

  29. Indicator Rules: Overview Item Semantic Network Structure nominalization TREATS treatment Drugs for the treatment of schizophrenia preposition in TREATS Hemofiltration in digoxin overdose preposition in HAS_LOCATION Severe infections in both feet Establish a correspondence between a syntactic item and a Semantic Network predicate

  30. Semantic Types The aim of this study was the characterization of the specific effects NP[of phsu][versus]NP[phsu]NP[in treatment]NominalizationNP[of dsyn]NP[with dsyn] and the delineation of dose-response and possible plasma level response relationships. Pharmacologic Substance Disease or Syndrome

  31. Apply Indicator Rule The aim of this study was the characterization of the specific effects NP[of phsu][versus]NP[phsu]NP[in treatment]NominalizationNP[of dsyn]NP[with dsyn] and the delineation of dose-response and possible plasma level response relationships. TREATS

  32. Argument Constraints The aim of this study was the characterization of the specific effects NP[of phsu] [versus]NP[phsu]NP[in treatment]NominalizationNP[of dsyn]NP[with dsyn]and the delineation of dose-response and possible plasma level response relationships. TREATS

  33. Semantic Network Predication phsu-TREATS-dsyn medd-TREATS-dsyn topp-TREATS-dsyn topp-TREATS-inpo The aim of this study was the characterization of the specific effects NP[of phsu] [versus]NP[phsu]NP[in treatment]NominalizationNP[of dsyn]NP[with dsyn]and the delineation of dose-response and possible plasma level response relationships.

  34. Match Semantic Types phsu-TREATS-dsyn medd-TREATS-dsyn topp-TREATS-dsyn topp-TREATS-inpo The aim of this study was the characterization of the specific effects NP[of phsu] [versus]NP[phsu]NP[in treatment]NominalizationNP[of dsyn]NP[with dsyn] and the delineation of dose-response and possible plasma level response relationships.

  35. Substitute Concepts The aim of this study was the characterization of the specific effects NP[of phsu] [versus]NP[Alprazolam]NP[in treatment]NominalizationNP[ofPanic Disorder]NP[with dsyn]and the delineation of dose-response and possible plasma level response relationships. Alprazolam-TREATS-PanicDisorder

  36. Evaluation • Developing a test collection • 2,000 sentences from MEDLINE • Mainly drug therapies • TREATS, OCCURS_IN, LOCATION_OF, ISA • Preliminary results • TREATS: 49% recall, 78% precision • ISA: 83% precision

  37. Applications • Automatic summarization • Marcelo Fiszman • Machine translation • Halil Kilicoglu • Discovery • Information extraction in genomics • Bisharah Libbus • Question answering • Dina Demner-Fushman

  38. Semantic Medline Enhanced Information Management UMLS Medline Semantic Processing PubMed

  39. Automatic Summarization • PubMed search with query “migraine” • Retain 500 most recent citations • Process with SemRep • Summarize SemRep output • Condense list of predications • Visualize results (Halil Kilicoglu) • Translate summarized results (using MeSH)

  40. Summarization for Discovery • Investigate “unexpected” connections • PubMed search with • sleep AND gastrointestinal • Run SemRep and summarize on • Gastroesophageal reflux disease

  41. Summarization for Discovery • Investigate “unexpected” connections • PubMed search with • sleep AND gastrointestinal • Run SemRep and summarize on • Gastroesophageal reflux disease • New PubMed search on “cpap AND gerd”

  42. cpap AND gerd

  43. Marked improvement in nocturnal gastroesophageal reflux in a large cohort of patients with obstructive sleep apnea treated with continuous positive airway pressure

  44. Semantic Network for NLP • Very useful as is • Issues noted while developing SemRep • Missing relations • Infelicitous relations • Semantic type hierarchy • Recommendations for development • Theory and practice • Incremental development • Maintenance

  45. Missing Relations • Genomics • “Genes” CAUSE “Disease” • “Genes” INTERACT_WITH “Genes” • Treatment • “Intervention” TREAT “Patients” / “Organism” Donepezil for patients with Alzheimer’s

  46. Semantic Type Hierarchy • Groups and group members (organisms) • dsyn OCCURS_IN podg • Adults with acidosis • dsyn PROCESS_OF mamm • Dogs with acidosis

More Related