1 / 36

The UMLS* Metathesaurus*: Lessons for Metadata Registries

The UMLS* Metathesaurus*: Lessons for Metadata Registries. Betsy L. Humphreys blh@nlm.nih.gov http://www.nlm.nih.gov. * UMLS and Metathesaurus are registered trademarks of the National Library of Medicine. Outline of Presentation.

archer
Télécharger la présentation

The UMLS* Metathesaurus*: Lessons for Metadata Registries

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The UMLS* Metathesaurus*: Lessons for Metadata Registries Betsy L. Humphreys blh@nlm.nih.gov http://www.nlm.nih.gov * UMLS and Metathesaurus are registered trademarks of the National Library of Medicine

  2. Outline of Presentation • Brief overview -- NLM’s Unified Medical Language System (UMLS) Project and its products • Description of the UMLS Metathesaurus • content, construction methods, characteristics • Interspersed Metadata Questions/Issues

  3. UMLS Purpose • Make it easy for health professionals and researchers to retrieve and integrate relevant information from disparate automated sources, e.g. • computer-based patient records • factual databanks • bibliographic databases and full-text • expert systems

  4. UMLS Focus -- Conceptual Connections • Build knowledge sources that can be used by intelligent programs to overcome: • disparities in language used by different users and in different information sources; • difficulties in identifying which of many information sources is relevant

  5. UMLS Knowledge Sources Multi-purpose tools or “intellectual middleware” for System Developers • Metathesaurus • SPECIALIST lexicon and lexical programs • Semantic Network

  6. UMLS Knowledge Sources Distribution • Annual updates, 1990 - - • Free under license agreement with NLM • Need separate license agreements with vocabulary producers for some uses of some vocabularies in the Metathesaurus • Available to licensed users (~900) via Internet server and on CDs • Relational format (ASN.1 retired due to lack of use, XML being developed)

  7. 1999 UMLS Metathesaurus • 626,313 concepts • 1,134,413 “terms” (Eye, Eyes, eye = 1) • 1,358,891 “strings”/concept names • (Eye, Eyes, eye = 3) • ~50 source vocabularies

  8. UMLS Metathesaurus • Concepts, terms, and attributes from many controlled “vocabularies” • New inter-source relationships, definitional information, use information • Scope determined by combined scope of source vocabularies

  9. UMLS Source “Vocabularies” • Widely varying purposes, structures, properties, but all are in essence “sets of valid values” for data elements: • Thesauri, e.g., MeSH • Statistical Classifications, e.g., ICD • Billing Codes, e.g., CPT • Clinical coding systems, e.g., SNOMED • Lists of controlled terms, e.g., COSTAR, HL7 value sets

  10. Metathesaurus Construction • Convert machine-readable vocabulary sources to UMLS “normal” form, making source semantics explicit • Merge, using source semantics and lexical processing techniques • Edit results, adding additional relationships and semantic information

  11. $100,000 Metadata Questions • What constitutes “explicit semantics” for Metadata? • At a minimum interpretable by humans • Preferably interpretable by machines • How will the significant human effort required to create useful Metadata registries be organized and funded?

  12. Metathesaurus Characteristics (1) • Concept organization • Many sources in a common database format • Representation of the meaning in each source vocabulary • Explicit tagging of each source vocabulary’s information

  13. Current MeSH --Organized by Preferred Term

  14. UMLS Metathesaurus -- Organized by Concept

  15. Metadata Question • What is the operational definition of synonymy in the realm of Metadata element names? • OR, When does a distinction make a difference in Metadata?

  16. Metadata Question • Will the Metathesaurus approach to “multiple meanings” work for data element names? • E.g., Country • Country of Birth • Country of Residence • Country of Publication • REMINDER: different data elements can have the SAME set of valid values

  17. Metadata Question • What level of explicit tagging is needed in Metadata Registries?

  18. Metathesaurus Characteristics (2) • Added relationships between concepts and terms from different vocabularies • Added definitional and use information • “Context-free” unique identifiers • the concept “names” that never change • Normalized word and string indexes produced using UMLS lexical tools

  19. Metadata Question • In the realm of Metadata, what requires unique, permanent, context-free identifiers?

  20. Normalization -- example • disorder esophageal motility = normalized form of: • Esophageal Motility Disorders • Esophageal Motility Disorder • Motility Disorder, Esophageal • Disorder, Esophageal Motility

  21. Metadata Questions • Are similar lexical resources needed as adjuncts to Metadata Registries? • Are the UMLS lexical tools directly useful for Metadata efforts?

More Related