1 / 59

Terminology and Metadata Whys and hows

Terminology and Metadata Whys and hows. Harold Solbrig Apelon, Inc. Outline. “Terminology” – Why does it matter? Metadata and its relationship to terminology Creating and managing terminological resources Description of Apelon and its role in all of this. Terminology – why does it matter?.

elenar
Télécharger la présentation

Terminology and Metadata Whys and hows

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Terminology and MetadataWhys and hows Harold Solbrig Apelon, Inc

  2. Outline • “Terminology” – Why does it matter? • Metadata and its relationship to terminology • Creating and managing terminological resources • Description of Apelon and its role in all of this

  3. Terminology – why does it matter? • Information technology (IT) is about _____? • Depending on your perspective, information: • Reduces uncertainty on the part of the receiver • IS the reduction of uncertainty on the part of the receiver • The transfer of information between a sender and a receiver is known as “communication” • The business of IT is accurate, timely and relevant communication.

  4. Communication and Language • Language - a “specification” that enables communication • Semantics - the association between signs or symbols and their intended “meaning” • Syntax - the rules for ordering and structuring the signs into phrases and sentences • Pragmatics - the relationship between signs and symbols and the recipient. Broadly, the shared context.

  5. The Semiotic Triangle Thought or Reference Refers to Symbolises Symbol Referent Stands for C.K Ogden and I. A. Richards. The Meaning of Meaning.

  6. The Semiotic Triangle Thought or Reference Refers to Symbolises Symbol Referent Stands for “Rose”, “ClipArt” C.K Ogden and I. A. Richards. The Meaning of Meaning.

  7. The Communication Process CONCEPT CONCEPT Symbolises Refers To Refers To Symbolises “I see a ClipArt image of a rose” “Rose”, “ClipArt” “Rose”, “ClipArt” Stands For Stands For Referent Symbol Symbol

  8. The Communication Process Semantics CONCEPT CONCEPT Symbolises Refers To Refers To Symbolises “I see a ClipArt image of a rose” “Rose”, “ClipArt” “Rose”, “ClipArt” Stands For Stands For Referent Symbol Symbol

  9. The Communication Process Semantics CONCEPT CONCEPT Symbolises Refers To Refers To Symbolises “I see a ClipArt image of a rose” “Rose”, “ClipArt” “Rose”, “ClipArt” Stands For Stands For Referent Symbol Symbol Syntax

  10. Context The Communication Process Semantics CONCEPT CONCEPT Symbolises Refers To Refers To Symbolises “I see a ClipArt image of a rose” “Rose”, “ClipArt” “Rose”, “ClipArt” Stands For Stands For Referent Symbol Symbol Context Syntax Shared Context

  11. Shared Context Impacts how much information can be contained in a symbol. Information / Symbol No Shared Context Shared Sun Shared Species Common Culture Common Profession Shared Universe Shared Planet Common Language Similar Education Common Specialty

  12. Shared Universe Pioneer 10 & 11 Voyager “Golden Record”

  13. Common Specialty “Interferons are a family of cytokines that exerts antiviral, antitumor and immunomodulatory actions by inducing a complex set of proteins. One of the best known IFN-induced protein is the dsRNA-dependent protein kinase (PKR), that mediates both antiviral and anticellular activities. PKR inhibits translation initiation through the phosphorylation of the alpha subunit of the initiation factor eIF-2 (eIF-2 ) and also controls the activation of several transcription factors such as NF- B, p53, or STATs. …” Marino Estiban. Induction of apoptosis by the dsRNA-dependent protein kinase (PKR): Mechanism of action. Apoptosis, Springer, Volume 5, Number 2, April 2000

  14. The impact of context on communication Shared context: • Allows information to be communicated in larger, more succinct “chunks”. • Drug, analgesic and NSAID are all “chunks”, yet differ markedly in conceptual complexity. • Enables specialized symbol sets: • Contrast the amount of information contained in the formula E=MC2 versus that contained in this presentation...

  15. Contextual Formalism The degree of formality in a shared context can vary across a wide spectrum: • Tacit context which is simply presumed • Contextual negotiation proceeding the actual message • Rigorous and formal rules and documents describing the form and possible meanings behind every message and phrase.

  16. Factors Effecting the Degree Contextual Formalism • Number of participating parties • Formalism needs to increase as number of participants increase • Geographic, cultural and temporal proximity of communicators • The further apart communicators are, the less they can assume • Amount of shared context • The more you have, the more important it becomes to be organized

  17. Factors Effecting the Degree Contextual Formalism • The cost of imprecise communication • Poetry and literature - low cost (some may argue actual gain) • Technical and professional - high to very high cost • What is the cost of assuming the units of a thrust specification? • What is the cost of assuming the dose of a prescription? • What is the cost of assuming the century in which the communication originated?

  18. Terminology • Symbols • Their encoding and decoding • Vocabularies, Dictionaries, Enumerations, Codes, ... • Context • Recording and sharing • Glossaries, textbooks, college courses, operations manuals, information models

  19. Terminology in the Digital Era • Multi-layered • We’ll ignore the lower layers – polarity of diodes representing bits, bits representing numbers, characters, …

  20. Terminology in the Digital Era • Focus is on metadata • What is a particular data collection about? • What information can be found in it? • How is that information recorded? • What are the contextual assumptions?

  21. The Communication Process Display Form CONCEPT CONCEPT Symbolises Refers To Refers To Symbolises Decode Encode Stands For Stands For Transform Referent

  22. Metadata and the Communication Process • Metadata describes the forms, data bases, encoding processes, etc. • Terminology is the component of metadata that: • Manages symbols and their “meanings” • For users (e.g. what are the possible choices for field ‘x’, and what does each of them mean) • For IT professionals (the Information Model) • Maintains context • What else does a given specialty, department, company, etc. assume is known in beyond the simple definition of symbols

  23. Terminology and Metadata • Standard modeling tools (UMLS, XML Schema, …) have provided a way to communicate the structure and content of data stores and messages. • Models, however, have to include information about their intended context and meaning to allow data sharing across domains. Terminology provides (or is, in some senses) this component.

  24. Terminology and Metadata(continued) • Amongst other things, ISO 11179 provides a model of how terminology and metadata go together • It has the advantage of being (or being in the process of becoming) a standard • ISO 11179 also provides astandard model of terminology content, which would provide a vehicle for interchange in the appropriate contexts. • There are other models of interest as well…

  25. Terminology and Metadatain 11179

  26. Terminology Sounds easy enough – why not just put together a set of tables and get going? Because… • Terminology has to be shared across multiple domains. This, after all, is its raison d'être • The model of the terminology itself has to be shareable. • The semantics of the terminology have to be shareable. • Terminology and knowledge management are inextricably intertwined • Fractal in nature – you can never stop adding • Boundaries are imprecise and expand • This means that there is no such thing as a “small terminology” • The components of terminology can also be viewed as declarative programs. • This means that the rigor of software development is applicable as well.

  27. Terminology(continued) 3)The knowledge behind terminology needs to be shared • Terminology resources depend on specialists (e.g. doctors, physicists, biologists, geneticists, etc…) • Development is expensive • Maintenance is often very expensive.

  28. Prerequisites to Terminology Creation • Know the standards • General standards (SKOS, RDF, OWL, 11179, SBVR, XML, UML, XMI, …) • Domain specific. • Example: Medical – HL7, LQS, CTS, CTS-2, UMLS, SNOMED, … • Know the tools • Development: TDE, Protégé, Obo Edit, Fact++, Racer, Jena, EVS, LexGrid… • Distribution: DTS, RDF, OWL, SKOS, … • Know the content • General (Dublin Core, CYC, SUMO, …) • Domain specific (Medical: NCIt, UMLS, ICD’s, SNOMED-CT, Gene Ontology, …)

  29. Terminology and Workflow • Terminology management includes: • Discovery • Federation • Authoring • Review • Distribution • Adoption

  30. Process (Example Sequencing) Import Report Review Transform Author Translate Approve Extract Load Post-coordinate Plan Federate Incorporate Map Version Review in Context Access Customize Maintain Submit Publish Subscribe Process Submissions Migrate Reevaluate Replace

  31. Content Update Applications VOSER Semantic MediaWiki (++) Annotations and Change Requests Status Report Core SME Submission Work Flow

  32. Key Points • Terminology is a critical component for cross-discipline, cross-enterprise information sharing. • Terminology development is a non-trivial task – it needs to be done correctly. • Terminology resources need to be federated, shared and reused. • But… there’s help!

  33. Apelon • Largest provider of terminology products and services • Unique expertise

  34. Employees • Internationally known terminology experts • Regular contributors to industry standards, publications and conferences

  35. Mission • Apelon software and services support the development, maintenance, and practical deployment of structured terminologies • Put another way, we help our customers - create, - maintain, and - leverage • standard and enterprise terminologies • It’s all about speaking the same language

  36. Facts Most of the world’s standard healthcare terminology resources have been built and/or are maintained with Apelon tools, including • SNOMED • CPT • ICD-9-CM • NDF-RT • UMLS

  37. Software Products • Terminology Development Environment (TDE) • Distributed Terminology System (DTS) • TermWorks

  38. 1 – Terminology Authoring (TDE) • Tools to create and maintain structured terminologies • Improve productivity, data quality and scalability • Enhance the value of enterprise assets • Commercial product – CPT • Internal infrastructure – Kaiser Permanente CMT • Public benefit – SNOMED CT, NDF-RT, NCI Thesaurus Author ICD CPT SNOMED NDF-RT . . .

  39. 1 - TDE • Based on Description Logic (DL) • Automated classification • Identifies redundancy • Provably consistent terminology • Collaborative features • Distributed authoring • Workflow • Conflict identification / resolution • Version control • Customizable interface and constraints

  40. Body Disease is-a part-of Heart is-a affects part-of is-a affects 1 – Automatic Classification Cardiac Disease Mitral Stenosis Mitral Valve

  41. Terminology servers reduce costs of terminology acquisition, integration and management Applications EMRs and CDRs NextGen, VA Knowledge repositories CDC, NCI Healthcare information portals HKHA Deploy Applications Customize 2 – Terminology Deployment

  42. 2 – What is a Terminology Server? A terminology server is • a networked software component • that centralizes terminology content and reasoning • to provide (complete, consistent and effective) terminology services for other network applications

  43. 2 – How is a Terminology Server Used? • By informaticists to create, maintain, localize and map terminologies • By clinical applications and their users to select and record standardized data • By integration engines to map data elements between applications

  44. Term/name normalization: What is the SNOMED CT name for heart attack? Code translation: What is the ICD-9 code for Myocardial Infarction? Grouping and aggregation: Is Myocardial Infarction a Cardiac Disease? Clinical knowledge: What drug treats Myocardial Infarction? Local information: Add L227 as the local code for Serum Calcium. Myocardial Infarction 410.9 Yes Streptokinase OK 2 - Examples of Terminology Services

  45. 2 – Apelon’s DTS Product • Integrated repository for all terminologies • Varying release cycles  regular releases • Inconsistent data models  common object model • Independent views  integrated view with mappings • Current snapshot  version management • Extensible with local terminology and maps • Subsets • Easy subscription updates (with exception reports) • Desktop editor and webtop browser • Workflow support • Flexible import, export and integration • Open source

  46. Terminology Server Standards • OMG’s Lexicon Query Services (LQS) • AKA TQS • Health Level Seven (and ANSI) Common Terminology Services (CTS) • In ISO Standardization as well • CTS-II • In process • Led by Apelon

  47. DTS and Standards CTS wrapper for DTS is available INTEL Healthcare SOA using DTS for CTS extensions • Currently ahead of CTS-II • Will be fed back into CTS-II

  48. 2 – Knowledge Base (KB) • Clinical (SNOMED CT) • Reimbursement (ICD, CPT, HCPCS) • Pharmaceuticals (Multum, NDF-RT) • Labs (LOINC) • Nursing (NIC, NOC, and NANDA) • Adverse events (MedDRA, COSTART, WHOART) • Extensive crosswalks • Mappings to MeSH and UMLS CUIs • Local additions

  49. DTS Server Tomcat (DTS Client) DTS Editor DTS Browser DTS Client Application 2 - Software Architecture DTS Database

More Related