1 / 29

Who is NCI Center for Bioinformatics? Part of US Government National Institutes of Health (NIH)

Who is NCI Center for Bioinformatics? Part of US Government National Institutes of Health (NIH) The Center for Bioinformatics is the National Cancer Institute ’ s strategic and tactical arm for research information management We collaborate with both intramural and extramural groups

elroy
Télécharger la présentation

Who is NCI Center for Bioinformatics? Part of US Government National Institutes of Health (NIH)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Who is NCI Center for Bioinformatics? Part of US Government National Institutes of Health (NIH) The Center for Bioinformatics is the National Cancer Institute’s strategic and tactical arm for research information management We collaborate with both intramural and extramural groups Mission to integrate and harmonize disparate research data Production, service-oriented data management

  2. Enable investigators and research teams nationwide, or worldwide, to combine and leverage their findings and expertise in order to meet this goal NCI Goal: Relieve pain, suffering and death due to cancer by the year 2015 Our advantage: A homogeneous community of interest with a common focus, willing to adopt standards and recommended best practices

  3. ability of a system to access and use the parts or equipment of another system Critical Success Factor: Semantic Interoperability Syntacticinteroperability Semanticinteroperability Emphasis on Machine Interoperability Over a cancer bioinformatics Grid

  4. Bruce BargmeyerWe have come to join … Terminology& Metadata ….. & Information Models

  5. The NCI’s Cancer Data Standards Repository (caDSR) • An XMDR approach to Harmonization of Terminolgy, Information Models and Metadata • Achieved through: • 3 layers of semantics • Infrastructure and compatibility guidelines • Model Driven Architecture • Open Standards • Tools • Community Governance

  6. SEMANTIC SEMANTIC SEMANTIC SYNTACTIC caBIG Compatibility Guidelines

  7. Analyze the problem space and develop the artifacts for each scenario Use Cases Use Unified Modeling Language (UML) to standardize model representations and artifacts. Design the system by developing artifacts based on the use cases Class Diagram – Information Model Sequence Diagram – Temporal Behavior Use meta-model tools to generate the code MDA Approach

  8. Limited expressivity for semantics No facility for runtime semantic metadata management Limitations of MDA

  9. MDA plus a whole lot more! Open software Open development ISO/IEC 11179 caCORE

  10. NCI Terminology Browser to search and Navigate NCI Thesaurus and other terminologies curated by EVS caDSR – Repository and administration Tool CDE Browser to Search for, View and Download Side-by-Side Compare Form Builder to Create user specified collections of CDEs - Sentinel Tool to Generate end user ‘Alerts’ triggered by metadata changes - caCORE SDK is a toolkit to create “semantically integrated” applications -- all exposed API elements have runtime accessible metadata that defines the meaning of the elements using controlled vocabualries. caCORE Tools Access, Develop, Manage, Consume

  11. UML Modeling Tool (any with XMI export) Semantic Connector (concept binding utility) UML Loader (model registration in caDSR) Codegen (middleware code generator) Security Adaptor (Common Security Module) caCORE ToolkitSDK Components caCORE SDK Generates a caBIG Silver-Compliant System

  12. S E C U R I T Y Application Domain Model (UML) Common Data Elements (11179) Enterprise Vocabulary (OWL-DL) caCORE Semantic Components 3 Layers of Semantics

  13. Application Objects (UML)

  14. What do all those data classes and attributes actually mean, anyway? Data descriptors or “semantic metadata” required Computable, commonly structured, reusable units of metadata are “Common Data Elements” or CDEs. NCI uses the ISO/IEC 11179 standard for metadata structure and registration Semantics drawn from Enterprise Vocabulary Service resources Common Data Elements (11179)

  15. Description Logic Enterprise Vocabulary Concept Code Relationships Preferred Name Definition Synonyms

  16. Description Logic Enterprise Vocabulary Relationships “Carcinoma”Disease_Associated_with_Disease“Lytic Bone Lesions” “TP53”Gene_associated_with_Disease“Breast Carcinoma” • Semantic Types: • Gene_associated_with_DiseaseC43780 Molecular abnormalities in the gene may be associated with the manifestation of disease. The role is used to assert a link between gene and disease and is considered to have clinical relevance. The domain and range kind for this role are Gene_Kind and Findings_and_Disorders_Kind, respectively.

  17. XMDR How? Cancer Ontologic Research Environment (caCORE) • MDA - UML domain models (Blue) • Model Driven Architecture (MDA) • Simplify application development • Embed semantic integration – annotate model with cancer Concepts • caDSR IEC/ISO 11179 Metadata Registry (Gold) • - Common Data Elements • - provides the semantics for data elements • - NCI expanded 11179 register models: UML, forms, protocols, analytic services, analytic tools, data services • EVS Shared Terminology(Red) • - Enterprise Vocabulary Services (EVS) • - Standard terms and definitions • - Cancer specific ontology

  18. Created a UML  caDSR Mapping ValueDoman:Enumeration

  19. caCORE SDK - Common Methodology Workflow

  20. Application Domain Models (UML) caCORE XMDR

  21. C1708 C1708 C1708:C41243 C1708:C41243 Computable Interoperability Agent Drug name id nSCNumber NDCCode CTEPName approvalDate FDAIndID approver IUPACName fdaCode My model Your model

  22. Tying it all together: The caCORE semantic management framework Desc. Logic CDEs Concept Codes 2223333 C1708 2223866 C1708:C41243 2223869 C1708:C25393 2223870 C1708:C25683 2223871 C1708:C42614 Application Objects Common Data Elements Enterprise Vocabulary

  23. Public APIs Domain object metadata Harmonized Common data elements (CDEs) Data Elements Vocabulary for CDE specification Dictionary, thesaurus services caCORE Infrastructure user application

  24. Kevin KeckXMDR 11179 Edition 3Concepts and Relationships

  25. *Concept Use and Integrationwith 11179 Part 3, Edition 2 Conceptual Domain Agent Object Class Chemopreventive Agent Valid Values Cyclooxygenase Inhibitor Doxercalciferol Eflornithine … Ursodiol Data Element Concept Chemopreventive Agent NSC Number Value Domain NSC Code Classification Schemes caDSRTraining Property NSCNumber Representation Code Data Element Chemopreventive Agent Name Context caCORE

  26. Concept Use and Integration Everything in Red in the caDSR is directly Associated with a CONCEPT UML Model/Package Conceptual Domain Drugs and Chemicals C1913 UML datatype or Enumeration UML Class Valid Values Anethole Trithione C246 Cyclooxygenase Inhibitor C1323 Ginger C2691 Green Tea C2694 Iloprost C48397 … Ursodiol C1818 Object Class Chemopreventive Agent C1892 Value Domain Drug Name Text Data Element Concept Chemopreventive Agent Namet Classification Schemes caDSR Training Property Name C42614 UML attribute Representation Name C42614 Data Element: Chemopreventive Agent Drug Name caDSR 11179 ID: 2008765v1.0 Semantic Signature: C1913.C1892.C42614.C42614.C246.C1323…. (ConceptualDomain.ObjectClass.Property.Repsentation.Values) * based on ISO/IEC 11179 Part 3 Metamodel

  27. Where have we been? Where are we now?…& where are we planning to go? caGrid System manuals Semantic grids caCORE Data dictionaries caDSR Semantics services (SSOA) EVS/caDSR 11179 E1 XMDR Project 11179 E2 XML & related standards 11179 E3 EVS Terminologies, ontologies, etc. Complex semantics management Data engineering/XML Data Semantics management for data Data Standards/Data Administration

  28. NCICB Peter Covitz Denise Warzel George Komatsoulis Frank Hartel Sherri De Coronado Gilberto Fragoso Oracle Steve Alred Prerna Aggarwal Christophe Ludet Shaji Kakkodi Jane Jiang Anwar Anhad Jennifer Dong NCICB URL http://ncicb.nci.nih.gov/infrastructure/cacore_overview ScenPro Bill McCurry Tom Phillips Robert Harding Jennifer Brush Larry Hebel Smita Hastak ISO ISO/IEC 11179 Information Technology - Metadata Registries (MDR) Parts1-6 2002(E) + Acknowledgements • Semantic Bits • - Ram Chilukuri • MSD • - Nicole Thomas • XMDR • Bruce Bargmeyer (LBNL) • Kevin Keck (LBNL) • Frank Olken (LBNL) • John McCarthy (LBNL) • Karlo Berket (LBNL) • Harold Solbrig (Mayo) • Gayle Hodge (USGS) • Denise Warzel (NCI) • Larry Fitzwater (EPA) • Nancy Lawler (DOD) • Sam Chance (DOD)

More Related