210 likes | 212 Vues
Pragati’s Expoze ́ Tool Suite for Harmonization. Mala Mehrotra Dmitri Bobrovnikoff Pragati Synergetic Research, Inc. mm@pragati-inc.com. Expedition Workshop/Designing the DRM NSF, Arlington, VA. August 16 2005. Outline. Motivation Ontology and Ontological Issues Expoze ́ Tool Suite
E N D
Pragati’s Expozé Tool Suite for Harmonization Mala Mehrotra Dmitri Bobrovnikoff Pragati Synergetic Research, Inc. mm@pragati-inc.com Expedition Workshop/Designing the DRM NSF, Arlington, VA. August 16 2005
Outline • Motivation • Ontology and Ontological Issues • Expozé Tool Suite • MVP-CA Technology Core • Expozé Contribution Areas for DRM • Knowledge Entry • Quality Assurance • Mapping • Candidate Examples from Preliminary Analysis of FEA • Conclusion
What is an Ontology? An ontology is an explicit formal specification of the terms in the domain and relations among them • Concepts arise from the objects of interest in the environment, and the purpose to which they are subjected • Interrelationships between the concepts depends upon the behavioral characteristics of the objects, and the operational characteristics of the environment Why do we need an Ontology? • Formalized semantics of concepts allows automated reasoning with the concepts • Enabling enhanced functionalities in a system • A common lingua supports interoperability, collaboration and sharing across systems
Ontological Design Principles Ontological engineers try to optimize the ontological design • Parsimonious design of concept classes • Crispness in the distinctions across concepts • Richness in the associations across concepts
Ontological Concerns • Information overload is occurring in the creation of ontologies • Every organization “thinks” their “core ontology” will be the Holy Grail for ontologies • Reality #1: The notion of a canonical ontology is, at least at present, a myth • Reality #2: We currently have to live with a cloud of candidate ontologies which model a “real” concept from different perspectives Ontology Developer’s Dilemma: How can I effectively find and reuse concepts from that “cloud”?
Ontological Issues Conceptual/Modeling Differences • Level of Abstraction • Concepts are too specialized Example: Ford Taurus, Toyota Camry, Honda Accord => Automobiles • Concept is too general: Example: Move => Move-Into, Move-To, Move-Out-Of, Move-Through • Placement in the ontological hierarchy Different choices on specifying ontological distinctions for orthogonal characteristics Example: An ontology for organizing clothes line is different for (a) department store layout for customers Gender (mens’, womens’) (b) ordering clothes from a manufacturer Clothes-type (pants, shirts)
Ontological Issues Term Relationships • Vicinity Terms – Terms related via common usage patterns Example: Pour, Immerse, Permeate • Complementary/Inverse terms Example:Move-From & Move-To, Exit and Enter • Homonym Terms - Context determines the semantics Example: Contract -> physical change vs. legal document Culture -> societal issues vs. biological experiment • Overloaded Terms – Same semantics for very different contexts Example: ObjectFoundInLocation
Ontological Issues • Lexically and semantically close terms Example: Move & Move-Into, Touches & TouchesDirectly Prevent & Prevents • Lexically distant but semantically close terms Example: providesCoverInCOA & providesConcealmentInCOA TaskTypeRequiresAgentType,opTypeRequiresAgentType • Lexically reversed but semantically close terms Example: ForwardPassageOfLines-MilitaryOperation & PassageOfLines-Forward-MilitaryTask
Pragati’s Vision Provide tools for the Ontology developers: • Development – Knowledge Entry aids • Reuse – Knowledge Discovery aids • Interoperability – Mapping/Merging aids • Maintenance – Quality Assurance aids Representations supported: • Axiomatized ontologies • Knowledge Bases • Loosely structured text (reports, manuals, etc.) Basic Tenet: Clustering the information system into semantically-related groups facilitates a variety of software engineering tasks.
Expozé: Pragati’s Tool Suite Clustering Engine Analysis Engine Vicinity Concepts Generator QA Engine Clustered Artifacts Repository Artifact adaptor Relationship Extractor Mapping Engine Query Engine Import/Export Plugins S-S-Text OWL MELD/CycL Template Extractor Repository Manager XMDR SCL CLIPS XMI ….. SemanticWeb Ontologies/ KBs/ Semi-Structured Systems COE with MVP-CA backend MVP-CA: Cluster Analysis Tool IOD: Iterative Ontology Development Tool OSRT: Ontology Search and Reuse Tool
Core-Technology: Multi-ViewPoint Clustering Analysis Approach: Cluster a knowledge base from multiple perspectives • Clustering of knowledge bases into groups of semantically-related rules/axioms reveals • Relationship of terms in the context of their usage • Prototypical patterns of usage for the terms in the axioms • Multiple ways of clustering (based on different objective criteria) aid in understanding and analyzing KBs from different perspectives
Utilizing Expozé Tool Suite for DRM 2 2 Registry 3 3 1 1 DOS DHS • Expoze Tool Suite Contribution Areas • Knowledge Discovery & Entry • Quality Assurance • Mapping & Harmonizing Data Models Screening COI Diagram from M. Daconta’s presentation
MVP-CA Interface: Navigating FEA Ontology* Understanding the ontology: MVP-CA tool supports exploration and browsing of concept clusters (dendrogram representing cluster formation shown here) * FEA Ontology developed by TopQuadrant
Some Concept Clusters in FEA Ontology component composition value transfer measurement area siblings reference models measurement indicators business / service Some high-level concepts extracted from a preliminary analysis of the FEA ontology (BRM, PRM, SRM & TRM)
COE – Expozé Interface:Searching for a concept Two complementary viewpoints: definitional view of FEA OWL axioms (left) and vicinity concepts view across ontologies given the search term “component” (right)
FEA Ontology Template Possible deviation from naming convention? Template: common patterns abstracted for knowledge entry
Structural Harmonization Cyc’s BioChemistry Mt. - - K Q M N D R A C G T A C G U N A G T U C D D D R R R D R • A nucleotide molecule can be represented by • holding the sugars constant at first level and varying the base (left figure) or • holding the base constant at first level and varying the sugar (right figure) • The left representation good for chain type of reasoning for the molecule that is at the nucleotide level. • The right representation good for the matching base pair type of level of reasoning. • Clustering brought to attention both these representations. Sugar-dependent representation Base-dependent representation
Axiom Clusters for NucleotidesCyc’s BioChemistry Mt. - - K Q M (#$genls #$Thymine-Deoxyribonucleotide #$Deoxyribonucleotide) (#$genls #$Adenine-Deoxyribonucleotide #$Deoxyribonucleotide) (#$genls #$Cytosine-Deoxyribonucleotide #$Deoxyribonucleotide) (#$genls #$Guanine-Deoxyribonucleotide #$Deoxyribonucleotide) Clusters showing multiple legitimate representations of Nucleotides shown graphically in the last slide (#$genls #$Uracil-Ribonucleotide #$Ribonucleotide) (#$genls #$Adenine-Ribonucleotide #$Ribonucleotide) (#$genls #$Cytosine-Ribonucleotide #$Ribonucleotide) (#$genls #$Guanine-Ribonucleotide #$Ribonucleotide) Sugar-dependent representation (#$genls #$Deoxyribonucleotide #$Nucleotide) (#$genls #$Ribonucleotide #$Nucleotide) (#$genls #$Nucleotide #$Molecule) (#$genls #$AdenineNucleotide #$Nucleotide) (#$genls #$CytosineNucleotide #$Nucleotide) (#$genls #$GuanineNucleotide #$Nucleotide) (#$genls #$Adenine-Ribonucleotide #$AdenineNucleotide) (#$genls #$Adenine-Deoxyribonucleotide #$AdenineNucleotide) (#$genls #$Cytosine-Deoxyribonucleotide #$CytosineNucleotide) (#$genls #$Cytosine-Ribonucleotide #$CytosineNucleotide) (#$genls #$Guanine-Deoxyribonucleotide #$GuanineNucleotide) (#$genls #$Guanine-Ribonucleotide #$GuanineNucleotide) Base-dependent representation
Mapping Opportunity in FEA:ServiceComponent(SRM) & ServiceCategory(TRM) - - K Q M
Future Work • In-Depth FEA analysis • Extend the FEA OWL axioms analysis to DRM Instances Analysis across COIs • Semi-Automatic Extraction of Ontologies from XML Data Model Instances • Build Mapping Aids for Interoperating across various DRM Models
ROI for DRM Effort • Cost-Effective Solution for Building and Organizing Data Models & Ontologies • Less time needed • Less personnel needed • Effective reuse of existing systems • Quality Solution enabling high-end analysis for • Development • Maintenance • Interoperability • Adaptive Solution to Changing Demands • In time as data models & ontologies evolve across applications • In perspective for different types of needs from various COIs