560 likes | 653 Vues
Chapter 9: Ontology Management. Service-Oriented Computing: Semantics, Processes, Agents – Munindar P. Singh and Michael N. Huhns, Wiley, 2005. Highlights of this Chapter. Motivation Standard Ontologies Consensus Ontologies. Motivation.
E N D
Chapter 9:Ontology Management Service-Oriented Computing: Semantics, Processes, Agents– Munindar P. Singh and Michael N. Huhns, Wiley, 2005
Highlights of this Chapter • Motivation • Standard Ontologies • Consensus Ontologies Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Motivation • Descriptions of services are improved through the use of ontologies • But how do we ensure the parties involved agree upon the ontologies? • Traditional approach: standardize the ontologies via a formal process • Emerging approach: • Be more like the Web • Figure out the “correct” ontology via consensus Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Standard Ontologies Standardization is more a sociopolitical than a technical process • IEEE Standard Upper Ontology • Common Logic (language and upper-level ontology) • Process Specification Language • Space and time ontologies • Domain-specific ontologies, such as health care, taxation, shipping, … Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
An Example Upper Ontology Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
OASIS Universal Business Language (UBL) Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Standardization Pros • Where standards exist and are agreed upon, they (even if imperfect) • Save time and improve effectiveness • Enable specialized tools where appropriate • Improve longevity of solution over time and space • Suggest directions for improvement Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Standardization Cons • Standardization of domain-specific ontologies is • Cumbersome • Often out of date by the time completed • Difficult to maintain • Often violated for competitive reasons Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Standardization: Proposed Approach • Always use standard languages (XML, RDF, OWL, …) • Take high-level concepts from standard models: • Domain experts are not good at KR • Lot of work in the best of cases • Work toward consensus in chosen domain Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Inducing Common Ontologies • Instead of beginning with a standard, develop consensus to induce common ontologies • Assumptions: • No global ontology • Individual sources have local ontologies • Which are heterogeneous and inconsistent • Motivation: Exploit richness of variety in ontologies • To see where they reinforce each other • To make indirect connections (next page) Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Truck APC Wheel Tire Possibly equivalent Truck APC APC partOf equivalence Wheel equivalence Wheel Tire Relating Ontologies Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Relating Ontologies • A concept in one ontology can have one of seven mutually exclusive relationships with a concept in another: • subclassOf • superclassOf • partOf • hasPart • siblingOf • equivalentTo • other • Each ontology adds constraints that can help to determine the most likely relationship Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Initial Experiment:55 Individual Simple Ontologies about Life Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
55 Merged Ontologies Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Methodology for Merging and Reinforcement • Merging used smart substring matching and subsumptionFor example, living livingThingHowever, living X livingRoombecause they have disjoint subclasses • 864 classes with more than 1500 subclass links were merged into 281 classes related by 554 subclass links • We retained the classes and subclass links that appeared in more than 5% of the ontologies • 281 classes were reduced to 38 classes with 71 subclass links • We merged concepts that had the same superclass and subclass links • Result has 36 classes related by 62 subclass links Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Consensus Ontology for Mutual Understanding Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Consensus Directions • The above approach considered lexical and syntactic bases for similarity • Other approaches can include • Richer dictionaries • Richer voting mechanisms • Richer forms of structure within ontologies, not just taxonomic structure • Models of authority as in the WWW Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Alternative Approaches We may construct large ontologies by • Inducing classes from large numbers of instances using data-mining techniques • Building small specialized ontologies and merging them (Ontolingua) • Top-down construction from first principles (Cyc and IEEE SUO) Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Aside: Categorizing Information Consensus is driven by practical considerations • Should service providers classify information where it • Belongs in the “correct” scientific sense? • Where users will look for it? • Case in point: If most people think a whale is a kind of fish, then should you put information about whales in the fish or in the mammal category? Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Chapter 9 Summary • For large-scale systems development, agreeing upon acceptable ontologies is nontrivial • Standardization helps, but suffers from key limitations • Consensus approaches seek to figure out acceptable ontologies based on available small ontologies • Should always use standards for representation languages Service-Oriented Computing: Semantics, Processes, Agents - Munindar Singh and Michael Huhns
Ontology Building across Heterogeneous Databases Michael N. Huhns Center for Information Technology University of South Carolina
The Fundamental Problem • We would like to arrange for effective and efficient interactions among large numbers of heterogeneous information components: databases, applications, and interfaces • Difficulties are • Components are incomprehensible, inconsistent, and often unknown in advance • We need to enable updates as well as retrievals • The information environment is open • We need to consider process and policy, as well as structure University of South Carolina
Needs and Applications • Heterogeneous database access and management • Information search, retrieval, and fusion • Workflow automation • Agent communication • Information management: consistency • Distributed collaboration • Distance education University of South Carolina
Emerging Solution: A Cooperative Information System Agent Application Application Application Agent Agent Agent Application Agent Agent Agent E-Mail System Agent Agent Workflow System Database System Web System University of South Carolina
User Agent Resource Agent User Agent Resource Agent User Agent Resource Agent Another View of CIS Middleware: Mediators, Brokers, Facilitators, Ontologies, and Registries
(de facto) Standard Agent Types and Architectures Application Program User Interface Agent MCC InfoSleuth CMU RETSINA SRI OAA USC-ISI SIMS & TeamCore Global InfoTek Grid Reply Reg/Unreg (KQML) Reply Query or Update (SQL) Ontology Agent Broker Agent Reg/Unreg (KQML) Mediator Agent Ontology (OKBC) Reg/Unreg (KQML) Registry Agent Mediated Query (SQL) Reg/Unreg (KQML) Schemas (CLIPS) 11179 Registry Mediated Query (SQL) Reply Reply Database Resource Agent Database Resource Agent SQL (JDBC) University of South Carolina
Implementing the Agent Architecture • How to build an agent • How to construct an ontology University of South Carolina
Models for Database #1 Title Phone Name Person coAuthors Document (1,N) (1,N) SSN Per_cent Relational Model Person (SSN , Name, Phone) CoAuthors (SSN, Title, Per_cent) Document (Title) University of South Carolina
Models for Database #2 Title EID Name Employee fillsOut ComplianceForm (1,1) (1,N) SSN Phone Relational Model Employee (EID , Name, SSN) ComplianceForm (Title, EID) University of South Carolina
Thing Class of All Class of All Entity Relations Attributes Person Person Name Attributes Person SSN Employee ID Full-Time Part-Time Full-Time Employee Employee Employee Attributes Attributes Domain Ontology Document Relations Person Document Document Attributes Coauthors ComplianceForm Employee Employee Document Title FillsOut Attributes Part-Time Employee University of South Carolina
Semantic Mappings Common Ontology Application 1 Interface 1 Entity Articulation Axiom 3 Mappings are sentences in some logical language, e.g., KIF, Loom, CLIPS Articulation Axiom 1 Document Person Boat Homemaker Employee Minor Articulation Axiom 4 Articulation Axiom 2 DB1 DB2 Person Employee SSN Name EID Name University of South Carolina
Ontologies and DBs • An ontology specifies the intended meaning of concepts in a database: DB Schema: Table: PartsPrice *stockNo: integer cost: float Ontology: price(x,y) => $ (x’,y’)[automobile_part(x’) & stock_no(x’) = x & retail_price(x’,y’) & magnitude(y’,US_dollars)=y] University of South Carolina
Semantic Translation Semantic Translation by Mappings by Mappings Semantic Translation Semantic Translation by Mappings by Mappings Semantic Translation by Mappings DB1 DB1 DB1 Semantic Translation User Application 1 Application n Agent for Application Agent for Application Common Enterprise-Wide View Agent for Resource Agent for Resource Agent for Resource University of South Carolina
Workflow Automation of Telecommunication Service Provisioning User Interface Agent Transaction Scheduling Agent User + Application Schedule Repairing Agent Schedule Processing Agent ESS ESS . . . Switch DB LFACS DB TIRKS DB University of South Carolina
Example Workflow in Telecommunications Service Request Span in Place? Service Order Create Bill LFACS TIRKS FEPS Switch TIRKS TIRKS NSDB WFA University of South Carolina
Semantic Model for Interface Agent id* date name* phone Service Order Ordered by Customer Orders quantity Circuit type aLocation zLocation University of South Carolina
Dimensions of Heterogeneity: Structure • Schemas and views, e.g., securities are stocks • Specializations and generalizations of domain concepts, e.g., stocks are a kind of liquid asset • Value maps, e.g., S&P A+ rating corresponds to Moody’s A rating • Semantic data properties, sufficient to characterize the value maps, e.g., prices on the Madrid Exchange are daily averages rather than closing prices • Cardinality constraints • Integrity constraints, e.g., each stock must have a unique SEC identifier • Data value ranges, e.g., Price > 0 • Allow or disallow “maybe values” for data University of South Carolina
Dimensions of Heterogeneity: Process • Procedures, i.e., how to process information (e.g., how to decide what stock to recommend) • Preferences for accesses and updates in case of data replication (based on recency or accuracy of data) • Preferences to capture view update semantics • Contingency strategies, e.g., whether to ignore, redo, or compensate • Contingency procedures, i.e., how to compensate transactions • Flow, e.g., where to forward requests or results • Temporal constraints, e.g., report tax each quarter University of South Carolina
Dimensions of Heterogeneity: Policy • Security, i.e., who has rights to access or update what information? (e.g., customers can access all of their accounts, except blind trusts) • Authentication, i.e., a sufficient test to establish identity (e.g., passwords, retinal scans, or smart cards) • Bookkeeping (e.g., logging all accesses) University of South Carolina
Definition • Ontology: a representation of knowledge specific to some universe(s) of discourse • Ontology: an agreement about a shared conceptualization, which includes conceptual frameworks for modeling domain knowledge and agreements about the representation of particular domain theories University of South Carolina
Key Words • Each document is characterized by a set of key words • The union of the sets is the domain of discourse for the documents • Advantages: • simple • domain independent methods exist (can be automated) • good for organizing heterogeneous text • Disadvantages: • not appropriate for data • “this is about X” vs. this is not about X” • key words are not organized University of South Carolina
Alta-Vista“Way-Cool Topic Graph” University of South Carolina
Thesaurus • Organizes key words based on synonyms and antonyms • WordNet: (http://www.cogsci.princeton.edu/~wn/) groups words into synonym sets, and relates the sets via hypernymy/hyponymy, antonymy, entailment, and meronymy/holonymy University of South Carolina
Taxonomies • A hierarchical organization of concepts, based on set-subset relationships. Biologists organize the plant and animal kingdoms using taxonomies University of South Carolina
Topic Trees, Ontologies, and Database Schemas MiG29 Weapon price designer Number Person People Terms Air Sea expertIn Mikoyan r73 mig29 sirena Fighter Bomber speed weight ivan artem mikoyan Person DOB Specialty Fighter Speed Weight Price University of South Carolina
Ontologies • A semantic net (a generalization of a taxonomy, allowing other relationships than subset) consisting of types of entities, attributes and properties, relations and functions, and constraints hasPart Car Wheel (= #wheels 4) subclass Convertible University of South Carolina
Ontology Development • Bottom-Up from Schemas and Key Words • identify databases • identify names for all tables, fields, and enumerated values (e.g., if value is limited to a primary color “red”, “green”, or “blue”) • form groups of common concepts and assign name to covering concept for each group • iterate; or Extensional View: form classes from instances University of South Carolina
Ontology Development • Top-Down from First Principles (intensional view): a class is defined by a set of membership conditions or properties • Restrictions on Class Formation: • a class must have instances • a class must contain all properties common to the instances in its extension • classification should obey cognitive economy--instances of a class must share some, but not all properties • classification should enable inference of properties based on class membership University of South Carolina
Ontology Development (cont.) • Restrictions on Class Structures: • Completeness--every property must be used in the definition of at least one class • Nonredundancy--a subclass must be defined by at least one property not in any of its superclasses (the result is that a subclass is always a specialization of any of its superclasses, i.e., it has more properties or restrictions, and has fewer instances) University of South Carolina
Classification Is Difficult! From the ancient Chinese encyclopedia Celestial Emporium of Benevolent Knowledge, “It is written that animals are divided into • belonging to the emperor • embalmed • tame • sucking pigs • sirens • fabulous • stray dogs • included in the present classification • frenzied • innumerable • drawn with a very fine camel-hair brush • et cetera • having just broken the water pitcher, and • that from a long way off look like flies.”