290 likes | 435 Vues
The Basics of Ontologies Nordic Agricultural Ontology Service (AOS) Workshop Royal Veterinary and Agricultural University Copenhagen, Denmark February 28, 2003 Frehiwot Fisseha Frehiwot.Fisseha@fao.org. What this talk is all about. The Origin of Ontology The Definitions of Ontology
E N D
The Basics of OntologiesNordic Agricultural Ontology Service (AOS) WorkshopRoyal Veterinary and Agricultural UniversityCopenhagen, DenmarkFebruary 28, 2003 Frehiwot FissehaFrehiwot.Fisseha@fao.org
What this talk is all about • The Origin of Ontology • The Definitions of Ontology • Motivation for Developing Ontology • Some Examples • Benefits of Ontology • Application Areas of Ontology • Types of Ontology • Similarities and Differences of Ontologies and Thesauri • Things to keep in mind • Conclusion
The Origin of Ontology The term “ontology” has been used for a number of years by the artificial intelligence and knowledge representation community but is now becoming part of the standard terminology of a much wider community including information systems modelling. The term is borrowed from philosophy, where ontology mean ‘a systematic account of existence’. (Not very useful definition for our purpose!!)
What is Ontology? (1) An ontology is "the specification of conceptualizations, used to help programs and humans share knowledge." An ontology is a set of concepts - such as things, events, and relations that are specified in some way in order to create an agreed-upon vocabulary for exchanging information. (Tom Gruber, an AI specialist at Stanford University.) Ontologies establish a joint terminology between members of a community of interest. These members can be human or automated agents.
What isOntology? (2) • In information management and knowledge sharing arena, ontology can be defined as follows: • An ontology is a vocabulary of concepts and relations rich enough to enable us to express knowledge and intention without semantic ambiguity. • Ontology describes domain knowledge and provides an agreed-upon understanding of a domain. • Ontologies: are collections of statements written in a language such as RDF that define the relations between concepts and specify logical rules for reasoning about them. • Computers will "understand" the meaning of semantic data on a web page by following links to specified ontologies.
What is Ontology?(3) A more formal definition is: “An ontology is a formal, explicit specification of a shared conceptualization” (Tom Gruber) • “explicit” means that “the type of concepts used and the constraints on their use are explicitly defined”; • “formal” refers to the fact that “it should be machine readable”; • “shared” refers to the fact that the knowledge represented in an ontology are agreed upon and accepted by a group”; • “conceptualization” refers to an abstract model that consists the relevant concepts and the relationships that exists in a certain situation The basis of ontology is CONCEPTUALIZATION. Consider the following: The conceptualization consists of • the identified concepts (objects, events, beliefs, etc) • E.g. Concepts: disease, symptoms, therapy • the conceptual relationships that are assumed to exist and to be relevant. • E.g. Relationships: “disease causes symptoms”, “therapy treats disease”
World without ontology = Ambiguity Example (1) Ambiguity for computer Rice? • International Rice Research Institute • Rice Research Program • Rice Carrier Service Center • Africa Rice Center • Rice University Cook? You mean • chef • information about how to cook something, • or simply a place, person, business or some other entity with "cook" in its name. The problem is that the word “rice“ or “cook” has no meaning, or semantic content, to the computer.
World without ontology = Ambiguity Example (2) • Ambiguity for humans • Cat • The Vet and Grand ma associate different view for the concept cat.
Motivation (1) The reason for ontologies becoming so important is that currently we lack standards (shared knowledge) which are rich in semantics and represented in machine understandable form. Ying Ding, Ontoweb Ontologies have been proposed to solve the problems that arise from using different terminology to refer to the same concept or using the same term to refer to different concepts. Howard Beck and Helena Sofia Pinto
Motivation (2) • Inability to use the abundant information resources on the web The WEB has tremendous collection of useful information however getting information from the web is difficult. Search engines are restricted to simple keyword based techniques. Interpretation of information contained in web documents is left to the human user. • Difficulty in Information Integration The integration of data from various sources is a challenging task because of synonyms and homonyms. • Problem in Knowledge Management Multi-actor scenario involved in distributed information production and management. “People as well as machines can‘t share knowledge if they do not speak a common language [T. Davenport] Ontologies provide the required conceptualizations and knowledge representation to meet these challenges.
Motivation (3) • Database-style queries are effective • Find red cars, 1993 or newer, < $5,000 • Select * From Car Where Color=“red” And Year >= 1993 And Price < 5000 • Web is not a database • Uses keyword search • Retrieves documents, not records Ontologies provide the required knowledge and representation to search the web in a database fashion through implicit Boolean search.
Car [0:1] has Year [1:*]; Year {regexp[2]: “\d{2} : \b’\d{2}\b, … }; Car [0:1] has Make [1:*]; Make {regexp[10]: “\bchev\b”, “\bchevy\b”, … }; Car [0:1] has Model [1:*]; Model {…}; Car [0:1] has Mileage [1:*]; Mileage {regexp[8] “\b[1-9]\d{1,2}k”, “1-9]\d?,\d{3} : [^\$\d][1-9]\d?,\d{3}[^\d]” } {context: “\bmiles\b”, “\bmi\.”, “\bmi\b”}; Car [0:*] has Feature [1:*]; Feature {regexp[20]: -- Colors “\baqua\s+metallic\b”, “\bbeige\b”, … -- Transmission “(5|6)\s*spd\b”, “auto : \bauto(\.|,)”, -- Accessories “\broof\s+rack\b”, “\bspoiler\b”, … ... Year Price 1..* 1..* 1..* has Make has Mileage has 0..1 0..1 0..1 1..* has 0..1 Car has 0..1 0..1 PhoneNr 0..* is for 1..* has Model 0..1 1..* has 1..* 1..* Feature Extension Graphical Textual Example: Car-Ad Ontology
Example: People Ontology http://www.sciam.com/article.cfm?articleid=0005DE0B-2C93-1CBF-B4A8809EC588EEDF
Benefits of Ontology • To facilitate communications among people and organisations • aid to human communication and shared understanding by specifying meaning • To facilitate communications among systems with out semantic ambiguity. i,e to achieve inter-operability • To provide foundations to build other ontologies (reuse) • To save time and effort in building similar knowledge systems (sharing) • To make domain assumptions explicit • Ontological analysis • clarifies the structure of knowledge • allow domain knowledge to be explicitly defined and described
Application Areas of Ontologies • Information Retrieval • As a tool for intelligent search through inference mechanism instead of keyword matching • Easy retrievability of information without using complicated Boolean logic • Cross Language Information Retrieval • Improve recall by query expansion through the synonymy relations • Improve precision through Word Sense Disambiguation (identification of the relevant meaning of a word in a given context among all its possible meanings) • Digital Libraries • Building dynamical catalogues from machine readable meta data • Automatic indexing and annotation of web pages or documents with meaning • To give context based organisation (semantic clustering) of information resources • Site organization and navigational support • Information Integration • Seamless integration of information from different websites and databases • Knowledge Engineering and Management • As a knowledge management tools for selective semantic access (meaning oriented access) • Guided discovery of knowledge • Natural Language Processing • Better machine translation • Queries using natural language
Types of Ontologies Ontologeis can be classfied according to the degree of conceptualization • Top-level ontologies • describes very general notions which are independent of a particular problem or domain • are applicable across domains and includes vocabulary related to things, events, time, space, etc • Domain ontologies • knowledge represented in this kind of ontologies is specific to a particular domain such as forestry, fishery, etc. • They provide vocabularies about concepts in a domain and their relationships or about the theories governing the domain. • Application or task ontologies • describe knowledge pieces depending both on a particular domain and task. • Therefore, they are related to problem solving methods.
Complexity of Ontologies Depending on the wide range of tasks to which the ontologies are put ontologies can vary in their complexity Ontologies range from simple taxonomies to highly tangled networks including constraints associated with concepts and relations. • Light-weight Ontology • concepts • ‘is-a’ hierarchy among concepts • relations between concepts • Heavy-weight Ontology • cardinality constraints • taxonomy of relations • Axioms (restrictions)
Thesauri and OntologySimilarities • Both serve the same purpose, namely to provide a shared conceptualisation about a specific part of the world to different users in order to facilitate an efficient communication of complex knowledge. • Both disciplines are based on concept systems representing highly complex knowledge independent of any language. • Both are concerned about covering a broad range of terminology used in a particular domain, and in understanding the relationships among these terms. • Both utilize a hierarchical organization to group terms into categories and subcategories. • Both can be applied to cataloguing and organizing information.
Thesauri and OntologyDifferences • Formality of the definition: • Thesauri uses text in natural language to define the meaning of terms. The correct interpretation of the intended meaning depends on the user. • Ontologies specify conceptual knowledge explicitly using a formal language with clear semantics, which allows an unambiguous interpretation of terms. • Computational support: • The available tools are quite different for thesauri and ontologies. • Most thesauri maintenance tools provide limited or no means for an explicit representation of knowledge. • Ontology maintenance tools provide systems with powerful knowledge representation languages and inference mechanisms that allow formal consistency checks, inference of new knowledge, and a more user-friendly interaction. • Users: • Thesauri are intended for human users, where domain experts constitute the major user group. • Ontologies are mainly developed for knowledge sharing between (both human and artificial) agents.
Reasons to evolve thesauri to ontologies • Little possibility of re-use due to inherent semantic ambiguity and lack of the explicitness of their semantics . • Difficulties in the diversity of their representational form (no common representational language) • Developed for human use. They lack of expressive mechanisms to represent, maintain, and reason about complex knowledge in an explicit form- interpretation is left for humans. (Source:http://www.xmluk.org/slides/magic-circle_2002/wilson/XML_UK_SW_Thes/all.htm)
Thesauri have not been constructed with purely defined semantics. It is common for BT/NT relations within a thesauri to include at least: subtype of(e.g. soil/ subsoil) instance(e.g. Development Agency/IDRC)) part of(e.g. soil/top soil) role(e.g. Development Agency/Voluntary agency) property of(e.g. maize/sweet corn) MAIZE NT dent maize NT flint maize NT popcorn NT soft maize NT sweet corn NT waxy maize SOIL NT top soilNT subsoil Development Agencies NT development banks NT voluntary agencies NT IDRC Problems with Thesaurus Modelling BT/NT relations-AGROVOC
UF/USE - between the descriptor and the non-descriptor (s). Associative relationship can represent: genuine synonymy, or identical meanings; near- synonymy, or similar meanings; In some thesaurus, antonym, or opposite meanings; ( eg. Eurovoc) DEVELOPMENT AGENCIES UF aid institutions 1 1- Similar but not necessarily identical concept Problems with Thesaurus Modelling Equivalence relations – UF, USE
Problems with Thesaurus Modelling Associative relations- RT The RT associative relation is more even open to interpretation than the hierarchical relation For some thesaurus, it can contain: • cause and effect • agency or instrument • hierarchy - where polyhierarchy has not been allowed the missing hierarchical relationships are replaced by associative relationships • sequence in time or space • constituent elements • characteristic feature • object of an action, process or discipline • location • similarity (in cases where two near-synonyms have been included as descriptors) • antonym
Degradation RT chemical reactions1RT discoloration RT hydrolysis RT shrinkage MAIZE RT corn flour RT corn starch 2 RT zea mays IDRC BT development agencies RT canada 3 1- cause and effect 2- characteristic feature 3- location RT in AGROVOC
Thesauri and Ontologyhow to migrate • Analyze the existing relations and establish semantically meaningful relations: • BT/NT => ‘Is-A’ relation • RT => analyzed to roles/properties/attributes (like “produces”, “used by”, “made for”). • Allow for machine-processable definitions: • Fencing sword = sword used for: fencing” • Weapon = object used for: fighting or hunting • Mother = human & female & which has born: human
Things to keep in mind.... • There is no one correct way to model a domain • Modeling the required knowledge heavily relies on to what purpose the ontology will be used. • Ontology development is a collaborative process • Knowledge captured in the ontology should be derived from consensus. This will ensure reuse and share-ability. • Ontology works in a network fashion • No single ontology but networks of ontologies • Ontology development is necessarily a dynamic and iterative process • Ontologies should evolve through time
Conclusion • Ontology provides better semantic representation and machine understandable representation of knowledge. • Ontologies are natural successors of thesauri particular for information retrieval and knowledge management. • Developing thesauri to ontologies requires increased precision of the semantics of the existing relations in thesauri. • Ontology repositories will be distributed on the Web • methods and tools for accessing/reusing/aligning ontology's are needed.