300 likes | 489 Vues
Computational Intelligence in Biomedical and Health Care Informatics HCA 590 (Topics in Health Sciences). Rohit Kate. Ontologies. Reading. For biomedical ontologies: Chapter 8, Text 6 Chapter 8, Main Text A few slides have been adapted from
E N D
Computational Intelligence in Biomedical and Health Care InformaticsHCA 590 (Topics in Health Sciences) Rohit Kate Ontologies
Reading For biomedical ontologies: • Chapter 8, Text 6 • Chapter 8, Main Text A few slides have been adapted from http://protege.stanford.edu/publications/ontology_development/OntologyEngineering.zip
Ontologies • Many domains, including biomedical, involve complex concepts (hundreds of thousands) with several levels of details • It is desirable to keep its representation human readable as well as computer processable • Creating ontologies is a way to organize concepts in a domain • Formalizes the knowledge and maintains it consistency • Often have computable semantics associated with them
What is an Ontology? • An ontology is an explicit description of a domain: • concepts • properties and attributes of concepts • constraints on properties and attributes • individuals (often, but not always) • An ontology defines • a common vocabulary • a shared understanding
Ontology Examples • Taxonomies on the Web • Yahoo! categories • Catalogs for on-line shopping • Amazon.com product catalog • Medicine-specific • Unified Medical Language System (UMLS) • SNOMED-CT
Why Develop an Ontology? • To share common understanding of the structure of information • among people • among software agents • To enable reuse of domain knowledge • If you know certain things about viral infections of nervous system and then if you know viral meningitis is a viral infection of nervous system, then you immediately know certain things about viral meningitis • To introduce standards to allow interoperability
What is “Ontology Engineering”? Ontology Engineering: Defining terms in the domain and relations among them • Defining concepts in the domain (classes) • Arranging the concepts in a hierarchy (subclass-superclass hierarchy) • Defining which attributes and properties(slots) classes can have and constraints on their values • Defining relations between classes • Defining individuals and instances and filling in slot values
Classes and the Class Hierarchy • A class is a concept in the domain • a class of wines • a class of wineries • a class of red wines • A class is a collection of elements with similar properties • Instances of classes • a glass of California wine you’ll have
Class Inheritance • Classes usually constitute a taxonomic hierarchy (a subclass-superclass hierarchy) • A class hierarchy is usually an IS-A hierarchy: an instance of a subclass is an instance of a superclass • If you think of a class as a set of elements, a subclass is a subset
Apple is a subclass of Fruit Every apple is a fruit Red wines is a subclass of Wine Every red wine is a wine Chianti wine is a subclass of Red wine Every Chianti wine is a red wine Class Inheritance - Example
Multiple Inheritance • A class can have more than one superclass • It’s a decision of the ontology creator whether to allow this or not
Avoiding Class Cycles • Danger of multiple inheritance: cycles in the class hierarchy, this can make the ontology inconsistent • Good ontology development softwares can detect this
Top level Middle level Bottom level Levels in the Hierarchy
Modes of Development • top-down – define the most general concepts first and then specialize them • bottom-up – define the most specific concepts and then organize them in more general classes • combination – define the more salient concepts first and then generalize and specialize them
Documentation • Classes usually have documentation • Describing the class in natural language • Listing domain assumptions relevant to the class definition • Listing synonyms • Documenting classes is as important as documenting computer program
Relations • Relations: Ways in which classes and individuals can be related to one another • For example, “wine” is a class and “winery” is a class and the two could be related by “produces” relation • Other relations could be “part-of”, “made-of” or any user-defined relation
Properties of Classes – Slots • Slots in a class definition describe attributes of instances of the class and relations to other instances Each wine will have color, sugar content, producer, etc.
Slot and Class Inheritance • A subclass inherits all the slots from the superclass If a wine has a name and flavor, a red wine also has a name and flavor • If a class has multiple superclasses, it inherits slots from all of them Port is both a dessert wine and a red wine. It inherits “sugar content: high” from the former and “color:red” from the latter
Instances • An instance is an actual individual of a class • The class becomes a direct type of the instance • Any superclass of the direct type is a type of the instance • Slot values are then assigned to the instance
Types of Ontologies Upper ontology • General Ontologies: Knowledge of an intermediate level independent of the task, e.g. diseases • Upper Ontology: General knowledge, e.g. concepts of time and space • Domain Ontologies: Knowledge about a particular field, e.g. drugs General ontology Domain Ontology
Comparison with Taxonomy, Terminologies etc. • Taxonomy: Contains classes and subclasses and has only the “is-a” relation • Vocabularies and Terminologies: List of terms with no relations explicitly identified • Could be “controlled” for a domain (e.g. use only these terms and avoid their variants)
Ontology Languages • Ontology could we expressed in Description Logic • There are various syntactic variations to write ontology, e.g. RDF, DAML+OIL, CycL, KL-ONE • OWL (Web Ontology Language) has recently emerged as a very popular ontology language with its application in semantic web • Software, like Protégé and OBO-Edit, can be used to build ontology and export as an OWL or some other form
General Ontology: OpenCyc • Cyc is a general ontology developed by Cycorp Inc. about common sense knowledge, e.g. tree is_a plant, animals die, cancer is_a disease • More than 1 million such hand-coded assertions coded • OpenCyc is a smaller subset which is publicly available with about 6,000 concepts and 60,000 assertions
General Ontology: WordNet A computational resource for English words, their senses and relations Available for free, browse or download: http://wordnet.princeton.edu/ Developed by famous cognitive psychologist George Miller and a team at Princeton University Database of word senses and their relations 25
WordNet • Synset (synonym set): Set of near synonyms in WordNet • Basic primitive of WordNet • Each synset expresses a semantic concept • Example synset: {drive, thrust, driving force} • Entry for each word shows all the synsets (senses) the word appears in, some description and sometimes example usage • About 140,000 words and 109,000 synsets • Synsets (not individual words) are connected by various sense relations • Emphasis is on general English usage, only partially useful for biomedical domains
Some WordNet Synset Relationships Antonym: front back Similar: unquestioning absolute Cause: kill die Entailment: breathe inhale Holonym: chapter text (part-of) Meronym: computer cpu (whole-of) Hyponym: tree plant (specialization) Hypernym: fruit apple (generalization) 27
A WordNet Snapshot motor vehicle, automotive vehicle synsets hypernym car, auto, automobile, machine,motorcar accelerator, gas pedal, gas meronym hyponym hyponym cab, taxi, taxicab, hack ambulance
WordNets for Other Languages EuroWordNet: Individual WordNets for some European languages (Dutch, Italian, Spanish, German, French, Czech, and Estonia) which are also interconnected by interlingual links http://www.illc.uva.nl/EuroWordNet/ WordNets for some Asian languages: Hindi: http://www.cfilt.iitb.ac.in/wordnet/webhwn/ Marathi: http://www.cfilt.iitb.ac.in/wordnet/webmwn/ Japanese: http://nlpwww.nict.go.jp/wn-ja/index.en.html 29
Homework 5Due by 2 pm, next class, Tuesday 10/22Submit .txt, .doc, .pdf, ppt or a scanned image through D2L • Decide suitable concepts and roles to encode the following in description logic and also draw a graph • Pain in lower limb is a type of pain whose site is lower limb structure • Pain in calf is a type of pain whose site is calf structure • Calf structure is a type of lower limb structure (use the subsumption operator) What can be said about the relation between pain in calf and pain in lower limb?