1 / 89

Advanced Topics in Biomedical Ontology PHI 637 SEM / BMI 708 SEM

Advanced Topics in Biomedical Ontology PHI 637 SEM / BMI 708 SEM. Werner Ceusters and Barry Smith. Lecture 12 Werner Ceusters & Barry Smith. Ontology evaluation. Lecture 12 – part2 Werner Ceusters. Evolutionary ontology evaluation. Lecture overview.

chacha
Télécharger la présentation

Advanced Topics in Biomedical Ontology PHI 637 SEM / BMI 708 SEM

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Advanced Topics in Biomedical OntologyPHI 637 SEM / BMI 708 SEM Werner Ceusters and Barry Smith

  2. Lecture 12Werner Ceusters & Barry Smith Ontology evaluation

  3. Lecture 12 – part2Werner Ceusters Evolutionary ontology evaluation

  4. Lecture overview Recapitulation of Realism-based Ontology Change Management. Evolutionary Quality Assessment (EQA): Theory, EQA applied to SNOMED CT and the Gene Ontology. Using EQA to decide on when to upgrade to a new version of ontology.

  5. 1. Realism-based ontology change management

  6. An “optimal” ontology (1) • Becauseontologies, as conceived on realist terms, • are artifactscreatedforsomepurpose (e.g. to serve as controlledvocabulary, or toprovide domain knowledgeto a software application), • are at thesame time intendedtomirrorreality, • shouldallowreasoningwhich is efficientfrom a computational point of view, • we arguethatanoptimalontologyshouldconstitute a representation of allandonlythoseportions of realitythat are relevant foritspurpose.

  7. An “optimal” ontology (2) • Each term in suchanontologywoulddesignate: • (1) a single portion of reality (POR), which is • (2) relevant tothepurposes of theontologyandsuchthat • (3) theauthors of theontologyintendedtousethis term todesignatethis POR, and • (4) therewouldbe no PORsobjectively relevant to these purposesthat are notreferredto in theontology.

  8. But things may go wrong … • Root cause errors: • assertion errors: ontology developers may be in error as to what is the case in their target domain; • relevance errors: they may be in error as to what is objectively relevant to a given purpose; • encoding errors: they may not successfully encode their underlying cognitive representations, so that particular representational units fail to point to the intended PORs. • Mismatches: • Unjustified presence or absence of RUs, • Redundancies: >1 RU  1 POR, • Ambiguities: 1 RU  >1 POR.

  9. RU – reality correspondence types • 1st version (2006): • Ceusters W, Smith B. A Realism-Based Approach to the Evolution of Biomedical Ontologies. Proceedings of AMIA 2006, Washington DC, 2006;:121-125. • 1st revision (2009): • Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529. • 2nd revision (2014): • Seppãlã S, Smith B, Ceusters W. Applying the realism-based ontology versioning method for tracking changes in the Basic Formal Ontology. Formal Ontology in Information Systems. Proceedings of the Eight International Conference (FOIS 2014), Amsterdam: IOS Press, 2014;:227-240.

  10. RU – reality correspondence types • Reality: • OE: Objective existence • OE: Objective relevance • Representation: • BE: Author’s belief in existence • BR: Author’s belief in relevance • IE: Author’s intended encoding • TR: Type of reference

  11. Configuration types • P: present in the ontology • P+: justifiably present • P–: unjustifiably present • A: absent from the ontology • A+: justifiably absent • A–: unjustifiably absent

  12. Configuration types • 1+4+12+5=22 possible configurations based on (mis)matches between reality, beliefs, and encodings

  13. OE/BE value pairs Y/Y: correct assertion of the existence of a POR; Y/N: lack of awareness of a POR, reflecting an assertion error; N/N: correct assertion that some putative POR does not exist; N/Y: the false belief that some putative POR exists; Y/NC: not considering that some POR exists; N/NC: not considering that some putative POR does not exist.

  14. 4 2 1 3 ‘na’ = not applicable If there is no POR of a specific sort, relevance is not applicable. If an author does not believe in some POR, believed relevance is not applicable. If an author did not consider (‘nc’) existence of some POR, believed relevance is not applicable. If believed relevance is either negative or not applicable, encoding is not applicable.

  15. Modes of reference:

  16. 2. Evolutionary quality assessment Theory

  17. Magnitude of error

  18. Error basis

  19. Error basis

  20. Error basis

  21. Error basis

  22. ‘Simple snapshot’ changes • a change in reality will not immediately lead to a change in the ontology authors’ understanding thereof and, • if an encoding change is introduced, e.g. by making some syntactic correction to an existing term, then this does not result in a term which wrongly refers. What happens from t1  t2 ?

  23. Effects of varioussorts of ‘snapshot’ changes

  24. Effects of varioussorts of ‘snapshot’ changes Change in error magnitude No change in ontology Addition Deletion Change in encoding No snapshot transition possible …

  25. Effects of varioussorts of changes • When something faithfully represented at t ceases to be faithful at t+1, leaving the ontology unchanged causes a P+1 to become a P-1. • When something faithfully represented at t is not believed to be faithful anymore at t+1 while in fact it still is, removing the representational element causes a P+1 to become a A-2. …

  26. Updating is an active process • authors assume in good faith that: • all included representational units are of the P+1 type, and, • all they are aware of, but not included, of A+1 or A+2. • If they become aware of a mistake, they make a change under the assumption that their changes are also towards the P+1, A+1, or A+2 cases. • Thus at that time, they know: • of what configuration type the previous entry must have been under the belief what the current configuration is, and, • the reason for the change.

  27. Evolution example: The Higg’s boson was discovered and added to the ontology: A-5  P+1 quality improvement of +1

  28. This leads to a calculus … • NOT: • to demonstrate how good an individual version of an ontology is, • But rather • to measure how much it improved (hopefully) as compared to its predecessors. • Principle: recursive belief revision. Ceusters W. Applying Evolutionary Terminology Auditing to SNOMED CT. In American Medical Informatics Association 2010 Annual Symposium (AMIA 2010) Proceedings, Washington DC, November 13-17, 2010:96-100.

  29. However:unnoticed changes in reality do not lead to updates! No change in ontology …

  30. Quality of a representation w.r.t. reality • n: number of representational elements in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, for the ith representational element Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529.

  31. Quality of a representation w.r.t. reality • n: number of representational elements in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, for the ith representational element Quality of a Representational Unit: 5 – ei 5 is the maximal magnitude of error, thus 0 ≤ ( 5 – ei ) ≤ 5 Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529.

  32. Quality of a representation w.r.t. reality • n: number of representational elements in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, for the ith representational element All unjustified absences have an error magnitude of 1 Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529.

  33. Quality of a representation w.r.t. reality • n: number of representational elements in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, for the ith representational element • The sum of the qualities of the RUs, each quality being: • 5 for faithful RUs, • (5 – error magnitude) for deviant RUs. Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529.

  34. Quality of a representation w.r.t. reality • n: number of representational elements in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, for the ith representational element • The quality of the ontology decreases through: • Error magnitude of unjustified presences, • Number of unjustified absences. Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529.

  35. Quality of a representation w.r.t. reality • n: number of representational elements in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, for the ith representational element • Ideal case: • ei is 0 for all RUs, thus = 5n • Number of unjustified absences = 0, thus 4m = 0. 5n 5n = 1 Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529.

  36. Comparing quality of ontologies • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU

  37. Comparing quality of ontologies • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU

  38. Comparing quality of ontologies • n: number of RUs in the ontology m: number of unjustified absences ei: magnitude of the error, if any, for the ith RU

  39. Comparing quality of ontologies • n: number of RUs in the ontology m: number of unjustified absences ei: magnitude of the error, if any, for the ith RU

  40. Comparing consecutive versions: t1 • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU (8*5) (8*5)

  41. Comparing consecutive versions: t2 • What must you believe at t2 about ontology version V1, in light of what you believe to be the case in reality at t2? ? (7*5) (7*5) • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU

  42. Comparing consecutive versions: t2 • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU (7*5) + (1*2) (8*5)

  43. Comparing consecutive versions: t3 • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU ? (8*5) (8*5)

  44. Comparing consecutive versions: t3 • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU ? (7*5) (7*5) + (1*4)

  45. Comparing consecutive versions: t3 • n: number of RUs in the ontology • m: number of unjustified absences • ei: magnitude of the error, if any, • for the ith RU (7*5) + (1*2) (8*5) + (1*4)

  46. Change management inSNOMED CT

  47. 2. Evolutionary quality assessment 2. Application to SNOMED-CT and the Gene Ontology

  48. SNOMED CT structure IHTSDO. SNOMED CT Starter Guide July 2014

  49. SNOMED CT concepts’ status (July 2011) ST Concept Status N % 0 active in current use 292,073 74.677% 6 active with limited clinical value (classification concept or an administrative definition) 20,930 5.35% 1 inactive: ‘retired’ without a specified reason 7,525 1.92% 10 inactive because moved elsewhere 14,451 3.69% 2 inactive: withdrawn because duplication 37,752 9.65% 3 inactive because no longer recognized as a valid clinical concept (outdated) 1,439 0.37% 4 inactive because inherently ambiguous. 15,858 4.05% 5 inactive because found to contain a mistake 1,142 0.29% TOTAL 391,170 100%

  50. Some principles used for determining Ax/Px type from SNOMED CT’s ‘reasons for change’ • all new introductions are unjustifiably missing in earlier versions. • is adequate for most types of concepts, except for pharmaceutical products and certain information artifacts such as newly constructed rating scales or named guidelines and protocols; • ‘duplicate’ translates into P-9; • sample of 1000 changes to find common principles. Ceusters W. Applying Evolutionary Terminology Auditing to SNOMED CT. In American Medical Informatics Association 2010 Annual Symposium (AMIA 2010) Proceedings, Washington DC, November 13-17, 2010:96-100.

More Related