1 / 49

Terminology Curation with the Semantic MediaWiki

Terminology Curation with the Semantic MediaWiki. Harold Solbrig Informatics Architect Apelon, Inc. The Primary Task. Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to: Upper Level Ontological Principles

olin
Télécharger la présentation

Terminology Curation with the Semantic MediaWiki

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Terminology Curation with the Semantic MediaWiki Harold Solbrig Informatics Architect Apelon, Inc. Terminology and the Semantic MediaWikiEcoterm IV – Vienna 17 – 18 April 2007

  2. The Primary Task Evaluate the roles, categories and organization of the National Cancer Institute (NCI)’s Cancer Thesaurus with respect to: • Upper Level Ontological Principles • ISO TC37 & Related principles As with Ontology construction, it was understood by all parties that this was a process – not a goal.

  3. Approach • Gather appropriate upper level ontologies (BFO, Dolce, Top Bio, UMLS Semantic Net and OBO Relations Ontology) into a single, readily referenced format • Load NCI Thesaurus into same format • Multiple parties review, annotate, recommend and categorize • Publish, analyze and evaluate results

  4. Solution By using the Semantic MediaWiki (SMW), we were able to accomplish all of the goals in a (very) reasonable period of time

  5. Discussion We also discovered that, with some extensions, the SMW could be useful for publishing, annotating and cross-referencing other terminological (and other..) resources.

  6. Questions? … just kidding.

  7. Wiki’s • Community developed • Collaborative • “Organic” – to the very core… • Primary focus (to date) is human consumption • Traceable, provenance automatically recorded, differences, undo and redo.

  8. MediaWiki • http://en.wikipedia.org/wiki/Wiki • Base for WikiPedia and many others… • Key characteristics • Web based editing • Page links • Categories • Templates

  9. MediaWiki • Fully documented using (surprise!) mediawiki • Rich mechanisms for discussion, curation, export, etc.

  10. Common constructs • [[Train Transport]] – hyperlink to page named “Train_Transport” • ‘‘Italic’’, ‘‘‘Bold’’’ • * Bullet point • [http://www.w3c.org/ The W3C] – hyperlink • … and much more

  11. Templates

  12. Templates

  13. Sample Template Extension call Parameter

  14. Semantic MediaWiki

  15. Semantic MediaWiki 3 Key extensions to MediaWiki • Categories == Class • PageA … [[Category:X]]  pageA rdf:Type category:X • Category:Y … [[Category:X]]  category:Y rdfs:subClassOf category:X • Links == Role • PageA … [[PageB]]  PageA …[[hasPart::PageB]] • Attributes == DataProperty • [[population:=32,154,773]] • Includes datatypes

  16. Categories and Relations

  17. Attributes

  18. Semantic Rendering RDF (!) Relation Attribute Value Type (or superClass)

  19. Thesaurus Content

  20. Templates? ; Gene_Product_Is_Biomarker_Type : The role is used to designate the type of … Kind: [[:Category:NCI_Kind]] ‘‘‘Semantic Type:’’’ [NCI_Semantic_Type::Category:SN_Conceptual_Entity|Conceptual Entity] Brittle, not readily changed…

  21. Templates? {{OntylogDescription|ns=NCI|text=“The role is used to designate…”}} {{Kind|ns=NCI|target=Kind}} {{ResourceRef|name=Semantic_Type|ns=NCI|target=Conceptual_Entity|targetns=SN}} Can readily be updated viat template…

  22. Link to another NCI comment Link to external Ontology Categorization in external Ontology Commentary

  23. Computed

  24. How is it Working? Very well!

  25. What can we do to improve it…

  26. Terminology • Centrally curated • Central to the practice of medicine • Insurance and reporting • Regulatory • Research • Clinical Practice • Information Sharing • ICD-9, CPT-4, SNOMED, …

  27. Clinical Terminology • Quality and content is important • Needs central vetting, integration, qa • Central model doesn’t scale • Need input from (many) experts • Need visible, active feedback loop

  28. Terminology Workflow 1995 Books PDF Distribution (3) Controlled Terminology Lists and Tables (2) (1) Curation (4)

  29. Terminology Workflow 1995 Books PDF Distribution (3) Controlled Terminology ‘B’ (2) Lists and Tables (1) Curation

  30. Terminology Workflow 2008 (3) Common Distribution Model Distribution Controlled Terminology (2) (4) Online Services (1) Curation (5)

  31. Terminology Workflow 2008 (3) Controlled Terminology B Common Distribution Model Distribution Controlled Terminology (2) (4) Online Services (1) Curation (5)

  32. Common Distribution Model • LexGrid • (a little bit of…) OWL • NCI Thesaurus & SNOMED CT • Still requires LexGrid-like additions • “Pushing the envelope” • UMLS RRF • Although underspecified as a ‘model’

  33. Online Services • OMG Terminology Query Services • Not heavily used • Perceived (incorrectly) as CORBA specific • Perceived as too complex • Object oriented and stateful • ANSI Common Terminology Services • Being adopted • Necessary but not sufficient • Stateless • CTS-2 • Co-development beginning w/ HL7 & OMG

  34. Online Services • LexBIG • LexGrid for the Bio Informatics Grid • Robust query specification • Meets many end-user (developers) requirments • Not simple to implement – it actually adds value • Not a standard - but will be used to guide CTS-2

  35. Workflow and Feedback (3) Common Distribution Model Distribution Controlled Terminology (2) (4) Online Services (1) Curation (5)

  36. The Feedback Component Curation

  37. The Feedback Component Common Distribution Model Semantic MediaWiki (++) Distribution Online Services Annotations and Change Requests Community Review Version Staging Curation

  38. Issues and Next Steps (1) SHARED Semantics • {{Definition|…}} • {{Synonym|…}}} • {{References|…}} • {{DLSome|…}} • {{DLAll|…}} • … 12620 anyone?

  39. Issues and Next Steps (2) Figure out namespaces • NCI:Activity, AgroVoc:Fish, … • NCI_Activity, AgroVoc_Fish • ??? (2a) Identifiers (Activity vs. C12345) (2b) Versions (2c) URI’s (vs. URL’s) • Internal • External

  40. Certification and Sanctioning • Who can edit? • Who can validate? • Who selects updates? • … (see: http://en.citizendium.org/wiki/Main_Page

  41. Automatic Export • Selecting sets of updates • Formatting update recommendations for target curators, etc…

  42. Synchronization • Changes implemented in terminology • Update wiki pages • Say what changed • What changes are incorporated by value? By reference?

  43. Approach and Responsible Parties Shared Semantics • Core set based on LexGrid & OWL • Post on WIKI and link on SMW site • Assigned to Apelon, Mayo, NCI, ??? • Extend to OBO, SKOS (?), XMDR… • Connections to 12620

  44. Time Frame and Assignments URI’s, namespaces, naming • UK NCR (CancerGrid) – looking at unAPI and servers • (Hopefully) can provide URI resolver svc. • Short term – use templates / extensions

  45. Content • SNOMED-CT, ICD-9-CM, many, many others are already available via. Apelon DTS Services • Available soon • FMA, HL7 Version 3 Terminology, OBO Foundry (GO, PATO, etc) as time permits • Others as needed (and funded…)

  46. What we’ve got to date • Apelon DTS Server Extension • Includes both defined and classified view (!) • Export in restful XML (currentely Apelon, soon to be LexGrid) • XMDR Export Format • Protégé (Native and OWL 3.2) prototype • Done by Mayo • Both import and export • Still needs templates

  47. Questions? • This time for real  Note: SMW will be made externally available (w/ simple password) once we get contract specific info cleaned up (NCI will probably publish shortly)… contact: hsolbrig@apelon.com for access.

More Related