160 likes | 691 Vues
Taxonomy and Ontology. Ian Bailey ian@modelfutures.com. Overview. Attempt to compare the disciplines of Taxonomy and Ontology What do they have in common ? Where do they differ ? How are they used ? Case Study: UK Defence Taxonomy
E N D
Taxonomy and Ontology Ian Bailey ian@modelfutures.com
Overview • Attempt to compare the disciplines of Taxonomy and Ontology • What do they have in common ? • Where do they differ ? • How are they used ? • Case Study: UK Defence Taxonomy • In March 2009, MOD ran a small research project to investigate how master reference data is best provided to enterprise architects • We took the UKDT and re-engineered large parts of it into a formal ontology (based on IDEAS ontology) • Assume the audience knows far more about Taxonomy than I do
Taxonomy and Ontology • Several definitions for both, not all of them are consistent • The types of taxonomy developed in UK Gov seem to be about terminology • Providing consistent terms to enable better discovery of information and consistency of communication • Usually implemented in software systems, but their goal is to help humans find stuff and be more consistent • Again, there are different flavours of ontology around • They all seem to share the common trait of being models of domain of interest • Unlike a taxonomy, an ontology models the things of interest and their relationships. The names of those things is of secondary concern to the structure of the things • Ontologies tend not to be for human consumption – not only are they “computer-interpretable”, they are generally speaking able to configure a system to do certain things
T&O – Quick Example to Compare • Barracks and garrisons taxonomy • Descending by “narrower term” • Aldershot Garrison narrower than Barracks and Garrisons • Arnhem Barracks Aldershot narrower than Aldershot Garrison • An ontology cares more about the natureof these things • Barracks and Garrisons is a type • Aldershot Garrison is and individual • Their relationship is type-instance • Arnhem Barracks is also an individual • Its relationship to Aldershot Garrison is whole-part • Making these distinctions allows for computer systems to interpret reality in a way that is closer to human understanding
Looking at it Another Way • Venn Diagrams & Physical Structures • Types (ovals) and Individuals (rectangles) • Individuals and their parts • Relationships are important • What was simply narrower term in the taxonomy breaks down into super-subtype (between types), type-instance (between types and things of that type) and whole-part (between individuals) whole-part type-instance built estate Aldershot garrison barracks and garrisons Arnhem Barracks Aldershot garrison Browning Barracks super-subtype Brunevel Barracks Arnhem Barracks, Aldershot etc.
Why Bother ? • This may seem like a lot of fuss… • However, you can build systems on this stuff • Super-Subtype Inheritance • If we know Built Estate has a lat-long location, then we know Barracks and Garrisons also have lat-long • Type-Instance • …and we also know that Aldershot has a specific lat-long value • Whole-Part • If we know Aldershot Garrison is in Hampshire then we know Brunevel Barracks is also in Hampshire • The point is that a certain degree of sophistication is required in order that systems can make inferences that can support business • Allows automation of a number of processes that would otherwise have been manual
Names & Objects • There are things in the real word (individuals, types, relationships) and there are the names we give them Object Space Name Space named-by Built Estate narrower-term Barracks and Garrisons narrower-term super-subtype named-by Aldershot garrison narrower-term named-by Arnhem Barracks, Aldershot type-instance named-by type-instance Ontologies tend to become quite “webby”, and this is a good thing. It better reflects reality, is extensible, and can cope with very complex concepts whole-part
Synonyms and Homonyms • The next level of sophistication for on ontology is to allow more than one namespace • Each object in the real world may have more than one name, each belonging to different namespaces • e.g. German, French and English names: • Homonyms are simply the same text being use to describe two different objects, but in two different namespaces German Namespace “Hund” named-by English Namespace named-by “dog” named-by French Namespace “chien” named-by Army Namespace “tank” Navy Namespace “tank” named-by RAF Namespace named-by “tank”
Take Care with Synonyms • Some taxonomies can be quite loose with their “Alternative Terms” • Prime Minister <> Tony Blair • Recycling <> Black Bin Bag • Sometimes, what appear to be synonyms are actually names applying to different states of something: “Miss A Smith” Person “Miss A Smith” “Mrs A Jones” “Mrs A Evans” Time • In the same way that we use whole-part to break individuals into their physical parts, we can also break them into temporal parts • This is called 4D Ontology • Each temporal part has a name
Methodology • There aren’t many formal methods for developing ontologies • Either done by navel-gazing academics agonising for weeks over the essence of a concept • …or hacked together by programmers • Neither are ideal situations • There is one methodology, designed for re-engineering existing data into an ontology • The BORO Method (Business Object Re-engineering Ontology) • Developed by Chris Partridge – ex KPMG legacy data practice lead • IDEAS upper ontology is developed using BORO
BORO Flowchart what are the members ? Select some typical members and analyse these START HERE Does it have spatial and temporal extent ? Select a concept for analysis no (not individual) what does it relate ? Add these things to the analysis yes (individual) yes Does it have members ? Add to model yes (type) no Does it relate things ? yes (tuple) no (if you’ve got to this stage, the concept needs to be broken down further)
Ontology in MOD – Country Codes • Starting with the SCOPE geo taxonomy, we built an ontology for locations • Using the namespace concept, we allowed for multiple names and identifiers for each geo-political entity • e.g. ISO country codes, NATO country codes, US FIPS10-4 country codes, names in English, German, etc. • Also added borders information whole-part whole-part type-instance named-by
Ontology in MOD – EA Master Data • Enterprise Architecture is multidisciplinary • Business processes, org structures, systems modelling, etc. • Need to encourage consistent terminology and structures in EA • Maximise re-use of existing architecture • Used UK Defence Taxonomy as basis and produced an ontology for MODAF users • Defence Estates – bases, garrisons, barracks • Equipment – types of platform, weapon, comms system, etc. • Organisation structure – brigades, squadrons, etc. • Also pulled in data from other sources • Defence Framework (org structure of MOD) • MOD website (military org structures)
Where We Are, Where We’re Going • IDEAS • International upper ontology developed by defence ministries of UK, US, Canada, Sweden and Australia • Adopted by DoD as basis for DoD Architecture Framework v2.0 (DODAF DM2) • Foundation released in April 2009 • UK MOD • Continued involvement with ontology and IDEAS • Michael Warner keeps tabs on projects • Currently investigating use of IDEAS in MODAF (as the US did with DoDAF) • Other ontology projects around – esp. around intelligence and counter-terror • Ordnance Survey • John Goodwin at OS • Developing natural language notations for ontologies • Will present at a future TIPS event
Further Reading • BORO & Ontology • Cutter Paper • http://www.cutter.com/offers/forensicIS.html • Chris Partridge’s book • “Business Objects: Re-Engineering for Re-Use” • ISBN 978-0955060304 • 4D Ontology • “How Things Persist”; Katherine Hawley • ISBN 978-0199275434
Contact Ian Bailey ian@modelfutures.com www.modelfutures.com