190 likes | 405 Vues
Ontologies and Thesauri - Tools for Effective Information Access. Martin Doerr. Institute of Computer Science Foundation for Research and Technology - Hellas. Workshop of the Human Network for Cultural Informatics Heraklion, Crete. Ontologies and Thesauri Problem Statement.
 
                
                E N D
Ontologies and Thesauri - Tools for Effective Information Access Martin Doerr Institute of Computer Science Foundation for Research and Technology - Hellas Workshop of the Human Network for Cultural Informatics Heraklion, Crete
Ontologies and ThesauriProblem Statement • Explanation of a term: • What is an ushebti, what a shawabty ? • What did it mean, and when? • What was is made for? • How was it made? • Where was it used ? • Ideas, concepts, rather than words • Multiple aspects of interest !
Ontologies and ThesauriProblem Statement • Searching for comparative Studies • How do I spell It? Ushabti, ushabty, ushebti, shawtaby? Will it be written the same everywhere? • Should I call it : “grave goods”(AAT), “burial figurines”,“dolls”, “afterlife helpers”, “personality surrogate”, “burial ritual”? • And what about “xαρώνειο, δανάκη” ? • Should I call it: “toll”, “cheap coin”, “afterlife helper”, “corpse equipment”, “burial gift”, “burial rites” ? • Would be “grave goods” distinctive enough?
Ontologies and ThesauriProblem Statement • How to find the characteristic termitself ? • How to discover related literature ? • Relevant abstractions are not standardized • How to make statistics even about the same item? • The same items can be referred in a thousand ways • How to do comparative studies by features ? • Implicit features are not declared, explicit features need systematic documentation
Ontologies and ThesauriProblem Statement • Find well defined concepts • uniquely identifiable without dialogue • with wide agreement • for reproducible agreement between classification and retrieval • Co-operative work on shared knowledge bases(Knowledge Organisation Systems, KOS): • knowledge elicitation from experts • many small agreements and data integration • structural evolution • publication - incorporation at user sites
? Ontologies and ThesauriUsage Environment Distributed Retrieval User’s Authority Target Authorities CMS Collections foreign language old version specialised Agreed-on Term Local Term
Ontologies and ThesauriAbout Thesauri • Thesauri: find good terms by associations • Peter Mark Roget,1852, “Thesaurus of English Words and Phrases” • Linguistic thesauri • TEI, FDIS ISO12620, MARTIF, VHG • Dictionary editing, term based, presentation oriented • Conceptual thesauri • From library science, subject classification • Ranganathan 1925-1965: priority of concept. Confusion of Idea plane - Verbal plane - Notational plane hinders analysis and problem solution • ISO2788, ISO5964, ISO2709, e.g. AAT
Ontologies and ThesauriAbout Thesauri • Intrathesaurus relations (ISO 2788) • Hierarchical Relations (from Descriptor, to Descriptor) • BT (Broader Term) • BTP (Broader Term Partitive) • BTG (Broader Term Generic) = actual BT • Associative Relations (from Descriptor, to Descriptor) • RT (Related Term) • Equivalence Relations (from Descriptor, to Term) • ALT (Alternative Term) • UF (Used For Term)
Ontologies and ThesauriAbout Thesauri • Concepts identify sets of real world objects • Concepts are identified by scope notes, literature references, examples, images – NOT by terms! • Terms (noun phrases) are used • by social groups to refer to concepts • Links express opinions and differences • about set relation between concepts, subsumption, disjointness etc. • about term usage
Ontologies and ThesauriConcepts are organized in Facets • Fundamental category, major facet, basic facet: • Ranganathan: Personality, Matter, Energy, Space, Time • CIDOC CRM: Period, Physical Entity, Conceptual Object, Actor, Place, Time-Span, Type, Material, Language • AAT: Objects, Agents, Activities, Styles and Periods, Materials, Physical Attributes, Associated Concepts. • Syntactic element of an indexing expression: e.g. subdivision by period, geography, genre (MARC): “history of painting in 19th century Greece”, or AAT: “fencing + swords”.
Ontologies and ThesauriAbout Minor Facets • “Minor facets” provide explicit context criteria: • E.g. MDA Archeological Thesaurus: armour by construction : scale armour armour by form : cuirass armour by function : parade armour • A striking example for explicit use of aspect: SHIC • Social, Historical and Industrial Classification • a “pure”, homogeneous thesaurus of human activities • used by British museums to classify artifacts !
Ontologies and ThesauriPolydeykes • Directorate of Monuments Record and Publications of the Greek Ministry of Culture developsthe “Polydeykes”, in collaboration ICS- FORTH: • Basic Facets: • Kosmos , the world as subject • Living Nature, as historical subject • Culture and Civilization • Space • Time • Creations, the man-made world • Immobile objects • mobile objects • conceptual works • Associative concepts: Stylistic, physical and technical characteristics
Thesauri in ArcheologyPolydeykes • Example: Aspects of Immobile Objects: • “Είδος”, the “design models” of the past (form dominated). • “Ενότητα”, units with respect to social or functional role • “Στοιχεία”, constructive and morphological characteristics: • “τμήματα”, segments/ sections • composition: dependent and independent parts • styles • shapes • Pre-combined in the upper abstraction levels to a complete grid for the classification of characteristic terms and for object classification – consistent but heavy.
Ontologies and ThesauriPolyhierarchies instead of Minor Facets objects sword-like Fighting and hunting weapons sword-like objects cutting and thrusting cutting and thrusting weapons fencing Wooden swords Fencing swords Wooden swords foils (swords) Term specialization Criteria assignment
Ontologies and ThesauriOntologies • Formal ontologies: mathematical models for thesaurus relationships • Concepts are correlated with sets of objects • BT/NT => IsA/ subsumption • RT => open number of “roles”/properties/attributes (like “produces”, “used by”, “made for”). • Allow for machine-processable definitions: • Fencing sword = sword used for: fencing” • Weapon = object used for: fighting or hunting • Mother = human & female & which has born: human
Ontologies and ThesauriOntologies • Formal Ontologies are the natural extension of thesauri • Allow for dynamic unambiguous concept formation => multiplication of available vocabulary (in contrast to post- coordination like “grinding+factory) • Allow for machine-based inferencing => multiplication of manageable amounts. • Allow for interpretation of data structures (tables, fields, tags, classes, attributes etc.) and terms => help data interoperability
Ontologies and ThesauriConclusions • Thesauri and ontologies for information systems are retrieval tools, not terminology dictionaries (concepts often different from expert terminology). • Thesaurus structure must be functional, polyhierarchical. • Thesaurus concepts are a matter of agreement. • Indexing data records is different from scholarly classification. • Try to correlate different (foreign) thesauri ! • Formal ontologies are the next step. Thesaurus editors: preserve as much knowledge as possible!