1 / 12

MESMUSES methodology

MESMUSES methodology. Lessons learned and open issues… Alain Michard Florence, June 2003. MESMUSES broad vision. Just like several other projects SW is all about semantic interoperability Sharing machine-readable terminologies and classification schemes

Télécharger la présentation

MESMUSES methodology

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. MESMUSES methodology Lessons learned and open issues… Alain Michard Florence, June 2003

  2. MESMUSES broad vision • Just like several other projects • SW is all about semantic interoperability • Sharing machine-readable terminologies and classification schemes • Science and culture are collective and international • Semantic Web methodology should be highly relevant for managing and sharing scientific and cultural information

  3. Some key S&T issues in the Project • Model : is RDFS / OWL-Lite adequate ? • Schemaauthoring : method and tools needed ! • Metadata : where does it come from ? • Automatic Indexing : experiments with a categorizer

  4. Lives-in Produces Dwelling Person Artefact Owner Schema House Artwork Artist Create Surrogates Lives-in Creates The basic SW model Type : texte imprimé, monographie Auteur(s) : Zola, Émile (1840-1902) Titre(s) : L'assommoir [Texte imprimé] / par Emile Zola Edition : 50e éd. Publication : Paris : G. Charpentier, 1878 Description matérielle : 111-569 p. Notice n° : FRBNF35963044 Real-world entities

  5. Model and Schema Language • Typed attributes are needed • XML-Schema types • Derived types (e.g.: Celsius temperature, Gregorian date, etc.) • Enumerated types, thesauri • Time-stamping • Cardinality constraints • Explicit transitivity of properties (e.g.: geographic inclusion)

  6. Schema authoring issues (1) • Find the right level of abstraction • Is « Glucid » a class or an instance ? • Or is it sometime a class and sometime an instance ? • Avoid the « KR » attitude and practices ! • It’s all about indexing resources with shared terminologies, not about representing human knowledge !

  7. est-constitué-de ISA consomme ISA transforme est-régulé-par est-constitué-de produit Processus Système implique élimine Structure déclenche Processus complexe Processus élémentaire nécessite ISA est-réalisé-par est-documentée-par est-documentée-par Organisme Cellule Appareil Organe Molécule Grande Thématique GTANS est-expliquée-par Tissus Schema authoring issues (2)

  8. Schema authoring issues (3)

  9. Schema authoring issues (4) • Authoring tools are badly needed • Graphical representation of the schema • Zooming on sub-graphs (hierarchies) • Versioning • Consider using UML authoring environment ? • Established methodology and tutorials are needed

  10. Creating Surrogates • Data extraction and fusion from structured sources • R-DB, XML-DB, LDAP • Updating • When ? • Should not create duplicates ! • Detect cross-references • Authority lists • Thesauri • Lexical distance • ???

  11. Automatic Categorization • Automatic indexing • By extracting metadata from resources • By automatic categorization • Define hierarchies of « concepts » inside the schema • Seeding with representative documents • Machine learning to create categorizers • Pros : enriched search functionality • Cons : hierarchies of categories are static • Adding a category may change the categorizers of the others

  12. Bottom-line… • RDFS schema authoring may be more difficult than E-R modelling • Debates on syntactic features are irrelevant • Should be grounded on real-world implementations and testbeds • A new query language (e.g.: RQL) is not high priority • We have not addressed the « logical rules » layer • Semantic Web vs. Community Webs

More Related