1 / 72

Dublin Core and metadata: a tutorial

Dublin Core and metadata: a tutorial. Lorcan Dempsey Andy Powell UKOLN, University of Bath (with a little help from our friends) http://www.ukoln.ac.uk/metadata. Questions for you. Metadata EAD, CIMI, TEI PICS, XML, RDF MARC 856 Dublin Core you are geeks/people with sensible shoes

ashanti
Télécharger la présentation

Dublin Core and metadata: a tutorial

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Dublin Core and metadata:a tutorial Lorcan Dempsey Andy Powell UKOLN, University of Bath (with a little help from our friends) http://www.ukoln.ac.uk/metadata

  2. Questions for you ... • Metadata • EAD, CIMI, TEI • PICS, XML, RDF • MARC • 856 • Dublin Core • you are • geeks/people with sensible shoes • goers/doers

  3. Overview • UKOLN and metadata • Metadata landscape • Dublin Core • Metadata management • Interoperability • Harvesting • Future

  4. ROADS subject gateways WHOIS++ templates BIBLINK CIP for electronic data Dublin Core (+ MARC) Desire WHOIS++, GILS, Dublin Core Z39.50/WHOIS++ NewsAgent current awareness, Ariadne Dublin Core, DC-dot MODELS collection description?? Agora PRIDE Initiatives UKOLN and metadata

  5. Metadata landscape

  6. What is metadata …? • It’s just cataloguing, isn’t it … ? • Yes and no … • Data which supports operations carried out on information objects … • discover, buy, ... • In the company of strangers (Brody) • Relieve user of having to have full advance knowledge of characteristics of resources … … variety

  7. Libraries Picture by Stu Weibel Metadata model: the library example Semantics, syntax, content MARC, ISO 2709, AACR2 MARC AACR2

  8. Commerce Home Pages Libraries Geospatial Internet Commons Scientific Data Museums Whatever... Picture by Stu Weibel Variety of formal and informal metadata models

  9. Discovery Location Selection fit for use Acquire terms Manipulate Exploit IPR Document Contextualise Preserve Manage dates, people, structures, … Agent/client access …. Variety of operations ...

  10. Variety of sectors ... • Curatorial traditions • ‘cataloguing’/documentation • libraries, archives, text archives, museums, geospatial data, etc • Network resource discovery • directory services, search engines, etc • influence from computer science • Network information management • web developments, W3C, database • sitemap, time to live, ... • pragmatic - market needs, vendor push

  11. Variety of creation models ... • Author/creator • web pages? • Repository/site manager • effective disclosure • better management • Third party creator • e.g. eLib subject gateways • Library

  12. Metadata ... • Variety of metadata models • syntax, semantics, content • scope • sectors/domains • Variety of operations supported • Variety of creation models • Variety of architectures for disclosure/discovery • Search and retrieve • Disclosure/distribution • Management … complex

  13. Some formats richer… semantics, structure, domain-specific, ...

  14. Dublin core in the metadata landscape

  15. FGDC MARC Museum ... Dublin Core Dublin Core • Metadata model • Simple element set • focus on semantics - several target syntaxes • Operations • resource discovery on the web • Explicitly cross sector/domain • No constraint on creation model or application architecture … simple and intuitive

  16. Dublin core - why success? • Simple • Coincides with strategic needs in each of sectors we identified • Curatorial: semantic interoperability between richer metadata models • Resource discovery: a simple format for descriptive metadata (DLOs) • Web management: associate metadata with Web resources • Inclusive (countries/domains/traditions) • Stu Weibel

  17. Introduction to Dublin Core

  18. Title Subject Description Creator Publisher Contributor Date Type Format Identifier Source Language Relation Coverage Rights Dublin Core - elements • 15 element core metadata set

  19. Dublin Core - HTML Example <HTML><HEAD> <TITLE>UKOLN Home Page</TITLE> <META NAME="DC.Title” CONTENT="UKOLN: UK Office for Library and Information Networking"> <META NAME="DC.Subject" CONTENT="national centre, network information support, library community, awareness, research, information services, public library networking, bibliographic management, distributed library systems, metadata, resource discovery, conferences, lectures, workshops"> <META NAME="DC.Description" CONTENT="UKOLN is a national centre for support in network information management in the library and information communities. It provides awareness, research and information services"> <META NAME="DC.Creator" CONTENT=”Isobel Stark"> </HEAD> ...

  20. Management

  21. Data creation Practical issues of using Dublin Core for Internet resource description... • UKOLN metadata system • Requirements • 3 models for metadata management • Implementation at UKOLN

  22. UKOLN metadata system requirements • Easy to use • Work with a variety of methods of creating HTML • Simple migration to future metadata formats • Separate metadata from resource

  23. Pros… Simple May be useful for training and familiarisation Cons… May not be possible with all editors Maintenance problems Easy to make errors Managing Dublin Core (1)HTML Authoring tool Embed by hand using HTML or text editor

  24. DC-dot • A Web based tool for creating Dublin Core <meta> tags • Automatic generation of some tags based on content of the resource • Forms based editing of tags • Cut-and-paste output into HTML • Conversion to other formats… • SOIF, ROADS/WHOIS++, USMARC, GILS... Run demo http://www.ukoln.ac.uk/metadata/dcdot/

  25. Pros… Use of Web-site management tools likely to increase Object-oriented database approach Cons… Proprietry formats Early days - too early to evaluate use for metadata yet? Managing Dublin Core (2)Web-site management tool Use Web-site management tool, for example NetObjects Fusion

  26. Pros… Separates metadata from resource Future migration fairly simple Cons… Performance Lack of integration with HTML tools Server specific Managing Dublin Core (3)On the fly generation Hold Dublin Core separately and embed on-the-fly using server-side include (SSI)

  27. UKOLN metadata system (1) • Embed on-the-fly • Apache SSI script • Store metadata using SOIF records • Use MS-Access as tool to create the records • Associate metadata with resource by co-locating them in the Web server filestore

  28. UKOLN metadata system (2) intro.html Apache syntax for calling server-side script <!--#exec cmd="getmeta" --> <html> <head> <title>…</title> <!--#exec cmd="getmeta" --> </head> ... HTML editor intro.html.soif @FILE { http://www.ukoln.ac. ... keywords{13}: xxx, yyy, zzz description{14}: blah blah b author{13}: Stark, Isobel ... } MS-Access Database

  29. UKOLN metadata system (3) MS-Access front end... Filename browser Text boxes Name choosers UKOLN specific metadata

  30. UKOLN metadata system (4) intro.html Web robot <html> <head> <title>…</title> <!--#exec cmd="getmeta" --> </head> ... 1 2 UKOLN Web server 6 intro.html.soif @FILE { http://www.ukoln.ac. ... keywords{13}: xxx, yyy, zzz description{14}: blah blah b author{13}: Stark, Isobel ... } 3 4 SSI script 5

  31. Issues • Performance • Interaction with Web caches • Dublin Core vs Alta Vista style metadata <META NAME=”Description” CONTENT=”blah, blah"> <META NAME="Keywords” CONTENT="xxx, yyy, zzz"> • Granularity • Which pages should have metadata?

  32. A short history:Dublin to Helsinki We have borrowed some of this material from Stu Weibel, with permission

  33. Dublin Core Workshop Series .. • DC-1: OCLC/NCSA Metadata Workshop Mar, 1995 • Limited Scope: Discovery of document-like objects • 13 element Dublin Core • Interdisciplinary consensus • DC-2: OCLC/UKOLN Warwick Workshop April, 1996 • Warwick Framework - modularity • Syntax issues

  34. .. Dublin Core Workshop Series • DC-3: CNI/OCLC Image Metadata Workshop, Sep, 1996 • Images are in scope • 15 element core; some element name changes • DC-4: Canberra Metadata Workshop Mar, 1997 • Minimalists and Structuralists • Canberra Qualifiers (additional information useful for interpretation of metadata)

  35. Dublin core - qualifiers • Language of element value • Scheme • specifies a context for interpretation <META NAME=“DC.Subject” SCHEME=“ddc.21” CONTENT=“170.42”> • Sub-element • specifies a facet - narrows <META NAME="DC.Creator.Address" CONTENT=“l.dempsey@ukoln.ac.uk">

  36. DC-5 • DC-5: National Library of Finland/OCLC Workshop, October 1997 • Formal Data Model (expressed in RDF) • many other problems are hereby made simpler • Resource Description Framework • The return of modularity • Finnish finish (of unqualified DC) • minimalist DC is done and will not be changed • Semantics for additional sub-structure • a small number of sub-elements will be established • Closer DC-W3C collaboration

  37. Data Model date, relationship, source what is a resource? 1:1 RDF Relationships Typology Sub-elements Date Working groups

  38. RFCs in preparation • Simple DC semantics (the minimalist position) • Simple DC syntax for embedded HTML • DC semantics with qualifiers • DC syntax with qualifiers • HTML 2.0 • HTML 4.0 • RDF

  39. Dublin Core implementation

  40. Projects • 30 projects; 10 countries http://purl.org/metadata/dublin_core/projects.html • “Interdisciplinary and international recognition as the lingua franca for resource discovery metadata for electronic resources” Stu Weibel • Support for use for non-digital objects

  41. The HTML 2.0 “kludge” • Convention for simple embedded metadata • Bootstrapping early Dublin Core deployments • META tags and standard HTML syntax • Useful for simple metadata without qualifiers • Can support Dublin Core qualifiers, but with risks for interoperability and indexing purity • <META NAME="DC.Subject" CONTENT="(SCHEME=LCSH) • Information technology -- higher education">

  42. HTML 4.0 - DC influences the web • Richer <META> tag attributes • LANG (language of the metadata) • SCHEME (formal qualifier) • SUB-ELEMENTS (dot syntax extensions) • Allows syntactically “clean” implementation of metadata with qualifiers <META NAME="DC.Subject" SCHEME="LCSH" CONTENT="Information technology -- higher education">

  43. Information provided by Dave Beckett Information provided by Sigfrid Lundburg Some quick statistics • UK (academic sites only) • Total pages: ~1.5M (a guess!) • Embedded DC: ‘a few hundred’ http://www.cs.ukc.ac.uk/people/staff/djb1/ • Sweden • Total pages: 1.4M • Embedded DC: ‘a few dozen’ http://www.lub.lu.se/nwiPaper/

  44. Interoperability

  45. Interoperability • What do we mean by interoperability? • Issues • Z39.50 and Dublin Core • Metadata registries

  46. In real life these can all get mixed up Interoperability? • Unify access to data in different domains - Web, library, museums, archives, ... • Issues • Protocols - Z39.50, WHOIS++, … • gateways • Attribute names - author/creator/... • Semantic interoperability - mapping tables • Format of results • format converters

  47. Protocol Gateways - an example • ZEXI - a Z39.50 to WHOIS++ gateway • Based on CNIDR's Isite • Accepts Z39.50 searches • Converts them to WHOIS++ • Returns SUTRS records http://roads.ukoln.ac.uk/cgi-bin/egwcgi/egwirtcl/targets.egw

  48. Attribute names • Different databases may use different ‘names’ for the same thing • ‘creator’ vs ‘author’ • Need to be able to construct searches that ‘work’ against different databases irrespective of the ‘names’ in use • Dublin Core provides a minimal set of agreed ‘names’ with which we can construct searches

  49. Format of results • Different databases may return results in different formats • USMARC, GRS-1, SUTRS, IAFA, ... • Early stages of searching ideally need results to be returned in single ‘simple’ format • Dublin Core provides a minimal set of agreed data elements with which we can construct results

  50. Z39.50 and DC - searching • Version 2 • Searches phrased in terms of single attribute set only • Either need to • add DC attributes to Bib-1 • map DC to Bib-1 • Version 3 • Multiple attribute sets allowed for searching • New simple DC attribute set to be proposed • Other attributes taken from Bib-1 http://cypress.dev.oclc.org:12345/~rrl/docs/dublincoreandz3950.html

More Related