1 / 15

Open Archives Initiative

Open Archives Initiative. Protocol for Metadata Harvesting. Collections in isolation. Some thoughts A wonderful collection is of limited use if it is not well known. Very redundant collections are often wasteful. Virtual collections.

Télécharger la présentation

Open Archives Initiative

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Open Archives Initiative Protocol for Metadata Harvesting

  2. Collections in isolation • Some thoughts • A wonderful collection is of limited use if it is not well known. • Very redundant collections are often wasteful

  3. Virtual collections • Some collections do not contain actual materials, only information about materials and links to the home site. • How do these virtual collections get the information about other collections? How do they stay up to date? • --> The Open Archives Initiative Protocol for Metadata Harvesting (OAI-PMH)

  4. OAI - PMH • A protocol -- that is just an agreement to exchange messages and interpret them according to strict rules. • Metadata -- data about the data -- information about the material in the collection • Harvesting -- gathering in the desired part of the collection for further use

  5. The protocol • See http://www.openarchives.org/OAI/openarchivesprotocol.html • Two sides - the repository and the harvestor • The repository (data providers) • Prepares the required metadata • Responds to the harvester queries • Acts like a server - responding to queries when they come • The Harvester (data gatherer) • Gathers the metadata from the collections • Organizes the harvested metadata in a way to serve its purpose. • Acts like a client - requesting service when it needs it.

  6. Resource, item, record • Resource: the actual content of the collection; the point of the digital library • Item: a part of the repository that generates the metadata. • Record: metadata in a specific format available for dissemination. • Encoded in XML • Unique identifier • Datestamp • setSpecµ • Optional status

  7. Sets • Repositories may organize items into sets • Allows selective harvesting • Each node in a set organization has • setSpec • Set may be hierarchical. If so, the levels are separated by colons • setName • setDescription

  8. Requests • Request embedded in an HTTP request • Valid OAI PMH Requests: • GetRecord • Identify • ListIdentifiers • ListMetadataFormats • ListRecords • ListSets

  9. GetRecord • Required arguments • Identifier = unique identifier of an item whose record is requested • metadataPrefix = prefix part of the metadata record relevant to the requested item • This identifies the type of metadata applied to the record. Example = oai_dc (the OAI version of the Dublin Core -- standard 15 elements, no extension.) • Errors: badArgument, cannotDisseminateFormat, idDoesNotExist

  10. Identify • No arguments • Requests information about the repository. • Response includes • repositoryName • BaseURL • protocolVersion • earliestDatestamp • deletedRecord (how does the repository handle deletions -- no, transient, persistent • Granularity (how finely can the datestamp be specified?) • adminEmail • compression (what schemes are supported) • description Optional

  11. ListIdentifiers • Required Argument • metadataPrefix • Optional Arguments • from • until • set • Exclusive argument • resumptionToken (flow control token for resuming an incompleted previous ListIdentifiers request) • Errors: badArgument, badResumptionToken, cannotDisseminateFormat, noRecordsMatch, noSetHierarchy

  12. ListMetadataFormats • Optional argument • identifier (if metadataformat is needed only for some particular item) • Errors - badArgument, idDoesNotExist, noMetadataFormats • Response includes both metadataPrefix and the associated schema

  13. ListRecords • Required arguments • metadataPrefix - Only records for which the specified metadataPrefix applies should be returned • Optional arguments • from • until • set • Exclusive arguments • resumtpionToken

  14. ListSets • Exclusive Argument • resumptionToken (used to continue a previous incomplete response to ListSets) • Errors - badArgument, badResumtpionToken, noSetHierarchy

  15. Resources • Compliance testing - www.dlib.vt.edu/projects/OAI/repexp/repexp.html • OAI PMH - www.openarchives.org/OAI/openarchivesprotocol.html • Implementation Guidelines www.openarchives.org/OAI/2.0/guidelines.htm

More Related