1 / 95

Metadata for your Digital Collections

Metadata for your Digital Collections. Jenn Riley Metadata Librarian IU Digital Library Program. Many definitions of metadata. “Data about data” “Structured information about an information resource of any media type or format.” (Caplan)

ktheodore
Télécharger la présentation

Metadata for your Digital Collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metadata for your Digital Collections Jenn Riley Metadata Librarian IU Digital Library Program

  2. Many definitions of metadata • “Data about data” • “Structured information about an information resource of any media type or format.” (Caplan) • “Any data used to aid the identification, description and location of networked electronic resources.” (IFLA) • … INCOLSA Workshop

  3. Refining a definition • Other characteristics • Structure • Control • Origin • Machine-generated • Human-generated • In practice, the term often covers data and meta-metadata INCOLSA Workshop

  4. Some uses of metadata • By information specialists • Describing non-traditional materials • Cataloging Web sites • Navigating digital objects • Managing digital objects over the long term • Managing corporate assets • By novices • Preparing Web sites for search engines • Describing Eprints • Managing personal CD collections INCOLSA Workshop

  5. Metadata and cataloging • Depends on what you mean by: • metadata, and • cataloging! • But, in general: • Metadata is broader in scope than cataloging • Much metadata creation takes place outside of libraries • Good metadata practitioners use fundamental cataloging principles in non-MARC environments • Metadata created for many different types of materials • Metadata is NOT only for Internet resources! INCOLSA Workshop

  6. Metadata in digital library projects • Searching • Browsing • Display for users • Interoperability • Management of digital objects • Preservation • Navigation INCOLSA Workshop

  7. Some types of metadata INCOLSA Workshop

  8. How metadata is used INCOLSA Workshop

  9. Creating descriptive metadata • Digital library content management systems • ContentDM • ExLibris Digitool • Greenstone • Library catalogs • Spreadsheets & databases • XML INCOLSA Workshop

  10. Creating other types of metadata • Technical • Stored in content management system • Stored in separate Excel spreadsheet • Structural • Created and stored in content management system • METS XML • GIS • Using specialized software • Content markup • In XML INCOLSA Workshop

  11. Descriptive metadata • Purpose • Description • Discovery • Some common general schemas • Dublin Core (unqualified and qualified) • MARC • MARCXML • MODS • LOTS of domain-specific schemas INCOLSA Workshop

  12. Simple Dublin Core (DC) • 15-element set • National and international standard • 2001: Released as ANSI/NISO Z39.85 • 2003: Released as ISO 15836 • Maintained by the Dublin Core Metadata Initiative (DCMI) • Other players • DC Usage Board • DCMI Communities • DCMI Task Groups INCOLSA Workshop

  13. DCMI mission • The mission of DCMI is to make it easier to find resources using the Internet through the following activities: • Developing metadata standards for discovery across domains, • Defining frameworks for the interoperation of metadata sets, and, • Facilitating the development of community- or disciplinary-specific metadata sets that are consistent with items 1 and 2 INCOLSA Workshop

  14. DC Principles • Original principles • “Core” across all knowledge domains • No element required • All elements repeatable • 1:1 principle • DC Abstract Model • “A reference against which particular DC encoding guidelines can be compared” model • Two schools of thought on its development • Clarifies model underlying the metadata standard • Overly complicates a standard intended to be simple INCOLSA Workshop

  15. None required Some elements recommend a content or value standard as a best practice Relation Source Subject Type Content/value standards for DC • Coverage • Date • Format • Language • Identifier INCOLSA Workshop

  16. Some limitations of DC • Can’t indicate a main title vs. other subordinate titles • No method for specifying creator roles • W3CDTF format can’t indicate date ranges or uncertainty • Can’t by itself provide robust record relationships INCOLSA Workshop

  17. Good times to use DC • Cross-collection searching • Cross-domain discovery • Metadata sharing • Describing some types of simple resources • Metadata creation by novices INCOLSA Workshop

  18. Qualified Dublin Core (QDC) • Adds some increased specificity to Unqualified Dublin Core • Same governance structure as DC • Same encodings as DC • Same content/value standards as DC • Listed in DMCI Terms • Additional principles • Extensibility • Dumb-down principle INCOLSA Workshop

  19. Types of DC qualifiers • Additional elements • Element refinements • Encoding schemes • Vocabulary encoding schemes • Syntax encoding schemes INCOLSA Workshop

  20. DC qualifier status • Recommended • Conforming • Obsolete • Registered INCOLSA Workshop

  21. Limitations of QDC • Widely misunderstood • No method for specifying creator roles • W3CDTF format can’t indicate date ranges or uncertainty • Split across 3 XML schemas • No encoding in XML (yet) officially endorsed by DCMI INCOLSA Workshop

  22. Best times to use QDC • More specificity needed than simple DC, but not a fundamentally different approach to description • Want to share DC with others, but need a few extensions for your local environment • Describing some types of simple resources • Metadata creation by novices INCOLSA Workshop

  23. MAchine Readable Cataloging (MARC) • Format for the records in library catalogs • Used for library metadata since 1960s • Adopted as national standard in 1971 • Adopted as international standard in 1973 • Maintained by: • Network Development and MARC Standards Office at the Library of Congress • Standards and the Support Office at the National Library of Canada INCOLSA Workshop

  24. More about MARC • Actually a family of MARC standards throughout the world • U.S. & Canada use MARC21 • Structured as a binary interchange format • ANSI/NISO Z39.2 • ISO 2709 • Field names • Numeric fields • Alphabetic subfields INCOLSA Workshop

  25. Content/value standards for MARC • None required by the format itself • But US record creation practice relies heavily on: • AACR2r • ISBD • LCNAF • LCSH INCOLSA Workshop

  26. Limitations of MARC • Use of all its potential is time-consuming • OPACs don’t make full use of all possible data • OPACs virtually the only systems to use MARC data • Requires highly-trained staff to create • Local practice differs greatly INCOLSA Workshop

  27. Good times to use MARC • Integration with other records in OPAC • Resources are like those traditionally found in library catalogs • Maximum compatibility with other libraries is needed • Have expert catalogers for metadata creation INCOLSA Workshop

  28. MARC in XML (MARCXML) • Copies the exact structure of MARC21 in an XML syntax • Numeric fields • Alphabetic subfields • Implicit assumption that content/value standards are the same as in MARC INCOLSA Workshop

  29. Limitations of MARCXML • Not appropriate for direct data entry • Extremely verbose syntax • Full content validation requires tools external to XML Schema conformance INCOLSA Workshop

  30. Best times to use MARCXML • As a transition format between a MARC record and another XML-encoded metadata format • Materials lend themselves to library-type description • Need more robustness than DC offers • Want XML representation to store within larger digital object but need lossless conversion to MARC INCOLSA Workshop

  31. Metadata Object Description Schema (MODS) • Developed and managed by the Library of Congress Network Development and MARC Standards Office • For encoding bibliographic information • Influenced by MARC, but not equivalent • Usable for any format of materials • First released for trial use June 2002 • MODS 3.2 released late 2006 INCOLSA Workshop

  32. MODS differences from MARC • MODS is “MARC-like” but intended to be simpler • Textual tag names • Encoded in XML • Some specific changes • Some regrouping of elements • Removes some elements • Adds some elements INCOLSA Workshop

  33. Content/value standards for MODS • Many elements indicate a given content/value standard should be used • Generally follows MARC/AACR2/ISBD conventions • But not all enforced by the MODS XML schema • Authority attribute available on many elements INCOLSA Workshop

  34. Limitations of MODS • No lossless round-trip conversion from and to MARC • Still largely implemented by library community only • Some semantics of MARC lost INCOLSA Workshop

  35. Good times to use MODS • Materials lend themselves to library-type description • Want to reach both library and non-library audiences • Need more robustness than DC offers • Want XML representation to store within larger digital object INCOLSA Workshop

  36. Visual Resources Association (VRA) Core • From Visual Resources Association • Separates Work from Image • Library focus • Inspiration from Dublin Core • Version 3.0 released on 2002 • Version 4.0 currently in Beta INCOLSA Workshop

  37. Categories for the Description of Works of Art (CDWA) Lite • Reduced version of the Categories for the Description of Works of Art (512 categories) • From J. Paul Getty Trust • Museum focus • Conceived for record sharing INCOLSA Workshop

  38. Structure standards for learning materials • Gateway to Educational Materials (GEM) • From the U.S. Department of Education • Based on Qualified Dublin Core • Adds elements for instructional level, instructional method, etc. • “GEM's goal is to improve the organization and accessibility of the substantial collections of materials that are already available on various federal, state, university, non-profit, and commercial Internet sites.”* • IEEE Learning Object Metadata (LOM) • Elements for technical and descriptive metadata about learning resources * From <http://www.thegateway.org/about/documentation/schemas> INCOLSA Workshop

  39. Text Encoding Initiative (TEI) • TEI in Libraries • For encoding full texts of documents • Literary texts • Letters • …etc. • Requires specialized search engine • Delivery requires specialized software or offline conversion to HTML INCOLSA Workshop

  40. Encoded Archival Description (EAD) • Maintained by the Society for American Archivists EAD Working Group • Markup language for archival finding aids • Designed to accommodate multi-level description • Requires specialized search engine • Delivery requires specialized software or offline conversion to HTML • EAD 1.0 released in 1998 • EAD2002 finalized in December 2002 INCOLSA Workshop

  41. Levels of control • Data structure standards (e.g., MARC) • Data content standards (e.g., AACR2r) • Encoding schemes • Vocabulary • Syntax • High-level models (e.g., FRBR) • Very few metadata standards include a counterpart to the AACR “chief source of information” INCOLSA Workshop

  42. Some data content standards • Anglo-American Cataloging Rules, 2nd edition (AACR2) • Scheduled to be replaced by RDA in 2009 • Describing Archives: A Content Standard (DACS) • Replaces APPM • Cataloging Cultural Objects (CCO) • First content standard explicitly designed for these materials INCOLSA Workshop

  43. When there’s no data content standard… INCOLSA Workshop

  44. TGM I TGM II TGN GeoNet AAT LCSH LCNAF DCMI Type MIME Types …etc. Vocabulary encoding schemes INCOLSA Workshop

  45. Syntax encoding schemes • ISO8601 • W3CDTF • URI • AACR2r • …etc. INCOLSA Workshop

More Related