1 / 28

The Open Archives Initiative Protocol for Metadata Harvesting and the IMLS Digital Collections & Content Project at

The Open Archives Initiative Protocol for Metadata Harvesting and the IMLS Digital Collections & Content Project at the University of Illinois. Timothy W. Cole (t-cole3@uiuc.edu) Mathematics Librarian & Professor of Library Administration University of Illinois at Urbana-Champaign

ince
Télécharger la présentation

The Open Archives Initiative Protocol for Metadata Harvesting and the IMLS Digital Collections & Content Project at

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Open Archives Initiative Protocol for Metadata Harvesting and the IMLS Digital Collections & Content Project at the University of Illinois Timothy W. Cole (t-cole3@uiuc.edu) Mathematics Librarian & Professor of Library Administration University of Illinois at Urbana-Champaign Friday 12 November 2004 MCN 2004, Minneapolis, MN http://imlsdcc.grainger.uiuc.edu/Cole_MCN2004_OAI.ppt

  2. The Digital Information Landscape • The information landscape can be seen as a contour map in which there are mountains, hillocks, valleys, plains and plateaus…. A specialized collection of particular importance is like a sharp peak. Upon a plateau there might be undulations representing strengths and weaknesses…. The landscape is, however, multidimensional. Where one scholar may see a peak another may see a trough. The task is to devise mapping conventions which enable scholars to read the map of the landscape fruitfully, at the appropriate level of generality or specificity. Michael Heaney (2000), “An Analytical Model of Collections and their Catalogues.” t-cole3@uiuc.eduUniversity of Illinois at UC

  3. Users & Uses of Digital Libraries • From Bibusages study (French National Library): • Digital Libraries are used in conjunction with Web search engines, generalist portals, commercial sites • Mix of intensive & casual users • DL users skew somewhat older, higher degree level than average French Internet user population • DL users seeking answer for specific information need; most time spent discovering, viewing, & downloading documents “Digital Libraries … are now attracting a new type of public, bringing about new, unique and original ways for reading and understanding texts.”Houssem Assadi, et al. “Users & Uses of Online Digital Libraries in France,” ECDL 2003 t-cole3@uiuc.eduUniversity of Illinois at UC

  4. Managing Digital Collections & Content • How do mandates translate & change in digital world? • Content & collections as virtual ‘information landscapes’ • New users, uses, & metrics • Increased emphasis on interoperability & sharing • New models for sharing & resource discovery • Harvesting – e.g., OAI-PMH • Federated searching – e.g., Z39.50 / ZNG, DiGIR, ... • New Emphasis on ‘Shareable’ metadata • Reconciling different descriptive metadata practices • New metrics for metadata quality (for interoperability) t-cole3@uiuc.eduUniversity of Illinois at UC

  5. IMLS Digital Library Forum (2001) • Framework of Guidance for Building Good Digital Collectionshttp://www.niso.org/framework/forumframework.html • Stresses reusability, persistence, interoperability, verification, and documentation of digital collections & content • Accompanying report included recommendations encouraging: • Creation of an IMLS Collection Registry • Implementation of the Open Archives Initiative Protocol for Metadata Harvesting by IMLS projects creating digital content • Development of infrastructure to facilitate interoperability between IMLS projects and initiatives like NSDL t-cole3@uiuc.eduUniversity of Illinois at UC

  6. IMLS DCC Project Overview • Collection description & prototype registry for IMLS National Leadership Grant projects with associated digital content • Enhance discoverability of collections & content • Provide alternative view of one output of IMLS NLG program • Prototype item level metadata repository via OAI-PMH • Demonstrate potential of metadata for interoperability • Serve as testbed for IMLS projects interested in OAI-PMH • Facilitate reuse of information resources paid for by IMLS • Research question:How can resource developers best represent collections and itemsto meet the needs of service providers and end users? t-cole3@uiuc.eduUniversity of Illinois at UC

  7. IMLS Grantees – A Diverse Community • Mix of library, museum, and archive traditions • Wide variation in technical skills, technology infrastructure & information management policy • Diverse perspectives on intellectual property; use and presentation of metadata & primary resources • Diverse embedded knowledge structures • Results in wide variability in: • Metadata formats • Content resource types • Controlled vocabularies • Descriptive metadata practices t-cole3@uiuc.eduUniversity of Illinois at UC

  8. Broad Categories of InstitutionsRepresented in Collection Registry

  9. Detailed Institution TypesRepresented in Collection Registry

  10. Broad Categories of InstitutionsRepresented in Metadata Repository

  11. Detailed Institution TypesRepresented in Metadata Repository

  12. Metadata Formats

  13. Types of Resources

  14. Controlled Vocabularies t-cole3@uiuc.eduUniversity of Illinois at UC

  15. Descriptive Practice • Different traditions regarding • Inclusion of interpretive information • Granularity of description • Presentation of information resources • Shared problems / issues • How to provide context & collection description • What exactly to describe • Which metadata scheme(s) to use t-cole3@uiuc.eduUniversity of Illinois at UC

  16. Illustration – Coverlets (1 of 2) Description:Digital image of a single-sized cotton coverlet for a bed with embroidered butterfly design. Handmade by Anna F. Ginsberg Hayutin. Source:Materials: cotton and embroidery floss. Dimensions: 71 in. x 86 in. Markings: top right hand corner has 1 1/2 in. x 1/2 in. label cut outs at upper left and right hand side for head board; fabric is woven in a variation of a rib weave; color each of yellow and gray; hand-embroidered cotton butterflies and flowers from two shades of each color of embroidery floss - blue, pink, green and purple and single top 20 in. bordered with blue and black cotton embroidery thread; stitches used for embroidery: running stitch, chain stitch, French knot and back stitches; selvage edges left unfinished; lower edges turned under and finished with large gray running stitches made with embroidery floss. Format:Epson Expression 836 XL Scanner with Adobe Photoshop version 5.5; 300 dpi; 21-53K bytes. Available via the World Wide Web. Coverage:— Date Created: 2001-09-19 09:45:18; Updated: 20011107162451; Created: 2001-04-05; Created: 1912-1920? Type:Image t-cole3@uiuc.eduUniversity of Illinois at UC

  17. Illustration – Coverlets (2 of 2) Description:Materials: Textile--Multi, Pigment—Dye; Manufacturing Process: Weaving--Hand, Spinning, Dyeing, Hand-loomed blue wool and white linen coverlet, worked in overshot weave in plain geometric variant of a checkerboard pattern. Coverlet is constructed from finely spun, indigo-dyed wool and undyed linen, woven with considerable skill. Although the pattern is simpler, the overall craftsmanship is higher than 1934.01.0094A. - D. Schrishuhn, 11/19/99 This coverlet is an example of early "overshot" weaving construction, probably dating to the 1820's and is not attributable to any particular weaver. -- Georgette Meredith, 10/9/1973 Source:— Format:228 x 169 x 1.2 cm (1,629 g) Coverage:Euro-American; America, North; United States; Indiana? Illinois? Date:Early 19th c. CE Type:cultural; physical object; original t-cole3@uiuc.eduUniversity of Illinois at UC

  18. OAI Protocol for Metadata Harvesting • ‘Harvesting’ approachto interoperabilityat metadata level • Divides world intoMetadata Providers& Service Providers • Builds on HTTP,XML, & Community Metadata Standards t-cole3@uiuc.eduUniversity of Illinois at UC

  19. Metadata Harvesting Model

  20. How OAI-PMH Works • OAI “VERBS” Identify ListMetadataFormats ListSets ListIdentifiers ListRecords GetRecord t-cole3@uiuc.eduUniversity of Illinois at UC

  21. Why OAI-PMH for IMLS DCC Project • Offers low technical barrier options; primary cost is metadata • e.g., OAI-PMH itself, OAI Static Repository, mod_oai • Is a cross-domain, non-proprietary approach to interoperability • Already used by NSDL, OAIster, etc. • Seen as a way to bring content to attention of wider audience • 37% of visits to State Library of New South Wales image collection via PictureAustralia (a OAI-PMH based portal) • Facilitates metadata & metadata services research • What makes for good ‘shareable’ metadata? • Contrast & compare metadata designs & workflows • Explore normalization, enhancement, aggregated searching issues t-cole3@uiuc.eduUniversity of Illinois at UC

  22. OAI-PMH Issues • Harvesting vs. federated • Harvested metadata aggregation always out of date, butFederated real-time performance dependent on weakest link • Sorting, ranking, & de-dupping easier with harvesting model • Potential scale issues • Largest OAI-PMH provider serves 4 million records • Largest OAI-PMH service provider < 10 million records • Integration into existing metadata workflow requires some investment – cost-to-benefit ratio still unclear • Practical metadata sharing issues: • Persistent identifiers, date stamps, proper application of protocol • Metadata quality, consistency, context, cross-walking, ... t-cole3@uiuc.eduUniversity of Illinois at UC

  23. Federated Searching Model

  24. Alternative Approaches for Interoperability • Federated search models • Library: NISO Z39.50 • Specimen / Natural History: DiGIR • More homogeneous metadata schemes, query rules • Collaborative, sometimes proprietary project portals • RLG Cultural Materials • ArtStor • GBIF, MaNIS, ... • Generally higher technical threshold; rely on higher level of metadata homogeneity & compliance t-cole3@uiuc.eduUniversity of Illinois at UC

  25. OAI-PMH as Complement to Other Approaches • OAI-PMH provides a lowest-common-denominator approach to sharing & interoperability • Insufficient for some high-level, domain-specific applications, • But useful for sharing across more heterogeneous communities & allowing participation with less technology • Portals can exploit combination of approaches • OAI-PMH metadata harvesters can normalize & augment metadata before sharing on with domain-specific federated search portals t-cole3@uiuc.eduUniversity of Illinois at UC

  26. IMLS DCC Collection Registry (alpha) Features: • Searchable • Browseable • An entry point foritem-level searching t-cole3@uiuc.eduUniversity of Illinois at UC

  27. IMLS DCC Metadata Repository (alpha) • Currently Harvesting: • 27 Collections • 193,677 Records • Ongoing analysis of metadata • Documenting practices • Potential for normalization • Implications for interface & search engine design t-cole3@uiuc.eduUniversity of Illinois at UC

  28. More Information • This presentation: • http://imlsdcc.grainger.uiuc.edu/Cole_MCN2004_OAI.ppt • Project Website: http://imlsdcc.grainger.uiuc.edu/ • Project PI: Tim Cole, t-cole3@uiuc.edu • Project Coordinator: Sarah Shreeves, sshreeve@uiuc.edu • OAI-PMH resources: • http://www.openarchives.org/ • Online OAI-PMH tutorial: http://www.oaforum.org/tutorial/ • DLF OAI-PMH & shareable metadata best practices (under development): http://oai-best.comm.nsdl.org/ t-cole3@uiuc.eduUniversity of Illinois at UC

More Related