1 / 16

A Complex Standard and Its Use Results from an empirical analysis of MARC

2004 Texas Library Association Annual Conference, March 18, 2004, San Antonio, TX. A Complex Standard and Its Use Results from an empirical analysis of MARC.

lucius
Télécharger la présentation

A Complex Standard and Its Use Results from an empirical analysis of MARC

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. 2004 Texas Library Association Annual Conference, March 18, 2004, San Antonio, TX A Complex Standard and Its UseResults from an empirical analysis of MARC William E. Moen<wemoen@unt.edu>School of Library and Information SciencesTexas Center for Digital KnowledgeUniversity of North TexasDenton, TX 72603

  2. Overview • Context for the analysis -- interoperability • Findings from the analysis • Indexing and MARC • More questions … TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  3. Context for the analysis • Interoperability across library online catalogs • Indexing of MARC records to support searching • Richness of MARC content designation available • Indexing guidelines prepared for the Z39.50 Interoperability Testbed (Z-Interop) • Implications for indexing guidelines and policies TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  4. Interoperability testbed project Realizing the Vision of Networked Access to Library Resources: An Applied Research and Demonstration Project to Establish and Operate a Z39.50 Interoperability Testbed • A Institute of Museum and Library Services National Leadership Grant • Goal: Improve Z39.50 semantic interoperability among libraries for information access and resource sharing FOR MORE INFORMATION, VISIT THE PROJECT WEBSITE… http://www.unt.edu/zinterop/ TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  5. Components of the testbed • Test dataset • 400,000+ MARC 21 records from OCLC’s WorldCat • Z39.50 reference implementations • Z-client (Bookwhere), Z-server & information retrieval system (Sirsi Unicorn) • Test scenarios & searches • Searches with known result records from dataset • Benchmarks • Results of test searches using reference implementations TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  6. Books: 91% Cartographic Materials: < 1% Electronic resources: < 1% Archival/Mixed Materials: <1% Sound recordings: 4% Visual Materials: 1% Serials: 3% Z-Interop test dataset • Approximately 1% sample of MARC records from OCLC’s WorldCat database • Weighted sampling based on number of libraries “holding” the object represented by the record • 419,657 total MARC records • 89% of records “full level” cataloging • Formats represented in test dataset TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  7. MARC 21 content designation TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  8. Content designation in dataset TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  9. Summary frequency results Total number of fields/subfields occurring in dataset = 13,849,499 Only 4% of all fields/subfields account for 80% of all occurrences or 96% of all fields/subfields account for 20% of all occurrences TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  10. Characteristics of top 36 • Most frequently occurring: 650 $a [Subject data] • 2nd most frequently occurring: 040 $d [Cataloging source] • 3rd & 4th most frequently occurring: 260 $a & $b [Publication information] • 5th most frequently occurring: 245 $a [Title] • Contain data useful to end users: 28 • Contain control numbers, etc.: 5 • Contain data useful to catalogers: 3 TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  11. Indexing & MARC • Indexing Guidelines to Support Z39.50 Profile Searches • Identified all MARC 21 fields/subfields that may contain author, title, or subject data • Author-related fields/subfields : 119 • AuthorTitle-related fields/subfields: 21 • Title-related fields/subfields: 253 • Subject-related fields/subfields: 144 • 537 fields/subfields contain author, title, subject data • Usefulness of indexing all possible fields? • How often are these fields/subfields used? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  12. Occurrences in test dataset • 381 occur one or more times in Z-Interop dataset • Author, title, or subject fields/subfields inZ-Interop dataset • Author-related fields/subfields : 86 • AuthorTitle-related fields/subfields: 16 • Title-related fields/subfields: 178 • Subject-related fields/subfields: 101 • 19 of the 381 (5%) account for 80% of all occurrences • 9 of 19 are subject-related • 5 of 19 are author-related • 5 of 19 are title-related • The 19 fields/subfields TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  13. Implications for indexing • What difference does indexing decisions make? • Preliminary testing using the 19 fields/subfields: • 95% - 100% of correct records retrieved! • Is there a systematic method to identify the “best” fields/subfields to index? • Per format of materials? • Per user (librarians and end users) needs? • Good enough search results? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  14. Inquiring minds want to know… • What is the extent of catalogers’ use MARC 21 content designation as indicated by analyses of large random samples of MARC records? • What does the empirical evidence of MARC 21 content designation use suggest about a set of common or core elements in bibliographic records per format or type of material • What is the relationship between the availability of new MARC content designation and its subsequent adoption and use? • What methodology is appropriate to identify and understand factors contributing to cataloger’s utilization of available content designation and the interplay between MARC and the entire cataloging enterprise? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  15. To the future and beyond • Given solid empirical data on use of MARC content designation… • The records are artifacts of the cataloging enterprise – what can we learn about cataloger practices? • Are records complete enough to support FRBR applications? • What are the implications for standards developers for the evolution of metadata and encoding schemes? • Will we XML’ize MARC content designation whether it is used or not? TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

  16. References • Assessing Metadata Utilization: An Analysis of MARC Content Designation Use • http://www.unt.edu/wmoen/publications/MARCPaper_Final2003pdf.pdf • Z39.50 Interoperability Testbed • http://www.unt.edu/zinterop/ • Indexing Guidelines to Support Z39.50 Profile Searches • http://www.unt.edu/zinterop/Documents/IndexingGuidelines1Feb2002.pdf TLA Annual Conference -- March 18, 2004 -- San Antonio, TX

More Related