1 / 43

Reconciling OCLC and Orbis

Reconciling OCLC and Orbis. Managing a Bibliographic and Holdings Synchronization Between Yale University Library and WorldCat Melissa A. Wisner. Purpose of Presentation. Describing the process What is involved? Staffing required Timeframe Programming required Are We Done Yet? No!.

ronny
Télécharger la présentation

Reconciling OCLC and Orbis

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reconciling OCLC and Orbis Managing a Bibliographic and Holdings Synchronization Between Yale University Library and WorldCat Melissa A. Wisner

  2. Purpose of Presentation • Describing the process • What is involved? • Staffing required • Timeframe • Programming required • Are We Done Yet? • No!

  3. Why do you want to come to this talk? • For any size collection a reconciliation is a detail oriented project, planning, pre-processing, OCLC processing, dealing with returned data, maintaining the data • Why do this? • Living with your own standards—good or bad • What is your database of record?

  4. YUL Background • Voyager ILS since 2002 • Approximately 8.5 million bibliographic records • Member of (former) RLIN • OCLC Participant—add pcc records, create IR records in WorldCat, weekly holdings update, some cataloging directly in Connexion, ILL lender • Early 00’s YUL did retrospective conversion with OCLC

  5. Standard Workflow between Voyager and OCLC • Weekly export to OCLC (staff flag records to send as needed) • Sporadic OCLC Batch Matches over the years • Local program to identify “candidate” records by encoding level and “UNCAT” status; send out to OCLC as separate project; filter and reload any 1, 4, or 7 el returned records and overlay the original • Run LC Match once a month-similar process against local copy of LCDB

  6. Arcadia Grant • Cultural Knowledge grant • March 2009-March 2013 • $5 million/$1 million per year • Cambodian Newspapers, Khmer Rouge Genocide documentation, African language materials and more… • Layoffs and re-staffing

  7. What Records to Send or Exclude? • Divided up by locations for staff review • Uncovered some data problems we knew about and didn’t know about…e.g. locations with no holdings in them; locations that still had holdings we thought had been migrated to new locations • Most significant…outdated MARC tags, outdated format codes, practice different from OCLC, dual script records

  8. What Records to Send or Exclude? • Sending approximately 6.7 million out of 8.5 million bibliographic records as UTF-8 • Excluding: • MARCIVE • E-resource records • Suppressed bibs • Unsuppressed bibs with suppressed holdings records • In Process/On Order records • UNCAT records**

  9. Tracking our Records • MySQL database created • Bib IDs • Exlcude Project ID (local tracking) • OCLC Project IDs • Reload Dates

  10. Tracking our Records • Used this to QA the results of the queries run to identify all potential records • Used this to push out files of bib ids by OCLC project ID to be used later to extract correct records to send to OCLC • Tracking was/is a big effort of reconciliation!

  11. Tracking our Records • As records are prepared for loading back into Voyager this MySQL database will be updated with those date(s) • OCLC will produce crossref reports and other processing reports per each file, but these are not concatenated into any form of a relational database

  12. Building an 079 Index in Voyager • Ex Libris contracted to generate and update Voyager indexes • Created in both Production and Test environments--took less then a day each time; downtime required; $ for service • Added 079|a and 079|z left anchored indexes

  13. Building an 079 Index in Voyager • Updated SYSN Composite Index to include new 079 indexes: • 019|a • 035|a |z • 079|a |z • Indexes were mostly to assist staff in searching, but also for bulk import profiles for ongoing loads • Exploring how to use the new indexes in ongoing EOD or e-resource loads from vendors

  14. OCLC Pre-Processing • OCLC IBM Mainframe limitations • Sending records in 100MB limit/90,000 records per file AND only 15 files per day • Separating records with 880s from those without • Additionally, OCLC is splitting out PCC records from the YUS files

  15. OCLC Pre-Processing • Each set of files sent as a “project” with unique ID • Creating label files, tracking via spreadsheet • Suspended weekly exports to OCLC (9/5/2010-12/20/2010**)

  16. OCLC Pre-Processing • Deleting YUL IR records in WorldCat • Why? Easier matching? • 5.7 million removed total • EBScan software process • Match routines set: • Example: match on this field and that or ….

  17. Cross Ref Reports and Stats • Sample • Adding in prefixes of ocm and ocn • Other statistical reports

  18. Loading OCLC Numbers back into Orbis • Basic Process: • Retrieve crossref report to be used as input • Script to de-dupe crossref reports by name* • Extract MARC record using Voyager API BIB_RETRIEVE and crossref as input • MARC4Java Open Source to parse and update the MARC record* • Remove any version of *OCoLC* *ocm* *ocn* in 035|a • Insert new IR number from crossref report with a prefix of (OCoLC)

  19. Loading OCLC Numbers back into Orbis • Basic Process: • Comparing 079|a to crossref report—if same, move on, if new, just add and move on, if different, update with new one and report out old one • Remove any 079|z and report out • Prepare new file of MARC records for bulk import • Report out log summary of process, errors encountered, discrepancies in 079|a • See handouts!

  20. Loading OCLC Numbers back into Orbis • Will also be our new permanent workflow post-reconciliation—maintenance of these control numbers! • Cornell, Columbia and Stanford all used similar processes… • Original hope was to load 250,000 records per day 4 days a week=estimated 6 weeks to reload everything back into Orbis…

  21. Loading OCLC Numbers back into Orbis • All depends on timing…OCLC process 80K records in 1-2 days for 6.7 million bibs it is 1.2 million/month or 2.4/month or 3 to 6 months total to process our data! • We can keep pace with loading updated MARC data, but waiting 6 months is a big deal • Need to keep 1 day a week for all other load activity in Orbis

  22. Loading OCLC Numbers back into Orbis • Run a keyword regen once a week—even though keyword index not being updated • Program to extract and update MARC records can process 80K records in 15 minutes • Bulk import run no-key takes 2 hours to load 80K records • Minimize the loss of any staff changes

  23. Handling Errors • Reports from OCLC with no match records (validation errors) • Correcting anything in OCLC? • Correcting records in Voyager then re-submitting post-reclamation? • See handouts!

  24. Processing a “Gap” File • Suspended weekly exports to OCLC 9/5/2010 • Extracted a version of the bib record between 9/8/10 and 9/10/2010 • Identify and extract all changes and new records from 9/8/10, that have an 079|a and the last operator in History is not OCLCRECON • Send to OCLC as another one-off project

  25. What Staff Will Do During Reconciliation • No processing of holdings in OCLC • ILL OK • Will not create IR records so as not to affect matching • Work in Orbis as normal otherwise

  26. Modifications Needed to Resume Weekly Exports to OCLC • Two file streams needed-one for archival materials and one for everything else • PCC records will be split off once at OCLC • YUM records split off once at OCLC • New process/program created

  27. Lessons Learned So Far • Consistent application of standards across cataloging units (Suppressed, Suppressed!!!, In Process records, etc.) • What is your database of record? • How much time to spend on fixing records so they can be sent? • Maintenance of the control numbers long term

  28. Questions? • Thank you! • melissa.wisner@yale.edu

More Related