1 / 60

CONTENTdm Interoperability -- Leveraging resources; repurposing collections

CONTENTdm Interoperability -- Leveraging resources; repurposing collections. ALA Annual New Orleans, LA June 23 rd , Friday, 9 am to noon. Claire Cocco , Product Manager Geri Ingram , Customer Service Specialist DiMeMa, Inc. Agenda Part 1. 9:00 to 10:15

loyal
Télécharger la présentation

CONTENTdm Interoperability -- Leveraging resources; repurposing collections

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CONTENTdm Interoperability-- Leveraging resources; repurposing collections ALA AnnualNew Orleans, LAJune 23rd, Friday, 9 am to noon • Claire Cocco, Product Manager • Geri Ingram, Customer Service Specialist • DiMeMa, Inc.

  2. Agenda Part 1 9:00 to 10:15 • Mainstream digital objects into existing workflows Importing from legacy systems • Exporting • Example of collaborative development for interoperability METS transform (courtesy of CDL) [BREAK 10:15 TO 10:30]

  3. Agenda Part 2 10:30 to 11:30 • Customizing and integrating your CONTENTdm site • Web templates • Custom Queries and Results • Configuration files

  4. Agenda Part 3 11:30 to Noon • Handling Finding Aids • Importing EAD files into CONTENTdm

  5. Setting the context: fully engaged in digital library transformation • Library services and collections expanding to encompass all • Traditional to digital • Licensed • Reformatted • Sharing • Preserving

  6. Leveraging resources • Staff time and skills throughout the organization and/or consortium • Existing metadata in some form • Existing digital collections (images and transcripts)

  7. Why? For better customer service • In order to mainstream your processing and amplify your efforts. • Your digital collections should ultimately be mainstreamed into regular workflows, similar to the ones used for other materials (whether that’s done centrally or in a distributed fashion). • This includes selection, technical processing (cataloging, organizing, importing), integration with site vis-à-vis presentation and archiving.

  8. Mainstreaming processing of digital formats(Part 1 of 3) • Importing from other systems to CONTENTdm • Exporting from CONTENTdm • Example of collaborative development for interoperability • CONTENTdm Standard Export • METS transform for import

  9. I. Importing from other systems to CONTENTdm • Metadata only • When records describe items that are not yet scanned • Replace “null” files at later time • Metadata AND their digital files

  10. From an OPAC or other database system When you have… • Individual image files cataloged already • And can export from an OPAC or other dbms Or where you havecompound digital objects ready for migration

  11. Migration steps: • Prepare the collection and the import files • Cross-walk metadata to Dublin Core • Configure the CONTENTdm collection fields • Export and prep data in a tab-delimited ASCII file • Import the file to CONTENTdm

  12. Data prep: Common problems in tab delimited data files • Extra data in columns or rows • Extra tabs at end of line • Extra CRs at end of file (Should only be 1 CR) • Carriage return in metadata, tab in metadata • Files must exist • 0 versus O • Error may occur in previous record, check few rows before and after error • File names are required, not full pathnames

  13. Data prep: Troubleshooting with Excel • Use Microsoft Excel to open the file and view data • Each row should be an item with last column as filename • Work with small batches to find errors – keep adding items until record with error is found • Use Excel’s “CLEAN” function to remove invisible characters • Import images from directory without using tab delimited file • Checks for any type of imaging errors

  14. Demo: MARC to DC • Export MARC records to tab-delimited text file (using ILS or MarcEdit) • Format and clean up the text file to conform to your CONTENTdm Collection schema • Import the file (with or without images) to the Collection

  15. Importing compound objects • For documents, postcards, monographs and picture cubes • Can do singly or in batch • Much easier to start with singles, then set up for batch when process is smooth

  16. Migrate compound objects from another database system Where you have many compound digital objects to migrate • Prepare the collection and the import files • Cross-walk metadata to Dublin Core • Configure the CONTENTdm collection fields • Configure folders for scans and transcripts (if appropriate) • Choose an import method based on your data structure • Create tab-delimited ASCII file(s) appropriate to the method • Import the files to CONTENTdm in batches

  17. Multiple compound object wizard • Documented in online tutorial • Today’s demo described in handout • Four import methods for multiple object loading • Compound object (same as single, but upload batched) • Directory Structure (most flexible and efficient) • Object List (useful when NO page-level metadata) • Job List • Time allowing, demonstrate three different object types using 3 of 4 methods

  18. Choose a multiple compound import method based on your data

  19. Are your scan files Create compound object separated into No directories for EACH compound object compound object. directories? Yes Break up into batches by type Are they all the same type of compound No object? Yes Do you have one tab-delimited text file containing ALL the objects? Create text file listing all Do you have page-level Do you have tab- compound objects and metadata for the No No delimited text files for No object metadata or compound objects? EACH compound object? create a text file for each compound object. Yes Yes Yes . DIRECTORY STRUCTURE DIRECTORY STRUCTURE OBJECTLIST .

  20. Every one of the four CONTENTdm compound object importing methods • Requires object-level metadata • Requires preparation • File–naming, keeping sort order in mind • Each object has own directory for scans • May use tab-delimited text file(s) • Accommodates transcripts

  21. A word about descriptive page-level metadata • Supported by some but not all 4 import methods • NOT supported by Object List • At page-level Title is only field required • Technical metadata, can be generated by Template creator

  22. More on transcripts • Typescripts and transcripts • Requires a field designated as the data type “Full Text Search” • Inserted into the metadata field of the scanned page • During import • Through use of .txt file found, or • By Template Creator • If OCR Extension in use • Or by “Directory Import” as with early versions of CONTENTdm • Transcripts and typescripts are supported by all four methods (i.e., not considered “metadata” for purposes of this discussion)

  23. Demo: Import Multiple Compound Objects • Monograph using Compound Object method • Postcards using Object List method • Documents using Directory Structure method

  24. II. Exporting from CONTENTdm • To ascii tab-delimited with field headers • To xml: • Standard Dublin Core —only DC • Custom—all fields, including local but not structure • CDM Standard—all fields, including structure

  25. III. Examples of collaboration for interoperability • Web integration through search engines, RSS • OAI harvesting • Enable at collection or server level • Choose to suppress <pagedata> or not • WorldCat registration • Open WorldCat integration

  26. CONTENTdm and a new METS transform • Info available on USC in July • Code at SourceForge • Windows-oriented

  27. The CONTENTdm to METS conversion tool

  28. What is/are METS? Why is/are METS good? What is 7train? How do I use 7train? What do I get from 7train? How do I get 7train?

  29. What is/are METS? METS (Metadata Encoding and Transmission Standard) is an XML-based standard for encoding metadata to describe objects (digital or otherwise) within a digital library. See http://www.loc.gov/standards/mets/ for more information

  30. METS METS Metadata about this particular METS - encoder, contact info, etc. metsHdr metsHdr Descriptive metadata - title, author, subjects, etc. dmdSec dmdSec Metadata for the management of the object: technical details, object history, etc. amdSec amdSec A list of files that make up the object fileSec fileSec Description of the structure of the object, i.e. how the files fit together structMap structMap What to do with the object: machine actionable instructions behaviorSec behaviorSec Yellow elements/tags are required; all others are optional What is/are METS?

  31. Why METS? To be able to add your objects to other collections and increase the visibility your institution's assets.

  32. What is 7train? 7train is an XSL-based tool for converting XML documents - in this case CONTENTdm exports describing objects managed in the CONTENTdm system - into METS objects suitable for submission to a digital library system, such as the California Digital Library's Online Archive of California. 7train is a platform-independent, standalone tool that was designed to work on any system and to be simple to use.

  33. How does 7train work? It is as easy as dragging your CONTENTdm XML export file onto an executable file.

  34. How does 7train work?

  35. How does 7train work? What do you get?

  36. Output: A Sample METS document

  37. References & Links 7train Home: http://seventrain.sourceforge.net 7train Download: http://seventrain.sourceforge.net/7train_download.html CONTENTdm: http://www.dimema.com METS: http://www.loc.gov/standards/mets/ XSL: http://www.w3.org/Style/XSL/ The California Digital Library: http://www.cdlib.org The Online Archive of California: http://www.oac.cdlib.org

  38. CONTENTdm Existing Libraries New Libraries 10K/50K/ Unlimited Objects Other CONTENTdm sites CONTENTdm Multi-Site Server OPACS Librarians, Archivists… Interoperability Web WorldCat DC Regional Union Catalog XML DC OAI OAI OAI MARC RECORDS OPEN WORLDCAT OAI Other digital archives For Library Users

  39. BREAK—15 minutes • This concludes Part 1 • To come after the break: Part 2 • Customization Part 3 • Finding Aids

  40. Customizing and integrating your CONTENTdm site (Part 2 of 3) • Web templates • Custom Queries and Results • Configuration files

  41. CONTENTdm Web Templates • Customizable for integration • Designed to support broad range of users • Small to large organizations • Beginners to experts • Use out of the box with minimal customization • Basic customization requires minimal HTML skills • Fully customize including advanced extensions • Based on a PHP API (Hypertext Preprocessor and Application Program Interface)

  42. Basic Customizations • Minimal skills needed • Easy to make changes • Global include files • Variables • Recommend all organizations do basic customizations • Header (name/logo), contact e-mail address, colors, about page, home page http://www.contentdm.com/help4/custom/templates.html

  43. Getting Started • Access to Web server docs directory • HTML editor or text editor • Design plan • Logo or other graphics • Backup copy of original files

  44. Customization Demo • http://sr.contentdmdemo.com • Files located in /cdm4 directory • /includes/global_header.php • /client/LOC_global.php • /client/STY_global_style.php • about.php • browse.php • results.php • New logo saved in /cdm4/images/

  45. Advanced Customizations • Experience with HTML, PHP, and JavaScript needed • Customize looks for each collection • University of Nevada, Reno • Web Template extensions • E-commerce (University of Utah, Oregon State University) • Comment forms (SENYLRC, Enoch Pratt Free Library, OSU) • Custom metadata display (University of Oregon) • QuickTime video (Williams College) • http://www.contentdm.com/customers/index.html

  46. Examples of Advanced Customizations • University of Nevada, Reno http://imageserver.library.unr.edu/ • University of Utah http://www.lib.utah.edu/digital/bodmer/ • Oregon State University http://digitalcollections.library.oregonstate.edu/cdm4/client/bracero/ • SENYLRC http://www.hrvh.org/ • Enoch Pratt Free Library http://www.mdch.org/ • Williams College http://contentdm.williams.edu/

  47. Customizations Tips • Always make a backup! • Be aware of encoding (UTF-8 vs. ASCII) • See what other users are doing • Share, borrow, and copy ideas and code • http://www.contentdm.com/customers/index.html • Listserv • Document changes • Document which files are edited and what code changes are made to ease upgrading to newer versions

  48. Custom Queries and Results (CQR) • Create predefined, custom queries • Virtual collections • Guide users to specific results • Integrate with other sites • Multiple options • Simple hyperlink, drop-down list, index box, text box, browse • Easy to use • Wizard generates code to copy and paste into Web pages • Documentation • http://www.contentdm.com/help4/custom/cqr.html • http://www.contentdm.com/USC/tutorials/cqr.pdf

  49. CQR DEMO • Generate code using CQR • Copy and paste into Web pages • May need to change path • Customize as desired

  50. Configuration Files • Customizable files that reside on the server • Stop words • Full text field stop words – fullstop.txt • Automatic hyperlink stop words – stopwords.txt • http://www.contentdm.com/help4/custom/stopwords.html • Image viewer • Customize how images are displayed – imageconf.txt • For all collections or per collection • http://www.contentdm.com/help4/custom/zoompan.html

More Related