730 likes | 904 Vues
Digital Library Collections & Services Landscape & Strategies. Roy Tennant California Digital Library. Questions, Questions, Questions. You will leave with more questions than answers If I do my job right, they will be the right questions Feel free to ask questions as we go along.
E N D
Digital Library Collections & ServicesLandscape & Strategies Roy Tennant California Digital Library
Questions, Questions, Questions • You will leave with more questions than answers • If I do my job right, they will be the right questions • Feel free to ask questions as we go along
“It was the best of times, it was the worst of times…” — A Tale of Two Cities, Charles Dickens
The Reality • Too many information sources with which to cope • A lack of human assistance at the time and place needed • Not enough ways to filter, sort, and otherwise narrow in on what is needed • Access is limited to what is free, or what has been purchased or rented on behalf of a clientele • Many useful resources are only available in print
The Most Commonly Proposed Solution The Digital Library
Digital Library Myths • Having everything in digital form will solve our information access problems • Soon (or eventually) everything will be digital • Any collection of digital objects can be a digital library • Everyone agrees about what comprises a “digital library” and how to build one — all we must do is do the work
A Digital Library Is… • A collection of digital objects and/or information that is: • Selected • Organized • Made Accessible • Preserved • A set of services that help you to find and use those objects and information • Often supported by a physical collection and always by professional staff
Outline • Themes: • Library Catalogs • Digital Library Collections • Digital Reference • Libraries as Publishers • Cross-Database Searching • Technologies: • XML • Metadata • Interoperability
Library Catalogs • We seem to be unable to provide an easy and effective information locating tool • Keep in mind that only librarians like to search, everyone else likes to find • We are even failing at things we have explicitly tried to do • Let’s take a look at the evidence…
Typical Searches • Known Item • “A Few Good Things” • Comprehensive
Typical Searches: Known Item • The good: searches can be limited to a particular field: author, title, etc. • The bad: limiting to a particular field doesn’t always act the way you expect
Typical Searches:“A Few Good Things” • The one type of search we have so far ignored in library system design • A type of search that we can do something about today • Bring Google-style relevance to library catalogs (e.g., for union catalogs, sort by number of holding libraries)
Typical Searches: Comprehensive • Most library catalogs hide many things available via regional cooperative or ILL • It is difficult, if not impossible, to search all appropriate journal databases • Most libraries do not provide good access to gray literature and web sites • Subject headings are often unintuitive, and catalogs give no guidance • Catalogs give no chapter-level access to book content
The Rescue of Print • Many library users want only that which is convenient (read digital) • Print resources are, therefore, increasingly overlooked (I call this the “convenience catastrophe”) • We must fight this trend by enriching our catalog records with tables of contents, indexes, book covers, etc. to entice users to print books
Digital Library Collections • Applying the digital versions of traditional library activities (selecting, acquiring, etc.) to put library collections online • One of the most well-understood of digital library activities • Many examples, resources, consultants, and vendors are available to help a library digitize their collections
Digital Reference • Putting the human help where it’s needed — online • Software is now available that provides for: • Queuing of patrons with audible alerts • Chat between librarian and user • Push web pages • Form sharing • Highlighting • Saved and/or emailed transcripts • Statistics
XML • A method of creating and using tags to identify the structure and contents of a document — not how it should be displayed • The tags used can be arbitrary or can come from a specification • There are two types: well formed and valid…
Well-Formed XML • Follows general tagging rules: • All tags begin and end • But can be minimized if empty: <br/> • All tags are properly nested: • <author><name>Mark Twain</name></author> • All attribute values are quoted: • <subject scheme=“LCSH”>Music</subject> • Software can check to make sure a given document follows these basic rules
Valid XML • Uses only specific tags and rules as codified by one of: • A document type definition (DTD) • A schema definition • Only the tags listed by the schema or DTD can be used • Software can take a DTD or schema and verify that a given document adheres to the rules • Editing software can prevent an author from using anything except allowed tags
Transforming XML • XML Stylesheet Language — Transformations (XSLT) • A markup language and programming syntax for processing XML • Used to transform XML to another format (e.g., to HTML for delivery to standard web clients) or from one set of tags to another • An XML parser • A method to bring all the pieces together if serving to the web (e.g., CGI program, Java servlet, etc.)
Metadata • Structured information about an object or collection of objects • Types: • Descriptive • Administrative • Structural • Preservation
Metadata Standards • Dublin Core: a set of basic fields primarily for systems interoperability • MODS: a MARC-like bibliographic format • METS: a structural standard for encapsulating a digital object or set of digital objects, including one or more segments of descriptive and/or administrative metadata
Libraries as Publishers • Libraries are increasingly becoming involved with publishing activities • University libraries are capturing scholarship before it leaves campus, and making it freely available to all • Two examples: • Repositories • Book publishing
Repositories • Two flavors: • Institutional • Topic or type focused • Characteristics: • Often author-maintained; therefore metadata may be of uneven quality/quantity • Often compliant with the Open Archives Initiative harvesting protocol • Benefits: • Captures a grey literature not always collected by libraries • If OAI-compliant, can be “crawled” and indexed
MIT Stanford UC The OAI Model Aggregation Portal SubjectPortal
Book Publishing • Academic libraries and university presses are teaming up: • Libraries provide technical expertise, online access, persistence, professional collection management • University presses provide editing, print publication, imprimatur, marketing • Case Study: University of California Press and the California Digital Library
Transformation XSLT Stylesheet Information Presentation Bookencodedin XML XHTML Document (no displaymarkup)* Web Server HTML Stylesheet (CSS) * Dynamic document