230 likes | 382 Vues
Building a Digital Library with Fedora International Conference on Developing Digital Institutional Repositories Hong Kong December 9, 2004. UVA Digital Library Assumptions:. All media, all content types integrated into one collection A network that is built to be a part of a global network
E N D
Building a Digital Library with FedoraInternational Conference on Developing Digital Institutional RepositoriesHong KongDecember 9, 2004
UVA Digital Library Assumptions: • All media, all content types integrated into one collection • A network that is built to be a part of a global network • The global network will be built by libraries, governments and corporations • Searching and browsing are equally important
UVA Digital Library Assumptions (cont.): • We will provide to tools to give access and make use of our collections • Any given resource can be presented in any number of contexts • Increasingly, we will be faced with born-digital materials • This is going to take a very long time …
The Flexible Extensible Digital Repository Architecture • Developed at Cornell under an NSF grant • UVA Library re-interpreted the architecture and created the first practical implementation • 3 year project funded in 2001 by Andrew W. Mellon Foundation to create open-source system • Another 3 years of development funded by Mellon in 2004
Fedora is a set of web services that can provide a foundation for a variety of information management strategies. • Supports client applications through SOAP or HTTP connections • Provides back-end web services for content through behavior objects • Provides both management and access APIs • Provides a search index aimed at repository management • Fine-grained policy enforcement
A data object is one unit of content. System Metadata Digital object identifier Persistent ID ( PID ) Default Disseminator methods for disseminating “views” of content Your Extension Your Extension Datastream (item) set of content or metadata items Datastream (item) Datastream (item) metadata about history and policies
Persistent ID (PID) System Metadata Datastreams Behavior Definition Metadata Persistent ID (PID) System Metadata Persistent ID (PID) Datastreams Disseminators Service Binding Metadata (WSDL) System Metadata Datastreams Behavior Definition Object behavior subscription Data Object behavior contract data contract Web Service Behavior Mechanism Object
Disseminators for Data Normalization • Can deliver datastream content directly • Can transform content into other sizes or formats for delivery • Can be used to hide differences among objects of a given type
Disseminators as User Interface • Can deliver a “module” of user interface appropriate for the object • Different user interfaces for difference purposes or audiences • Easy to add new types of collections by adding new modules of code • The set of all behavior objects can be used as a database of code modules • Can provide a way to collect the “look and feel” of scholarly projects in a formal way
Relationships Among Objects • Relationship metadata datastream in the data object • Describes adjacency relationships among objects • RDF data of the form: PID – typeOfRelationship – relatedObjectPID • Can used to assemble collections for such things as creating full-text search indexes • Can build graphs of relationships to feed into a variety of user interfaces
The Resource Index • Uses Resource Description Framework (RDF) • The repository can be configured to index any combination of the following aspects of a digital object: • System metadata properties • Dublin Core metadata • Metadata about datastreams and disseminations • Relationship metadata • Internal dependencies (e.g., between datastreams and disseminators)
UVA First Implementation Arch Demo
Disseminators • Two default disseminators on every object • Default access behaviors, i.e. getPreview, getFullView, getLabel, getDefaultContent • Administrative and descriptive metadata behaviors • Class-specific disseminators, i.e. image and text disseminators • Search services to be provided using collection object disseminators
Text Collections: three models • TEI transcriptions of texts plus page images • TEI transcriptions only • Page images only, but the text represented by a minimal TEI file
Future Fedora Development • Improved infrastructure for workflow • Support for building indexes for searching • Infrastructure for building federations of Fedora repositories • Enhance performance • Support for preservation • Begin organizing a Fedora development consortium
Fedora Project web site:http://www.fedora.infoUVA Digital Initiatives:http://www.lib.virginia.edu/digital