1 / 29

Architecting an Extensible Digital Repository

Architecting an Extensible Digital Repository. Anoop Kumar, Ranjani Saigal,Rob Chavez, Nikolai Schwertner Tufts University, Medford, MA. Overview. Background Information on the evolution of TDL Design Requirements TDL Architecture Applications that interface with TDL Tufts DL search VUE.

thai
Télécharger la présentation

Architecting an Extensible Digital Repository

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Architecting an Extensible Digital Repository Anoop Kumar, Ranjani Saigal,Rob Chavez, Nikolai Schwertner Tufts University, Medford, MA

  2. Overview • Background Information on the evolution of TDL • Design Requirements • TDL Architecture • Applications that interface with TDL • Tufts DL search • VUE

  3. History of Digital Collections at Tufts • About Tufts • Interdisciplinary • Focus on teaching and learning • Digital Collections at Tufts • Perseus (Classics) • Tufts University Science Knowledgebase (TUSK-Medicine) • Artifact (Art History) • Digital Collections and Archives (DCA) • Bolles, etc • Other (Crime and Punishment)

  4. Why TDL?(Tufts Digital Library) • The collections were continuously expanding adding content in a variety of formats. The architecture of these libraries was not built to accommodate such expansion. • Needed a university wide digital repository that can manage the ever increasing content while continuing to service the discipline specific needs and leveraging existing and new tools and service

  5. Designing TDL • Digital Collections and Archives partnered with Academic Technology to create a digital library that can manage the content while supporting teaching and learning. • Commitment to comply with standards in the library and the open source community. • Ensure Scalability, Flexibility, Reusability, Extensibility and Interoperability

  6. Design Requirements • Ingest: • Ability to enforce archival standards • Management: • Use of information packages to facilitate storage and dissemination • Ability to incorporate content models • Persistence: • Use of persistent identifiers • mapped URNs

  7. Tufts DL Architecture Application Creation Service Application Data A A FedoraClient M U Drop Box Fedora Application Interface Search Interface Naming Service Fedora Ingestion Service U Search Index Search Indexing Service U - Users M - Manager A - Administrators

  8. Components of TDL

  9. TDL Architecture • Drop Box and Ingestion Service • Naming Service • Fedora Repository Service at Tufts • Indexing Service and Search Engine • Application Creation Service

  10. Drop Box and Ingestion Service

  11. TDL Architecture • Drop Box and Ingestion Service • Naming Service • Fedora Repository Service at Tufts • Indexing Service and Search Engine • Application Creation Service

  12. Naming Service • Assigns, reserves and resolves URNs • URN Format tufts:school name:owner:[collection:]item name tufts:dca:central:MS102.33.1345 • URN Properties • Provides unique ID to objects deposited into repository • Service assures resolution to unique resource.

  13. TDL Architecture • Drop Box and Ingestion Service • Naming Service • Fedora Repository Service at Tufts • Indexing Service and Search Engine • Application Creation Service

  14. Fedora Repository Service@Tufts • Fedora - Key Features • Repository at Tufts • Content Models at Tufts • Objects, Behaviors and Disseminator • Implementation Challenges

  15. Flexible Extensible Data Object Repository Architecture (Fedora) • Support for heterogeneous data types • Accommodation of new types as they emerge • Aggregation of mixed, possibly distributed, data into complex objects • The ability to specify multiple content disseminations of these objects • The ability to associate rights management schemes with these disseminations.

  16. Internet Bandwidth (200Kb JPEG) Repository Model Storage Device HTTP Server Processing Service Medium Bandwidth High Bandwidth (20Mb TIFF) (20Mb TIFF) HTTP Request stores URLs for Caching Service Fedora HTTP Request Medium Bandwidth HTTP Request (200Kb JPEG) Applications User

  17. Indexing Disseminators Repository-Level Disseminators • getIndexTerms • getForIndexing • Etc. • getArchivalCopy • getPreview • getClass • Etc. Text CM Image CM Binary CM VUE CM Collection CM • getTOC • getChunksList • getChunk • Etc. • getThumbnail • getAccessHigh • getImageStats • Etc. • getObject • getMIME • Etc. • getConceptMap • getResource • Etc. • getObjects • getInfo • Etc. Content Model (CM) Hierarchy Specific Implementations (TEI text, EAD text, Encyclopedia, Directory, TIFF image, etc)

  18. Implementation Challenges • Processing Large XML Documents • Transforming Large Images • Modeling Collections • Advanced Search • Customized Search • Caching Disseminations

  19. TDL Architecture • Drop Box and Ingestion Service • Naming Service • Fedora Repository Service at Tufts • Indexing Service and Search Engine • Application Creation Service

  20. Indexing Service and Search Engine • Indexing • Specialized Polymorphic Disseminators • Implementation • Lucene • Supported Types of Search • Basic Keyword • Advanced metadata based • Accessing the service • HTTP GET/POST • SOAP

  21. TDL Architecture • Drop Box and Ingestion Service • Naming Service • Fedora Repository Service at Tufts • Indexing Service and Search Engine • Application Creation Service

  22. Application Creation Service • An important design requirement for TDL was to allow current digital library applications to easily interface with TDL and provide access to the content in the digital library within their own environments in a seamless fashion. • Current applications like Perseus can interface with this service to allow their tools to disseminate the content that resides in TDL • The service has been designed not only to support current application but also to accommodate the needs of future yet-to-be-defined applications like course management systems, learning tools, portals etc.

  23. Applications Accessing TDL Content • Tufts DL Search • Visual Understanding Environment (VUE)

  24. Visual Understanding Environment (VUE)

  25. Why TDL?(Tufts Digital Library) • The collections are continuously expanding adding content in a variety of formats. The current architecture of these libraries is not built to accommodate such expansion. • Need a university wide digital repository that can manage the ever increasing content while continuing to service the discipline specific needs and leveraging existing and new tools and service

  26. Future Direction • Authentication and authorization service • Customization and enhancement to Fedora@Tufts to address a wide variety of needs. • Provide automated browsing service for Repository.

More Related