1 / 12

Developing PANDORA Mark Corbould Director, IT Business Systems

Explore the development strategy and future enhancements for PANDORA, a national and potentially distributed infrastructure for capturing, storing, and accessing digital content. Expect exponential growth, improved workflow efficiency, and better resource discovery.

hugoc
Télécharger la présentation

Developing PANDORA Mark Corbould Director, IT Business Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DevelopingPANDORAMark CorbouldDirector, IT Business Systems

  2. Context • Perceived Wisdom Accessing information from the Internet is like trying to drink from a fire hose • “It can’t be done” • “It will not scale” • “It is too expensive” • The goal posts keep moving as authors use the browser feature du jour • And the technical challenges are … • Systems/tools for capture, creation, storage, display and access • Metadata support • Access control and rights management • Preservation and ongoing access

  3. Development Strategy • Expect the archive to grow exponentially (at least a factor of two each year) • Develop PANDORA as a national and potentially distributed infrastructure • Develop PANDORA in the context of other collecting strategies, eg electronic deposit and whole of domain web capture • Buy not build

  4. What is PANDORA Today? • PANDAS • The ILMS of PANDORA • Systems/tools for capture, creation, storage • Metadata support • Access control • PANDORA’s Box • The Stacks of PANDORA • Large scale storage supporting ongoing access and long term preservation • PANDORA’s Lid • The Reading Room for PANDORA • Controlled public access to the archive using contemporary browsers • Appropriate resource discovery tools

  5. PANDAS • Improve workflow efficiency • Provide more effective quality assurance tools • Develop ability to allow publisher’s to push material into the archive • Keep pace with web publishing technology • Database-driven services • Streaming delivery

  6. PANDORA’s Box • The archive is currently approximated 1.5 million objects requiring 150GB of storage • … and growing fast • The Digital Object Storage System (DOSS) • Large scale storage system for Digital Collections • Initial system configuration provides 5 TB of storage • System can be scaled to 25 TB • PANDORA will migrate to DOSS for the end of July

  7. DOSS Architecture Ethernet 100 Mbs SCSI 80 MBs DB Server Web Server DOMS Server Fibre Channel 100 MBs SAN Switch Tape Library Disk Arrays

  8. PANDORA’s Lid • Initial release will go into production by the end of July, and will support • Automatically generated title entry pages • Access Controls • Improved resource discovery • Browse by title • Browse by subject • Full text search • Metadata search • And it will look better too!

  9. PANDORA’s Lid futures • Better integration with the Library Catalogue • Full metadata support • Facilitate the research use of the archive though the development of appropriate navigation tools • Support more sophisticated rights management • Better browser support

  10. Towards a Distributed National Archive PANDORA currently supports distributed collection management and access through a central system The Library in partnership with other agencies will explore “more” distributed models Currently the model being discussed is that of agencies having the choice to maintain local archives and access with a central metadata repository and access portal Two possible architectures have been proposed

  11. Distributed Storage • Enhance existing system to allow agencies to have local copies of PANDORA’s Box and their own public access system • Can be done in the short term • Management is central • Gathering may be local or central • Archiving is to a local system • PANDORA’s Lid provides normal functionality

  12. Distributed PANDORA • Each agency would provide local management, gathering, storage and access • National metadata repository and access portal may be real or virtual • Difficulties • Technology • Cost • A packaged hardware and software solution providing “PANDORA Appliance”

More Related