1 / 20

DSpace System Architecture

DSpace System Architecture. Logical Data Model. Metadata I. Descriptive: Qualified Dublin Core in RDBMS; registry of elements and qualifiers Other schema (e.g. MARC) stored as serialised bitstreams Administrative: Some stored as Dublin Core (e.g. provenance)

imaran
Télécharger la présentation

DSpace System Architecture

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. DSpace System Architecture

  2. Logical Data Model

  3. Metadata I • Descriptive: • Qualified Dublin Core in RDBMS; registry of elements and qualifiers • Other schema (e.g. MARC) stored as serialised bitstreams • Administrative: • Some stored as Dublin Core (e.g. provenance) • Mostly stored in legacy DB schema

  4. Metadata II • Structural: • Currently, just structure of data model • Proposed use of METS to describe relationship between bitstreams in bundles • Preservation: • Registry of bitstream formats with preservation metadata

  5. Metadata III • Much of the metadata proprietary, embedded in RDBMS schema • Defining Archival Information Package based on METS to store this in a standard way

  6. Persistent Naming • Handle system from CNRI • Currently Handles assigned to items – soon to cover collections and communities • CNRI Handle resolving software with DSpace “plug-in” • Individual Handles managed within DSpace system

  7. Implementation • Implemented in Java • UNIX-type operating systems: Linux, HPUX etc • Open source – BSD-style license • Uses only free, open source libraries

  8. Storage Layer • RDBMS: PostgreSQL via JDBC, and a simpler “homegrown” API on top • Simple bitstream storage API – currently just stores bitstreams in the file system

  9. Content Management API • Java classes for manipulating content and metadata • Fully transaction-safe • Invokes authorisation system • Records object history

  10. Workflow System • Submissions to a collection may go through editorial or review process • Implemented simple submission workflow with three optional stage; meets initial requirements

  11. Search • Simple wrapper around Jakarta “Lucene” search engine • Lucene maintains indices, implements stemming, stop-word removal and wild-card searching • Wrapper allows bounded search within specific community/collection

  12. Browse • “Homegrown” • Builds indices in RDBMS • Browse by author, issue date, title • Bounded browse within specific community/collection

  13. “E-people” and Group Management • Currently represent only users of system: Not a naming authority • Stores user’s e-mail address, real name, password, contact information • Simple group management – organise e-people into named groups

  14. Authorisation • Policies are triples: • (object, action, group) • Any object (community, collection, item, bitstream) can have access policies, and policies can be inherited from containers

  15. History • Mechanical history of objects stored in RDF - Harmony ABC model • Stored in file system • Simple index held in RDBMS

  16. Ingest • Legacy import format (SIP) defined – soon to be based on METS • To ingest, content + metadata transferred to temporary filespace on the server • Ingest application is invoked, which trawls and ingests data • Ingested content can be sent through or bypass workflow system

  17. Administration Toolkit • Ubiquitous “Context” object is used for authorisation and transaction-safety • Configuration system – includes handling of configuration files for external tools (e.g. Apache, Tomcat) • Standard text-based logging • Classes for manipulating Dublin Core and bitstream format registries

  18. Web User Interface I • Implemented as Servlets (for processing HTTP requests) and JSPs (for displaying HTML) – allows easy modification of display without affecting functionality • Tomcat 4.0 Servlet engine running with Apache (for SSL features) • Caucho Resin also supported

  19. Web User Interface II • Authentication with e-mail/password or X509 certificates • Other sites can provide a site authenticator class with custom code, e.g. interfacing with a departmental database

  20. Web User Interface III • Pages for: • Search • Browse • Configurable community and collection home pages • “My DSpace” – managing in-progress submissions and workflow tasks • Central administration UI – currently little support for self-administration by communities • Item display • Item submission

More Related