1 / 34

Natasa Bulatovic Max Planck Digital Library Research and Development

eSciDoc, VIRR and Digitization Lifecycle - insights into an infrastructure for management of digitized resources. Natasa Bulatovic Max Planck Digital Library Research and Development. The Max Planck Digital Library (MPDL) in a Nutshell.

dympna
Télécharger la présentation

Natasa Bulatovic Max Planck Digital Library Research and Development

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. eSciDoc, VIRR and Digitization Lifecycle - insights into an infrastructure for management of digitized resources Natasa Bulatovic Max Planck Digital Library Research and Development

  2. The Max Planck Digital Library (MPDL) in a Nutshell • Max Planck Digital Library (MPDL) is a service unit within the Max Planck Society (MPG) • MPG consists of about 80 institutes in three scientific sections • the Chemistry, Physics and Technology Section • the Biology and Medicine Section • the Human Sciences Section • The core activities of the MPDL lie in building up service infrastructure and tools for publications and research data • MPDL develops software solutions in close cooperation with scientists, librarians and technicians • In the Human Sciences Section several institutes have digitizedcultural artefacts and want to make them open access

  3. eSciDoc SOA Landscape

  4. Which data are managed?

  5. How? • PubMan – Publication Management • VIRR – Textual digitized resources management • IMEJI – Image management

  6. PubMan: Management of publications

  7. VIRR is about • Collaboration of the MPDL with the Max Planck Institute for European Legal History • Motivation: The period of the Holy Roman Empire produced a enormous corpus of legislative sources.Till now no complete collection of this works exist.

  8. ViRR Key features • Web-based collaborative application • Editor (bibliographic metadata, table of contents and structural metadata) • Viewer (online representation) • Browser

  9. ViRR Editor • Combines a set of tools • Paginator • Table of Contents Editor • Metadata Editor • One complex, but flexible workspace • No default order for the usage of the tools

  10. ViRR Editor - Paginator • Assign the logical page numbers to the physical ones • Choose between different formats (Arabic, Latin, custom) • Paginate manually or automatically

  11. ViRR Editor - ToC Editor • Gather the logical structure of a work by breaking it down in structural elements • Arrange the hierarchical order of structural elements in the tree • Assign scans to structural elements • Choose from fine granular structural element types (over sixty)

  12. ViRR Editor – Metadata Editor Assign descriptive metadata to structural elements • Detailed description of every structural element • Systematic browsing • Dedicated search will be possible

  13. ViRR Viewer Browse by ToC Navigate to page View metadata of structural element Browse by scan Page (web resolution) Page (full resolution) on click

  14. ViRR: Sharing and reuse http://virr.mpdl.mpg.de

  15. From ViRR to Digitization Lifecycle Project • Goal • support the complete Digitization Lifecycle with guideliness, standards, tools and a publishing platform • Partners: • MPI for European Legal History, Frankfurt • KunsthistorischesInstitut, Florenz (KHI) • Bibliotheca Hertziana, Rom • MPI for Human Development, Berlin • Related projects: • ViRR(see http://colab.mpdl.mpg.de/mediawiki/ViRR:_Virtueller_Raum_Reichsrecht) • XML-Workflow (see http://colab.mpdl.mpg.de/mediawiki/MPDL_Project_XML_Workflow)

  16. Imeji: Management of image collections

  17. Imeji: repository of Digital Images Organized into • Collections Created and defined by the institution, project, working group • Albums Created and defined by the researcher

  18. Imeji: what is so different about it? Imeji is not Flickr, nor Facebook... • Freely definable metadata profiles at collection level • Controlled Vocabularies may be integrated • Smart search for dates, ranges (based on the metadata type) Helps gathering the metadata more effectively Focusses on collaboration and metadata quality Repository: Data can be exported at any time

  19. eSciDoc and other services

  20. eSciDoc SOA Landscape

  21. Report Handler Report Definition Handler Aggregation Definition Handl. Statistics Data Handler Scope Handler Admin Handler Set Handler (OAI-PMH) Item Handler Container Handler Content Relation Handler Context Handler Organizational Unit Handler Content Model Manager User Account Handler Role Handler Group Handler eSciDoc core infrastructure Statistics Security Resources & Data

  22. CoNE Service • Manages named entities • Journals • Persons • Dewey Decimal Classification (3 public levels) • Creative Commons Licenses (CC licenses) • ISO 639-3 Languages • MIME Types • PACS classification • Custom classifications • Reuse • Data delivered in multiple formats (JSON, HTML, RDF/XML, Options list) • Motivation • Metadata quality: autosuggest components in solutions during metadata editing • Disambiguation: each entity is a named graph • Data linking: CoNE identifiers in publication metadata • Technical facilitation: all lists in one place • Persons: Researcher Portfolio • Extensions • Refresh data from external sources

  23. CoNE – Control of Named Entitieshttp://cone.mpdl.mpg.de/ http://pubman.mpdl.mpg.de/cone/persons/resource/persons2450 + Content negotiation supported

  24. Transformation Service • Transforms textual data formats • Metadata • Resources • Standard formats • Specific formats (e.g. EndNote custom fields) • Motivation • Migration of data from MPI • Exports and dissemination • Imports • Continuous interoperability enhancement • Implement once, use wherever needed

  25. Search&Export ServiceCiation style manager • Searches and exports results • Citation styles (Citation style manager) • EndNote • BibTex • … • Reuse • Data delivered in multiple formats (PDF, HTML, XML, ODT) • By external systems (content management, wordpress) • Motivation • Search results should be available in various outputs • One service – many presentations (e.g. Wordpress Plug-in) • One interface – easy inclusion of various export formats

  26. Syndication Service • Provides with the latest data updates • RSS • Atom • Reuse • Subscription to feeds and data reuse • By any external clients • Extensions • Media RSS Feeds: <feed> <!--The title of the feed --> <title>Recent releases in repository</title> <!--Feed's description --> <description>Recent releases in repository (item versions)</description> … </feed> Feeds: <feed> <!--The title of the feed --> <title>Recent releases in repository</title> <!--Feed's description --> <description>Recent releases in repository (item versions)</description> … </feed> Feeds: <feed> <!--The title of the feed --> <title>Recent releases in repository</title> <!--Feed's description --> <description>Recent releases in repository (item versions)</description> … </feed> 2: Get feed definition 2: Get feed definition 2: Get feed definition Syndication Service 1 4 Syndication Service 1 4 Syndication Service 1 4 3: Search/retrieve items 3: Search/retrieve items 3: Search/retrieve items eSciDoc Repository eSciDoc Repository eSciDoc Repository

  27. Validation service • Semantical validation • Contextual validation • Validation rule editor (upcoming)

  28. Data acquisition service • Fetches data from known sources via identifier (unAPI interface) • Transforms data to other format

  29. Pubman SWORD Server • Deposit of data packages (metadata and fulltexts) • Logic implements a pubman specific workflow

  30. PID Cache manager • Fetches Handles from the GWDG Handle System (dummy resolution) • Assigns a pre-fetched handle to the resource • Synchronizes the assigned handle with the resolution to a resource in the Handle system EPIC – European Persistent Identifier Consortium (GWDG Germany, SARA Netherlands, CSC Finland, http://www.pidconsortium.eu/ )

  31. A note on the metadata profiles • DCAP based (Dublin Core Application Profile) • DC terms (identified URIs) • eSciDoc solution specific terms (identified by URIs) • METS/MODS • Publicly available • Functional description http://colab.mpdl.mpg.de/mediawiki/ESciDoc_Application_Profiles • Schemas http://metadata.mpdl.mpg.de/escidoc/metadata/schemas/0.1/ • Interoperability levels • Shared term definitions (done) • Semantic interoperability (done) • Description set syntactic interoperability (prepared) • Description set profile interoperability (prepared)

  32. Premises • Applications • Web-based • Internationalized • Integrated Help system • Easy to use • Easy to install • Services and infrastructure • Reusable, interoperable, composed, technology-independent • Extensible, Scalable and performant  • Data • Persistently identified, versioned, discoverable, provenance and authenticity information, fine-grained authorization • Described with published metadata profiles • Interoperable and enabled for reuse and repurpose

  33. Related projects and new developments • DARIAH Digital Research Infrastructure for Arts and Humanities (see http://dariah.eu) • Imeji • AWOB • Astronomers Workbench • Resource Registries • ECHO – European Cultural Heritage Online (seehttp://echo.mpiwg-berlin.mpg.de/home )

  34. Thank you! • bulatovic@mpdl.mpg.de http://colab.mpdl.mpg.de http://escidoc.org

More Related