370 likes | 485 Vues
Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities. Robina Clayphan Interoperability Manager, EDLF. ECDL Workshop – Harvesting Metadata: Practices and Challenges September 30 2009. Introduction.
E N D
Europeana: Update on Metadata Mapping and Normalisation, Content Ingestion and Aggregation Activities Robina Clayphan Interoperability Manager, EDLF ECDL Workshop – Harvesting Metadata: Practices and Challenges September 30 2009
Introduction • A look at the metadata schema we use and the elements that must be in a standard form • The whole ingestion process • Summary of the aspects of and approach to aggregation
Europeana Europeana brings together and makes available digital content from: • Four cultural heritage sectors • Museums, Archives, Libraries, Audio-visual archives • Twenty-nine countries • EU plus Norway and Switzerland • Twenty-six languages • Four types of material • Image, sound, video, text ….need for a metadata lingua franca…
ESE V3.2 Europeana Semantic Elements (ESE) V3.2 developed for the prototype • A Dublin core-based application profile • Cross-domain schema for heterogeneous data • Not to capture the full semantics of provider’s data • 37 Dublin Core terms – used principally to describe the objects • 12 Europeana coined terms - used to support portal functionality • Needed to have consistent data for the portal to work
Normalised elements • Language • ISO 369-1 standard two character code. • Country • ISO 3166 standard • Year • Four digit year from Gregorian calendar (YYYY) • Generated where possible from date supplied in <dc:date> • Provider • Controlled list of names, in the language of provider • Type • Controlled list (in English) of four types: Text, Image, Sound, Video • mapped from the diverse types used in source data (by provider)
Mapping and Normalisation Three key reference documents for providers: • ESE Specification V3.2 • Normalisation Guidelines V1.2 • ESE V3.2 XML schema + explanatory text All available from the “Provide Content” section of the Europeana Group pages: http://group.europeana.eu/web/guest/provide_content
Content Ingestion ……starting right from the beginning
Content Ingestion • Europeana has provided a Content Checker tool which has two parts: • The Content Ingestor • Allows uploading of a data set • Validation against the ESE V3.2 XML schema • Importing the data into the database • Indexing of data • Caching of thumbnails • The Test Portal • Separate from the operational portal • Allows provider to search for uploaded data
Content Ingestor Select “new data set” - the ingestor automatically creates a new ID – “null05” in this example
Aggregation and the Content Strategy Move on to a look at various aspects of aggregation in Europeana – the need for it, the approach to it.
Aggregation - terminology • A Content Provider • an organization that provides metadata that enables access to its digital objects • An Aggregator • collects metadata from a group of content providers • transmits them to Europeana, • helps content providers with guidance on conformance with Europeana norms • transforms metadata if necessary • supports the content providers with administration, operations and training
Roles and benefits • Content providers • Know their content and data best – fewer mapping errors • Look at the results before ingested in operational system • Aggregators • Know the needs of the providers (domain, level) • Play a bridging role between providers and Europeana – single point of contact, conduit for information in both directions • Europeana • Supporting role for consultation, co-ordination, standardisation • Management of the 10 million objects • Offer the cross-domain and multi-lingual service
Types of aggregator Matrix of aggregators: • cross-domain, single domain, thematic • level of operation – regional, national, European, global
Why aggregation? • November 2008 – 5 million items in Europeana • July 2009 - content from over 1000 providers • July 2010 – target of 10 million items • Many individual organisations asking to contribute • Currently there are six projects that aggregate content for Europeana (amongst other objectives) • another three projects starting later this year • Europeana Group site at: http://group.europeana.eu/web/guest/home
Why aggregation? • Labour-intensive administration and ingestion processes • Not due to the amount of data – but the number of organisations • Europeana is a small organisation! • Aggregation provides economies of scale allowing Europeana Office to remain relatively small Promoting aggregation and providing services and expertise to aggregators will be key to Europeana’s Content Strategy
Aggregation activities • Aggregators survey • Establish shared issues and need for support • Formation of Aggregators group • Council of Content Providers and Aggregators is now part of Europeana Governance structure • Training for aggregators • Generic and bespoke training days as the need arises • Identifying potential aggregators • “EuropeanaLabs” for Aggregators • Test environment for content delivery and/or software development
Aggregation activities • Handbook for aggregators. Content to be decided as part of survey but likely to cover: • Europeana source code, APIs, content checker etc • Technical documentation for participating in Europeana • Templates and documentation for budget planning, fundraising, revenue generation, sustainability • Templates and documentation for administrative and organisational aspects of running an aggregator • Templates and documentation on IPR and European Licensing framework • Documentation for establishing political and networks support • Templates and documentation for dissemination activities • Wiki for aggregator issues
Thank you! robinaclayphan@kb.nl
Thank you! robinaclayphan@kb.nl
1 isShownBy
2 isShownAt