240 likes | 372 Vues
The paper discusses the development and implementation of digitometric services for Open Archives environments, presented by Tim Brody, Simon Kampa, Stevan Harnad, Les Carr, and Steve Hitchcock at ECDL 2003 in Trondheim, Norway. It emphasizes the Open Archives Initiative (OAI) and its protocols, promoting interoperability through exposed metadata. The research explores caching techniques, efficient record retrieval, and visualization tools for managing research metadata, contributing to enhanced knowledge mapping and access to open scholarly resources.
E N D
Digitometric Services forOpen Archives Environments Tim Brody Simon Kampa, Stevan Harnad, Les Carr, Steve Hitchcock {tdb01r,srk,harnad,lac,sh94r}@ecs.soton.ac.uk University of Southampton, Intelligence, Agents, Multimedia Group ECDL 2003, Trondheim, Norway
The protocol is openly documented, and metadata is “exposed” to at least some peer group (note: rights management can still apply!) Archive defined as a “collection of stuff” -- not the archivist’s definition of “archive”. “Repository” used in most OAI documents. Promotinginteroperability Open Archives Initiative ECDL 2003, Trondheim, Norway
OAI Data Model:Resources/Items/Records resource All available (meta)data about the resource Item = OAI identifier item Dublin CoreMetadata MARC Metadata ???XML records record = metadata + identifier + datestamp ECDL 2003, Trondheim, Norway
Protocol Responses ECDL 2003, Trondheim, Norway
Protocol HTTP URL Requests Service Provider Data Provider XML Responses Identify 1 Collection-level Description ListRecords?metadataPrefix=xyz 2 All repository xyz records 3 ListRecords?from=2003-04-02&… All repository xyz records since 2003-04-02 ECDL 2003, Trondheim, Norway
Other Commands • ListIdentifiers • Return only the identifier/datestamp/set membership • ListMetadataFormats • Return the available data formats • ListSets • Return the set structure (if there is one) • GetRecord • Return a record given by OAI identifier ECDL 2003, Trondheim, Norway
Interest in OAI • 111 registered OAI repositories • Many unregistered (e.g. all GNU EPrints.org and DSpace archives) • 4,500,000 public records • http://arc.cs.odu.edu/ • NSDL project, UK’s JISC Information Environment • OLAC (language community built on OAI) ECDL 2003, Trondheim, Norway
Why OAI? • Mandated Dublin Core allows the quick establishment of basic services and tools • Simple and metadata-neutral protocol allows more interesting possibilities (without breaking 1.) and extensions … ECDL 2003, Trondheim, Norway
Adding Caching to OAI-PMH ECDL 2003, Trondheim, Norway
Celestial (OAI Cache) • Developed to maintain a local metadata copy • Avoid repeated, large harvests during development • Provides an abstraction over multiple OAI versions • (hence acts as a gateway to older implementations) • Useful for testing OAI implementations & improving performance • Using XSLT provides a Web interface to OAI • Provides redundancy ECDL 2003, Trondheim, Norway
Citebase Search – Data Model e-Services ECDL 2003, Trondheim, Norway
Content • 250,000 full-text resources • 240,000 of which arXiv.org • 6 million references • 29 mean refs/paper (therefore failed to extract references for 18% of papers) • (n.b. modal refs is 19) • 1 million references linked internally to the full-text (15%) ECDL 2003, Trondheim, Norway
Citebase Search ECDL 2003, Trondheim, Norway
Citebase Search:Navigation by Citation Links Article withreference list Future Referencelink Related Current Article Co-cited Past ECDL 2003, Trondheim, Norway
Citebase Search cites cites ECDL 2003, Trondheim, Norway
Citebase Search cites cites ECDL 2003, Trondheim, Norway
Citebase Search “Co-cited” ECDL 2003, Trondheim, Norway
Read/Cite Cycle ECDL 2003, Trondheim, Norway
Digitometric Services for OAI • Tools for visualising research metadata • Builds an analysis service on Citebase • Knowledge mapping (co-authors, co-citation, etc.) ECDL 2003, Trondheim, Norway
Co-Citation Network ECDL 2003, Trondheim, Norway
Full Co-Citation Map ECDL 2003, Trondheim, Norway
Digitometric Services forOpen Archives Environments • http://www.openarchives.org/ • http://opcit.eprints.org/ • http://citebase.eprints.org/ • http://www.eprints.org/ • http://www.hyphen.info/ • AKT Project (knowledge) Thank you for listening! Tim Brody ECDL 2003, Trondheim, Norway