240 likes | 366 Vues
PSIgate Knowledge Exchange: Using OAI to Share Information Paul Meehan, PSIgate Technical Manager UKSG Meeting. May 14, 2003. PSIgate Knowledge Exchange: using OAI to share information. Subjects to Cover Google, PSIgate and the RDN Database Services and Partners Why OAI?
E N D
PSIgate Knowledge Exchange: Using OAI to Share Information Paul Meehan, PSIgate Technical Manager UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • Subjects to Cover • Google, PSIgate and the RDN • Database Services and Partners • Why OAI? • PSIgate Information Exchange via OAI • OAI Pros and Cons • Future Developments Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • The Trouble with Google • It’s too big! e.g. copper – 5.15 million hits copper + element – 518,000 copper + element + properties – 163,000 • relevancy and ranking Only 7 out of first 25 “copper” hits refer to the metal But as a starting point it’s still number 1!! Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • This is what happens … • Lost in (hyper)space! • You lose focus • You waste time • You miss information • FRUSTRATION! Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • The RDN • The Resource Discovery Network (RDN) is a collaboration of over seventy educational and research organisations, including the Natural History Museum and the British Library • It is a free service, funded by JISC • It is divided into a number of subject gateways or “hubs”, each of which selects and catalogues Internet resources for learning, teaching and research. Each hub provides search and browse interfaces to its records • The RDN additionally provides a central database of records • Subject gateways may also provide additional services Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information The RDN EEVL: Engineering PSIgate: Physical sciences BIOME: Biomedical sciences Humbul: Humanities SOSIG: Social sciences Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • PSIgate • PSIgate is the physical sciences gateway of the RDN • Based in Manchester; team of 4 • We offer a number of database services including: • Internet Resource Catalogue • astronomy, chemistry, earth sciences, materials, physics, history and policy of science • Expanded “Web Catalogue” • “additional services” We work with a number of professional bodies and organisations e.g. RSChem, IoP, BGS Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information PSIgate vs Google • * IRC contains 7600 resources, Web catalogue ca. 80,000 at May 12, 2003 • Catalogued by subject specialists • Should all be relevant! Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • PSIgate – Additional Services • News channels • Spotlight science magazine • Reference data • Science timelines • Plug-ins • Training materials • Featured sites and topics • Databases provided by external sources • Partnership with LTSNs … all of these require different technologies … including OAI! Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • Why OAI? • The Open Archives Initiative (OAI) “develops and promotes interoperability standards that aim to facilitate the efficient dissemination of content” • We should aim to use OAI because: • Promotes interoperability • Provides standard tools • Free! • It is supported by an increasing number of data providers Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • PSIgate Information Exchange • We currently use OAI for three distinct purposes: • Exposure of our records for gathering by the RDN • Harvesting of records from the IoP • Exposure of records to our LTSN partners Each of these services has its own quirks and methods! Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information Using OAI to Share Records with the RDN The RDN gathers resources from all the hubs to provide a central “ResourceFinder” service We store our records in a MySQL database (1), and convert them to “OAI format” on a regular basis (2). The RDN gathers the records (3,4) and stores them in a Cheshire database (5), which is then cross-searched via Z39.50. 3 - request 1 PSIgate Database (MySQL) PSIgate Records (OAI) RDN repository (OAI) Stored in Cheshire Database, search using Z39.50 2 - weekly 5 4 - retrieval resources Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information Step 1 – Record Conversion Firstly we need to create a series of OAI compliant records from our MySQL database. This is done by running a simple script and creating XML format records. This process is done automatically once per week. A typical record will contains the following core fields … Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information IDENTIFIER - 1 TITLE DESCRIPTION RESOURCE TYPE KEYWORD(S) IDENTIFIER - 2 PUBLISHER LANGUAGE RIGHTS <identifier>oai:psigate.ac.uk:1033748964-226</identifier> <dc:title> Aroma Chemistry </dc:title> <dc:description> A small set of overheads to accompany a lecture on aroma chemistry, by Terry Acree of Cornell University. The overheads provide theoretical summaries relating to: volatility, activity (Raoult's law), potency, odour units, complexity and mixtures. </dc:description> <dc:type> Lecture notes and courses </dc:type> <dc:subject> smells </dc:subject> <dc:subject> odours </dc:subject> <dc:identifier> http://www.nysaes.cornell.edu/fst/faculty/acree/fs430/lectures/tea18olfaction.html </dc:identifier> <dc:publisher> www.psigate.ac.uk </dc:publisher> <dc:language> en </dc:language> <dc:rights> This metadata record is for use by RDN partners only. </dc:rights>
PSIgate Knowledge Exchange: using OAI to share information • Key points • we create a bulk OAI repository once per week • the RDN is granted access to this repository • the RDN harvest this metadata once per week • we use the latest version (2.0) of the OAI-PMH OAI is efficient for this because: • the records are easy to generate • the process is automated • the RDN handles the retrieval and subsequent storage • The PSIgate database currently stands at around 7,600 resources. The central ResourceFinder service stores more than 50,000 records Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • The IoP Select Service • The Institute of Physics Publishing offers a service known as “IOP Select”. • “A FREE service from IoP Journals comprising articles chosen by our Editors for their novelty, significance and potential impact on future research” • free • updated weekly • normally requires username and password; direct access via PSIgate Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • Using OAI with IoP Select • The IoP allow us to gather their “select” resources via OAI • we use a harvesting tool to gather records on a weekly basis (1,2) • we extract relevant fields from each record eg title, abstract, URL (3) • we create search and browse interfaces to the records (4,5) Search interface 1 - request 4 IoP Repository (OAI) PSIgate repository (OAI) Key fields (text file) 3 - convert Browse interface 2 - retrieval 5 Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information Browse all papers Search by keyword Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information Option 1 – View all papers Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information Option 2 – Keyword search Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • Our experience of using OAI with IOP Select • Pros: • gathering is simple, using standard OAI-PMH conventions • fast (c. 5 minutes to search and retrieve records from large database) • we store records so we can adapt and create search interfaces easily • maintenance of records occurs at IoP end • building of search interface Cons: • record storage uses disk space (not significant at present, < 1MB) • we need to convert and store records in searchable format • building of search interface (!) Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information Using OAI to Share Records with the LTSNs We are engaged in projects with four LTSN partners: LTSN Physical Sciences UK Centre for Materials Education LTSN Geography, Earth and Environmental Science (GEES) LTSN Built Environment OAI allows us to expose our metadata records to these partners … work to be undertaken Summer 2003 Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • Future Work • LTSN partnerships • Other OAI data providers may offer useful data • Additional work with professional bodies and the IoP • Sharing records with RDN partners and hubs • GEsource Paul Meehan UKSG Meeting. May 14, 2003
PSIgate Knowledge Exchange: using OAI to share information • Conclusions • The Open Archives Initiative is already offering PSIgate a significant means of collaborating with external sources: • fast • fairly simple methodology • gaining increasingly widespread usage • harvesting processes can easily be automated • some work to be done with storage and searching Paul Meehan UKSG Meeting. May 14, 2003