1 / 28

The Natural History Museum http://www.nhm.ac.uk

The Natural History Museum http://www.nhm.ac.uk. Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems c.hussey@nhm.ac.uk.  The Trustees of The Natural History Museum, 2002 . Data Access - challenges and opportunities.

morwenna
Télécharger la présentation

The Natural History Museum http://www.nhm.ac.uk

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Natural History Museumhttp://www.nhm.ac.uk Speaker: Charles Hussey Science Data Co-ordinator Department of Information and Library Systems c.hussey@nhm.ac.uk The Trustees of The Natural History Museum, 2002

  2. Data Access - challenges and opportunities Move towards networks connecting distributed sources Two components to this presentation Start by drawing upon work for European Natural History Specimen Information Network (Personal view of what is achievable) Then look at some of the approaches we have taken within The NHM

  3. Acknowledgements Nicolas Bailly, MNHN Paris, ENHSIN David Gee, originator of DSML Dilshat Hewzullah, NHM, DSML & Querying distributed databases Anne Hume, NHM, Online databases and DSML Andrew Jones, University of Cardiff, SPICE for Species 2000 Mike Lowndes, NHM, Museum Information Locator System Rachel Perkins, NHM, Collections Level Descriptions Mike Sadka, NHM, Fast-track programme Darrell Siebert, NHM, Fish Collection Database Chris Sleep, NHM, DSML Neil Thomson, NHM, BioCASE

  4. Nature of Data What do we have to deal with? First Challenge: Integrating disparate sources NHM Survey in 2000: 87 institutions responded: 33 different products; 40% using bespoke solutions; 5 using spreadsheets BioCISE Survey in 1998/99: 292 institutions responded: 60 different products; 75% using bespoke solutions; Only 8% providing web access to unit level data

  5. Nature of Data First Challenge: Integrating disparate sources Do data providers have the means to: • Implement and maintain a local Internet Server providing 24-hour a day access? • Compile metadata (collections level or unit level)? • Supply additional data (such as resolving localities or providing elements of higher taxonomy) • Maintain quality of datasets • Construct views of their data or implement wrappers • Handle version control

  6. Nature of Data Second Challenge: Comparing like with like • Authorities for names • Personal names • Geographic co-ordinates • Place names • Language and spelling

  7. Architectures • Single client/server database used by all providers and users 2. Central summary system 3. Central Gateway to distributed databases 4. Peer-to-peer databases 5. Web directory pointing to data sources

  8. Architectures • Single client/server database used by all providers and users Single database, subscribers have local client Allows detailed and complex interaction with data Example: NHM Palaeontology Collections Management System Example: Packages for Observers – Recorder 2000, MapMate

  9. Architectures 2. Central summary system Contributors maintain their own systems and post copies of data to centrally maintained database Example: NBN Species Dictionary

  10. Architectures 3. Central Gateway to distributed databases No central database …but “Common Access System” may store metadata Example: Species 2000 Example: Biodiversity on the Web

  11. Biodiversity on the Web Selection of Searchable Databases

  12. Architectures 4. Peer-to-peer databases multiple Z39.50 servers and clients Example: Species Analyst Example: AHDS

  13. Architectures 5. Web directory pointing to data sources Essentially, a portal Example: BIODIV

  14. Other Issues • Scalability • Sustainability • Access • Quality Control Terminology Control “Gaps” in data: Still parts of collection not yet databased Collection not suitable for databasing at unit level Inadequate data dictionary Data not available for a specimen Data needs interpretation Indicators for Quality

  15. A Case in Point: Wrapping a dataset for ENHSIN Pilot • Copy table from Access to SQL Server • Restructure table to add “new” fields • Perform conversions: • Place = Waterbody + Locality(verbatim) + Site.Ref. • Split Collection date to DAY, MONTH, YEAR • Convert Lat & Long to decimal degrees • Convert Altitude to metres and deal with altitude ranges • Shape = Material + “(“+Preservation Method +”)” • Collector = Collector Surname + Initials + Title • Determiner = Determiner Surname + initials + Title • Populate blank fields with static data by creating view (e.g. for Kingdom, Collection Name, Contact Info.) • Delete fields not required after conversion • Rename fields to match ENHSIN element names

  16. NHM Initiatives • Imaging of Primary Sources • Zoology Accession Ledgers • Entomology Card indexes (VIADOCS project) • Rapid Data Entry • Fish Collection • Botany Pilot • Collections Level Description • Darwin Centre • Entomology Index to Collections • Integrated Access • Data Locator

  17. Links ENHSIN: http://www.nhm.ac.uk/science/rco/enhsin/index.html SPICE Project: http://www.systematics.reading.ac.uk/spice Biodiversity on the Web: http://www.biodiversity.org.uk/ibs/ Species Analyst: http://habanero.nhm.ukans.edu NBN Species Dictionary: http://yaw.nhm.ac.uk/nhm/ AHDS Gateway: http://prospero.ahds.ac.uk:8080/ahds_live/ BIODIV: http://www.br.fgov.be/biodiv/ NHM Collection Level Descriptions: http://www.nhm.ac.uk/cld/index.shtml NHM Data Locator:http://internt.nhm.ac.uk/cgi-bin/locator/ Online databases at NHM: http://www.nhm.ac.uk/science/projects.html

More Related