1 / 44

GLOBAL BIODIVERSITY

INFORMATION FACILITY. GLOBAL BIODIVERSITY. GBIF efforts in digitizing and mobilising primary biodiversity data. Vishwas Chavan and Nicholas King February 12, 2008 vchavan@gbif.org. WWW.GBIF.ORG. GBIF’s Mission.

hovan
Télécharger la présentation

GLOBAL BIODIVERSITY

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. INFORMATIONFACILITY GLOBALBIODIVERSITY GBIF efforts in digitizing and mobilising primary biodiversity data Vishwas Chavan and Nicholas King February 12, 2008 vchavan@gbif.org WWW.GBIF.ORG

  2. GBIF’s Mission …to make the world’s biodiversity data freely and universally available via the Internet What is biodiversity? • GBIF follows the broadly outlined CBD recognition of levels of biological diversity: • Molecules / genes • Species • Ecosystems / ecology

  3. Who needs primary biodiversity data! • Scientists, experts, consultants • Government officials at all levels • Farmers, foresters, indigenous communities • Education at all levels • NGOs and the general public These needs are highly varied, but can be met by open access to the same datasets The same data can be analysed differently for different uses

  4. But this needs easy access to (digitised) data As of end 07 GBIF facilitates access to 142 million primary data records Screen shot: 26 Oct 2007

  5. GBIF Data Portal: Dispelling Mythes! • Searches • Taxonomic • Geographic (by country or bounding-box) • By dataset • Taxonomic browse navigation using choice of classification • Integration of data: DiGIR-Darwin Core & BioCASe-ABCD (new versions), TAPIR, tab-delimited, TCS, SDD • Search and download by one to many species, geography, dataset (or combination) • Web services Distributed, Decentralised, Data Discovery and Access through network of heterogenous and multicultural partners is possible!

  6. Countries are organised alphabetically on the lhs, and show numbers of national records on the rhs. Here we can see that there are more than 3.2 million records available for South Africa (2,8 million with coordinates), referring to nearly 41.500 species

  7. Example of a country summary page. This map provides an overview of the density of records currently available.

  8. Sample of records available for South Africa at September 2007. The GBIF portal offers a range of options for further use of the data…

  9. It is also possible to get the full list of organisations providing data collected in a specific country or region In this case 68 collections from all over the world are making available data for South Africa through GBIF – a good exemplar of data repatriation activities promoted and facilitated by GBIF

  10. South African institutions are also providing data relevant to other countries and regions in the world, as demonstrated in this example from the Shark Collection at the Iziko South African Museum

  11. The GBIF data portal also allows for more detailed views of regions, datasets, taxonomic groups, etc. Here it is possible to see nearly 100 000 records from the Linefish dataset collected in 1989 by the Marine and Coastal Management (MCM) at the Department of Environmental Affairs and Tourism in South Africa

  12. Exporting data from the GBIF data portal to other applications such as Google Earth is a matter of a click!

  13. Coverage for Africa • >5m records currently for Africa • > 1m from EU country institutions • Estimated >100m not yet digitised

  14. Within Google Earth overlays it is also possible to go down to the level of individual primary records, getting back to the original data provider

  15. With the filter functionality it is possible to perform complex queries on the data. In this example we are looking for all records on Lepidoptera (butterflies) collected or observed in South Africa from 1950 to 2000.

  16. Range changes due to Climate Change Proteaceae in the Cape Floral Kingdom

  17. Leucospermumtomentosum: range centres in10 year time slices

  18. But, this is just a beginning....... We need to cover much beyond imagination, and much much faster than we think?

  19. Biological Data Domain - challenges Greatest Informatics Problems Digital Status Data Status Sub-domain Data migration, cleansing, vouchering, taxonomy (gene & species) Molecular Sequence & Gene/Genome Data 95% digital Persistent digital, universally accessible data stores Species- & Specimen Data Persistent physical data stores, accessible with difficulty Digitisation, migration of legacy data, indexing <5% digital Ecological & Ecosystem Data 80% ? digital Persistent digital and physical data stores, moderately accessible Migration of legacy data, metadata generation, taxonomy (species)

  20. Primary Biodiversity Data

  21. Primary Biodiversity Data Observations / Monitoring Biological Collections N A M E S Multimedia Resources

  22. Growth rate of GBIF data sharing

  23. 1 Billion Record by 2008 – We need to expedite! • Many specimens remain to have their data digitised • Many records are already digital... • … but are not yet being shared * data useful in analyses that contribute to sustainable management of biodiversity

  24. GBIF is all about our shared vision and partnership • 28 Voting Country Participants • 15 Associate country Participants • 35 International Organisations and Economies

  25. GBIF Working Principles • Collaboration and sharing — not compilation • Ownership of data (specimens or names) remains entirely with providers • Standardised schemata for data sharing — software free to providers • Worldwide network of collaborating institutions that share data (data providers) • GBIF’s Participants’ Nodes promote and coordinate activities of data providers

  26. GBIF Working Principles • Procedures for interoperability and data integration • Web services (mostly for machines, but for people too) • Global registry for advertisement of shared data • Vision and coordination • GBIF has a unique global mandate in both Informatics and Content • GBIF is a multi-purpose, open-ended cyber-infrastructure that facilitates biologists serving biodiversity and society in new ways

  27. GBIF Strategic Areas 2007 – 2011 • Informatics • Data portal powerful and friendly • Consolidated infrastructure and standards • Tools and support for Nodes and providers • Content • Data quantity and richness in priority areas • Data integration and discovery • Documented data quality • Participation • Nodes' expertise shared across the network • Guidance on setting up and maintaining Nodes

  28. Data: Fitness for Use • In a database, the data have no actual quality or value; they only have potential value. That value is realizedonlywhen someone uses the data to do something useful (English 1999). • The quality of data cannot be assessed independently of the uses of that data (Strong et al. 1997). • Data are of high quality if they are fit for their intended use in operations, decision-making, and planning (Juran 1964).

  29. Data standards / protocols used by GBIF • Darwin Core (TDWG data standard) • Simple XML data model to represent taxon occurrence records (only core attributes) • Extensions to handle e.g. curation details, geospatial data, microbial specimens • ABCD - Access to Biological Collection Data (TDWG data standard) • More complex XML data model to represent collection or observation data • Detailed document structure including features for different communities • Taxon Concept Schema(TDWG data standard) • XML data model for exchange of nomenclatural/taxonomic data • Will be supported in new GBIF data portal • Tab-delimited links to species information • Lists of scientific names, URLs and key words • Will be supported in order to establish links to external resources from the new GBIF data portal

  30. Data standards / protocols used by GBIF • DiGIR / BioCASe / TAPIR (TDWG access protocols) • XML protocols for searching remote data resources • Suitable for use with a wide range of different data models • TAPIR (latest version) supports flexible views and simple URLs • SPICE protocol (Species 2000 access protocol) • Web service interfaces for exploring taxonomic data (hierarchies, synonymy, common names) • Will be supported for connecting data resources to new GBIF data portal • LSIDs – Life Science Identifiers(TDWG-adopted GUID mechanism) • Globally unique identifiers to simplify tracking data records • Include protocol for resolving data for any LSID

  31. Examples of resources provided by GBIF free

  32. GBIF Training Manual 1: Digitisation of Natural History Collections • CONTENTS • Introduction • The Uses of Primary Species Occurrence Data • Initiating a Natural History Collection Digitisation Project • Principles of Data Quality • Principles and Methods of Data Cleaning • BioGeomancer Guide to Best Practices for Georeferencing • Guide to Best Practices for Generalizing • Glossary and Acronym Expansion • To be released by end February 2007.

  33. Observational Data Task Force • Quantum of observational data is unprecedented • Over 60% of GBIF mediated data is observational • Observational Data Task Group • Recommend GBIF on mobilisation of observational data • Criteria for Observational Data Sharing Infrastructure • Metadata Schema for Observational Schema • Protocols / Standards for observational data exchange / sharing • Best Practices Guide for observational data management • Encourage participation of potential data providers • Report by September 2008

  34. Enhanced support for data providers • Broader range of supported import formats and protocols • Occurrence data • Darwin Core (original v1.2, MaNIS, OBIS, new v2.0 with extensions) • ABCD (v1.20, v2.06) • Taxonomic data • Catalogue of Life CD-ROM (moving to dynamic checklist) • Nomenclators via tab-delimited lists of LSIDs (work under way) • Data from ECAT projects (models and tools under way) • Other resources • Discussions under way with other resources (GenBank, BOLD, ARKive) • General support for handling XML and tab-delimited formats

  35. Enhanced support for data providers • Validation and annotation of data during indexing • Presence of required fields • Consistency between country name and coordinates • Reports for data providers • Clear separation between “raw” and “processed” index data • Scientific name string versus interpreted taxon • Country name string versus interpreted country • “Home page” for each data resource

  36. Training, Capacity Building, Mentoring • Training programs on how to share data • Training on Ecological Niche Modeling • Mentoring to developing countries • Help Desk services

  37. Call for Action! With GBIFs’ decentralised approach of NBIFs, RBIFs, and ThBIFs Africa has lots to contribute..... Individual, institutional, national, regional and global level!

  38. How to contact GBIF: • Web site: www.gbif.org • Data portal: www.gbif.net • GBIF Secretariat • Universitetsparken 152100 CopenhagenDenmark • E-mail: info@gbif.org • Phone: +45 3532 1470 • Fax: +45 3532 1480 • GBIF Secretariat building, supported by a grant from the Aage V. Jensens Fonde

  39. Merci beau coup / Thank you Questions? Questions? Questions?

More Related