1 / 13

Ensuring Quality in Data Exchanges: A Tri-Part Approach in the French Information System on Nature and Landscapes

Learn about the importance of data validation in biodiversity and environmental conservation decisions and how the French Information System on Nature and Landscapes ensures data quality.

arielk
Télécharger la présentation

Ensuring Quality in Data Exchanges: A Tri-Part Approach in the French Information System on Nature and Landscapes

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data Quality in Data Exchanges: a Tri-Part Approach in the French Information System on Nature and Landscapes Rémy Jomier (UMS Patrinat, National Natural History Museum – MNHN –, French Agency for Biodiversity – AFB –, and National Center for Scientific Research – CNRS –), Nature data standardization manager Solène Robert (UMS Patrinat, National Natural History Museum – MNHN –, French Agency for Biodiversity – AFB –, and National Center for Scientific Research – CNRS –), Nature data and geographical data cellcoordinator

  2. Why validate taxon data ? • « Whoever is careless with the truth in small matters can not be trusted with important matters » - A. EinsteinA datum is small… But has to be validated (hence, true) in order to be trusted with the important matters : biodiversity, our environment, and related research/political decisions. It gets even more important when you know that:« some outside the museum community see the quality of museum data as being generally unacceptable for use in making environmental conservation decisions  » - A. Chapman • Some of you may think « Hey, • I know THAT name ! » 

  3. What’s the « SINP » ? • SINP: French, national Information System on Nature and landscaPes. (Système d'Information Nature et Paysages) • Encompassesanybiodiversity and geological data in France. Includesboth taxon occurrences and habitat occurrences for biodiversity. • Uses welldefinedrules, often more constrainingthan the Darwin Core (DwC) • The part dealingwith a taxon occurrence : OccTax

  4. What is a taxon occurrence? Observation or non-observation of a taxon, at a time, a place, by observers. • Example : • On 10th may 2014, Patrick Haffner (MNHN) observed badger traces at the point 8 050, 67 523 (Lambert 93 projection) • Indirect observation of a mammal • Direct observation of a butterfly

  5. Prerequisite to data exchange: • Data conformity and consistency • Conformity: ensuresthat a datumcanbeexchanged • Presence of compulsoryelements • Type of the attribute (text, number, date…) • Consistency: ensuresthatthereis no blatanterror • Checkingconsistencyneeds to have comparisonbetweenelements • Example: end date / start date • Both are ensured by a national protocol, common to all

  6. Scientific validation of data: What happens • Validation processes • Validation levels Datum • Manual: experts • Automated: compare with knowledge bases • Combined arms: both for maximum damage ! Er… Sorry, validation. • Producer validation • Regional validation • National validation • SCOPE ! • Looks like overkill ? Nu-uh ! That’s the bare minimum !

  7. Scientific validation of data: the needs within SINP • 3 levels : • Producer’slevel, with a self evaluation • Regionallevel, coordinated at a regionallevel • National level, coordinated at a national level (canbeequivalent, at times, to the regionallevel, taking care not to duplicate efforts is key) • National validation isdoneglobally, by using national expert networks and feedback fromusers, or knowledgedatabases • Validation should NEVER slow down data movement… But shouldalsobeexchangedwhenitexists.

  8. Scope ? • Quick and relatively easy to check: the taxon/date/location triplet Minimal scope • Not always easy to check: any other information. The process to which it’s been submitted, and what elements have been checked, have to be described. Enlarged scope

  9. Scientific validation of data: Processes • Automated process : Comparing information with reference databases (presence maps for example). Very quick, but dependent upon existing databases. • 1.5 hour / 1,5 million data for conformity, consistency, and minimal scope scientific validation • Manual process : has to have experts intervene and check each and every datum. Time consuming, but very reliable, can work outside of automated bounds and without databases. • Combined process: combines both, with automated process flagging things that the experts should check.

  10. Scientific validation of data: Results • Eachdatumistaggedwith a trust level, and all relevant information: • Level (producer, regional, national) • Scope (minimal, enlarged) • Validator (thisensures trust) • Date (in case of a furtherrevision and update) • Type of validation (manual, automatic, combined) • Reasonable Fallout: validated data

  11. How does all that affect data exchange ? • Each element needs to be attached to the datum • Aim of a standard : exchanging information Need for concise information, numerical or 1-2 letter codes • No need to embark useless information. « not checked » doesn’t need data • Levels don’t require same information (producer vs regional/national) • Checking for duplicate data: interesting • Data may have been validated on compulsory attributes, but rejected on optional ones: Keeping the optional information in a different place, validated information is exchanged • Data flow: when do we update and how ? Modification date on the datum

  12. Data exchange schema

  13. TēnākoutouThankyou for your attention • E-mail : rjomier[at]mnhn.fr

More Related