1 / 24

Data integration and Linked Data

Data integration and Linked Data. Tatiana Tarasova University of Amsterdam T.Tarasova@uva.nl. 03/09/12. 1. Outline. 1. Task 4.2 objectives 2. Use Case of the ENVRI data integration 3. Linked Data 4. Linked Data for ENVRI 5. Benefits of Linked Data 6. UvA needs. Task 4.2. objectives.

shyla
Télécharger la présentation

Data integration and Linked Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Data integration andLinked Data Tatiana Tarasova University of Amsterdam T.Tarasova@uva.nl 03/09/12 1

  2. Outline 1. Task 4.2 objectives 2. Use Case of the ENVRI data integration 3. Linked Data 4. Linked Data for ENVRI 5. Benefits of Linked Data 6. UvA needs

  3. Task 4.2. objectives Harmonise, integrate and publish data from the ENVRI Research Infrastructures to facilitate multidisciplinary scientific research.

  4. Use Case Study the correlation between the concentration of CO2 in the air and the ocean temperature during the Iceland Volcano eruption in 2010.

  5. Challenges platform observatory good quality level 2 year ? month flask date TSV CSV day Authorized IP access FTP catalogues CO2 concentration Ocean temperature

  6. ENVRI data integration: requirements Find a solution that addresses both structural and semantical data heterogeneity and is universal for all the RIs re-uses the existing RIs' technological solutions re-uses the existing data resources like code lists, thesauri and ontologies ensures data provenance traceability

  7. Linked Data:URIs, HTTP, RDF [15] vsto:Observatory icos:MHD rdf:type Mace Head is anobservatory . Subject – predicate – object . prefix icos: <http:/example.envri.org/icos/> prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> prefix vsto: <http://escience.rpi.edu/ontology/vsto/2/0/vsto.owl#>

  8. Linked Data publishing workflow Analyze Model Publish Use

  9. Analyze ENVRI data: structure METADATA (parameter, unit of measure, instrument, provider, ...) OBSERVATIONS DATASET DIMENSIONS (time, lat/long, elevation)

  10. Analyze ENVRI data: data DATASET “CO2 concentration measured by Mace Head” OBSERVATION “392.049” Parameter: CO2 Unit of measure:ppm Observatory: Mace Head Provider: ICOS Observed value:392.049 Time: 2010.01.01 Lat/Long: 53.3261/-9.9836 Elevation: 25 m

  11. Model ENVRI data:the Data Cube vocabulary The Data Cube vocabulary [1] provides a generic framework to encode collections of observations. The core classes of Data Cube are: DataSet, Dimension, DataStructureDefinition. The core properties are: DimensionProperty, AttributeProperty, MeasurePorperty. This vocabulary was developed for the statistical domain and based on the SDMX standard [2]. Examples: the UK local government payments [3,16], the UK Environmental Agency sampling water monitoring [4].

  12. Publish ENVRI data:structure @prefix qb: <http://purl.org/linked-data/cube#> . @prefix icos: <http://example.envri.org/icos/> . @prefix vsto: <http://escience.rpi.edu/ontology/vsto/2/0/vsto.owl#> . @prefix time: <http://www.w3.org/2006/time#> . @prefix geo: <http://www.w3.org/2003/01/geo/wgs84_pos#> . _:structure rdf:type qb:DataStructureDefinition , qb:component _:cs1 , _:cs2 , _:cs3, _:cs4, _:cs5, _:cs6, _:cs7 . _:cs1 qb:measurevsto:hasContainedParameter . _:cs2 qb:attributemuo:measuredIn . _:cs3 qb:attributevsto:isFromInstrument . _:cs4 qb:dimensiongeo:lat . _:cs5 qb:dimensiongeo:long . _:cs6 qb:dimensiontime:inXSDDateTime . _:cs7 qb:dimensionsweet_spaceExtent:hasHeight .

  13. Publish ENVRI data:dataset icos:co2-mhd rdf:type qb:DataSet ; rdfs:comment "Dataset with CO2 concentration measured by the Mace Head observatory" ; qb:structure _:structure ; vsto:hasContainedParametersweet_chemCompound:CO2 ; muo:measuredIn muo:ppm ; vsto:isFromInstrumenticos:flask ; dcterms:publisher <http://www.icos-infrastructure.eu/> ; dcterms:source … dcterms:contributor ...

  14. Publish ENVRI data:observations icos:observation0 rdf:type qb:Observation ; rdfs:label “observation” ; rdfs:comment "Observation of the dataset 'CO2 concentration measured by Mace Head'" ; qb:dataSeticos:co2-mhd ; geo:lat "53.3261" ; geo:long "-9.9036" ; time:inXSDDateTime “2010-01-01T00:00:00Z” ; sweet_spaceExtent:hasHeight “25” ; muo:numericalValue “391.318” .

  15. Use ENVRI data:get all parameters ## Cross-dataset query based on the common ontology. SELECT ?dataset ?parameter ?parameterName WHERE { ?dataset vsto:hasContainedParameter ?parameter . ?parameter rdfs:label ?parameterName .} ## The query returns all the parameters independently of the dataset for which they were defined.

  16. Use ENVRI data: answering the use case I • ## the query returns observations' values, the measured parameters, • ## the datasets which contain the observations and filters the time of the • ## measurements to April, 2010 SELECT ?parameterName ?value ?time ?dataset WHERE { ?obsmuo:numericalValue?value ; qb:dataSet ?dataset ; ?datasetvsto:hasContainedParameter?parameter ; ?parameter rdfs:label ?parameterName ; time:inXSDDateTime?time . FILTER (?time >= '2010-04-01T00:00:00Z'^^xsd:dateTime and ?time <= '2010-05-01T00:00:00Z'^^xsd:dateTime) .}

  17. Use ENVRI data:answering the use case II ## The query returns all the measurements with their parameters independently of the dataset for which they were defined.

  18. Linked Data benefits • The Time Ontology [5] • The Geo WGS84 based Vocabulary [6] • The Measurement Units Ontology (MUO) [7] • The Open Provenance Model (OPM) [8] • The Semantic Web for Earth and Environmental Terminology (SWEET) [9] • The Virtual Solar-Terrestrial Observatory Ontology (VSTO) [10] extends SWEET • ... • Data Cube provides structural data interoperability. • Semantical interoperability can be addressed by extending Data Cube with the existing ontologies:

  19. Linked Data benefits contd Linked Data is universal, i.e. different data formats can be transformed into Linked Data, e.g., CSV, TSV, relational data, XML. Linked Data complements the existing technological solutions. Linked Data allows to describe data provenance.

  20. What do we need? Description of the Iceland Volcano Use Case, including the workflow and the datasets involved. All data and metadata! Domain ontologies to encode ENVRI specific terms, e.g., unit of measurements, geospatial dimension, different realms: atmosphere, volcanoes, plate tectonics, etc.

  21. Thank you! Questions?

  22. References I [1] The RDF Data Cube Vocabulary http://www.w3.org/TR/vocab-data-cube/ [2] Statistical Data and Metadata Exchange http://sdmx.org/ [3] The Combined Online Information System http://data.gov.uk/resources/coins [4] The Environmental Agency sampling water monitoring's site http://environment.data.gov.uk/lab/bwq-web.html [5] The OWL Time Ontology http://www.w3.org/TR/owl-time/ [6] The Basic GEO Vocabulary http://www.w3.org/2003/01/geo/ [7] The Measurement Units Ontologyhttp://idi.fundacionctic.org/muo/muo-vocab.html

  23. References II [8] The Open Provenance Model Vocabulary http://openprovenance.org/ [9] The SWEET ontologies http://sweet.jpl.nasa.gov/ontology/ [10] The Virtual Solar-Terrestrial Observatory Ontologyhttp://www.vsto.org/ [11] The Argo Data Management sitehttp://www.argodatamgt.org/ [12] The Google Refine plug-in for Data Cube http://refine.deri.ie/qbExport [13] The Digital Enterprise Research Institute's sitehttp://www.deri.ie/

  24. References III [14] The Virtuoso Open Source Edition's site http://virtuoso.openlinksw.com/dataspace/dav/wiki/Main/ [15] The Linked Data principleshttp://www.w3.org/DesignIssues/LinkedData.html [16] The mashup application of the UK Linked Data COINS dataset http://wheredoesmymoneygo.org/

More Related