1 / 26

Metrics-Driven Approach for LOD Quality Assessment

Metrics-Driven Approach for LOD Quality Assessment . 2014-May-07. Outline. What is t he problem?. What have others done? . What is our solution?. Does it work?. What is the problem?. Linked Open Data (LOD): Realizing Semantic Web by interlinking existing but dispersed data

zack
Télécharger la présentation

Metrics-Driven Approach for LOD Quality Assessment

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Metrics-Driven Approach for LOD Quality Assessment 2014-May-07

  2. Outline • What is the problem? What have others done? What is our solution? Does it work?

  3. What is the problem? • Linked Open Data (LOD): • Realizing Semantic Web by interlinking existing but dispersed data • Main components of LOD: • URIs to identify things • RDF to describe data • HTTP to access data

  4. What is the problem? Datasets: 295 Triples:over 30,000,000,000 (30 B) Links:over 500,000,000 (500 M)

  5. What is the problem? Inclusion Criteria for publishing and interlinking datasets into LOD cloud • resolvable http/https URIs • Presented in one of the standard formats of Semantic Web (RDF, RDFa, RDF/XML, Turtle, N-Triples) • Contains at least 1000 triples • Connected via at least 50 RDF links to the existing datasets of LOD • Accessible via RDF crawling, RDF dump, or SPARQL endpoint Is dataset ready to publish?

  6. What is the problem? Idea of the LOD: Publishing first, improving later Results in: quality problems in the published datasets Missing link: Data Quality evaluation before release

  7. What have others done? Data quality in the Context of LOD Validators Quality Assessment of Published data • General Validators • Parsing and Syntax • Accessibility / Dereferencability • Classifying quality problems of LOD • Using metadata for quality assessment • filtering poor quality data (WIQA) • Semantic Annotation using ontologies

  8. What have others done? Limitations of related works: • Syntax validation, not quality evaluation • Not scalable • Not full automated • Evaluation after publishing

  9. What is our solution? Proposing a set of metrics for Inherent quality assessment of datasets before interlinking to LOD cloud

  10. What is our solution?

  11. 1. Selecting Inherent Quality Dimensions

  12. 1. Selecting Inherent Quality Dimensions

  13. 2. Proposing Metrics Example: Goal: Assessment of the consistency of a dataset in the context of LOD Question: What is the degree of conflict in the context of data value? Metric: The number of functional properties with inconsistent values

  14. 3. Developing LODQM • LODQM: Linked Open Data Quality Model • 6 Quality dimensions • 32 Metrics

  15. 4. Theoretical Validation

  16. 5. Empirical Evaluation 5.1 5.2 5.3 5.4 5.5 5.6 5.7

  17. 5. Empirical Evaluation

  18. 5. Empirical Evaluation √ √

  19. 5. Empirical Evaluation √ √ √ • Result: • Three pairs of metrics are correlated: • {IFP, Im_DT} • {Im_DT, Sml_Cls} • {Inc_Prp_Vlu, IF} • The others are independent

  20. 5. Empirical Evaluation √ √ √ √

  21. 5. Empirical Evaluation √ √ √ √ √ √

  22. 5. Empirical Evaluation √ √ √ √ √ √ √ • Result: • Only one pair of quality dimensions is correlated: • {Interlinking, Syntactic accuracy} • The others are independent

  23. 6. Quality Prediction Result: 20 out of 32 metrics are selected • Using Neural Network Method: • MultiLayerPerceptron

  24. 6. Quality Prediction

  25. Conclusion on Metrics

  26. Appreciative of your Attention and Comments

More Related