330 likes | 464 Vues
This document explores critical quality issues in spatial databases, highlighting the challenges of data fusion from multiple sources, including vector and raster data. It emphasizes the importance of various data quality dimensions such as logical consistency, completeness, and accuracy. Through real-world case studies, it presents the complexities of integrating diverse datasets and the need for enhanced methodologies to ensure better decision-making. The discussion includes specific examples of inconsistencies, mapping issues, and outlines proposed solutions for improving data quality in spatial contexts.
E N D
Quality issues in Spatial DatabasesM. Mostafavi, G. Edwards, R. Jeansoulin CRG & GEOIDE & REVIGISVictoria, May 2003
Contents • Introduction • Problems • Objective • Methodology • Results • Discussion • Conclusions and perspectives
Introduction • Data fusion and Data Quality • Multi sources spatial data • Vector data : BNDT, BDTQ, … • Raster data: satellites images, aerial images,… • Need for better quality • Logical consistency • Completeness • Semantic accuracy • Temporal accuracy • Positional accuracy • and more … • Decision making (Effective crisis management (MSPQ))
A real case problem • BNDT: good geometry • Statistics Canada database, Canada election database: reach descriptive information but weak geometry • How to reconcile these two data sets? BNDT SC, EC
Context SDB1 Information of greater quality SDB2 Fusion SDB3 User vision (fitness for use) Producer vision (Product ontology)
Logical consistency • Logical consistency is an important element of data quality. It defines the degree of consistency of the data with respect to its specifications. • Integrity constrains • Explicit rules stated in the data specifications (e.g. connectivity between two objects) • Implicit rules (e.g. a river always flows downstream) • Ontology vs. specifications Ontology specifications
Project definition data Consistency vs. BNDT Yes Does this Help? ?......No Mapping the ontologies Ontology fusion Step 1 Step 3 Step 4 NTDB Ontology Integrated ontology BDTQ Ontology Ontology consistency Lack of explicit rules data consistency data consistency BDTQ data Step 2 New data set NTDB data Data fusion Step 5
Consistency in NTDB Step 1 Step 2 NTDB Ontologies Dataset Delphi Interface Delphi Interface Prolog Studying the Logical consistency of the dataset
Formalizing the ontology BNDT Ontology Knowledge base Rules Facts Queries
A C D E B A B B B A A C Spatial relations in NTDB • Spatial relations in NTDB are: • Connection relations • Sharing relations • adjacency relations • Superposition relations 1 2 3 4
Logical approach- facts • For NTDB the facts consist of • Taxonomy of NTDB • Themes • Entities • Allowed Combinations • Code (NTDB identity code) • Geometric representations • Spatial relations • Connection • Sharing • Superposition/ adjacency • Minimal values (e.g. distance constraints between objects)
Logical approach- facts • There are about 350,000 facts describing the NTDB • Remark: regrouping of objects for programming purposes has created some inconsistencies
Logical approach- rules Several rules are defined to analyze the ontological consistency of the NTDB. Inconsistency rules
Results (1/2) Inconsistency (inverse connection) Data dictionary: (generic relation) • between themes:Railway(L) Connected toRoad(L) • between themes :Road (L) Connected to Railway(L) Table of connection and cardinalities ?
Results (2/2) Inconsistency (Different Values for the cardinality one) Data dictionary: (Generic relation) Gas and oil facilities (P) is ConnectedtoBuilding (P) Table of connection and cardinalities ?
Dataset VB Interface Consistency in Data Step 1 Step 2 NTDB Ontologies Delphi Interface Prolog Studying the Logical consistency of the dataset
Geomedia professional Spatial operations • Meet • Entirely Contained • Entirely Contained by • Contains and • Contained by • Spatially equal • touch Meet Overlap
Mapping Polygon – Polygon Relations
Mapping problems • Several problems • Confusions in spatial relations • Unique mapping is not possible • Cardinalities cannot be considered
File 21E05 Region: Sherbrooke 68 Entities 23,283 objects Analyzed binary relations: Contours vs. water bodies Buildings vs. roads Water bodies vs. buildings Liquid depot vs. Liquid depot Roads vs. water bodies … Data vs ontology
Results • Liquid depot vs. Liquid depot • Spatial representations (Point, Area) • Spatial relations • Ontology/ specification (superposition is illegal) • Data (superposition case is found)
Results • Problem: Road crosses a water body • Illegal relation with respect to semantics of the objects • Incomplete ontology
Results • Problem: Cut line crosses a water body • Illegal relation with respect to semantic definition of the objects • Incomplete ontology
Results • Problem: Contour crosses water body • Illegal relation with respect to the ontology • Inconsistent data
Results • Problem: Road crosses water body • Illegal relation with respect to the ontology • Inconsistent data
Results • Problem: Road crosses Building • Illegal relation with respect to the semantics of objects • Incomplete ontology
Results • Problem: Water body (L) superposed Vegetation (A) • Illegal relation with respect to the ontology • Inconsistent data • Control system problem
Results • Problem: Buildings (S) superposed to water body (A) • Illegal relation with respect to the semantics of objects • Inconsistent data
Results • Problem: Building (A) Overlap Vegetation (A) • Illegal relation with respect to the semantics of objects • Inconsistent data
Suggestions, solutions • Adding new rules • Building (a) and vegetation (a) (illegal superposition) • Road (l) and building (conditional superposition) • A better control system is needed • Find exceptions
Current situation • Product ontology is analyzed • Mapping of topological relations to binary relations • Ontology translation in prolog (Delphi program) • Consistency studding of spatial relations • Connection (table C) • Sharing (table D) • Superposition and adjacency (table E) • Consistency between different relations (fusion of facts) • connection and sharing , connection and superposition / adjacency, sharing and superposition / adjacency • Consistency of data vs. specifications are studied
Future work • logical consistency of other available datasets • Mapping of ontologies • Fusion of ontologies • Fusion of data • Consistency of the newly created data set