1 / 27

GIS Data Quality

GIS Data Quality. Producing better data quality through robust business processes. BrightStar TRAINING. Kim Ollivier. Schedule Day 2. Suggested breaks for the following times: Start: 9:00 Session 1 ( 90 min) Morning tea: 10:30 to 10:45 Session 2 ( 105 min)

amie
Télécharger la présentation

GIS Data Quality

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GIS Data Quality Producing better data quality through robust business processes BrightStar TRAINING Kim Ollivier

  2. Schedule Day 2 Suggested breaks for the following times: Start: 9:00 Session 1 ( 90 min) Morning tea: 10:30 to 10:45 Session 2 ( 105 min) Lunch: 12:30 to 1:30 Session 3 ( 90 min) Afternoon tea: 3:00 to 3:15 Session 4 ( 105 min) Finish: 5:00 Each session will have an exercise or interactive discussion

  3. Topics • Metadata • Designing rules • Data warehouse and ETL • Feature maintenance

  4. Metadata • Data model • Business rules, relations, state • Subclasses (lookup tables) • GIS Metadata NZGLS and ISO XML • Readme.txt or readme.html

  5. Metadata • Which standard? • ISO 19115, NZGMS • Aust asdd.ga.gov.au

  6. Examine Metadata • Geospatial metadata • Benefit to users or producer? • How do we collect it? • Standardisation or not? • metadata\topo250k_metadata.html • metadata\DCW_DQ_Project.htm • metadata\meta.html Morning Tea

  7. Data Quality Rules • Attribute domain constraints • Relational integrity rules • Rules for historical data • Rules for state-dependent objects • General dependency rules • Spatial feature rules

  8. A GIS Data Quality System Assess Data Quality Assessment Data Profiling Improve Recognise Prevent Data Cleaning Monitoring Data Integration Interfaces Ensuring Quality of Data Conversion and Consolidation Building Data Quality Metadata Warehouse Monitor Recurrent Data Quality Assessment

  9. Assessing Quality • Project steps • Required roles • Defining the objectives • Designing rules • Scorecard and Metadata • Frequency of assessment

  10. Building Rules • Data profiling • Interview users • Examine data model • Data Gazing • Application v data matrix

  11. Attribute Domain Constraints • Lookup tables • Numeric ranges • Null values • Blank values • Format constraints • Precision • Complex domain restraints

  12. Relational Integrity Rules • Identity rule • Reference rules • Cardinal rules • Inheritance rules

  13. Historical Data • Time dependent attribute • Value constraints • Rates of change • Volatility • Continuity • Granularity

  14. State-dependent Objects start • State-transition models • States, terminators • Actions Active (A) On Leave (L) Retired (R) Terminated (T) Deceased (D)

  15. Event Histories • An object may have many events • Event Overlaps • Event Frequencies • Event Conditions

  16. Spatial Rules • Projection, units • Dimensions 2D,3D,M,Z • point,line,poly • Precision • Topology

  17. Valuation Roll • Legacy structure, 50 years old • Variable maintenance standard • Valuer General audit (DQ spec)

  18. Rules Exercise • Split into pairs • Examine sample DVR dataset • Devise some rules for each category • Verbal discussion with class Lunch

  19. Data Warehouse & ETL • Why not direct access to online DB? • Staging Area • Scripting tools • Trade-offs • KPI for project • better quality than source • better quality than target

  20. ETL Extract • Extract

  21. ETL Transform • The importance of primary keys

  22. ETL Load • Batch offline most common • Daily status usually enough

  23. Safe Software FME • Examples Afternoon Tea

  24. Data Quality Team IT DQ Team Users

  25. Maintenance of features • Time series important • Line/polygon features are not atomic • Splitting loses inheritance • Calculating depreciation • Direct editing bypasses business rules

  26. Maintenance of the Quality • Gardening, not mountain climbing • Discussion of course topics

  27. References • Data Quality, The Accuracy Dimension – Jack E Olson • The Data Warehouse ETL Toolkit – Ralph Kimball Please fill in evaluation forms Finish

More Related