1 / 32

GIS Data Preparation and Integration

GIS Data Preparation and Integration. Digesting the Food. Data Preparation and Integration: the necessary steps. Geocoding: assigning geographic coordinates to points Perhaps the most basic form of spatial data entry data media conversion scanning digitizing data format conversion

annhunter
Télécharger la présentation

GIS Data Preparation and Integration

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. GISData Preparation and Integration Digesting the Food

  2. Data Preparation and Integration: the necessary steps • Geocoding: assigning geographic coordinates to points • Perhaps the most basic form of spatial data entry • data media conversion • scanning • digitizing • data format conversion • raster & vector • data reduction • Topology, error detection and topological editing • rectification and registration (one on top of the other) • overlaying sheets and referencing to the real world • edge matching & image adjustment (side by side) • linking & balancing adjacent sheets • interpolation • conflation

  3. Geocoding:assigning spatial coordinates to point data Address Matching assigns spatial coordinates (explicit location) to addresses (implicit location) Address matching requires street network file with street attribute information (street name and number range) for all street segments (block sides) • “Zone” variable required if data spans multiple cities (to handle duplicated street names) • precise matching of street names can be problematic • completeness (esp. for ‘new’ streets) important • PO boxes, building names, and apartment complex names cause problems. Implementation in ArcGIS is 3-step process • In ArcToolbox (9.2), process street network file to create a Geocoding Service • In ArcMap, load appropriate geocoding service via Tools/Geocoding/Services Manager • In ArcMap, geocode a table of addresses using Tools/Geocoding/Geocode Addresses Point Location Files containing lat/long or x,y coordinates (e.g derived via GPS) • bring table (e.g. in .csv or .dbf format) into ArcGIS using add data icon • Right click table name in T of C and select Display X,Y data • Displays as “event layer.”Export to shapefile or gdb feature class for spatial data set. Input table must contain 3 variables at minimum: Feature ID, x, y

  4. Produces “dumb” raster data vectorize using conversion software Create “smart” image using digital image processing techniques electromechanical $100-$50,000 instruments drum or flatbed scan resolution depends on price! down to 20 microns (millionth of m) Scanners v. sensors Sensors collect data directly in digital form (e.g. digital cameras) Sensor resolution now (2005>) matches that of photos, so scanning photos becoming old technology Still lots of paper maps around e.g. property ownership records Great if need only raster representation Automated creation of vector data from scanning very problematic: docs must be clean complex line work adds error lines shouldn’t be broken with text. text may be interpreted as lines automatic feature detection (road versus railroad) difficult ESRI’s ArcScan for ArcGIS (included with ArcEditor) provides interactive, semi-automated raster to vector conversion. Other vendors offer specialized conversion software Digital image processing techniques used to create “smart raster” Identify feature type within each raster Data Media Conversion--Scanning:automated recording of map or aerial

  5. Applied to map or aerial photo Use hard copy map/photo on table/tablet, or scanned image on screen (heads-up digitizing) pen or cursor detects x, y coords coordinates are in inches/cms from lower left (0,0) control points (tic marks) relate digitized coordinates to real world lat/long coordinates coordinates captured in stream or point mode accuracy of table (but not user!) usually better than 0.1 mm all nodes and polygons should be marked and numbered first essentially a vector approach Problems: paper maps unstable crease and fold stretch with humidity ( up to 3%) photos more stable (0.2%) map errors transferred to GIS maps often prepared for display not accuracy human hand very shaky often generates undershoots, overshoots, & double lines editing and clean-up essential Data Media Conversion--Digitizing:manually tracing a map or aerial

  6. Vector to Vector e.g. whole polygon (e.g SAS map data) to point/arc/polygon computationally intense no accuracy loss providing data is ‘clean’ perfectly transitive raster to raster may involve resampling (see under data reduction) may involve conversion between different vendor’s raster formats (e.g. GRID to BIL) vector to raster: point node x,y assigned to closest raster cell locational shift almost inevitable; error depends on raster size. two points in one cell indistinguishable not transitive; cannot retrieve original data without error vector to raster:line cells assigned if touched by line stair step appearance of diagonal lines (called aliasing) can be visually improved through anti aliasing: brightness of cells varied based on fraction of cell covered by the line raster to vector by far the most difficult Transitive: the ability to reproduce the original data after conversion. Vector raster vector raster 4 possibilities Data Format Conversion:

  7. Vector to Raster Conversion Point Orthogonal Line Diagonal Line (more problemmatic) Vector Note the use of anti-aliasing to improve line’s visual appearance Raster

  8. Raster to Vector Data Conversion:3-step process • skeletonizing (or thinning): to reduce rasters to unit width • peeling approach successively removes outer edges • medial axis approach determines set of interior pixels farthest from outer edges • vector extraction: to identify lines • 4-connected reconstruction • joins center points of 4-connected neighbors if present • particularly bad for diagonal line reproduction • 8-connected reconstruction • joins center points of 8-connected neighbors if present • diagonal lines reproduced but adds extra lines • 8-connected reconstruction with redundancy elimination • if 4-connected neighbor line exists, don’t draw diagonal • reduces redundant lines • topological reconstruction: recreates topological structure • create nodes at line junctions • construct arcs • define polygons (manual designation required) Available via the ArcScan extension for ArcGIS, as well as via several specialized packages from other vendors

  9. Raster to Vector ConversionSkeletonizing For example, go to: http://www.cosc.canterbury.ac.nz/people/mukundan/covn/Thin.html

  10. Vector Raster Raster to Vector Conversion: Vector Extraction4-connect reconstruction 4-connect reconstruction: search the 4 surrounding cells and join center points if present

  11. Vector Raster Raster to Vector Conversion:Vector Extraction8-connect reconstruction 8-connect reconstruction: search the 8 surrounding cells and join center points if present.

  12. Vector Raster Raster to Vector Conversion:Vector Extraction8-connect reconstruction with redundancy elimination 8-connect with redundancy elimination: draw diagonal from 8-cell search only if not already connected by orthogonal from 4-cell search

  13. Data Format Conversion Implementation in ArcGIS 9 To Vector To Raster From Raster From Vector

  14. Why? conserve space Disk in past Comm. bandwidth today conserve time reduce processing time (batch) speed response time (interactive) Resampling (raster data) ‘average’ the 4 values in a 2by2 neighborhood use this 1 value in a single cell occupying the location of the 4 original cells use mean for interval data; rules required for ordinal or nominal data not transitive! Thinning (vector data) often applied to data digitized in stream mode tolerance elimination: remove nearest-neighbor points which are ‘too close’ (e.g. output device resolution insufficient to distinguish) topological elimination*: remove points unnecessary for topo structure model-based elimination: fit polynomial by least squares and record fewer points along its path 3 7 2 4 Data Reduction 16 bytes *Normally uses the Douglas/Poiker (or Peucker) algorithm:David H. Douglas & Thomas K. PeuckerAlgorithms for the reduction of the number of points required to represent a digitized line or its caricature, Canadian Cartographer, 1973 Implement in ArcGis via Advanced Editing toolbar, Generalize tool 4 bytes 4 1 byte

  15. Tarrant Dallas Topology & Errors Topology --knowledge about relative spatial positioning --spatial relationships between features and rules about these relationships --managing data cognizant of shared geometry Implies knowledge of the three Cs: • connectivity (linked): • congruency (coincident/same as/on top of) • contiguity (adjacent) It is critical that spatial data be created and managed so that it is topological clean--free from topological errors --editing must always aim to maintain topological structure In topological editing, changes made to one feature (line, polygon, etc.) are also reflected in all other features to which it is connected, coincident, or adjacent In the classic GIS data structure model (as discussed in GIS Data Structures lecture) this implies that, for example --all arcs have nodes at end points --there is a node wherever arcs intersect or connect --a single arc forms the border between contiguous polygons (e.g. Dallas and Tarrant county) --a single arc represents a common boundary (e.g. state and county boundary)

  16. Errors: detection and removal • GIS packages commonly use topological structure checking to detect errors • Editing based on node snapping used to correct errors: moving a feature so its coordinates correspond exactly with another’s • snapping conducted based on tolerances -- snap if within 1 foot, for example • Care must always be taken to assure that topological “cleaning” does not itself introduce errors (e.g. snapping nodes and lines together which shouldn’t be)

  17. dangling arc (node missing at one end) No node at arc intersection (overpass?) Overshoot (or missing node)? undershoot? pseudo node (but perhaps road surface changes) pseudo arc (connects to itself) open polygon Sliver polygon gap Topological errors or real world occurrences?common problems

  18. How ArcGIS Handles Topology • The original Coverage data model, introduced with ArcInfo in 1981, incorporated topology as a part of the data • The CLEAN command checked for, and automatically “fixed”, topological errors based on a set tolerance • It could introduce errors into the data • The BUILD command then rebuilt polygon structures • ArcGIS 8.3 introduced the concept of topological rules for geodatabases in which the topological relationships are stored as a topology feature class separate from the data itself • The user can generate an error report, review each error, and then fix it in the data if desired, or mark it as an “exception”

  19. rectification: rearrangment of location of objects to correspond to a specific reference system (usually geodetic) registration: rearrangment of location of objects of one set so they correspond with those of another, without reference to a specific reference system Despite formal difference, often used interchangeably Two methods homogeneous transformation via rotation, translation, scaling, skewing used for map projection and similar conversions differential transformation via rubber sheeting used to correctly position distorted images or scanned maps or documents Georeferencing: Rectification and Registrationproviding true earth location/overlaying layers • Most commonly used to relate images (e.g. scanned photo) to a vector layer, but can also be used to “fix” incorrect positioning of features in a vector layer • Implemented in ArcMap: via the Georeferencing toolbar for images • via the Spatial Adjustment toolbar for vector layers

  20. translation of origin from digitizer origin for sheet to ‘true’ origin of GIS file rotation of axis e.g to true north scaling of axis homogenous: differential (ovals to circles) skewing of axis Changing map projections may involve all 4 Transformation:(homogeneous conversion) translation differential scaling rotation skewing

  21. GIS file is differentially ‘stretched’ so that tic points in file overlay corresponding ground control (tie) points on earth’s surface (or tic points in a second file) polynomial fitted by least squares between known ground control coords and tic point coords in GIS “Least squares” minimizes the sum of the squared distances between tic/tie pairs derived parameters then applied to all coordinates in file after conversion, tic points are on average closer to ground control points, but not identical can’t do this with a paper map! GIS file Rubber Sheeting(differentialconversion) --the more the better --well distributed --known lat/long of ground control tie points (usually obtained from GPS) needed for rectification --common identifiable points in each file needed for registration ground control (tie) map locations (tic)

  22. Process required for topo. consistency even if features line-up visually snapping used to connect features Issues acceptable tolerance before ‘further investigation’ of mismatch ‘how far back’ to go on sheet(s) with adjustments for mismatch Causes of mismatch paper map shrinkage/expansion errors from digitizing/scanning georeferencing errors accuracy of equipment extrapolation or round-off errors overlapping map coverage Implement in ArcGIS 9 by: ArcToolbox>Data Management>General>Append (replaces Geoprocessing Tools>Merge in AG 8) combines two (or more) files, but does not link features Spatial Adjustment toolbar, edge match tool links features (after links have been manually identified) Corresponding features fail to match on two sheets: Edge Matching:Joining map sheets to create a seamless GIS Edge matching in this example would likely require ‘further research’

  23. Image Adjustmentsraster/image data issues Raster data is made from separate images (photos) or tiles which are mosaiced to produce “seamless image” Collars: must be removed for seamless image • Overlap between adjacent images • Borders of scanned maps Image Balancing and Feathering: adjusting radiometry for consistent and/or desired image color, brightness, contrast • Checker board appearance • Abrupt line between adjacent images • Brightness levels wash out detail in highly reflective areas, but enhance detail in low reflectance areas • Inconsistent signature for same features, especially water as function of wind or sun relative to camera (and is it blue?) Digital Ortho adjustments: • Ground control (usually with GPS for visible points) to obtain ‘real world’ location • Ground control for camera’s angle relative to ground • Camera calibration data to remove lens distortion • Digital terrain model (dtm) to remove elevation “distance” (5 mi. on map to mountain top, but 6 mi walking or on photo if mountain is 5,280 feet high!)

  24. Collar removal required.

  25. Image Balancing/ feathering required

  26. Tiles After Before 2005 NCTCOG Digital Orthos

  27. estimating values for locations with no data based on: known values, and understanding of spatial behavior of phenomena generally, should assign more importance to closer known values than those further away weighting functions average closest n (2?) points ignores distance fit line between closest 2 fit surface between closest 3 trend surface approaches one high order polynomial oscillation a problem finite element approach: fit separate polynomials for each local area kriging: uses correlations of values with distance Interpolation:to create regular spacings from irregular data(e.g creating raster elevation surface from set of point height measurements) Estimated values Implemented in ArcGIS 9 via ArcToolbox>Spatial Analyst Tools>Interpolation

  28. Conflation • create new master coverage from the best spatial and attribute qualities of two or more source coverages • combine multiple coverages into one to simplify support • updated data obtained (e.g. new TIGER file) but need to preserve enhancements made to earlier version • two groups modify a single file, then need to recreate single version which preserves mods • create new master coverage from quality spatial data in one source and quality attribute data in another • somewhat narrower definition • Depending on the situation, can require application of a variety of processing tools and can be labor intensive: • Approaches available within ArcGIS 9 include • Spatial Adjustment toolbar, specifically attribute transfer tool • ArcToolbox>Analysis Tools>Overlay>Update • other add-ins available such as • MapMerge from ESEA, Mountain View CA for ArcGIS • GIS/T-Conflate for transportation applications

  29. NAVSTAR Satellite Program 24 (NAVigation Satellite Time and Ranging) satellites in 11,00 mile orbit provide 24 hour coverage worldwide first launched 1978; full system operational December 1993. gps receiver computes locations/elevations via signals from simultaneously visible satellites (minimum 3 for 2-D, 4 for 3-D) Selective Availability (SA) security system 100m accuracy with single receiver, if active 10-15m accuracy if inactive SA turned off May 1st, 2000 Multiple ways to counteract SA Even USCG broadcasted correction signal! Europeans threatened to compete Regional denial of signal possible Russia’s 21-satellite GLONASS (Global Navigation Satellite System) also available. Types of Ground Collection and Corrrection Autonomous Hand-held unit provides 10m accuracy (with SA off) $150-$1,500 per unit WAAS (wide area augmentation system) <3 meter accuracy in practice (spec. is 7m vert/horiz) Base stations (25 across US) monitor satellites 2 master stations (E & W coast) calculate corrections upload to two geosynchronous satellites over equator correction signal broadcast to GPS receivers (no special extra equipment needed unlike DGPS) Began operation June, 1998 To be expanded to cover Canada, Mexico, Panama European EGNO, Asian MSAS under development Differential (DGPS-predecessor to WAAS) accuracy 1-5m depending on equipment/exact method equipment $1,500-$15,000 per receiver correct for SA and other errors via either real time correction signals over FM radio post process with data from Internet Kinematic: high accuracy engineering (within cms); two receivers (base station and rover must lock-on to satellites equipment $15-30K per station NAVSTAR Global Positioning System (gps) • use to collect ground control for imagery/orthos • or for point/line data (manholes, roads, etc)

  30. Ionosphere worst in evening at low altitudes (but ephemerous best there) troposhere especially water vapor which slows signal multipath reflected signals from buildings, cliffs, etc ephemerous position and number of satellites in sky 4 required for 3D (horiz. and vertical), 3 for 2D (no elevation) ideallly, 3 every 120° horizon. with 20° elev., 1 directly above blockage (of satellite signal) by foliage, buildings, cliffs, etc. WAAS signal espec. subject to blocking by terrain & buildings ‘cos is from geostationary equatorial satellite Overall, accuracy better at night than during day. Factors Affecting GPS Accuracy

  31. Conclusion Most of the effort in most GIS projects involves data preparation and integration!

More Related