1 / 48

The Nature of Geographic Data

The Nature of Geographic Data. The Paper Map. A long and rich history Has a scale or representative fraction The ratio of distance on the map to distance on the ground Is a major source of data for GIS Obtained by digitizing or scanning the map and registering it to the Earth’s surface

edda
Télécharger la présentation

The Nature of Geographic Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. The Nature of Geographic Data

  2. The Paper Map • A long and rich history • Has a scale or representative fraction • The ratio of distance on the map to distance on the ground • Is a major source of data for GIS • Obtained by digitizing or scanning the map and registering it to the Earth’s surface • Digital representations are much more powerful than their paper equivalents

  3. Representations • Are needed to convey information • Fit information into a standard form or model • Almost always simplify the truth that is being represented • There is no information in the representation about daily journeys to work and shop, or vacation trips out of town

  4. Digital Representation • Uses only two symbols, 0 and 1, to represent information • N symbols (bits)  2N distinct values • Many standards allow various types of information to be expressed in digital form • MP3 for music • JPEG for images • ASCII for text • GIS relies on standards for geographic data

  5. Why Digital? • Economies of scale • One type of information technology for all types of information • Simplicity • 0,1  on,off • Reliability • Systems can be designed to correct errors • Easily copied and transmitted • Perfect copies • At close to the speed of light

  6. Accuracy of Representations • Representations can rarely be perfect • Details can be irrelevant, or too expensive and voluminous to record • It’s important to know what is missing in a representation • Representations can leave us uncertain about the real world

  7. The Fundamental Problem • Geographic information links a place, and often a time, with some property of that place (and time) • “The temperature at 34 N, 120 W at noon local time on 12/2/99 was 18 Celsius” • The potential number of properties is vast • In GIS we term them attributes • Attributes can be physical, social, economic, demographic, environmental, etc.

  8. Types of Attributes • Nominal, e.g. land cover class • Distinction (“a” is/is not “b”) • Ordinal, e.g. a ranking • Significance (“a” is X-er than “b”) • Interval, e.g. Celsius temperature • Relative magnitude (“a” is N units X-er than “b”) • interpolable • Ratio, e.g. Kelvin temperature • Absolute magnitude (“a” is N times X-er than “b”) • scalable

  9. Cyclic Attributes • Do not behave as other attributes • What is the average of two compass bearings, e.g. 350 and 10? • Occur commonly in GIS • Wind direction • Slope aspect • Flow direction • Special methods are needed to handle and analyze

  10. The Fundamental Problem • The number of places and times is also vast • Potentially infinite • The more closely we look at the world, the more detail it reveals • Potentially ad infinitum • The geographic world is infinitely complex • Humans have found ingenious ways of dealing with this problem • Many methods are used in GIS to create representations or data models

  11. Types of Spatial Data Discrete: definitive; with concrete, observable, boundaries Continuous: no easily discernable boundaries, “fuzziness” depends on scale

  12. Types of Spatial Data • Continuous spatial data: geostatistics • Samples may be taken at intervals, but the spatial process is continuous • e.g. soil quality • Discrete data • Irregular: zonal data, regions, states, districts, postcodes, zipcodes • Regular lattice data: constructed grid, ‘raster’ representation

  13. Discrete Objects and Fields • Two ways of conceptualizing geographic variation • The most fundamental distinction in geographic representation • Discrete objects • The world as a table-top • Objects with well-defined boundaries

  14. Discrete Objects • Points, lines, and areas • Countable • Persistent through time, perhaps mobile • Biological organisms • Animals, trees • Human-made objects • Vehicles, houses, fire hydrants

  15. Fields • Properties that vary continuously over space • Value is a function of location • Property can be of any attribute type, including direction • Elevation as the archetype • A single value at every point on the Earth’s surface • The source of metaphor and language • Any field can have slope, gradient, peaks, pits

  16. Examples of Fields • Soil properties, e.g. pH, soil moisture • Population density • But at fine enough scale the concept breaks down • Name of county or state or nation • Atmospheric temperature, pressure • Pollution level • Groundwater quality information

  17. Difficult Cases • Lakes and other natural phenomena • Often conceived as objects, but difficult to define or count precisely • “When is a heap of sand no longer a heap?” • Weather forecasting • Forecasts originate in models of fields, but are presented in terms of discrete objects • Highs, lows, fronts

  18. Rasters and Vectors • How to represent phenomena conceived as fields or discrete objects? • Raster • Divide the world into square cells • Register the corners to the Earth • Represent discrete objects as collections of one or more cells • Represent fields by assigning attribute values to cells • More commonly used to represent fields than discrete objects

  19. Legend Mixed conifer Douglas fir Oak savannah Grassland Raster representation. Each color represents a different value of a nominal-scale field denoting land cover class.

  20. Characteristics of Rasters • Pixel size • The size of the cell or picture element, defining the level of spatial detail • All variation within pixels is lost • Assignment scheme • The value of a cell may be an average over the cell, or a total within the cell, or the commonest value in the cell • It may also be the value found at the cell’s central point

  21. Vector Data • Used to represent points, lines, and areas • All are represented using coordinates • One per point • Areas as polygons • Straight lines between points, connecting back to the start • Point locations recorded as coordinates • May have “holes” and “islands” • Lines as polylines • Straight lines between points

  22. Raster vs Vector • Volume of data • Raster becomes more voluminous as cell size decreases • Source of data • Remote sensing, elevation data come in raster form • Vector favored for administrative or discrete data • Software • Some GIS better suited to raster, some to vector

  23. Generalization • GIS data may preserve data beyond what you need or want • ArcGIS can differentiate between incredibly small values • State Plane (feet) default is 0.003937 inches • Software may have difficulties displaying overly detailed data at smaller scales

  24. Spatial Autocorrelation First law of geography: “everything is related to everything else, but near things are more related than distant things” – Waldo Tobler Many new geographers would say “I don’t understand spatial autocorrelation” Actually, they don’t understand the mechanics, they do understand the concept.

  25. Spatial Autocorrelation • Spatial Autocorrelation – correlation of a variable with itself through space. • If there is any systematic pattern in the spatial distribution of a variable, it is said to be spatially autocorrelated • If nearby or neighboring areas are more alike, this is positive spatial autocorrelation • Negative autocorrelation describes patterns in which neighboring areas are unlike • Random patterns exhibit no spatial autocorrelation

  26. Positive spatial autocorrelation

  27. Overly dispersed - negatively autocorrelated

  28. Random - no spatial autocorrelation

  29. Importance of Spatial Autocorrelation • Most statistics are based on the assumption that the values of observations in each sample are independent of one another • Positive spatial autocorrelation may violate this, if the samples were taken from nearby areas • Goals of spatial autocorrelation • Measure the strength of spatial autocorrelation in a map • test the assumption of independence or randomness

  30. Why does spatial auto correlation occur? Reaction functions? Spillovers, externalities? Unobserved similarities between places? Diffusion? (disease spread) Common activity in neighboring areas? (crime) Common policy across neighboring areas? (zoning)

  31. Sampling • The sampling density determines the resolution of the data • Samples taken at 1 km intervals will miss variation smaller than 1 km • Standard approaches to sampling: • Random • Systematic • Stratified

  32. Random samples Every location is equally likely to be chosen

  33. Systematic samples Sample points are spaced at regular intervals

  34. Stratified samples Requires knowledge about distinct, spatially defined sub-populations (spatial subsets such as ecological zones) More sample points are chosen in areas where higher variability is expected

  35. Stratified samples

  36. Using (Geospatial) Statistics As always, error propagates and grows through subsequent analyses Correlation does not mean causation Sampling method may introduce bias Models and measurements must be appropriate for your dataset With GIS data, model must be geo-aware

  37. Pearson’s r & r2 r is the correlation value between two or more sets of values Ranging from -1 to +1, r identifies the degree of positive or negative correlation Squaring r produces a percentage to which two sets of data share the same values r can be plotted as a “best-fit” or trend line

  38. Plotting Correlation

  39. Gravity Model Gravity model applies concepts in physics to the social sciences The “masses” and distance between two urban places influences the migratory bond between two places Population (people, employment) and distance decay effect the degree to which two places are “bonded”

  40. Self-similarity and fractals

  41. The Koch Snowflake First iteration After 2 iterations

  42. After 3 iterations

  43. After n iterations

  44. After iterations (work with me here, people)

  45. The Koch snowflake is six of these put together to form . . . . . . well, a snowflake.

  46. Notice that the perimeter of the Koch snowflake is infinite . . . . . . but that the area it bounds is finite (indeed, it is contained in the white square).

  47. Importance of Fractals • The precision at which you measure linear features influences the total length • What measurement is “right”? • Self-similarity of features • A craggy shoreline will have a similar pattern at a small and large scale • An agglomeration of urban neighborhoods into a city mirrors the pattern of cities creating a region

  48. Coastline Paradox • Just like the fractal snowflake, the coastline of an island does not have a well-defined length.

More Related