GIS BOOTCAMP Todd Bacastow
Geography matters! • ‘Geographic Information’ is information which can be related to specific locations. • Most human activity depends on geographic information.
Topic 1: What is GIS? • Dozens of possible definitions • Some emphasise the technology • The Hardware • The Software • Others focus on applications • Other terms often encountered: LIS, AM/FM, Geo-information systems, etc. • May emphasise different roles for the system, e.g. spatial decision support system, spatial database system, etc.
One definition of GIS (Dueker and Kjerne, 1989) • “Geographic Information Systems - A system of hardware, software, data, people, organizations and institutional arrangements for collecting, storing, analysing, and disseminating information about areas of the Earth”
Geographic Information System • Concepts such as location, direction, distance, proximity, adjacency provide links between different data • Geographic information usually broken down into three linked components of • Space • Time • Attribute
Geographic Information System • An Information System is a set of processes, executed on raw data, to produce information which will be useful in decision-making
Geographic Information System • In a system the whole is greater than the sum of its parts (Aristotle, C4th BC) • GIS is a convergence of technological fields and traditional disciplines • Not just technology: the data, people and institutional context are as much part of GIS as are the computers and software
Geography Cartography Remote Sensing Photogrammetry Surveying Geodesy Statistics Operations Research Computer Science Mathematics Civil Engineering Business management Behavioural science Etc…. GIS is the convergence of many disciplines:
GIS as a tool • Majority view of GIS • Focus is on hardware, software and routines • A technocentric perspective • The favoured viewpoint of the system vendors
GIS as science • Emphasis is on data, human uses, contexts • A more academic perspective • Geographic information science is the “science behind the systems” • Includes concepts of spatial reasoning, cognition, human-machine communication, visualisation, data modelling, etc.
GIS is a product of a particular culture • Most GIS developed in Europe/N. America • USA: Arc/Info, ArcView, Intergraph, Bentley, Autodesk, MAP, GRASS... • Canada: Caris, Spans, GeoVision... • France: GeoConcept, Carto 2-D... • UK: Smallworld, GIMMS, Laserscan... • Netherlands: ILWIS, PC Raster...
GIS is a commercial product • Developments often driven by commercial considerations, less by scientific ones • Vendor’s decisions usually based on questions of profitability • Critical evaluation of proprietary GIS is rare
Boundaries of GIS are being pushed back • GIS techniques and concepts increasingly seen in other areas and applications: • “Office” type software • In-car navigation and other route-finding systems • Multimedia presentations • The Internet • WAP, SMS, & MMS phone technology
What GIS is not • GIS is not simply the technology: it also has a (growing and important) conceptual base • GIS can not produce good results from bad data or poor conceptual frameworks • GIS is not simply a program to produce maps • GIS is not a substitute for thinking! • GIS is not the universal answer to all problems!
Data input - a major bottleneck • Costs of input often >80% of project costs • Labor intensive, tedious, error-prone • Construction of the database may become an end in itself • the project may not move on to analysis of the data collected • Essential to find ways to reduce costs, maximise accuracy
Manual data conversion involves three stages • State 1: Geocoding • The conversion of analogue maps to digital form • Stage 2: Entering attribute values • e.g. the heights to associate with digitised contour lines • State 3: Linking attribute data to their own geocoded features
Digital map data: three possible situations. • The data we want already exist • Hopefully we can find and buy them (or they may even be free!) • Data exist but not in digital form • Will require conversion from analogue format • Data do not exist at all • Will need to collect the data ourselves by remote sensing, field data collection, etc.
Data exist in digital form • To be useful, have to be in right format, resolution, etc. • Metadata can inform us as to fitness for purpose • unfortunately such information not always available • may lead to misinterpretation, false expectations about accuracy
Sources of digital map data • National Mapping Organization • Other government agencies • Commercial data vendors
Standards • standards may be set to • assure uniformity within a single data set or across several data sets • ensure the data can be shared across different hardware and software platforms
For Vector data DXF and DWG NTF DLG TIGER SDTF DIGEST .E00 (Arc Export) format Shapefiles For Raster data BIL BSQ DEM TIFF JPEG BMP Some popular standards for digital map data include
Data exist but not in digital form • Need tools to convert analogue maps or other source documents to digital format • Digitizing may be performed manually or through automation • Manual methods tedious & error prone • Automated techniques may create bigger editing problems later
What if the Data do not exist at all? • Field data capture • May be done manually (e.g. direct survey), automatically (e.g. automatic data loggers, etc.) or a combination of the two • Remote sensing • Includes satellite imagery, geophysical survey, air photos • May be used as alternative source of data
Criteria for choosing modes of input • Type of data source • images favour scanning • maps can be scanned or digitised • Database model of the GIS • scanning easier for raster, digitising for vector • Density of data • dense linework makes for difficult digitizing • Expected applications of the GIS implementation
Integrating different data sources: issues • Formats • many different format standards exist • a good GIS can accept and generate datasets in a wide range of standard formats
Integrating different data sources: issues • Projections • Many ways exist to represent curved surface of the earth on a flat map • Some projections are very common • A good GIS can convert data from one projection to another, or to latitude/longitude • Input derived from maps by scanning or digitizing retains the original map's projection • With data from different sources, a GIS database often contains information in more than one projection, and must use conversion routines if data are to be integrated or compared
Integrating different data sources: issues • Scale • data may be input at a variety of scales • scale is an important indicator of accuracy • maps of the same area at different scales will often show the same features • variation in scales can be a major problem in integrating data
Integrating different data sources: issues • Resampling rasters • Raster data from different sources may use different pixel sizes, orientations, positions, projections • Resampling is the process of interpolating information from one set of pixels to another • Resampling to larger pixels is comparatively safe, resampling to smaller pixels is very dangerous
Representing Spatial Entities • The object-focused approach • Based on recognition of discrete objects or entities • May be layer-based or object-oriented • Usually represented by Vector GIS
Two ways of representing space in a GIS • The Tesseral (field-oriented) approach • Typically seen in Raster GIS • Also in some other models
Vector data models • Based on the recognition of discrete objects or entities • The location/boundaries of these objects defined with respect to some coordinate system • Emphasis is on boundaries, space within and between boundaries implied • Objects are usually defined in terms of points, lines and areas • Complex graphic objects are seen as amalgamations of simpler ones • Typical Vector GIS include ARC/INFO, MapInfo Intergraph MGE
Separation of Locational and Attribute data • In vector GIS, geographic information is represented in terms of • Locational / geometric data (“where?”) • Attribute information (“what?”) • Relationships between objects and attributes
The vector data model • Fundamental spatial primitive is a point • Defined by a single x,y coordinate pair • Points can be used to • locate spatial objects • represent Vertices (single = “vertex”) defining a line • represent Nodes defining start- or end-points on lines, junctions where lines meet, etc.
The vector data model • Sequences of points can be used to define lines • Lines themselves can be aggregated to represent • Networks • Boundaries of polygons and regions • Topographic features (contours, breaks of slope, etc.).
Topology • An essential element of vector GIS • A distinct branch of mathematics • Defines spatial relationships between objects • Adjacency, connectivity, containment, etc. • Essential for most vector GIS operations
Advantages and disadvantages of the vector approach • Lower data volumes • More adaptable to variations in scale/resolution of phenomena • Tends to be more suited to social and economic applications • Disadvantages: • Less adaptable to uncertainty, fuzziness • Often no “lowest common denominator” of aerial unit .
Objects versus layers • Major point of discussion in GIS since mid-1980s • Alternative strategies for vector representation of geographic space • a “stacked” sequence of layers • a collection of discrete objects • Difference in how contents of the database represents the real world • Echoes wider developments in Computer Science
The Object view • More closely mirrors natural ways of seeing the world • Objects usually used in speaking, writing, thinking about the world • Objects are fundamental to our understanding of geography • Object-oriented approaches may offer data storage and processing advantages
What are these objects? • Graphics objects can be points, lines, areas • Geographic objects can be roads, houses, hills, etc. • A space can be occupied by many, or no, objects • A river is an object (has an identity, name, coordinates, properties, etc.) • A line is an object (also has an identity, name, coordinates, properties, etc.)
Applications of object view: • Utilities and facilities management • Concept of empty space littered with objects fits many needs of managing infrastructure • Two or more objects may occupy same horizontal position, separated vertically • Smaller objects may be part of larger ones (e.g. pipes as part of networks) and vice versa • Idea of a variable measured everywhere on Earth has little relevance
The Layer view • Locations specified by a system of coordinates • Geography of real world conceptualised as a series of variables (soils, land use, elevation, etc.) • Each layer in the database represents a particular variable
The Layer view • Layer view often more compatible with theories of atmospheric, ocean processes • Object view is less compatible with concept of continuous change • Good for resource management applications • Much data for environmental modelling derived from remote sensing • Implies a layer view
Disadvantages • The layer approach usually requires many different files to represent each layer • Some files contain the actual data • Some contain registration information • Some contain topological information to construct complex geometries from more primitive ones
Applications of layer view • Resource management • geographic variation can be described by relatively small amount of variables • conceptualisation reasonably constant between scales • movement of individuals can lead to difficulties of representation and tracking across layers
Tesseral geometries • From the Greek, tetara or Latintessella = a tile • Tessallations are “sets of connected discrete two-dimensional units” • thus mosaics or tilings of space • May be regular or irregular • Focus is on space occupancy • Emphasis is on areas, boundaries are implied
Conceptual basis: creating a tessallation • Define a geographic area of interest • Undertake sampling of the entire area • Each point is space is assigned a value • The data are separated into a set of vertical thematic layers • One item of information stored for each location within a single layer