660 likes | 755 Vues
Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620 ) Geodatabases Dr . David Arctur Research Fellow, Adjunct Faculty University of Texas at Austin Lecture 4 September 19, 2013. Outline. Tables Geocodes Data table joins Spatial joins Spatial data formats
E N D
Introduction to Geographic Information Systems Fall 2013 (INF 385T-28620) GeodatabasesDr. David Arctur Research Fellow, Adjunct Faculty University of Texas at Austin Lecture 4 September 19, 2013
Outline • Tables • Geocodes • Data table joins • Spatial joins • Spatial data formats • Geodatabases • Calculating geometry INF385T(28620) – Fall 2013 – Lecture 4
Lecture 4 tables INF385T(28620) – Fall 2013 – Lecture 4
Two kinds of tables in ArcGIS INF385T(28620) – Fall 2013 – Lecture 4 • Feature attribute table of map layer • Attribute data is part of map layers • Data table with geocodes (such as census IDs) • Can add as table to ArcMap • Can join to map layer to add more attributes to layer • Join via same geocode values in both the data table and map layer’s attribute table • Census data example—too many census variables to supply already in feature attribute table, so download custom table and join to appropriate polygon layer
Data table format INF385T(28620) – Fall 2013 – Lecture 4 • Rectangular table with one value per cell • Columns (fields) are attributes • Rows are observations (records)
Data table format INF385T(28620) – Fall 2013 – Lecture 4 • First row must have column names that are self-documenting labels • E.g., Shape, POP2000 • First character of attribute name must be a letter • Remaining characters can be any letter, digit, or the underscore character (but no blanks)
Data table format INF385T(28620) – Fall 2013 – Lecture 4 • All additional rows of a data table must contain only attribute values (raw data) • None of the rows can be sums, averages, or other statistics for raw data rows
Primary keys INF385T(28620) – Fall 2013 – Lecture 4 • Each table has a primary key attribute with two properties • Each value is unique • There are no null values
Field calculator INF385T(28620) – Fall 2013 – Lecture 4 • Add computed columns in ArcGIS • ArcGIS does not have the query capacity of relational database packages to compute new columns on the fly • So, must create permanent new columns • Full range of computation • Can add, multiply, etc. • Has numeric and text functions • Can concatenate text values
Field calculator (numeric) INF385T(28620) – Fall 2013 – Lecture 4
Field calculator (text) INF385T(28620) – Fall 2013 – Lecture 4 Concatenate house number and street fields
External table file formats for import into ArcGIS INF385T(28620) – Fall 2013 – Lecture 4 • Plain ASCII text with comma separated values (.csv) • Very transportable format, very large files • Each table record is a row terminated with a line-break character (invisible, nonprinting value) • Has values separated by a delimiter, usually a comma • For data values that contain the delimiter, enclose the value in double quotes • Sometimes columns get wrong data type on import (use double quotes to force text data type for digits, say for house numbers)
External table file formats for import to ArcGIS INF385T(28620) – Fall 2013 – Lecture 4 • Excel (.xls, .xlsx) • Excel 2003, up to 65,000 rows and 256 columns • Excel 2007, up to 1,048,576 rows and 16,384 columns • dBase database table (.dbf) • Legacy format • ArcMap truncates field names to 1st 10 characters • dBase IV has maximum of 255 columns • Can open dBase file in Excel but cannot save dBase from Excel • Microsoft Access database (.mdb) • Up to 2 GB file size • See following for other limits: http://www.databasedev.co.uk/access_specifications.html
Lecture 4 Geocodes INF385T(28620) – Fall 2013 – Lecture 4
Geocodes (2000) • Federal Information Processing Standards (FIPS) • Developed by the National Institute of Standards and Technology • Codes for place-names throughout the world • Countries • States/provinces • Counties • Metropolitan statistical areas (MSA’s) • Cities • Places—Indian reservations, airports, and post offices in the US See http://www.genesys-sampling.com/pages/Template2/site2/61/default.aspxfor additional geocodes. INF385T(28620) – Fall 2013 – Lecture 4
Geocodes: hierarchical FIPS codes (political boundaries) Country: US State: 42 (Pennsylvania) County: 003 (Allegheny) Minor civil division: 4200361000 (Pittsburgh) Tract: 1917 Census codes(statistical boundaries) Block group: 003 Block: 005(US420031917003005) Parcel block & lot number • 0096-P-00210000000 • (1690 Seaton St, Pittsburgh, PA 15226) Local government cadastral data (legal boundaries) INF385T(28620) – Fall 2013 – Lecture 4
World and US INF385T(28620) – Fall 2013 – Lecture 4
US and state 42 State 42 and county 003 INF385T(28620) – Fall 2013 – Lecture 4
County 003 and municipality 61000 Municipality 61000 and tract 1917 INF385T(28620) – Fall 2013 – Lecture 4
Tract 1917 and block group 003 Block group 003 and block 005 INF385T(28620) – Fall 2013 – Lecture 4
Geocodes (2010) • ANSI Codes • American National Standards Institute Codes • Replace the Federal Information Processing Standards (FIPS) • The entities covered include: • States and statistically equivalent entities • Counties and statistically equivalent entities • Named populated and related location entities (such as places and county subdivisions) • American Indian and Alaska Native areas See http://www.census.gov/geo/www/ansi/ansi.html INF385T(28620) – Fall 2013 – Lecture 4
Lecture 4 Data table joins INF385T(28620) – Fall 2013 – Lecture 4
Review: Table joins INF385T(28620) – Fall 2013 – Lecture 4 • Puts two tables together, on the fly, to make one table • One-to-one join (e.g., join state attribute data to state shapefile by StateName) • One-to-many join (e.g., join code table to feature attribute table to add code description. Many records can use the same code value.) • Each table in a join must have key attribute for matching • Must have same values and data types for key in both tables
Example join • + • = INF385T(28620) – Fall 2013 – Lecture 4
Problems with joins Text values left align while numeric values right align INF385T(28620) – Fall 2013 – Lecture 4 • Field types are different (e.g., one is numeric and one is text)
Solution INF385T(28620) – Fall 2013 – Lecture 4 Create a new field of the same type and use Field Calculator
Solution INF385T(28620) – Fall 2013 – Lecture 4 Both tables are same field types
Problems with joins Must remove dashes INF385T(28620) – Fall 2013 – Lecture 4 Data format varies
Lecture 4 Spatial joins INF385T(28620) – Fall 2013 – Lecture 4
Spatial joins INF385T(28620) – Fall 2013 – Lecture 4 • Joins using shape (not attribute field) • Enables data aggregation (counting or summing points by polygon) • Common spatial joins • Points to polygons (counts) • Polygons to points (adds text) • Points to points (distances)
Points to polygons INF385T(28620) – Fall 2013 – Lecture 4 • How many businesses are in each neighborhood? • Start with: • Business points • Neighborhoodpolygons
Points to polygons • Right-click neighborhoods > Joins and Relates > Join INF385T(28620) – Fall 2013 – Lecture 4
Spatial join result INF385T(28620) – Fall 2013 – Lecture 4 New polygon layer with count of points (number of architects and engineers)
Spatial join result INF385T(28620) – Fall 2013 – Lecture 4 Show as a choropleth map with labels, or table
Points to polygons INF385T(28620) – Fall 2013 – Lecture 4 • What neighborhood is a business in? • Start with: • Business points • Neighborhoodpolygons
Polygons to points INF385T(28620) – Fall 2013 – Lecture 4 Right-click business points > Joins and Relates > Join
Spatial join result INF385T(28620) – Fall 2013 – Lecture 4 Point shapefile with neighborhood data on each business
Points to points INF385T(28620) – Fall 2013 – Lecture 4 • How close is the nearest bus stop to a business? • Start with: • Business points • Bus stop points
Points to points INF385T(28620) – Fall 2013 – Lecture 4 Right-click business points > Joins and Relates > Join
Result INF385T(28620) – Fall 2013 – Lecture 4 Distance field added to new layer of businesses and stops joined
Lecture 4 Spatial data formats INF385T(28620) – Fall 2013 – Lecture 4
Esri legacy format: Coverage INF385T(28620) – Fall 2013 – Lecture 4 Folder with multiple files Can have points, lines, and/or polygons Has several intermediate data products (topology) to speed up processing (now calculated on the fly)
Esri legacy format: Shapefile INF385T(28620) – Fall 2013 – Lecture 4 Multiple files, all with the same name but different file extensions No intermediate data products, but has indices to speed data processing Widely used to share spatial data files
Shapefiles INF385T(28620) – Fall 2013 – Lecture 4 • ArcView native format • Minimum files • .shp–stores feature geometry • .shx–stores index of features • .dbf–stores attribute data • Additional files • .prj–projection data • .xml–metadata • .sbn and .sbx–store additional indices
CAD drawings INF385T(28620) – Fall 2013 – Lecture 4 • CAD software • Autodesk, AutoCAD (.dwg) • Bentley, Microstation (.dgn, .dxf) • Often used by engineering companies • Better digitizing precision
CAD drawings INF385T(28620) – Fall 2013 – Lecture 4
Lecture 4 GEODATABASES
Geodatabases Country layer World.gdb Graticule layer INF385T(28620) – Fall 2013 – Lecture 4 A geodatabase is a container used to hold a collection of datasets (GIS features, tables, raster images, and other objects)
Enterprise geodatabases INF385T(28620) – Fall 2013 – Lecture 4 • Practically unlimited size and multiple simultaneous users • Use enterprise data management systems • Store spatial datasets in a number of DBMSs: IBM DB2, Microsoft SQL Server, Oracle, or Postgres
Personal geodatabase INF385T(28620) – Fall 2013 – Lecture 4 • Parallels enterprise geodatabase but on PC • Stores datasets in a Microsoft Access .mdb file • Limited to 2 GB • Much overhead in space and extra structure • Tempting to apply one’s own Access skills, but needs ArcGIS Catalog utility for manipulation