320 likes | 536 Vues
U.S. Census Data & TIGER/Line Files. Census Bureau: Charged with the Constitutional responsibility of carrying out the decennial census Census of Population and Housing Very large mapping component involved in undertaking a national census! Census demographic/socioeconomic data:
E N D
U.S. Census Data & TIGER/Line Files • Census Bureau: • Charged with the Constitutional responsibility of carrying out the decennial census • Census of Population and Housing • Very large mapping component involved in undertaking a national census! • Census demographic/socioeconomic data: • Demographic, economic, & social data about persons & households • Aggregated by census enumeration units: e.g. block, block group, tract, county, metropolitan area, etc… • TIGER/Line files: • The “geography” of the census • Topogically Integrated Geographic Encoding & Referencing • e.g., polygons for enumeration units, streets & landmarks
TIGER/Line files - background • 1967 - New Haven Census Use Study • test digital data structures for storing census data by geographic areas • test processes for creating computerized Census maps • had topology!! • 1970s - Census DIME files • expansion of New Haven study into production version • data coverage: U.S. urban areas • important component of 1980 decennial Census • 1980s - development of TIGER/Line files • incorporated DIME files for urban areas (DIME updated in 1981 & 1985) • incorporated nationwide 1:100,000 USGS DLG data • additional information from local officials & Census fieldwork • 1990s – TIGER in use • used for 1990 Census • TIGER updated nearly yearly after 1990 from variety of sources • 1998-1999: major update & prep for 2000 Census • 2000 - latest Census • 2nd use of TIGER for Census • data being released now
TIGER/Line Files • Nominal scale: 1:100,000 • Data "layers": • Enumeration units • blocks, block groups, tracts/block numbering areas, counties, cities/MA, etc. • multiple hierarchies • Voting districts • used for Congressional redistricting • Supporting geography • roads/streets/highways • basic hydrography • point & area landmarks • etc... • TIGER designed to: • support pre-census functions in preparation for Census of Population and Housing • support census-taking efforts • evaluate success of the Census • provide geographic framework for analysis
TIGER Area (polygon) & Landmark Data • Point and poly landmarks • Census geography (tracts, blocks, etc.) used for reporting Census data • ID linkage from polygons in TIGER/Line data to Census attribute data
TIGER Line and Address Data • Linear features... • Form polygon boundaries • Roads • attributes include basic road type, address ranges • also hydro features, etc.
Link to Census Data • Census attribute data • - Summary Tape File (STF) data files • Link to Census geographic entities in TIGER/Line files using unique Census geography IDs • • Lets us merge a tremendously rich souce of detailed socioeconomic data (Census) with a comprehensive geography for the entire country… Orange County, NC block groups w/ median income data (darker green = higher income)
Census Geographic Hierarchy • hierarchical tabulation systems, e.g.: USA Region Division State County Tract Block Group Block • 2000 Census tallies for entire US: 65,443 tracts 208,790 block groups 8,205,582 blocks • for NC: 1,563 tracts 5,271 block groups 232,403 blocks
TIGER Address Data • address ranges: street address numbers at beginning and ending of arc/line in database • allows address geocoding match data with address to a spatial location using an interpolated estimate • data use implication: • explosion of analysis and data integration capabilities! • extremely large (and growing) amount of data tied to addresses • problem: incomplete address range data, esp. in rural areas --why? • some areas simply have incomplete data (very large data collection task) • PO rural routes (though this is changing due to E-911 systems) • Census Bureau steadily improving rural address data • private street/address data providers enhance address range data
Relational DBMS • Data stored as tuples (tup-el), conceptualized as tables • Table – data about a class of objects • Two-dimensional list (array) • Rows = objects • Columns = object states (properties, attributes)
Column = property Table = Object Class Row = object Object Classes with Geometry called Feature Classes
Relation Rules • Only one value in each cell (intersection of row and column) • All values in a column are about the same subject • Each row is unique • No significance in column sequence • No significance in row sequence
Relational Join • Fundamental query operation • Occurs because • Normalization • Data created/maintained by different users, but integration needed for queries • Table joins use common keys (column values) • Table (attribute) join concept has been extended to geographic case
Normalization • Process of converting tables to conform to relational rules • Split tables into new tables that can be joined at query time • The relational join • Several levels of normalization • Forms: 1NF, 2NF, 3NF, etc. • Normalization creates many expensive joins • De-normalization is OK for performance optimization
Spatial Relations • Equals – same geometries • Disjoint – geometries share common point • Intersects – geometries intersect • Touches – geometries intersect at common boundary • Crosses – geometries overlap • Within– geometry within • Contains – geometry completely contains • Overlaps – geometries of same dimension overlap • Relate – intersection between interior, boundary or exterior