1.03k likes | 1.55k Vues
Advanced Spatial Analysis Spatial Regression Modeling. Paul R. Voss and Katherine J. Curtis. Day 4. GISPopSci. Review of yesterday. Spatial processes spatial heterogeneity spatial dependence Spatial regression models Various specifications for spatial dependence spatial lag model
E N D
Advanced Spatial Analysis Spatial Regression Modeling Paul R. Voss and Katherine J. Curtis Day 4 GISPopSci
Review of yesterday • Spatial processes • spatial heterogeneity • spatial dependence • Spatial regression models • Various specifications for spatial dependence • spatial lag model • spatial error model • higher-order models • Afternoon lab • spatial regression modeling in GeoDa & R GISPopSci
Questions? GISPopSci
Plan for today • Understanding spatial heterogeneity in relationships • Local multivariate methods for spatial data analysis • Introduction Geographically Weighted Regression (GWR) • theory & concept • uses of GWR • cautions regarding GWR • Discrete spatial heterogeneity in relationships • Lab: GWR in R; spatial regime analysis in R GISPopSci
Review: Spatial Dependence & Spatial Heterogeneity GISPopSci
Spatial dependence… the existence of a functional relationship between what happens at one point in space & what happens elsewhere Spatial heterogeneity… exists when the mean, and/or variance, and/or covariance structure “drifts” over a mapped process GISPopSci
Spatial heterogeneity… • Typified by regional differentiation • Reflects “spatial continuities” of social processes which, “taken together help bind social space into recognizable structures” • a “mosaic of homogeneous (or nearly homogeneous)” areas in which each is different from its neighbors (Haining 1990:22) GISPopSci
% Child Poverty US South, Census 2000 Suppose we observe the following map… GISPopSci
% Child Poverty US South, Census 2000 The question for us is… Is this observed spatial distribution of child poverty generated by spatial heterogeneity or spatial dependence (or nuisance)? It’s not always easy to know… GISPopSci
Iterate as needed… 1. EDA & ESDA on variables—global & local patterns of spatial autocorrelation under different neighborhood specifications 2. OLS baseline model & accompanying diagnostics 3. Correct for spatial heterogeneity if indicated 4. (With possible controls for spatial heterogeneity) estimate & contrast spatial error & spatial lag model results GISPopSci
% Child Poverty US South, Census 2000 The question for us is… Is the process generating poverty in the Mississippi Delta the same as the process generating poverty in Appalachia, or are there different processes? In other words, is there spatial heterogeneity in the relationships? GISPopSci
Spatial Heterogeneity in Relationships GISPopSci
“The term spatial heterogeneity refers to variation in relationships over space.” James P. LeSage Spatial Econometrics December, 1998, p. 6 (book manuscript online at http://www.spatial-econometrics.com/html/wbook.pdf) GISPopSci
Constancy Assumption… • Slope of a regression line (or average association among all units) applies to separate units that comprise the whole (Freedman et al. 1991:678) • Unemployment has same association with child poverty in all counties GISPopSci
Spatial Heterogeneity… • Regionally-specific circumstances influence structural relationships (O’Loughlin et al. 1994) • Unemployment has different associations with child poverty in different counties GISPopSci
Aspatial Context… Individual wage returns (y) to achieved education (x) by gender (in some hypothetical advanced society) A0 = male A1 = female GISPopSci
Spatial Context… Median wage returns (y) to HS+ education (x) by region(hypothetical values) A0 = South A1 = non-South GISPopSci
Spatial Context… • Differentiation in the magnitude & nature of relationships across the spatial region • Geographic space represents a physically bounded area that holds social characteristics • which intersect to create divergent social, economic, & political outcomes • across sub-areas within the larger spatial region GISPopSci
A Brief Digression: “Neighborhood Effects” or “Contextual Effects” GISPopSci
Contextual effects… “[T]he essential feature of all contextual-effects models is an allowance for macro processes that are presumed to have an impact on the individual actor over and above the effects of any individual-level variables that may be operating.” Hubert M. Blalock, Annual Review of Sociology (1984:354) “Putting people into place means explaining behavior and outcomes in relation to a potentially changing local context.” Barbara Entwisle, Demography (2007:687) GISPopSci
Contextual layers… Broad Social, Economic, Cultural, Health, & Environmental Conditions & Policies at the Global, National, & Local Levels Living & Working Conditions Social, Family, & Community Networks Individual Behavior Individual Characteristics Adapted from Luke (2004:5) GISPopSci
Conceptual motivations… Ecological & atomistic fallacies Misattribution of relationships discovered at one level to relationships at another level Collectives and their members— Both have properties that can be dis/aggregated across levels But the relationships between the properties may differ between the levels GISPopSci
Statistical motivations… Non-independence in error structure Correlated errors Inaccurate standard errors Coefficients apply equally to all contexts Relationship assumed stable across contexts (constancy assumption!) GISPopSci
Variation in outcomes: Variation in the effects: Community Inequality (Community i) * Individual Health (Individual i, j) *** Community Inequality (Community j) Community Inequality (Community j) Individual Characteristic (Individual i) Individual Health (Individual i) GISPopSci
Place j Place k Place l Place i Labor Market Structure Political Climate Population Composition Population Health Place m Place i Place n Place o Place p Place q Individual Individual “Place” versus “Space”… GISPopSci
“Multilevel models do not incorporate any notion of space and, as such, may be described as nonspatial: they consider the neighborhood affiliation of individuals but neglect spatial connections between neighborhoods.” Basile Chaix et al., American Journal of Epidemiology (2005:177) GISPopSci
Introducing “space” into “place” framework… “[The multilevel approach] fragments space into administrative neighborhoods and ignores spatial associations between them.” Basile Chaix et al., American Journal of Epidemiology (2005:171) “A more dynamic conceptualization is needed that…integrates multiple dimensions of local social and spatial context…” Barbara Entwisle, Demography (2007:687) GISPopSci
Household Individual Occasion Multiple membership model… From Goldstein et al. (2000) GISPopSci
Household j Household k Individual i Individual i Occasion t Occasion t+1 Extended multiple membership model… From Goldstein et al. (2000) GISPopSci
Place j Place k Place l Place i Labor Market Structure Political Climate Population Composition Population Health Place m Place i Place n Place o Place p Place q Individual Individual Spatially lagged predictors… GISPopSci
Standard multilevel model… X1ij, individual characteristic Z1j, contextual characteristic GISPopSci
Multilevel model with spatially lagged predictors… X1ij, individual characteristic Z1j, contextual characteristic WZ1j, spatially-weighted contextual characteristic GISPopSci
The point being… Multilevel modeling framework consistent with concept of heterogeneity in relationships relationships & outcomes might be conditioned by place But not necessarily “spatial” heterogeneity though framework can be modified to explicitly incorporate “space,” not just “place” GISPopSci
Geographically Weighted Regression (continuous spatial heterogeneity) GISPopSci
This week we’ve looked at the results of a simple OLS multivariate regression model Dependent variable: sqrt(PPOV) Independent variables: sqrt(UNEM) sqrt(PFHH) log(HSPLUS) GISPopSci
What about all those diagnostics at the end of the GeoDa regression output? They revealed a specification with lots of problems GISPopSci
Recall, the lower half of the GeoDa output from the OLS regression run looked like this GISPopSci
Let’s remind ourselves… GISPopSci
What’s the take-home message? • Perhaps continue explorations in R (richer diagnostic environment) • But, unless we’re very fortunate, the OLS model diagnostics almost always leave us with a bad taste. Why? • BECAUSE WE WANT TO MOVE ON! We want to do something about the spatial autocorrelation in the residuals • BUT neither the residual Moran statistic nor the Lagrange multiplier statistics are trustworthy in the presence of non-normality & heteroskedasticy • Furthermore, econometeric simulations have shown that in the presence of residual spatial autocorrelation, heteroskedasticity is induced • We’ve got problems!! • This is where we begin to look for ways of reducing the unresolved heterogeneity that appears to be plaguing our OLS model GISPopSci
What to do? • Our model was very simple. Surely we could add important covariates that might improve the statistical qualities of our residuals • additional covariates? • corrections for large 1st-order trends in the data? • Alternative functional forms? • Other specifications? • interactions? • spatial regime approaches? • Here’s where we may get some additional guidance from GWR GISPopSci
GWR GISPopSci
Background • Social processes are non-stationary • We have come to accept the reality that phenomena vary depending on where they are measured • Certainly for single variables • But multivariate relationships? • It is not at all uncommon to see published research studies that specify and estimate multivariate models (based, say, on census data) that report only “global” regression results • e.g., the relationship of median HH income to home ownership rates for counties across the U.S. GISPopSci
Local methods for spatial data analysis: Lots of them! • Local univariate spatial data analysis • point pattern clustering; scan statistics • local graphical analysis; dynamically linked windows • local filtering • local measures of spatial dependence • Local multivariate spatial data analysis • spatial expansion models • multilevel modeling • random coefficient models; spatial regime models • geographically weighted regression GISPopSci
What is GWR? • GWR 3.x • Software developed by: • Stewart Fotheringham • Martin Charlton • Chris Brunsdon • University of Newcastle upon Tyne • (at the time) • Method of exploratory spatial data analysis • Software (GWR 3.x & spgwr in R) • Book • Website • Very much identified with three people… A. Stewart Fotheringham Martin Charlton Chris Brunsdon GISPopSci
Specifically, GWR is a tool for exploring and identifying variation in statistical relationships over space It’s a way of exploring “spatial heterogeneity” (“spatial non-stationarity”); i.e., where the same stimulus provokes a different response in different parts of the study region GISPopSci
Linear regression model: OLS estimator: • Gauss-Markov assumptions • Parameters constant over space • If there’s spatial heterogeneity, we only see it through the residuals Assumptions: GISPopSci
So… what to do? • We’ve looked at several approaches this week: • Map residuals; look for spatial patterning • Compute an autocorrelation statistic for the residuals • “Model” the error dependency using spatial regression model • But… why not address the issue of spatial nonstationarity directly, and allow the relationships to vary over space? GISPopSci
Geographically Weighted Regression model: GWR estimator: Where Wi is a matrix of weights specific to location i such that observations nearer to i are given greater weight than observations further away GISPopSci
Where wik is the weight given to data at location k for the estimate of the local parameters at location i Note: if Wi = I (identity matrix), each observation in the data has a weight of unity, and the GWR model reduces to the OLS model Optimizing the weights matrix is a computational task of O(n2), so for large data sets it takes some time. Good news: we need only derive this once GISPopSci