190 likes | 341 Vues
This session will delve into the concept of spatial autocorrelation using SAR models in R. We will cover the importance of defining spatial weight matrices, evaluating error terms in SAR models, and utilizing functions from the spdep package, such as dnearneigh() and knearneigh(). Participants will learn how to construct different neighborhood definitions, transform neighborhood data into weight matrices, and assess model fit. The discussion will also explore spatial eigenvector maps for enhancing regression analysis in macroecology.
E N D
R for Macroecology Spatial models
Next week • Any topics that we haven’t talked about? • Group projects
SAR Models • Augment standard OLS with an additional term to model the spatial autocorrelation • We’ll focus on error SAR models, which focuses on spatial pattern in the error part of the model OLS Y = βX + ε SARlag Y = ρWY + βX + ε SARerror Y = βX + λWu+ ε Defining the spatial weights matrix, W, is crucial
Neighborhoods in R • spdep • dnearneigh() • knearneigh() Coordinates (matrix or SpatialPoints) dnearneigh(x, d1, d2, row.names = NULL, longlat = NULL) Minimum and maximum distances (in km if longlat = T) Returns a list of vectors giving the neighbors for each point
Neighborhoods in R • spdep • dnearneigh() • knearneigh() > x = c(1,3,2,5) > y = c(3,2,4,4) > n = dnearneigh(cbind(x,y),d1 = 0,d2 = 3) > n Neighbour list object: Number of regions: 4 Number of nonzero links: 10 Percentage nonzero weights: 62.5 Average number of links: 2.5 > str(n) List of 4 $ : int [1:2] 2 3 $ : int [1:3] 1 3 4 $ : int [1:3] 1 2 4 $ : int [1:2] 2 3 - attr(*, "class")= chr "nb" - attr(*, "nbtype")= chr "distance” ...
Converting a neighborhood to weights neighbors list what to do with neighborless points nb2listw(neighbours, style="W", zero.policy=NULL) W = row standardized (rows sum to 1) B = binary (0/1) C = global standardized (all links sum to n) U = C/n S = variance stabilization (Tiefelsdorf et al. 1999)
Converting a neighborhood to weights > nb2listw(n,style = "W")$weights [[1]] [1] 0.5 0.5 [[2]] [1] 0.3333333 0.33333330.3333333 [[3]] [1] 0.3333333 0.33333330.3333333 [[4]] [1] 0.5 0.5 > nb2listw(n,style = "B")$weights [[1]] [1] 1 1 [[2]] [1] 1 1 1 [[3]] [1] 1 1 1 [[4]] [1] 1 1 > nb2listw(n,style = "C")$weights [[1]] [1] 0.4 0.4 [[2]] [1] 0.4 0.40.4 [[3]] [1] 0.4 0.40.4 [[4]] [1] 0.4 0.4> > nb2listw(n,style = "S")$weights [[1]] [1] 0.4494897 0.4494897 [[2]] [1] 0.3670068 0.3670068 0.3670068 [[3]] [1] 0.3670068 0.3670068 0.3670068 [[4]] [1] 0.4494897 0.4494897
Converting a neighborhood to weights > nb2listw(n,style = "W")$weights [[1]] [1] 0.5 0.5 [[2]] [1] 0.3333333 0.33333330.3333333 [[3]] [1] 0.3333333 0.33333330.3333333 [[4]] [1] 0.5 0.5 > nb2listw(n,style = "B")$weights [[1]] [1] 1 1 [[2]] [1] 1 1 1 [[3]] [1] 1 1 1 [[4]] [1] 1 1 > nb2listw(n,style = "C")$weights [[1]] [1] 0.4 0.4 [[2]] [1] 0.4 0.40.4 [[3]] [1] 0.4 0.40.4 [[4]] [1] 0.4 0.4 > nb2listw(n,style = "S")$weights [[1]] [1] 0.4494897 0.4494897 [[2]] [1] 0.3670068 0.3670068 0.3670068 [[3]] [1] 0.3670068 0.3670068 0.3670068 [[4]] [1] 0.4494897 0.4494897 Emphasizes weakly connected points Emphasizes strongly connected points Emphasizes strongly connected points Tries to balance
Lots of options – how to choose? • Define the neighborhood • Define the spatial weights matrix • Try things out! • Look for stability in model estimates • Look for residual autocorrelation
Defining the neighborhood - d #1. Small distance n = dnearneigh(cbind(x,y),d1 = 0, d2 = 0.1) w1 = nb2listw(n,zero.policy = T) #2. Medium distance n = dnearneigh(cbind(x,y),d1 = 0, d2 = 0.3) w2 = nb2listw(n,zero.policy = T) #2. Large distance n = dnearneigh(cbind(x,y),d1 = 0, d2 = 0.5) w3 = nb2listw(n,zero.policy = T) par(mfrow = c(1,4)) plot(x,y,axes = F,xlab = "",ylab = "") plot(w1,cbind(x,y)) plot(w2,cbind(x,y)) plot(w3,cbind(x,y))
Defining the neighborhood - K #4. 2 neighbors n = knn2nb(knearneigh(cbind(x,y),k=2,RANN = F)) w4 = nb2listw(n,zero.policy = T) #5. 4 neighbors n = knn2nb(knearneigh(cbind(x,y),k=4,RANN = F)) w5 = nb2listw(n,zero.policy = T) #6. 8 neighbors n = knn2nb(knearneigh(cbind(x,y),k=8,RANN = F)) w6 = nb2listw(n,zero.policy = T) par(mfrow = c(1,4)) plot(x,y,axes = F,xlab = "",ylab = "") plot(w4,cbind(x,y)) plot(w5,cbind(x,y)) plot(w6,cbind(x,y))
Neighborhoods on grids x = rep(1:20,20) y = rep(1:20,each = 20) plot(x,y) n = dnearneigh(cbind(x,y),d1=0,d2 = 1) w = nb2listw(n) plot(w,cbind(x,y)) n = dnearneigh(cbind(x,y),d1=0,d2 = sqrt(2)) w = nb2listw(n) plot(w,cbind(x,y)) Rook’s case Queen’s case
Data size • SAR models can take a very long time to fit • 2000 points is the maximum I have used • sample() is useful again
Fitting the SAR model • errorsarlm() just like lm() what to do with neighborless points errorsarlm(formula, listw, zero.policy=NULL) The neighborhood weights
Try it out • Build several SAR models with different W • Which one works best?
Spatial eigenvector maps • Generate new predictors that represent the spatial structure of the data • Three steps • Calculate a pairwise distance matrix • Do a principal components analysis on this matrix • Select some of these PCA axes to add to an OLS model
Spatial eigenvector maps Diniz-Filho and Bini 2005
Filter 1 Filter 2 Filter 3 Filter 4
Filter 20 Filter 10 Filter 30 Filter 40