1 / 64

SPATIAL MODELS FOR DATA REPORTED AS COUNTS OVER GEOGRAPHIC AREAS

SPATIAL MODELS FOR DATA REPORTED AS COUNTS OVER GEOGRAPHIC AREAS. Gary Simon, 28 APRIL 2006. With special thanks… Frank LoPresti, Academic Computing Services, GIS Group Kevin Tun, Stern I.T. Group. Here’s an interesting obscure formula. Consider a set of points:

jude
Télécharger la présentation

SPATIAL MODELS FOR DATA REPORTED AS COUNTS OVER GEOGRAPHIC AREAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. SPATIAL MODELS FOR DATA REPORTED AS COUNTS OVER GEOGRAPHIC AREAS Gary Simon, 28 APRIL 2006

  2. With special thanks… Frank LoPresti, Academic Computing Services, GIS Group Kevin Tun, Stern I.T. Group

  3. Here’s an interesting obscure formula. Consider a set of points: Point 1: (x1 , y1) Point 2: (x2 , y2) …. Point n: (xn , yn)

  4. Connect the points in order. Draw a line from point 1 to point 2, then from point 2 to point 3, …., from point n-1 to point n. Finally draw a line from point n back to point 1. Assume that none of the segments cross, so that this is a polygon.

  5. The area of the resulting polygon is given by The + occurs when the perimeter is drawn counter-clockwise, the – when drawn clockwise.

  6. The data: K regions Counts zl , z2 , …, zK Total count z+ Populations P1,P2 , …, PK Total population P+

  7. The obvious null hypothesis of uniformity is tested by G2 =

  8. Uniformity is often rejected. What should be the alternative to uniformity? Techniques like kriging assess covariance structure and not the structure of the expected counts.

  9. There are also techniques that measure spatial association (Cliff and Ord, 1973, 1981) with I and with c, and these also relate to covariance notions. Cliff, A.D. and Ord, J.K. (1981) Spatial Autocorrelation, London: Pion. Cliff, A.D. and Ord, J.K. (1981) Spatial Processes: Models and Applications, London: Pion. Spatial association can also be given angular interpretations (Simon, 1997). Simon, Gary (1997) An Angular Version of Spatial Correlations, with Exact Significance Tests, Geographical Analysis, vol 29, #3, pp 267-278.

  10. Let’s form a model for the “spatial force” and give this model a central location or hot spot. Note this location as s = . Here sxand syare parameters to be estimated.

  11. Let f(z) be the spatial force at location z = . Then let f(z) = =

  12. Since f(z) = , f(s) = c . At any z with = α , f(z) = . Thus α is a “half-strength” distance.

  13. In this form, the only role of c is to assure the condition

  14. This can be generalized to mix uniform and hot-spot features. f(z) = The parameter ω assesses the strength of the hot-spot relative to uniformity. Negative ω notes a protective effect.

  15. The maximum likelihood expected counts { ek } will be used in the test statistic G2 =

  16. The value of ekwill be computed as Pk× “average” force on county k scaled so that

  17. Consider cancer rates in Florida. “Age-Adjusted Death Rates for Florida, 1998 – 2002.” http://www.stateofflorida.com

  18. Florida has 67 counties. There were 38,814 cases in a population of 15,982,378. The rate is 2.43 per 1,000. The G2 statistic is 2,816.27 on 66 degrees of freedom. The cancer rates are not uniform.

  19. The maximum likelihood fit occurred at parameter values sx = 375.8877 sy =300.6793 α = 13.4375 ω = 2.325

  20. This fit has G2 = 2,246.93 on 67 - 4 = 62 degrees of freedom. This is still an inadequate fit, but the reduction in G2 is 569.34 with four degrees of freedom.

  21. The fitted values are these: The hot spot is at (82.56 w long, 28.80 n lat), in Citrus County.

  22. Map information comes in (longitude, latitude) form that needs to be converted to (x, y) form in (say) miles.

  23. Each degree of latitude has the same mile equivalent. North Pole One degree of latitude cuts off same arc length at all latitudes. Equatorial plane

  24. However, a degree of longitude represents a small distance near the poles and a large distance near the equator. 30° N Latitude Equator

  25. Problem: Find the length of one degree of longitude at latitude θ. Solution: Form a triangle with one corner at the north pole, an angle of one degree at the north pole, and with sides 90°-θ.

  26. 30° N Latitude Equator In a spherical triangle, the sides also have angle measure.

  27. We can use the law of sines for spherical triangles: A, B, C are the angles and a, b, c are the sides.

  28. The computation of E(zk) = ek is found as Pk× “average” force on county k. This average force could be f(ck), where ck is the center of the county.

  29. Instead we will use where  denotes the county and h is the two-dimensional variable of integration.

  30. The value of can be obtained from outside sources. The challenge comes in finding This can be difficult even for simple figures;  is not simple.

  31. Finding requires some organized description of , the boundary of . Fortunately, such descriptions are available from mapping programs.

  32. Consider this geographical region:

  33. Mapping program MapInfo will export an MIF file giving coordinates of (latitude, longitude) points on the boundary. The file has layout 26 -75 40.1288 -75.0154 40.1378 -75.1094 40.0454 . . . -75 40.0294 -74.9755 40.0485 -74.9893 40.1259 -75 40.1288

  34. A graph of these points:

  35. With the boundary so identified, county  is a polygon, so the task of finding is equivalent to integrating over that polygon. The mathematics can be done with Green’s theorem.

  36. Green’s theorem for connected region  and for scalar functions P and Q of two variables is =

  37. The boundary  needs to be parameterized as a function of a single variable, say t. This is possible when the boundary is made up of simple curves or, as in the MapInfo story, straight lines.

  38. The line connecting to is parameterized as Note that dy means .

  39. In the statement of Green’s theorem, = let’s use and so that

  40. Green’s theorem is now = = Area() =

  41. This solves as P(x, y) = 0 and Q(x, y) = x and then Area() =

  42. With the boundary  given as a polygon, the calculation is routine. The consequence is Area() = where m is the number of boundary points of region .

  43. This calculation finds the area of region  and, as a side benefit, discovers whether the point ordering was clockwise or counter-clockwise.

  44. We need also the integrated force function

  45. Match to Green’s theorem = with P(x, y) ≡ 0 and

  46. This means that we need to be able to find Q(x, y) = The solution is Q(x, y) =

  47. Then = =

  48. Let , , … , be the boundary points of . Then Segment k connects point k to point k + 1. (Last segment goes back to point 1.)

More Related