1 / 61

Geocoding Public Health Data

Geocoding Public Health Data. Lecture 5 Locating Street Addresses and Global Positioning System GIS and RS in Public Health Edmund Seto, Ph.D. School of Public Health University of California, Berkeley. Spatial Data.

penha
Télécharger la présentation

Geocoding Public Health Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Geocoding Public Health Data Lecture 5 Locating Street Addresses and Global Positioning System GIS and RS in Public HealthEdmund Seto, Ph.D.School of Public HealthUniversity of California, Berkeley

  2. Spatial Data In previous lectures we talked about the wide availability of spatial data. Public Health data are often inherently spatial: Vital stats have residential street addresses A cohort study of exposure to air pollution might consider residence and work addresses The problem is how to get these locations on a map. (ie. in a format that is readily usable within a GIS) The process of getting such data placed onto a map or within a GIS is known as Geocoding.

  3. Types of Geocoding • Relational Joins for Spatially Aggregated Data • Address Matching • Global Positioning System • Other Alternatives

  4. Aggregated Data For example: A table of data that is grouped at the county level… How do we match this up with GIS map of counties?

  5. Relational Join A GIS is based on the concept of relational databases, which allow us match geographic features with the corresponding attribute data. In exercises 1 and 2, we saw that a table of attributes can be “joined”with a table of geographic features based on a common identifier in GIS. Where common identifiers might be: country name, county name, postal code, etc.

  6. cdc wonder 1999 disease of circulatory system age-adj to yr 2000 pop ICD codes I00-I99

  7. Geocoding Limitations Beware! Your choice, or lack of choice in terms of the scale, or choice of area-based measure (individual address vs census tract vs block vs zip, etc) can affect the results of your study. Modifiable Area Unit Problem Openshaw, S., and P. Taylor, 1979: A million or so correlation coefficients: Three experiments on the modifiable area unit problem, in Statistical Applications in the Spatial Sciences, ed. N. Wrigley, (London: Pion), 127-144.

  8. Nancy Krieger, Jarvis T. Chen, Pamela D. Waterman, Mah-Jabeen Soobader, S. V. Subramanian and Rosa Carson Geocoding and Monitoring of US Socioeconomic Inequalities in Mortality and Cancer Incidence: Does the Choice of Area-based Measure and Geographic Level Matter? The Public Health Disparities Geocoding Project Am J Epidemiol 2002; 156:471-482

  9. Street Addresses For example: A table of individual street addresses… How do we match this up with a GIS map of streets?

  10. Address Matching in GIS This is known as Address Matching Street geography layer: Street: name, starting & ending address 1234 University Ave 1. matching 2. interpolation Coordinates for the address

  11. Geocoding TIGER The US Census Bureau’s TIGER files include street address information.

  12. FRADDL TOADDL University Ave FRADDR TOADDR

  13. Geocoding Services in ArcGIS Arcview provides a tool known as Geocoding Services that allows us to geocode, in particular, street addresses. For address matching, Geocoding Services works along the same principle as we have just discussed, relying on street geography, and interpolating the address numbers. Arcview comes with a license for StreetMap USA. For the following example, however, we will rely on TIGER files for our Geocoding Service.

  14. Geocoding Berkeley Clinics From the Yellow Pages, I created a table of Berkeley Clinics and their addresses. We will create a Geocoding Service in Arcview for geocoding these addresses. The Geocoding Service will be based on the Berkeley streets file that we clipped out from TIGER data in exercise 2.

  15. 1. Start up ArcCatalog. Under Geocoding Services, select “Create New Geocoding Service”.

  16. Addresses can be formated in a number of different ways, and here you can choose the style that fits the data that you’re using. For TIGER data we will use:US Streets (File-based)

  17. 5. In the Geocoding Services Manager “Add” the service we just created.

  18. 5. Add the service we just created.

  19. Address Matching Difficulties Address Matching isn’t as easy as it seems. Even in our little example, we only had good matches for around 50% of our addresses. And we only tried 18 addresses in Berkeley! Problems: Not all mailing addresses correspond to street addresses: PO Box 140 Warren Hall Trailer Parks Newly developed areas lack street maps for geocoding Quality of data, which could be poorly formatted address data and/or errors in street geography data.

  20. Address Matching Difficulties Texas DOH Guideline for Geocoding http://www.tdh.state.tx.us/gis/Images/Docs/GUIDELINE_FOR_GEOCODING.pdf New Jersey Geocoding problems http://www.state.nj.us/health/chs/releasable.htm Jane McElroy’s talk - Univ of Wisc. Geocoding addresses from a large population-based study: Lessons learned and applied http://www.pophealth.wisc.edu/lecture/pm803-02/pm803-25slides.ppt

  21. No Geographic Data For example: Mapping data that cannot be easily located on existing maps. Residential locations in rural villages Environmental sampling sites Infectious disease vector breeding & control sites

  22. Global Positioning System • What is the Global Positioning System (GPS)? • A global navigation system • Answers the questions: • Where am I now? • How far is my destination? • How do I reach my destination?

  23. GPS Background • A satellite-based navigation system • 24 very high-altitude orbiting satellites • Launched by U.S. Department of Defense • 24-hour, worldwide coverage • Free and reliable • Capable of very high accuracy location measurements

  24. How Does GPS Work? • Uses radio signals transmitted from satellites to triangulate a position on the earth • 4 unknowns: x, y, z, time • Hence 4 satellites are required for triangulation

  25. 1 2 d2 d1 3 d3 Triangulating Position • 3 Satellites to locate position down to one of either 2 points. • One of those 2 points is off in space or is changing very rapidly. So theoretically if we calculate the range to each satellite exactly, then only 3 satellites would be necessary.

  26. 1 2 d2 d1 3 d3 Distance from each satellite? • Satellites are all coordinated to send the same psuedo-random code • The receiver in the field also produces the same psuedo-random code and determines the delay or offset in the code due to transmission time from each satellite. The farther away a satellite is, the larger the delay in its signal.

  27. 1 2 d2 d1 3 d3 The fourth satellite • A fourth satellite signal is needed to triangulate the position because the clocks on field receivers are not as accurate as those onboard the satellites. Hence, the fourth satellite is used to solve the position even when there is imperfect timing. d4

  28. Sources of Inaccuracy • Multipath reception • Timing offset • Signal delays due to Earth’s ionosphere and atmosphere • Poor satellite geometry • Selective Availability (turned off May 1, 2000)

  29. Differential Correction • Eliminates systematic errors: • S/A, receiver clock, satellite clocks, satellite position, ionosphere and atmosphere delays • Uses GPS receiver at a static known reference point to determine error in the signal • This error is similar for nearby GPS receivers at unknown positions • Error correction signal from the reference receiver can correct positions for the nearby receivers

  30. Differential Correction Radio link sends correction information or post-processed in office Moving ROVERs at unknown locations BASE Reference station at known location

  31. GPS Accuracy

  32. GPS Accuracy • GPS accuracy depends on other variables too: • Time spent on measurements • Averaging a bunch of measurements • Design of the receiver and antenna

  33. GPS for Public Health • Disease case or incident sites • Sites of major exposure • Hazardous sites • Vector breeding sites • Intervention or control sites

  34. Creating an Appropriate GPS/GIS Database • What spatial/temporal factors are relevant? • Spatial component: • Point Features • Line Features • Area Features • Attribute Data component: • What sorts of data are relevant for each particular type of spatial feature? • Spatial Resolution

  35. Creating an Appropriate GPS/GIS Database • Fieldwork logistics are a real issue because you have to physically be at the site you want to map! • Cost of receivers vs efficiency • Battery power • Time needed for each feature • Difficulty getting to the sites

  36. Schistosomiasis Case Study

  37. Schistosome Lifecycle Humans worms eggs Irrigation ditch exposure Fertilization miracidia cercaria Irrigation ditch habitat Snails

  38. Geocoding Snail Density • Intermediate host for the disease is a snail that lives in irrigation ditches • Preexisting methods for estimating snail density based on sampling frames • How can we geocode these frames? • Rural area • No maps available • Roughly 500 frames within a village • Money and time are limited

  39. One solution • Map the ditches with GPS • Line feature • Attributes • Ditch ID • Ditch properties: width, flow, construction

More Related