480 likes | 602 Vues
Spatio – Temporal Cluster Detection Using AMOEBA. Jimmy Kroon Pennsylvania State University Advisor: Dr. Frank Hardisty. This is a parody – Original Art: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html. Outline. Introduction – Clustering and Project Direction
E N D
Spatio – Temporal Cluster DetectionUsing AMOEBA Jimmy Kroon Pennsylvania State University Advisor: Dr. Frank Hardisty
This is a parody – Original Art: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html
Outline • Introduction – Clustering and Project Direction • The Spatial Scan Statistic and SatScan • AMOEBA • Proposed Spatio-Temporal AMOEBA Method • Software, Data, and Progress
Cluster Detection Cluster: “a geographically and/or temporally bounded group of occurrences of sufficient size and concentration to be unlikely to have occurred by chance” (Knox, 1989) Two Typical Uses Disease Surveillance Week of 2/7/2010 Data: Google Flu Trends – Analysis: GeoDa Epidemiological Studies Brain Cancer in NM Kulldorff et al. 1998 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Time in Spatial Analysis • Time Matters: • Many geographic phenomena are dynamic. • Spatial patterns we see probably change over time • The American Association of Geographers describes temporal geography as a ‘frontier’ of GIScience. • Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters. • Growth • Movement • Splits / Joins Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Research Problem Primary: No method exists for the determining the true extent of irregularly shaped clusters in spatio-temporal datasets. Secondary: Spatial AMOEBA has not been implemented in R Project Goals • A demonstration of spatio-temporal cluster detection based on the AMOEBA procedure. • R scripts for running spatial and spatio-temporal AMOEBA will be contributed to the R community. Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic • Scan data with a moving ‘window’, calculating local autocorrelation for spatial units that fall within the window. • Select the window(s) with the highest calculated autocorrelation value as possible cluster(s). • The spatial scan statistic is by far the most popular cluster detection technique, largely due to the availability of SaTScan software by Martin Kulldorff. Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatial Scan Statistic Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Drawbacks of the Spatial Scan Statistic • Clusters that are not similar in shape to the scanning window can produce errors. • False inclusions • False exclusions • Identify thin clusters as multiple small clusters • Cannot detect holes in clusters Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Elliptical Spatial Scan Statistic • Must choose shapes a priori to avoid pre-selection bias See Kulldorff et al. 2006 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA • Ecotope-Based – Regions of contiguous spatial units that are related in terms of z-value • Multidirectional – Search in all directions. • Optimum – Procedure takes place at the finest spatial scale possible and is capable of revealing all spatial association present in the dataset (Aldstadt and Getis, 2006). AMOEBA Clusters Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA • Defining an Ecotope • Add a seed location (one polygon) to the ecotope • Calculate Gi* (Getis-Ord local autocorrelation statistic) • Search in all directions for contiguous polygons • Those that increase Gi* are added to the growing ecotope for that seed location • Keep searching for more neighbors, growing the ecotope until Gi* no longer increases • Repeat – creating ecotopes for each polygon in the dataset Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The R Neighbor Object Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Finding an Ecotope with AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA • From Ecotopes to Clusters • Rank ecotopes by final Gi* • Select that with the highest Gi* as a cluster • Eliminate intersecting ecotopes • Select the ecotope with the next highest Gi* as a second cluster • Repeat • Probability of clusters can be tested using Monte Carlo simulation Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Incorporating Time into AMOEBA • Remember - Spatio-temporal clusters may exhibit behaviors not seen in purely spatial clusters. • Growth • Movement • Splits / Joins • Visualize temporal data as layers of data with time extending vertically through the layers. • Each spatio-temporal unit has spatial neighbors and temporal neighbors Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
The Spatio-Temporal Scan Statistic See Kulldorff et al. 1998 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Software Environment and Test Data • The R Project • Free, open source statistical software • Extendable with user contributed packages • www.r-project.org • Google Flu Trends • Estimates flu incidence levels using aggregated data about user searches for certain keywords • 90% accurate compared to CDC data • State-level data - updated daily • www.google.org/googleflu • SEER (Surveillance Epidemiology and End Results) • National Cancer Institute incidence, survival, and mortality data Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
AMOEBA ArcToolbox for ArcGIS Python Scripts by Jared Aldstadt and Yeming Fan (Aldstadt, 2010) Google Flu Trends – Feb 1, 2009 Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Spatio-Temporal AMOEBA in Python: 2009 Flu Epidemic Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
Hmmm… Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
R Programming Progress • Compete … • Geoprocessing tasks • Create spatio-temporal • neighbor list • Delineate ecotopes • Sort and eliminate intersecting ecotopes • Returns primary cluster PolyID’s that match the Python results • To Do … • Monte Carlo simulation • Process results and add to the output shapefile • Test, test, test Clusters : SaTScan : AMOEBA : ST AMOEBA : Progress
References Aldstadt, Jared, and Arthur Getis. 2006. Using AMOEBA to Create a Spatial Weights Matrix and Identify Spatial Clusters. Geographical Analysis 38: 327-343. Aldstadt, Jared. 2010. Spatial Analysis Tools (ArcGIS). Spatial Analysis Tools. http://www.acsu.buffalo.edu/~geojared/tools.htm. Bellec, S, D Hémon, J Rudant, A Goubin, and J Clavel. 2006. Spatial and space–time clustering of childhood acute leukaemia in France from 1990 to 2000: a nationwide study. British Journal of Cancer Duczmal, Luiz, Martin Kulldorff, and Lan Huang. 2006. Evaluation of Spatial Scan Statistics for Irregularly Shaped Clusters. Journal of Computational and Graphical Statistics 15(2): 428-442. Knox, G. 1989. Detection of Clusters. In Methodology of Enquiries into Disease Clustering, ed. P Elliott, 17-22. London: Small Area Health Statistics Unit. Kulldorff, Martin, Athas, William, Feuer, Eric, Miller, Barry, and Key, Charles. 1998. Evaluating cluster alarms: A space-time scan statistic and brain cancer in Los Alamos, New Mexico. American Journal of Public Health 88(9): 1377-1380. Kulldorff, Martin, Lan Huang, Linda Pickle, and Luiz Duczmal. 2006. An elliptic spatial scan statistic. Statistics in Medicine 25(22): 3929. Kulldorff, Martin. 1999. Geographic Information Systems (GIS) community health: Some statistical issues. Journal of Public Health Management and Practice 5(2): 100-106. Original artwork for parody title slide: http://projectswordtoys.blogspot.com/2009/05/project-sword-annual-1967.html