1 / 47

Local Enhancement of Global Estimation

Local Enhancement of Global Estimation. Molly Leecaster, Ph.D. Kerry Ritter, Ph.D. . DAMARS and STARMAP 2 nd Annual Conference Oregon State University Corvallis, OR August 11, 2003. Acknowledgement. PROJECT FUNDING.

zayit
Télécharger la présentation

Local Enhancement of Global Estimation

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Local Enhancement of Global Estimation • Molly Leecaster, Ph.D. • Kerry Ritter, Ph.D. DAMARS and STARMAP 2nd Annual Conference Oregon State University Corvallis, OR August 11, 2003

  2. Acknowledgement PROJECT FUNDING • The work reported here was developed under the STAR Research Assistance Agreement CR-829095 awarded by the U.S. Environmental Protection Agency (EPA) to Colorado State University. This presentation has not been formally reviewed by EPA.  The views expressed here are solely those of the presenter and STARMAP, the Program they represent. EPA does not endorse any products or commercial services mentioned in this presentation.

  3. Outline of Presentation • Introduction • Two-stage sample design • Spatial modeling of binary EMAP data • Indicator kriging • Conditional autoregressive model • Simulation Example • Future work

  4. Introduction • EMAP developed for estimation of areal extent of resources • Sample locations are spatially separated • EMAP participants are interested in global estimation but also have local concerns • Spatial modeling • EMAP data does not provide information on the local spatial structure required for good spatial models • Therefore …. Augment EMAP design to improve spatial modeling

  5. Goals • Present enhancement to EMAP design • Use of enhanced sample in spatial models of indicator data • Indicator kriging • Conditional autoregressive model

  6. Outline of Presentation • Introduction • Two-stage sample design • Spatial modeling of EMAP data • Simulation Example • Future work

  7. Two-stage: Systematic Grid Plus Star Cluster Sample Design • Two-stage because two goals • Systematic (EMAP) grid for global structure • Star cluster sample for variogram estimation • Enhance EMAP design with additional sample locations • Ideal for areal extent and prediction • Ideal for variogram estimation

  8. Two-Stage Design Pink…….…….absence Blue…….…….presence Black….……...systematic Green.………..star clusters 1 Orange…..…..star clusters 2

  9. Stage One: Systematic Component (EMAP) • Based on global estimation requirements • e.g. 30 spatially separated locations per strata

  10. Stage Two:Star Cluster Component • Star clusters of sample sites around stage-one locations • Star clusters provide estimate of small scale pair-wise variance • Star clusters also provide many added pairs of samples at various distance lags • Star clusters provide directional information at small scale • How to specify star clusters?

  11. Stage Two:Star Cluster Component • Location of star clusters • Adaptive, locate at specified observed response • Does this bias the variogram estimation? • Random stage-one locations • Systematic subset of stage-one locations • Size of star clusters • Diameter of star = variogram range • Diameter of star > variogram range • Number of star clusters • At least two, but how many more?

  12. Outline of Presentation • Introduction • Two-stage sample design • Spatial modeling of EMAP data • Simulation Example • Future work

  13. Spatial Models for Binary Data • Indicator kriging for geo-referenced data • Conditional autoregressive model for binary lattice data

  14. Indicator Kriging • Binary geo-referenced data • Spatial correlation structure modeled from data • Precision of predictions depends on sample spacing and variogram parameters

  15. Ordinary Indicator Kriging • Estimate local indicator mean, , at each location • Apply simple IK estimator using estimated mean

  16. Conditional Autoregressive Model for Binary Data • Binary lattice data • Spatial correlation structure assumed: locally (neighborhood) dependent Markov random field • Neighborhood defined as fixed pattern of surrounding grid points • Precision of predictions depends on neighborhood structure, grid size, and variance of response

  17. Conditional Autoregressive Model for Binary Data

  18. Comparison of Models • Ordinary Indicator Kriging • Advantages • Knowledge of spatial relationship improves prediction • Assumed spatial relationship based on data • Disadvantages • Not robust to variogram mis-specification • Requires strong stationarity assumption • Conditional autoregressive • Advantages • No need to estimate or model variogram • Can be used without geo-referenced data • Disadvantages • Assumed spatial relationship based on a grid size that could be inaccurate

  19. Outline of Presentation • From last year to now … progress & new directions • Two-stage sample design • Spatial modeling of EMAP data • Simulation Example • Future work

  20. Simulation Example • Used simulation so spatial structure was known • Simulated response from specific variogram model on to 50x50 hexagon grid of points • Specified presence/absence cutoff • Applied two-stage sample design (2 realizations) • Estimated and modeled variogram from sample data • For some, did two manual and one automatic fit • Predicted probability of presence using indicator kriging and conditional autoregressive model

  21. Simulation Methods • Simulated data from Gaussian random field (S-Plus) • Spherical variogram, range = 22, sill = 0.4, nugget = 0 • Simulated value > 2 => presence • Sample Designs • Systematic sample (n=30) • Systematic sample plus 2 star clusters (n=54) • Systematic sample plus 4 star clusters (n=78) • Models • Indicator kriging • Conditional autoregressive model

  22. Data Simulation with Sample Sites Pink…….…….absence Blue…….…….presence Black….……...systematic Green.………..star clusters 1 Orange…..…..star clusters 2

  23. Variogram for Sample Designs Systematic Systematic + 2 Stars Systematic + 4 Stars

  24. Systematic Sample Results

  25. Systematic Sample with 2 Stars

  26. Systematic Sample with 4 Stars

  27. Three Fits: Systematic + 2 Stars Automatic Fit Manual Fit #1 • Range Sill Nugget • 17 0.3 0 • 0.4 0 • 0.27 0 • All use correct model Manual Fit #2

  28. Predictions from 3 Variogram Fits Automatic Fit Manual Fit #1 Manual Fit #2

  29. Comparison of Prediction Errors • Sensitivity • Number of presence sites predicted to be present • Specificity • Number of absence sites predicted to be absent • True Positive Rate • Number of predicted presence sites that truly are present • True Negative Rate • Number of predicted absence sites that truly are absent

  30. Comparison of Predictions (Data1F)(positive if probability > 0.5)(Auto, Manual #2)

  31. Comparison of Predictions (Data1F)(positive if probability > 0.3)(Auto, Manual #2)

  32. Data Simulation with Sample Sites Pink…….…….absence Blue…….…….presence Black….……...systematic Green.………..star clusters 1 Orange…..…..star clusters 2

  33. Variograms for Sample Designs Systematic Systematic + 2 Stars Systematic + 4 Stars

  34. Systematic Sample Results

  35. Systematic Sample with 2 Stars

  36. Systematic Sample with 4 Stars

  37. Three Fits: Systematic Automatic Fit Manual Fit #1 • Range Sill Nugget • 30 .25 .21 • 15 .27 0 • .22 0 • All use correct model Manual Fit #2

  38. Predictions from 3 Variogram Fits Automatic Fit Manual Fit #1 Manual Fit #2

  39. Comparison of Predictions (Data3F) (positive if probability > 0.5)(Auto, Manual #2)

  40. Comparison of Predictions (Data3F) (positive if probability > 0.3)(Auto, Manual #2)

  41. Simulation Conclusions - Design • Two star clusters improved small-scale features of variogram • Two star clusters improved prediction accuracy • Four star clusters offered little improvement over two stars

  42. Simulation Conclusions - Models • Variogram model affects predictions • Kriging tends toward overall mean probability of presence, i.e. it smooths • Kriging builds patches whose diameter is approximately the range of the variogram • Conditional autoregressive model attempts to connect observed presence • Neither model had consistently higher sensitivity or specificity

  43. Outline of Presentation • From last year to now … progress & new directions • Two-stage sample design • Spatial modeling of EMAP data • Simulation Example • Future work

  44. Future Work • Further simulation studies on two stage design • Effect of sample size • Number of star clusters necessary to improve variogram estimation • Effect of size of star clusters • Bias from adaptive second-stage sampling • Advantages of indicator kriging and conditional autoregressive model • Sensitivity of conditional autoregressive model to initial values, prior distributions, and grid size • Sensitivity of kriging to variogram model specification

  45. Future Work • Apply two-stage sample design to real data • DDT data from Santa Monica Bay, CA • EMAP data and local monitoring data • Freely distribute functions for applying the conditional autoregressive model on a hexagon lattice • Functions in R to produce hexagon lattice input for WinBUGS • File in WinBUGS to apply model • Investigate optimal grid size to achieve EMAP and spatial modeling goals

  46. Systematic (EMAP) Grid Based on Variogram Model • Kriging variance • Analog for conditional autoregressive model

  47. Systematic (EMAP) Grid Based on Variogram Model • Prediction variance is minimized by large covariance between prediction location and sample locations • For kriging, grid refers to sample locations • For conditional autoregressive, grid refers to sample locations and prediction locations • Want -------- Sample locations “close” together • Samples too far apart => • Kriging -> correctly uses no spatial relationship • Conditional autoregressive -> incorrectly uses assumed spatial relationship • Samples too close together => waste of resources

More Related