1 / 53

Next time

Next time. Student presentations Rationale Format. Presentations. Please limit to 5 min If you have a ppt, bring it in on a USB flash drive (preload please) Overview of your project (Why, Where) Approach or method (How) Results, output to discuss (if any at this time)

zahur
Télécharger la présentation

Next time

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Next time • Student presentations • Rationale • Format

  2. Presentations • Please limit to 5 min • If you have a ppt, bring it in on a USB flash drive (preload please) • Overview of your project (Why, Where) • Approach or method (How) • Results, output to discuss (if any at this time) • You will have time to answer one or two questions max

  3. Audience • Please ask at least one question – I will keep track of this • You will also be evaluating each presenter based on a set of criteria • The evaluations will be returned to each speaker

  4. Presentation evaluations • Rate the clarity and organization of the presentation. • How well did the project incorporate available GIS/spatial analysis techniques? • Suggestion to improve presentation or project: ______________________

  5. Peer evaluations

  6. Menu • 4:15 pm • 7 pies • 3 pep, 1 plain, 2 veg mix, 1 ass-meat • Diet Coke, Coke, Sprite • You may bring a dessert to share if you wish

  7. CART and GIS RESM 575 Spring 2009 Lecture 15

  8. CART “Tree modeling” • A local nonparametric model • Decision tree • Data driven This type of tree -> <- Not this one

  9. CART approach • Identify thresholds in predictor variables that partition response data optimally • Result in decision rules associated with “suitability or probability of presence”

  10. CART • Useful when the response variable is made up of several distinct, unordered classes • Not significantly affected by outliers (Verblya, 1987) • Regression trees • Used to predict numeric values • Classification trees • Used to predict classifications

  11. Classification trees vs Regression trees • Classification trees • Should be used to predict habitat types • “Categorical data” • Regression trees • Could be used to predict abundances • “Numeric data”

  12. CART (some general notes) • Tree models are overfitted first then reduced later • Various methods to reduce trees • 10fold optimization • GINI method

  13. Methods for determining tree size • One standard error rule (with or without cross-validation). • Akaike's Information Criterion (classification trees only). • Mallow's Cp (regression trees only). • The Gini index • is a measure of impurity with values ranging from 0 to 1, where 0 represents total purity (all observations in the node are in one class) and 1 equals total impurity (all classes are equally represented in the node). • The reduction in the Gini index is measured as the impurity of a group before the split, minus the sum of the impurities of the two groups resulting from the split.

  14. Evaluation Splus • The tree may be pruned using deviance or misclassification rate to evaluate the subtrees, or the tree my be shrunken.

  15. Evaluation

  16. Result of probability

  17. Model Assessment – Kappa Analysis • Kappa can be used as a measure of agreement between model predictions and reality (Congalton 1991) or • to determine if the values contained in an error matrix represent a result significantly better than random (Jensen 1996). • Kappa is considered to be a reliable measure of accuracy if the z-score is significant (Fleiss 1973).

  18. Sampling in Hawth’s tools

  19. Computing Kappa

  20. CART Advantages • Easy to interpret • Useful for non-linear, non-additive and hierarchical relationships • Possible to investigate variable importance • Pruning measures available for variable reduction • Accepts categorical data

  21. CART Disadvantages • Somewhat computer intensive • Less powerful than parametric methods when response functions are simple • Requires multiple software

  22. CART Software • CART by Salford Systems • Random forests • Its strengths are spotting outliers and anomalies in data, displaying proximity clusters, predicting future outcomes, identifying important predictors, discovering data patterns, replacing missing values with imputations, and providing insightful graphics. Additionally, it can provide clustering and density estimations. • SPLUS, R, JMP, SAS, etc

  23. StatMod http://www.gis.usu.edu/~chrisg/avext/ • Goal: developed to help statistical modeling for ecologists • Helps perform logistic regression with SAS and GIS Or • Performs CART analysis with GIS • Provides graphical user interface and more…

  24. Basics of StatMod Spatial Data in GIS Sample GIS Themes Set up Analysis Options Run Statistical Analysis Helps Read Output Back into GIS

  25. Additional functions of StatMod • Helps user convert and resample data • Select random points • Perform Kappa analysis

  26. StatMod installation • Simply copy statmod.avx into the c:/esri/av_gis30/arcview/ext32 directory

  27. ArcRstat

  28. References Andersen, M. C., J. M. Watts, J. E. Freilich, S. R. Yool, G. I. Wakefield, J. F. McCauley, and P. B. Fahnestock. 2000. Regression-tree modeling of desert tortoise habitat in the central Mojave desert. Ecological Applications 10: 890-900. Breiman, L., J. H. Friedman, R. A. Olshen, and C. J. Stone. 1984. Classification and Regression Trees. Chapman & Hall, New York, New York, U.S.A. Carroll, C., W. J. Zielinski, and R. F. Noss. 1999. Using presence-absence data to build and test spatial habitat models for the fisher in the Klamath region, U.S.A. Conservation Biology 13: 1344-1359. Congalton, R. G. 1991. A review of assessing the accuracy of classifications of remotely sensed data. Remote Sensing of Environment 37:35-46. Fleiss, J. L. 1973. Statistical Methods for Rates and Proportions. John Wiley & Sons, Inc., New York, New York, USA. Hosmer, D. W., and S. Lemeshow. 1989. Applied Logistic Regression. John Wiley & Sons, Inc., New York, New York, USA. Jensen, J. R. 1996. Introductory Digital Image Processing: A Remote Sensing Perspective (Second edition). Prentice Hall, Inc., Upper Saddle River,New Jersey, USA. Miller, J.M. 2005. Incorporating Spatial Dependence in Predictive Vegetation Models: Residual Interpolation Methods. The Professional Geographer, 57(2): 169-184. Rejwan, C., N. C. Collins, L. J. Brunner, B. J. Shuter, and M. S. Ridgway. 1999. Tree regression analysis on the nesting habitat of smallmouth bass. Ecology 80: 341-348. RS/GIS Laboratories, Utah State University. 2003. Sagebrush Ecosystem Mapping using Landsat ETM, Final Report. Unpublished manuscript. Schadt, S. E. Revilla, T. Wiegand, F. Knauer, P. Kaczensky, U. Breitenmoser, L. Bufka, J. •ervený, P. Koubek, T. Huber, C. Staniša, and L. Trepl.2002. Assessing the suitability of central European landscapes for the reintroduction of Eurasian lynx. Journal of Applied Ecology 39: 189-203. Verbyla, D. L. 1987. Classification trees: a new discrimination tool. Canadian Journal of Forest Research 17: 1150-1152.

  29. (Shumar, 2009)

More Related