510 likes | 516 Vues
Who will you trust?. Field technicians? Software programmers? Statisticians? Instructors? GIS technicians? Other researchers? Yourself?. Regression (Correlation) Modeling. Creates a model in N-Dimensional “Hyper-Space” Defined by: Covariates Response variables
E N D
Who will you trust? • Field technicians? • Software programmers? • Statisticians? • Instructors? • GIS technicians? • Other researchers? • Yourself?
Regression (Correlation) Modeling • Creates a model in N-Dimensional “Hyper-Space” • Defined by: • Covariates • Response variables • Mathematics used to create the model • Statistics used to optimize parameters • Options for model evaluation • Predictor variables
Linear Regression: 2 Predictors Mathworks.com
Regression Methods • Continuous Regression: • Linear Regression • Generalized Linear Models (GLM) • Generalized Additive Models (GAMs) • Categorical Regression (trees): • Regression Trees • Classification and regression trees (CART) • Machine Learning: • Maximum Entropy (Maxent) • NPMR, HEMI, BRTs, etc.
Brown Shrimp Size • Add graph from work
Terminology • Plant uses: • Measured value and response variable • Explanatory variable • I prefer: • Response variable • I’ll use “measured value” to identify measured values in field data • Covariate: Explanatory variable used to build the model • Predictor: Explanatory variable used to predict
Douglas Fir Habitat Model 1 Habitat Quality 0 0 1000 Precipitation (mm)
Predictor Model Prediction
Model Selection and Parameter Estimation Field Data Covariate Predictor Model Prediction
Model Selection and Parameter Estimation Field or Sample Data Covariate Predictor Model Model Validation Prediction
Douglas-Fir sample data Create the Model Model “Parameters” Precip Extract Prediction To Points Text File Attributes To Raster
Data • Response Variable • From the field data (sample data) • Covariates • From the field or remotely sensed • Predictors • Typically remotely sensed • Sample as covariates for training • Can be different for predicting to new scenarios
Response Variable • What is the: • Spatial uncertainty? • Temporal uncertainty? • Measurement uncertainty? • Will it answer your question?
Covariate Variables • What is the: • Spatial uncertainty? • Temporal uncertainty? • Measurement uncertainty? • How well does the collection time of the covariates match the field data? • Do they co-vary with the phenomena? • Do the covariates “correlate”?
Types of uncertainty • Accuracy (bias) • Precision (repeatability) • Reliability (consistency of a set of measurements) • Resolution (fineness of detail) • Logical consistency • Adherence to structural rules, attributes, and relationships • Completeness
Types of Errors • Gross errors • Transcription • Sinks in DEMs • Random • Estimated using probability theory • Systematic errors • “Drift” in instruments • Dropped lines in Landsat
Gross Errors • Lat/Lon: • Reversed • 0, names, dates, etc. • Dates: • Extended in databases • Measurements: • Inconsistent units • Inconsistent protocols • What can you expect from a field team?
Occurrences of Polar Bears From The Global Biodiversity Information Facility (www.gbif.org, 2011)
Systematic Errors Landsat Scan line Error
Response Variable Qualification Tools • Maps (various resolutions) • Examine the data values: • How many digits? • Repeating patterns, gross errors? • “Documentation” • Measurements: • Occurrences? • Binary: Histogram • Categorical: Histogram • Continuous: Histogram
Significant Digits • How many digits to represent 1 meter? • Geographic: Lat/Lon? • UTM: Eastings/Northings?
Significant Digits • Geographic: • 1 digit = 1 degree • 1 degree ~ 110 km • 0.00001 ~ 1.1 meters • UTM: • 1 digit = 1 meter
Covariate Qualification • Maps • Documentation • Examine the data: • How many digits? • Integer or floating point? • Repeating patterns? • Histograms
Histograms hist(Temp,breaks=400)
Covariate Correlation • Correlation Plots • Pearson product-moment correlation coefficient • Spearman’s rho – non parametric correlation coefficient
Response vs. Covariates • For Occurrences: • Histogram covariates at occurrences vs. overall covariates • For Binary Data: • Histogram covariates for each value • For Categorical Data : • Histogram covariates for each value • Or scatter plots • For Continuous Data • Scatter plots
Covariate Occurrence Histograms Precipitation with Douglas-Fir Occurrences
Douglas Fir Model In HEMI 2 Green: Histogram of all of California Red: Histogram of Douglas-Fir Occurrences
Terrestrial Predictors • Elevation: • Slope • Aspect • Absolute Aspect • Distance to: • Roads • Streams (streamline) • Climate • Precip • Temp • Soil Type • RS: • Landsat • MODIS • NDVI, etc.
Marine Predictors • Temp • DO2 • Salinity • Depth • Rugosity (roughness) • Current (at depths) • Wind
More Complicated • Associated species • Trophic levels • Temporal • Cyclical
Predictor Layers • Means, mins, maxes • Range of values • Heterogeneity • Spatial layers: • Distance to… • Topography: elevation, slope, aspect
Field Data and Predictors • As close to field measurements as possible • Clean and aggregate data as needed • Documenting as you go • Estimate overall uncertainty • Answer the question: • What spatial, temporal, and measurement scales are appropriate to model at given the data?
Temporal Issues • Divide data into months, seasons, years, decades. • Consistent between predictors and response • Extract predictors as close to sample location and dates as possible • Use the “best” predictor layers
Dimensions of uncertainty • Space • Time • Attribute • Scale • Relationships
Basic Tools • Histograms: What is the distribution of occurrences of values (range and shape) • Scattergrams: What is the relationship between response and predictor variables and between predictor variables • QQPlots: Are the residuals normally distributed?
Types of Data • “God does not play dice” • Einstein • “the end of certainty” • Prigogine, 1977 Nobel Prize • What remains is: • Quantifiable probability with uncertainty