40 likes | 157 Vues
This exercise involves creating a presence map of lodgepole pine within the Uinta Wilderness area using a Random Forest model. The process includes loading necessary datasets, generating a binary response variable for lodgepole presence, and running the model with specified predictors. Key steps include creating tables for species counts, merging data, and checking results. The model's importance is evaluated using the Gini index, along with calculating the percentage area occupied by lodgepole pine compared to aspen. Visualizations like scatterplots are also generated for data analysis.
E N D
Exercise Answers • ## 1. Create a map of presence of lodgepole within the Uinta Wilderness area, using a Random Forests model and the predictors from modeldat. Hint: Load the plt and tree table again and use code from slide #20 to create a binary response variable for lodgepole presence. • Also, set a seed to 66 so you can recreate the model. • plt<- read.csv("PlotData/plt.csv", stringsAsFactors=FALSE) • tree <- read.csv("PlotData/tree.csv", stringsAsFactors=FALSE) • ref <- read.csv("PlotData/ref_SPCD.csv", stringsAsFactors=FALSE) • tree <- merge(tree, ref, by="SPCD") • head(tree) • # First, create a table of counts by species and plot & change to binary 1/0. • spcnt <- table(tree[,c("PLT_CN", "SPNM")]) • spcnt[spcnt> 1] <- 1 • head(spcnt) • # Join the lodgepolecolumn to the modeldat table. • load("Outfolder/modeldat.Rda") • modeldat2 <- merge(modeldat, spcnt[,"lodgepole pine"], by.x="CN", by.y="row.names", all.x=TRUE) • head(modeldat2) • # Change name from y to LODGEPOLE and change NA values to 0 values (Absence). • names(modeldat2)[names(modeldat2) == "y"] <- "LODGEPOLE" • modeldat2[is.na(modeldat2[,"LODGEPOLE"]), "LODGEPOLE"] <- 0 • modeldat2$LODGEPOLE <- factor(modeldat2$LODGEPOLE, levels=c(0,1)) • str(modeldat2)
Exercise Answers cont. • # Check data • modeldat2[modeldat2$CN == 2377876010690,] • tree[tree$PLT_CN == 2377876010690,] • # Generate Random Forest model • set.seed(66) • lp.mod <- randomForest(LODGEPOLE ~ TMB5 + TMndvi + fnfrcl + elevm + slp + aspval, data=modeldat2, importance = TRUE) • lp.predict<- predict(clipstack, lp.mod) • lp.predict • plot(lp.predict) • # Plot with color breaks • cols <- c("dark grey", "green") • par(omi=c(0,0,0,0.5)) • colclasses(lp.predict, cols, labs=c("Absence", "Presence"))
Exercise Answers cont. • ## 2. How many plots in model data set have presence of lodgepole pine? • table(modeldat2$LODGEPOLE) • ## 3. Display 1 scatterplot of the relationship of elevation and NDVI values with lodgepole pine presence and absence as different colors. Hint: see slide #49. Make sure to label plots and add a legend. • attach(modeldat2) • par(mfrow=c(1,1)) • plot(elevm, TMndvi, col=c("black", "red"), pch=20, xlab="Elevation", ylab="TM NDVI", main="Lodgepole pine presence/absence") • legend(x=2000, y=120, legend=levels(ASPEN), col=c("black", "red"), pch=20) • ## 4. What is the variable with the highest importance value based on the Giniindex? • imp <- lp.mod$importance • imp[order(imp[,"MeanDecreaseGini"], decreasing=TRUE),] • elevm • ## 5. What percentage of the total area is lodgepole pine? How does this compare with the percentage of aspen? • lp.freq <- freq(lp.predict) • lp.freq <- na.omit(lp.freq) • round((lp.freq[lp.freq[,"value"] == 1, "count"] / sum(lp.freq[,"count"])) * 100, 2) • asp.freq<- freq(asp.predict) • asp.freq<- na.omit(asp.freq) • round((asp.freq[asp.freq[,"value"] == 1, "count"] / sum(asp.freq[,"count"])) * 100, 2)