240 likes | 343 Vues
Delve into the complexities of statistical analysis with concepts such as linear regression, ANOVA, and generalized linear models. Understand the assumptions, coding in R, and likelihood functions for various models like Poisson and logistical regression. Explore nonlinear least squares methods, variance structures, and mixed models for comprehensive data analysis.
E N D
General linear models • Predictions are a linear function of a set of parameters. • Includes: • Linear models • ANOVA • ANCOVA • Assumptions: • Normally distributed, independent errors • Constant variance • Not to be confused with generalized linear models! • Distinction between factors and covariates.
Linear regression Standard R code: >lm.reg<-lm(Y~X) >summary(lm) >anova(lm.reg) Likelihood R code: >lmfun<-function(a, b, sigma) { Y.pred<-a+b*x -sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE)) }
Analysis of variance (ANOVA) Standard R code: >lm.onewayaov<-lm(Y~f1) >summary(lm.aov) >anova(lm.aov) # will give you an ANOVA table Likelihood R code: >aovfun<-function(a11, a12, sigma) { Y.pred<-c(a11,a12) -sum(dnorm(DBH, mean=Y.pred, sd=sigma, log=TRUE)) }
Analysis of covariance (ANCOVA) Standard R code: >lm.anc<-lm(Y~f*X) >summary(lm.anc) >str(summary(lm.anc)) Likelihood R code: >ancfun<-function(a11, a12, slope1, slope2, sigma) { Y.pred<-c(a11,a12)[f] + c(slope1, slope2)[f]*X -sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE)) }
Nonlinearlity: Non-linear least squares Uses numerical methods similar to those use in likelihood Standard R code: >nls(y~a*x^b, start=list(a=1,b=1) >summary(nls) >str(summary(lns)) Likelihood R code: >nlsfun<-function(a, b, sigma) { Y.pred<-a*x^b -sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE)) }
Generalized linear models • Assumptions: • Non-normal distributed errors ( but still independent and only certain kinds of non-normality) • Non-linear relationships are allowed but only if they have a linearizing transformation (the link function). • Linearizing transformations: • Non-normal distributed errors ( but still independent and only certain kinds of non-normality). These include the exponential family and are typically used with a specific linearizing function. • Poisson: loglink • Binomial: logit transfomation • Gamma: inverse Gaussian • Fit by iteratively reweighed least square methods: estimate variance associated with each point for each estimate of parameter(s). • Not to be confused with general linear models!
GML: Poisson regression Standard R code: >glm.pois<-glm(Y~X, family=poisson) >summary(gml.pois) Likelihood R code: >poisregfun=function(a,b) {Y.pred<-exp(a+b*X) -sum(dpois(Y, lambda=Y.pred, log=TRUE))}
GML: Logistic regression Standard R code: >glm2<-glm(y~x, family=binomial) >summary(gml2) Likelihood R code: >logregfun=function(a,b,N) {p.pred<-exp(a + b*X))/(1+exp(a + b*X)) -sum(dbinom(Y, size=N, prob=p.pred, log=TRUE))}
Generalized (non)linear least-squares models:Variance changes with a covariate or among groups Standard R code: >gls<-gls(y~1,weights=varIdent(form=~1|f) >summary(gls) Likelihood R code: >vardifffun=function(a, sd1,sd2) {sdval<-c(sd1,sd2)[f] -sum(dbinom(Y, mean=a, sd=sdval, log=TRUE)}
Complex error structures • Error structures are not independent • Complex likelihood functions • Includes: • Time series analysis • Spatial correlation • Repeated measures analysis Variance-covariance matrix x x Vector of means (pred) Vector of data
Complex error structures (x x Increasing variance General case Independent
Complex error structures • Variance/covariance matrix is symmetric so we need to specify at most n(n-1)/2 parameters. • V/C matrix must also be positive definite (logical), this translates to having a positive eigenvalue or positive diagonal values. • Select elements of matrix that define the error structure and ensure positive definite. • In this example, correlation drops off with the number ofd steps between sites.c
Complex error structures: An exampleSpatially-correlated errors R code: >rho=0.5 >m=matrix(nrow=5, ncol=5) >m<-rho^(abs(row(m)-col(m)) #OR# >m[abs(row(m)-col(m))==1]=rho mvlik<-function(a,b, rho) { pred.rad=a+b*dbh n=length(radius) m=diag(n) #generates diag matrix of n rows, n columns m[abs(row(m)-col(m))==1]=rho -dmvnorm(radius, pred.rad, Sigma=m, log=TRUE) } mle(mvlik, start=list(a=0.5, b=3,rho=0.5), method="L-BFGS-B", lower=0.001)
Mixed models & Generalized linear mixed models (GLMM) • Samples within a group (block, site) are equally correlated with each other. • Fixed effects: effects of covariates • Random effects: block, site etc. • GLMM’s are generalized linear models with random effects
Complex variance structures • So how do you incorporate all potential sources of variance? • Block effects • Individual effects (repeated measures includes both individual and temporal correlation) • Measurement vs. process error • …..