170 likes | 327 Vues
Non-Linear and Smooth Regression: An Introductory Example (V&R 8.1). Rommel Vives Lin Zhang Statistics 6601 Project, Fall 2004. Objective. To present a primer on non-linear regression Start by comparing non-linear regression to linear regression. Similarities and Differences of Methodology.
E N D
Non-Linear and Smooth Regression: An Introductory Example (V&R 8.1) Rommel Vives Lin Zhang Statistics 6601 Project, Fall 2004
Objective To present a primer on non-linear regression Start by comparing non-linear regression to linear regression.
Similarities and Differences of Methodology Similarities • Each fits a mathematical model to data. • Both are sometimes called least squares methods since the procedures use sum-of-squares as a measure of goodness-of-fit. Differences • In the non-linear regression model, at least one parameter is related to the explanatory variable(s) in a non-linear fashion. • There is usually no closed form expression for the least squares parameter estimates with non-linear regression. • Least squares estimates of the parameters are, therefore, derived from an iterative process using numerical methods.
Advantages and Disadvantages • Linear Model • Linear Least Squared Method provided the best fit model. • Linear Model has closed form, moderate computing. • The estimated parameters always have clear meanings. • Too perfect, then not realistic. • Non-Linear Model • More flexibility, more realistic? • Without closed form. Need better understanding of the meaning of parameters. • Mass computation requirement. Impossible in era of “Classical Model”.
Fitting Data with Non-Linear Regression • Choose a model • Estimate initial values • Constrain a parameter • Decide on a weighting scheme • Handle replicate values, if any
Choose a Model Functions that fit the data well using the least squares criterion may not be meaningful in the context of the experiment. That is, the parameters may not be interpretable for such functions. Choose one where the parameters and their estimates are scientifically meaningful, so that results may be of use in the data analysis.
Estimate Initial Values Initial values are ballpark estimates of the least squares values of the parameters and are needed to start the iterative process of deriving them. Understand the model, the meaning of all the parameters and look at the graph of the data to estimate the initial values.
A Least Squares Surface (Two Parameters) with a Local Minimum
Constrain a Parameter It is not necessary in some cases to fit each parameter in the model. For those not fitted, a constant value is assigned.
Decide on a Weighting Scheme • Weigh data points equally • Apply weights to data points if, for example, the average distance of the points from the curve varies with the response variable, but relative distance remains constant.
Handle Replicate Values If, for one set of values of the independent variables, replicate values of the response variable are collected, treat replicates as separate points if they are independent. Otherwise, use the average of the replicates.
An Introductory Example Data on Weight Loss • pertains to an obese male patient, age 48, height 193 cm (6’4”) with a large body frame • consists of the variables “weight” (in kilograms as measured under standard conditions) and “days” (time since start of weight reduction program)
R Code • > library(MASS) • > attach(wtloss) • > oldpar <- par(mar=c(5.1,4.1,4.1,4.1)) • > plot(Days,Weight,type="p",ylab="Weight (kg)") • > Wt.lbs <- pretty(range(Weight*2.205)) • > axis(side=4,at=Wt.lbs/2.205,lab=Wt.lbs,srt=90) • > mtext("Weight (lb)",side=4,line=3) • > fm <- lm(Weight~Days) • > abline(fm,col='red') #Fit a simple regression of Weight on Days. • > lrf <- loess(Weight~Days) • > lines(Days,fitted(lrf),col='blue') #Fit a LOWESS curve of Weight on Days.
Model Y = b0 + b12-t/q + where only is considered random b0 is the ultimate lean weight, or asymptote b1 is the total amount to be lost and q is the time taken to lose half the amount remaining to be lost
References • The GraphPad Guide to Nonlinear Regression • Engineering Statistics Handbook