1 / 9

Regression using lm lmRegression.R

Regression using lm lmRegression.R. Basics Prediction World Bank CO2 Data. Simple Linear regression. Simple linear model: y = b 1 + x b 2 + error y: the dependent variable x: the independent variable b 1 , b 2 : intercept and slope coefficients

yonah
Télécharger la présentation

Regression using lm lmRegression.R

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Regression using lmlmRegression.R Basics Prediction World Bank CO2 Data

  2. Simple Linear regression • Simple linear model: y = b1 + x b2+ error y: the dependent variable x: the independent variable b1, b2 : intercept and slope coefficients error: random departures between the model and the response. Coefficients estimated by least squares

  3. Multiple regression • y = b0+ x1 b1+ x2b2 + x3b3 + … + error

  4. Annual Boulder Temperatures Temperature is dependent variable, Year is the independent variable Errors =???? Linear =???

  5. CO 2 Emissions by Country • Independent: GDP/capita • Dependent: CO2 emission • Linear?? Errors ??

  6. The R lm function • Takes a formula to describe the regression where ~means equals • Works best when the data set is a data frame • Returns a complicated list that can be used in summary, predict, print plot lmFit <- lm( y ~ x1 + x2)

  7. Or more generallyusing a data frame lmFit <- lm( y ~ x1 + x2, data=dataset) dataset$y, dataset$x1, dataset$x2

  8. Analysis of World Bank data set • Best to work on a log scale and GDP has the strongest linear relationship • Some additional pattern leftover in the residuals • Try other variables • Try a more complex curve • Check the predictions using cross-validation

  9. Leave-one-out Cross-validation • Robust way to check a models predictions and the uncertainty measure • Four steps: • Sequentially leave out each observation • Refit model with remaining data • Predict the omitted observation • Compare prediction and confidence interval to the actual observation A check on the consistency of the statistical model Because omitted observation is not used to make prediction

More Related