Understanding Residuals in Regression Analysis: Diagnostic and Remedial Measures
This document explores the significance of residuals in regression analysis, focusing on their diagnostic and remedial measures. Residuals, defined as the difference between observed and predicted values, help assess the quality of a regression model. Key characteristics such as mean, variance, and properties are discussed, alongside methods for evaluating normality and constancy of variance. Visualization techniques such as residual plots and tests for outliers are highlighted as essential tools for model evaluation. Understanding these concepts ensures robust statistical analyses and enhances model accuracy.
Understanding Residuals in Regression Analysis: Diagnostic and Remedial Measures
E N D
Presentation Transcript
DIAGNOSTIC AND REMEDIAL MEASURES • Residuals • The main purpose examining residuals • Diagnostic for Residuals • Test involving residuals Regression Analysis Week 8
The observed error: • ei = Yi – Ŷ • For regression model, the true error εi are assumed to be independent normal random variables, with mean 0 and variance σ2. If the model is appropriate for the data, the ei should then reflect the properties assumed for the εi. Residuals
Properties of Residuals • Mean • Variance • The residuals ei are not independent random variables as they involve the fitted values Ŷi which are based on the sample estimates bo, b1, b2, ..., bp-1. • X’e = 0 and Ŷ’e = 0 Residuals (2)
Standardized Residuals: This residuals are useful in identifying outlying observations. There are still other measures based on residuals (see ch 11) Residuals (3)
To identify whether • The regression function is not linear • The error terms do not have constant variance • The error terms are not independent • The error terms are not normally distributed • The model fits all but one or a few outlier observations • One or several important independent variables have been omitted from the model The main purpose in examining residuals
Look at the distribution of each variable • Look at the relationship between pairs of variables • Plot the residuals versus • Each explanatory variable • Time • Fitted values • Omitted variables Diagnostics
Are the residuals approximately normal? • Look at a histogram, box plots, stem and leaf plots or dot plots • Normal quantile plot • Is the variance constant? • Plot the squared residuals vs anything that might be related to the variance Diagnostics (2)
Transformations such as Box-Cox • Analyze without outliers • More in NKNW Ch 11 Remedial measures
Tests for Randomness • Tests for Constancy of Variance • Tests for Outliers • Tests for Normality • More in NKNW Ch 11 Tests Involving Residuals