Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Regression and Correlation

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Regression and Correlation**BUSA 2100, Sect. 12.0 - 12.2, 3.5**Introduction to Regression**• Forecasts (predictions) are often based on the relationship between 2 or more variables. • Ex. 1: Advertising expenditures and sales. • Example 2: Daily high temperature and demand for electricity. • X = independent variable, the variable being used to make a forecast; Y = dependent variable, the variable being forecasted. • Identify X and Y in Examples 1 and 2. • Y depends on X.**Straight Lines**• A regression line can be used to show mathematically how variables are related.**Regression Example**• To determine the equation of a line, all we need are the slope and Y-intercept. • Example: Pizza House builds restaur-ants near college campuses. • Before building another one, it plans to use X = student enrollment (1000s) to estimate Y = quarterly sales ($1000s). • A sample of 6 existing restaurants is chosen.**Pizza Restaurant Problem**• Resulting data pairs are shown below. • X Y 4 95 6 155 9 140 11 210 12 250 15 260**Scatter Diagram & Line of Best Fit**• Draw a scatter diagram on the board. • Use a hiatus so that the X, Y axes don’t have to begin at zero. All units must be the same size within axes. • By trial and error, draw some lines through the data. The regression line is the one line that fits the data best. (Also called the line of best fit.)**Line of Best Fit (Continued)**• As indicated earlier, YF is a forecasted value (on the regression line). Y is an actual value (one of the dots).**Regression Formulas**• Based on calculus, the equation of a regression line (line of best fit) can be found using these formulas.**Regression Formulas, Page 2**• Carry out the numerical coefficients (b1 and b0) 3 or 4 decimal places; then round to 2 or fewer places at the end. • Substitute the numbers into the regression equation: YF = b0 + b1X. • We will complete the restaurant prob-lem, using a table to organize the data.**Restaurant Problem, Page 2**• X Y XY X2 Y2 4 95 380 16 9025 6 155 930 36 24025 9 140 1260 81 19600 11 210 2310 121 44100 12 250 3000 144 62500 15 260 3900 225 67600 SUM 57 1110 11780 623 226850**Meaning and Uses of the Regression Equation**• Example: Vidalia State University has an enrollment of 9,800. Forecast pizza sales for a restaurant near the campus.**Accuracy of Forecasts Using Regression**• The accuracy of forecasts depends on how closely the points in a scatter diagram fit the regression line. • If the linear relationship is too weak (the deviations are too large), there are large forecast errors and there may be no need to pursue use of a regression line.**Evaluating Accuracy of Regression Forecasts**• It is best to have an estimate of forecast accuracy before using a regression line. • 3 ways to estimate forecast accuracy:**Introduction to Correlation**• Def.: The coefficient of correlation (r) is a numerical measure of the strength of the linear relationship between 2 variables. • Values of r are always between -1 & 1; i.e., between 0 and 1 in absolute value. • r = 0 means no correlation; r = +-1 means perfect correlation; both rare.**Positive and Negative Correlation**• Definition:Two variables X, Y have a positive correlation if large values of X tend to be associated with large values of Y; similarly, for small values. • X, Y must be measurable quantitatively. • Example of positive correlation:**Positive and Negative Correlation, Page 2**• Definition:Two variables X, Y have a negative correlation if large values of X tend to be associated with small values of Y, and vice-versa. • Example of negative correlation: • Graph positive and negative correlation.**High and Low Correlation**• General guidelines: Degree of Forecast Correlation Accuracy • very high very good high good moderate medium low fair very low poor**Formula for Correlation**• Use regression for forecasts only if r is .70 or larger, in absolute value.**Regression Analysis Summary**• Steps in regression analysis: • (1) Collect data pairs, using 2 related variables. • (2) Calculate the correlation, r. • (3) (a) If r >= .70, in absolute value, find the regression equation and use it for forecasting. • (b) If r < .70, don’t use regression.**Multiple Regression**• Regression analysis with one independ-ent variable (X) is called simple regres-sion. • Regression analysis with 2 or moreindependent variables (X1, X2, etc.) is called multiple regression.**Multiple Regression and Line of Average Relationship**• State the multiple regression equation. • A regression equation is also called the line of average relationship. Explain in terms of GPA example. • Correlation does not necessarily imply cause and effect. Illustrate with example.