1 / 23

Lecture 5 Regression

Lecture 5 Regression. Homework Issues…past. Bad Objective: Conduct an experiment because I have to for this class Commas – ugh  Do not write out symbols (‘pi’), use the symbol (‘ p ’) Summarize results (don’t give me everything and then some) Report: mean ± std. dev.

edan
Télécharger la présentation

Lecture 5 Regression

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 5Regression

  2. Homework Issues…past • Bad Objective: Conduct an experiment because I have to for this class • Commas – ugh  • Do not write out symbols (‘pi’), use the symbol (‘p’) • Summarize results (don’t give me everything and then some) • Report: mean ± std. dev.

  3. Homework Issues…present • A confidence interval should be reported as an interval, e.g., 1.2 – 1.5 • Define abbreviations when first used, e.g., CI • However, there were too many conjunctive adverbs at the start of sentences! • Equation formatting

  4. Homework Issues…present • Do not show 27 digits of accuracy • UNITS!!! UNITS!!! INCLUDE UNITS!!! • Every table and figure should have a caption and be referred to in the text. • A section (e.g., results) should be more than just a table.

  5. On to the lecture…

  6. In Excel… • three ways to perform a linear regression: • Built-in functions SLOPE() and INTERCEPT() -- no details • Adding a trendline to a chart, and showing the regression equation on the chart (simplest) • Regression analysis using the Data Analysis Toolkit (best option – more information)

  7. Option 3 in Excel

  8. Excel Results • Recall that we forced the intercept = 0

  9. Interpretation of results… • Excel reports the Standard Error, not the standard deviation. They are not equal. See next slide. • The P-value is the probability that the observed result could be explained by random chance. The tiny P-value for the slope (1.91 x 10-25) indicates that there is a miniscule probability that the observed result can be explained by random chance. That is, you REALLY NEED the slope term to explain the data.

  10. Interpretation of results… • The 95% confidence interval for the true value of the slope (true value of π in this example) is presented in the output table. In this example, with 95% confidence, the true value of π is somewhere between 3.138 and 3.307. • The 90% confidence interval is 3.15233 to 3.292408, which does not contain the true value!! Measurement bias – not small, random, additive error?

  11. Calculating std. dev. • Slope se =0.0405 • Slope sd = 0.0405 ·sqrt(20) = 0.181 • Our experimental results are: • “The experimental value of π was found to be 3.22 ± 0.181.” • “The 95% confidence interval for true value of π ranges from 3.138 to 3.307.”

  12. Multivariable Regression • Fit this data to an equation of the form:

  13. Plot

  14. Multivariable Regression • y is the response variable. • Order of the other columns does not matter.

  15. In Excel…

  16. Results… (bug?)

  17. Interpretation… • The coefficients ± s are: • b0 = 5.53 ± 20.45 • b1 = 2.12 ± 8.54 • b2 = 3.98 ± 0.78 • Standard deviations are significantly larger than the mean values for b0 and b1. • p-values for these coefficients are 0.42 and 0.45. • These p-values are well over 0.05, so these terms are statistically insignificant (at 5%.) We can regress this data nearly as well with:

  18. p-value? • Recall: The lower the p-value, the less likely the result, assuming the null hypothesis, so the more "significant" the result, in the sense of statistical significance. • The null hypothesis here is, simplistically, that the coefficient is zero.

  19. t-Test on a Regression Slope • Comparison of b1 from regression with another value, b. • The t-test is a hypothesis test. Here are the hypotheses for this t-test. • H0 (null hypothesis) – The slope, b1, is equal to the known value, β. • H1 (test hypothesis) – The slope, b1, is not equal to the known value, β.

  20. t-Statistic • The appropriate t-statistic for this case is calculated as • where • The t statistic is always positive; you may have to use (β-b1) to get a positive value.

  21. Critical t Value • If tstat > tcrit – Reject the null hypothesis that the slope, b1, is equal to the known value, β. • If tstat ≤ tcrit – Fail to reject the null hypothesis. • Get tcrit from a t-Table or Excel (see example). • degrees of freedom, DOF = N-2

  22. Example • We are comparing b1 = 3.22 (first example in lecture) to b = p. • Get SSE = 85.954 from regression output. • Calculate: tstat = 0.952 • Choose α = 0.05. • DOF = 20 – 2 = 18. • In Excel, calculate TINV(α,DOF), which returns the value tcrit=2.101 when α = 0.05 and DOF = 18 • Since tstat ≤ tcrit (0.952 < 2.101) we fail to reject the null hypothesis. • Conclusion? We cannot say with 95% confidence that b1 is not equal to b.

  23. Example • Choose α = 0.40. • DOF = 20 – 2 = 18. • In Excel, calculate TINV(α,DOF), which returns the value tcrit=0.86 when α = 0.40 and DOF = 18 • Since tcirt ≤ tstat we reject the null hypothesis. • Conclusion? We can say with 60% confidence that b1 is not equal to b. • Hmmm…that’s a coin flip.

More Related