1 / 10

Reexpressing Data

Reexpressing Data. Re-express data – is that cheating?. Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight enough condition: Does the scatterplot look straight?

martina
Télécharger la présentation

Reexpressing Data

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Reexpressing Data

  2. Re-express data – is that cheating? Not at all. Sometimes data that may look linear at first is actually not linear at all. Straight enough condition: Does the scatterplot look straight? Randomization Condition: are the individuals a representative sample from the population? Does the Plot Thicken? Condition: Does a scatterplot of the residuals against predicted values have ANY pattern? It shouldn’t. Clusters indicate that the relationship probably isn’t linear. Boring is good..

  3. Huh? The picture you see is a scatter plot of fuel efficiency (mpg) vs weight of a late model car (lbs). Looks ok, and r2 is .816 (sometimes written as 81.6%) so maybe it is ok. The second graph, extrapolating the data, suggests that a 6000 lb car would get 0 mpg. The H2 weighs 6400 lbs. Now, it doesn’t get good gas mileage, but it is better than 0. The third graph is the residual graph of fuel efficiency. See how it has a “bend” in it? This is the indication that the original graph is not well described by a near expression.

  4. So what do we do? Weight vs Fuel efficiency (gal/100 miles) may solve the problem. Where else do we re-express? If I ran 9 miles per hour on a mile run.. Is that fast? What if I did that on a 100 m dash?

  5. Why re-express? 1. Make the distribution of a variable (histogram) more symmetric. 2. Makes the spread of several groups (seen in side by side boxplots) more alike. 3. Make the form of a scatterplot more nearly linear. 4. Makes the scatter spread out more evenly.

  6. Ladder of Powers This is a list of ways to re-express data

  7. Ladder of Powers Part 2 If nothing feels good, you can try one of these three ideas (as long as none of the data is negative or zero)

  8. This is not a cure-all!! Some data just won’t benefit. Don’t worry. Yes, some data fits curved models, but the calculations are pretty intense.

  9. Example

  10. Homework Take a worksheet with you and do the circled problems.

More Related