1 / 16

Ch. 10 – Scatterplots, Association and Correlation (Day 1)

Ch. 10 – Scatterplots, Association and Correlation (Day 1). Scatterplots. So far, all of our analysis has looked at one variable at a time In this chapter, we will look at the relationship between two variables

karlson
Télécharger la présentation

Ch. 10 – Scatterplots, Association and Correlation (Day 1)

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Ch. 10 – Scatterplots, Association and Correlation (Day 1)

  2. Scatterplots • So far, all of our analysis has looked at one variable at a time • In this chapter, we will look at the relationship between two variables • If the variables are quantitative, we can do this by starting with a graph called a scatterplot

  3. Scatterplots • Ex Use the following data to examine the relationship between the amount of fertilizer (lbs per acre) used on plots of land in a particular farming region and the number of bushels per acre of grain produced.

  4. THINK: How will we draw the graph? • To decide which variable will go on which axis, think about what you are trying to learn • Do the variables have an explanatory/response relationship? • In this case, we are wondering how the amount of fertilizer used affects the amount of grain produced • Fertilizer is the explanatory variable • Bushels produced is the response variable • In a scatterplot, the explanatory variable goes on the x-axis and the response variable goes on the y-axis • If we aren’t looking at this type of relationship for the variables, you can use either axis

  5. SHOW: Draw the scatterplot • Don’t forget about labels and scale! 60 55 50 45 40 Bushels 30 40 50 60 70 80 Lbs of Fertilizer

  6. TELL: What does a scatterplot show us? • In most of our previous graphs, we were looking for center, shape, and spread of a single quantitative variable • This time we are looking at the relationship between two quantitative variables • If the two variables seem related, this is referred to as an association • Specifically, we are looking at the form, direction and strength of the association

  7. Form: Is it linear? • Our eventual goal is to create a model for the data • In order to decide which calculations to use, we need to first look at the form (shape) the pattern follows • A scatterplot has a linear form if a straight line could be used to describe it reasonably well • For now, we will simply describe form as linear or nonlinear Linear Nonlinear

  8. Direction: Positive, Negative or No Association? • Once we decide that the form is linear, we now turn to direction • If y increases as x increases, this is a positive association • If y decreases as x increases, this is a negative association Positive association Negative association No association

  9. Strength: Strong, Moderate, Weak? • The last thing we should address is the strength of the relationship • The conclusions we draw about strength are highly subjective, especially if they are based strictly on looking at the scatterplot Strong association Moderate association Weak association

  10. Correlation Coefficient • r = correlation coefficient for linear relationships • Measures the strength and direction of a linear relationship between two quantitative variables

  11. Calculating r 60 55 50 45 40 Bushels r = .9782 30 40 50 60 70 80 Lbs of Fertilizer

  12. What does r tell us? • Close to +1 = strong, positive linear association • Close to -1 = strong, negative linear association • Close to 0 = weak or no linear association • r = 1 or r = -1 means a perfect linear correlation

  13. Properties of r • r is a number between -1 and 1 • Since r is based on z-scores, it is not affected by shifting or re-scaling, and it has no units • The correlation of x with y is the same as the correlation of y with x (it doesn’t matter which variable is used as x or y – the correlation stays the same) • Remember that r only works for linear associations of quantitative variables • r is very sensitive to outliers – be careful! • Even though we have this numerical calculation, strength is still subjective – a value such as 0.68 that is considered strong for one set of data might be considered weak for another

  14. Outliers • A scatterplot can also show us outliers • In this context, an outlier is a point which doesn’t seem to fit within the pattern formed by the rest of the data

  15. Homework Pg. 542 # 12, 14, 16 Directions: Make a scatterplot of the data. Calculate the correlation coefficient and interpret what this means.

More Related