160 likes | 183 Vues
Learn how to draw scatterplots, interpret association, calculate correlation, and analyze relationships between variables. Explore examples and understand the strength of linear relationships in data sets.
 
                
                E N D
Two Quantitative Variables • Scatterplots • examples • how to draw them • Association • what to look for in a scatterplot • Correlation • strength of a linear relationship • how to calculate • good news and bad news
80 BOATS 50 20 40 20 30 CARS Scatterplot
80 BOATS 50 20 40 20 30 CARS Scatterplot
Made-up Examples STATE AVE SCORE PERCENT TAKING SAT
Made-up Examples IQ SHOE SIZE
Made-up Examples JUDGE’S IMPRESSION 450 250 350 BAKING TEMP
Made-up Examples LIFE EXPECTANCY GDP PER CAPITA
Scatterplots: Which variable goes where? • RESPONSE VARIABLE goes on Y axis • (“Y”) (“dependent variable”) • EXPLANATORY VARIABLE goes on X axis • (“X”) (“independent variable”) • If neither is really a response variable, it doesn’t matter which variable goes where.
Scatterplots: Drawing Considerations • Don’t show the axes without a good reason • Don’t show gridlines without a good reason • Scales should cover the ranges of the variables-- • —outliers? • —no need to include 0 • —what if same units?
What to look for in a scatterplot… • Do the cases break up into separate clusters? • Are there outliers? • Is there an ASSOCIATION between the • variables? OR are they INDEPENDENT? • ALWAYS DRAW THE PICTURE !!!!
Kinds of Association… • Positive vs. Negative • Strong vs. Weak • Linear vs. Non-linear
CORRELATION • CORRELATION • (or, the CORRELATION COEFFICIENT) • measures the strength of a linear relationship. • If the relationship is non-linear, it measures the strength of the linear part of the relationship. But then it doesn’t tell the whole story. • Correlation can be positive or negative.
Computing correlation… • Replace each variable with its standardized version. • Take an “average” of ( xi’ times yi’ ):
Computing correlation sum of all the products r, or R, or greek  (rho) n-1, not n
Good things about correlation • It’s symmetric ( correlation of x and y means same as correlation of y and x ) • It doesn’t depend on scale or units • — adding or multiplying either variable by • a constant doesn’t change r • — of course not; r depend only on the • standardized versions • r is always in the range from -1 to +1 • +1 means perfect positive correlation; dots on line • -1 means perfect negative correlation; dots on line • 0 means no relationship, OR no linear relationship
Bad things about correlation • Sensitive to outliers • Misses non-linear relationships • Doesn’t imply causality