190 likes | 351 Vues
In this introduction to two-variable statistics, we explore techniques to visualize and analyze relationships between two sets of data. Building on one-variable statistics, we utilize scatter plots to illustrate potential correlations, determining whether they are positive, negative, or nonexistent. We also calculate correlation coefficients to quantify these relationships. Understanding dependent and independent variables is essential for proper analysis. This guide includes methods for creating scatter plots and interpreting correlation coefficients, enhancing your data analysis skills.
E N D
Two Variable Statistics An Introduction
We have studied one variable statistics. We developed techniques to visualize one variable data (frequency diagrams, box and whiskers plots) and to summarize one variable data (measures of central tendency and measures of spread).
Sometimes we are interested in seeing if a relationship exists between two sets of data. We can create visual displays of these relationships (scatter plots) and summarize this relationship (correlation coefficients and mathematical models).
Scatterplots: • When creating scatter plots: • Make sure you can define the dependent and independent variable. • If doing it by hand, pick an appropriate scale.
Describing the Scatter Plot Once a scatter plot is created, we describe the relationship by how close it is to a linear relationship.
Positive Correlation: A positive linear correlation would see the dependent variable increasing as the independent variable increases.
Negative Correlation: A negative linear correlation would see the dependent variable decreasing as the independent variable increases.
Words we use to describe linear correlation: • Weak • Moderate • Strong • A line of best fit is a straight line drawn that passes as close as possible to all of the points on a scatter plot.
http://www.regentsprep.org/regents/math/algebra/AD4/PracPlot.htmhttp://www.regentsprep.org/regents/math/algebra/AD4/PracPlot.htm
Analyzing Two Variable Data The Correlation Coefficient
Which situation describes an example of negative correlation? • time worked and earnings • mark on a math test and student heights • Per capita income and crime rate • None of the above
What is the dependent variable in a study of the amount of math retained and amount of TV watched? • Type of TV watched • Amount of TV • Number of Facebook friends • Amount of math retained
A scatter plot provides a visual piece of evidence for correlation but sometimes when we classify the correlation (i.e. weak, moderate and strong) we are making a subjective decision. A correlation coefficient attempts to put a number on the strength of this correlation.
Formula for the correlation coefficient, r: where sx is the standard deviation of the variable X, sy is the standard deviation of the variable Y and sxy is the covariance of the variables X and Y. This formula is unwieldy for manual calculations.
Computation Formula for the Correlation Coefficient, r. Calculate the correlation coefficient for the cancer and smoking data.
Interpreting the Correlation Coefficient The value of r will range from -1 to 1. Classify the value of the correlation coefficient for the cancer and smoking data.
For each scatter plot, estimate the value of the correlation coefficient, r. Homework: p. 169 #5b-e,6bc