450 likes | 663 Vues
PASW-SPSS STATISTICS. David P. Yens, Ph.D. New York College of Osteopathic Medicine, NYIT dyens@nyit.edu PRESENTATION 5 REVIEW OF ANOVA CORRELATION AND REGRESSION 2010. ANALYSIS OF VARIANCE. Simple
 
                
                E N D
PASW-SPSS STATISTICS • David P. Yens, Ph.D. • New York College of Osteopathic Medicine, NYIT dyens@nyit.edu • PRESENTATION 5 • REVIEW OF ANOVA • CORRELATION AND REGRESSION 2010 David Yens, Ph.D. NYCOM
ANALYSIS OF VARIANCE • Simple • Used to determine whether there are differences in means among more than two groups, or: • Factorial • on more than one dimension (independent variable). • Examples: • 1. Compare blood pressures resulting from the use of three treatments. • 2. Compare blood pressures resulting from the use of three treatments and between males and females. David Yens, Ph.D. NYCOM
ANOVA • Determining differences after ANOVA • Planned contrasts • Post-hoc analyses D Yens, NYCOM
hsbdataB • Effect of fathers education on • Grades • Visualization test • Math achiement
ANOVA • ANALYZE • COMPARE MEANS • ONE-WAY ANOVA • OR • GENERAL LINEAR MODEL • UNIVARIATE • Several options are available
OTHER ANALYSIS OF VARIANCE METHODS • Repeated measures • Analysis of Covariance • Test statistic - F David Yens, Ph.D. NYCOM
STATISTICAL ANALYSES • ANALYSIS OF VARIANCE (Repeated measures) • Used to assess before and after measures on the same individuals exposed to two or more treatments. • Example: Assess the increase in blood pressure for two groups exposed to different treatments. D Yens, NYCOM
REPEATED MEASURES ANOVA D Yens, NYCOM
CORRELATION AND REGRESSIONMorgan, Chapt. 8 • CORRELATION – Expresses relationship only • REGRESSION – Prediction of one variable from another. Implies direction of influence, does NOT prove causality • MULTIPLE REGRESSION – Prediction of a target variable from 2 or more predictors (independent variables) David Yens, Ph.D. NYCOM
CORRELATION • Correlation coefficient is a number between -1 and +1 whose sign is on the same as the slope of the line and whose magnitude is related to the degree of linear association between two variables • R2, the coefficient of determination, expresses the proportion of variance in the dependent variable explained by the independent variable • On a ratio scale; an r2 =.50 is twice as large as .25 • Interpretation of values David Yens, Ph.D. NYCOM
ASSUMPTIONS FOR PEARSON CORRELATION & SIMPLE REGRESSION • Linear relationship • Scores normally distributed • Outliers can have a major impact
VARIABLES FOR CORRELATION • Grades MathAchievement • 4 9.00 • 5 10.33 • 6 7.67 • 3 5.00 • 3 -1.67 • 5 1.00 • 6 12.00 • 4 8.00 • ETC.
EXAMPLE FROM TEXT • Check assumptions
OBTAINING A SCATTERPLOT • GRAPHS • LEGACY DIALOGS • SCATTER/DOT
ADDING REGRESSION LINE • Now double-click the output chart
USING CHART BUILDER • GRAPHS • CHART BUILDER • OK • SELECT “Gallery” • SELECT “Scatter/Dot” • With mouse, move “Simple Scatter” to Chart Preview • Find/move “math achievement test” to vertical axis box • Find/move “grades in h.s.” to horizontal axis box • Click OK
TO OBTAIN A FIT LINE • Double-click on chart • SELECT “Elements” • SELECT “Interpolation line”
TO GET A CORRELATION BETWEEN THE 2 VARIABLES • ANALYZE • CORRELATE • BIVARIATE
CORRELATION EXAMPLE • Dayya (2005) looked at predictors of obesity. In one example, he plotted percent of calories in carbs against BMI to see if there was a relationship with the following result: • Dayya, D. Analysis of the CDC-NHANES Database to Identify Predictors Of Obesity in a Multiple Linear and Logistic Regression Model. New York Medical Journal, online, Dec. 2005.
REGRESSION • The simplest regression is y=a+bx, where y is the dependent variable (plotted on the vertical axis), x is the independent variable (plotted on the horizontal axis), and a is the y intercept. • Refers to a mathematical equation that allows one variable (the target variable) to be predicted from another (the independent variable). • Implies a direction of influence; it does not prove causality. From Greenhaigh, T. How to read a paper: statistics for the non-statistician. II. BMJ, 315 (7105) David Yens, Ph.D. NYCOM
Simple Regression • The regression line is the straight line passing through the data that minimizes the sum of squared differences between the original data and the fitted points • Least-squares analysis • This was the basis for ANOVA procedures • Intercept term is equivalent to the grand mean David Yens, Ph.D. NYCOM
QUESTION • Can we predict math achievement from grades in high school? • Using the same variables as before: • ANALYZE • REGRESSION • LINEAR
REGRESSION EXAMPLE • We could look at the Dayya data again to predict BMI from percent calories in carbs. Do you think we could obtain an accurate prediction? • Other uses of regression might be to predict the number of fillings required during a 5-year period from the number of times teeth were brushed a week.
Multiple Regression • A more complex mathematical equation that allows the target variable to be predicted from two or more independent variables (often known as co-variables). • EXAMPLE: predicting blood pressure from age, height, weight, and drug dosage.
MULTIPLE REGRESSION • FINAL POINTS • Sample size – number of subjects at least 5 (preferably 10) times the number of variables • The multiple R should be at least .7 • The change in R2 should be at least a few percent • A gradual fall off should be seen in the prediction of each successive variable • Fewer predictor variables are better than many; too many make interpretation difficult • Analyze the influence of outliers David Yens, Ph.D. NYCOM