250 likes | 567 Vues
Biomedical Statistics. 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang- Zern Tsai) 姓名 : 凃建宇 (Jacky Tu ). Outline. Two Sample Hypothesis Testing for Correlation Multiple Correlation Spearman’s Rank Correlation. Two Sample Hypothesis Testing for Correlation. Case1: Independent samples
E N D
Biomedical Statistics 系所別:中央大學電機工程學系(NCUEE) 指導教授:蔡章仁(Jang-ZernTsai) 姓名:凃建宇(Jacky Tu)
Outline • Two Sample Hypothesis Testing for Correlation • Multiple Correlation • Spearman’s Rank Correlation
Two Sample Hypothesis Testing for Correlation Case1: Independent samples Case2: Dependent samples
Two Sample Hypothesis Testing for Correlation with independent samples Example: A sample of 40 couples from London is taken comparing the husband’s IQ with his wife’s. The correlation coefficient for the sample is .77. Is this significantly different from the correlation coefficient of .68 for a sample of 30 couples from Paris?
Then we can perform either one of the following tests: Some excel functions: FISHER equ. SQRT equ. square root(number) NORMSDIST equ.
Two Sample Hypothesis Testing for Correlation with dependent samples What difference? two correlations have one variable in common or because the two variables are correlated at one moment in time and again at another moment in time Example: IQ tests are given to 20 couples. The oldest son of each couple is also given the IQ test with the scores displayed in Figure 1. We would like to know whether the correlation between son and mother is the significantly different from the correlation between son and father.
use the following test statistic S is the 3 × 3 sample correlation matrix and Since p-value = .042 < .05 = α we reject the null hypothesis, and conclude that the correlation between mother and son is significantly different from the correlation between father and son.
Multiple Correlation We can also calculate the correlation between more than two variables Definition 1: multiple correlation coefficient multiple coefficient of determination Rz,xy^2 R^2 x,y:independentvariables z:dependent variable R
Multiple Correlation(Cont.) Definition 2 adjusted multiple correlation coefficient k = the number of independent variables
Example By using Excel’s Correlation data analysis tool,we can get correlation coefficients for data in Example
We use the data above to obtain the values , rPW, rPI , and rWI Definition 3: partial correlation(x and z holding y constant) semi-partial correlation(x and y is eliminated, x and z and y and z not)
Example If we want to know the relationship between GPA (grade point average) , salary and IQ but maybe IQ correlates well with both GPA and Salary. To test this need to determine the correlation between GPA and salary eliminating the influence of IQ so the partial correlation r(GS,I)
If we continue calculate r(PW,I),rP(W,I) Then we can proof the property by:
Since the coefficients of determination is a measure of the portion of variance attributable to the variables involved, we can look at the meaning of the concepts defined above using the following Venn diagram, where the rectangular represents the total variance of the poverty variable calculate the breakdown of the variance for poverty:
we can calculate B in a number of ways: (A + B – A, (B + C) – C, (A + B + C) – (A+ C) where D = 1 – (A + B + C)
Follow the property 1: If the independent variables are mutually independent:
Spearman’s Rank Correlation Definition : the same as correlation coefficient r has the range -1~1 but is on the rank. Example: If IQ associates with the number of hours listen to rap music per month Can use Excel’s function RANK.AVG(A4,A$4:A$13,1) Pearson’s correlation = CORREL(A4:A13,B4:B13) = -0.036 Spearman’s rho = CORREL(C4:C13,D4:D13) = -0.115
shows there isn’t much of a correlation between IQ and listening to rap music, although the Spearman’s rho is closer to zero (indicating independent samples) than the Pearson’s. If we plot the example no ties in the ranking, there is alternative way of calculating Spearman’s rho using the following property di = rank xi – rank yi
If we use the property above to do the example again: the same as the CORREL(C4:C13,D4:D13) = -0.115
A study is designed to check the relationship between smoking and longevity. A sample of 15 men 50 years and older was taken and the average number of cigarettes smoked per day and the age at death was recorded, as summarized in the table in Figure 1. Can we conclude from the sample that longevity is independent of smoking? Example: Repeat the analysis for Example of One Sample Hypothesis Testing for Correlation using Spearman’s rho Spearman’s rho is the correlation coefficient on the ranked data, namely CORREL(C5:C19,D5:D19) = -.674
We now use the table in Spearman’s Rho Table to find the critical value for the two-tail test where n = 15 and α = .05. Interpolating between the values for n = 14 and 16, we get a critical value of .525. Since the absolute value of rho is larger than the critical value, we reject the null hypothesis that there is no correlation between cigarette smoking and longevity. Since n = 15 ≥ 10, we can use a t-test instead of the table Since |t| = 3.29 > 2.16 = tcrit= TINV(.05,13), we again conclude that there is a significant negative correlation between the number of cigarettes smoked and longevity.