1 / 58

Hypothesis Tests II

Hypothesis Tests II. The normal distribution. Normally distributed data. Normally distributed means. First, lets consider a more simple problem…. We are testing the equality of a mean of a population (Y) to a particular value .

Télécharger la présentation

Hypothesis Tests II

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. HypothesisTests II

  2. The normal distribution

  3. Normally distributed data

  4. Normally distributed means

  5. First, letsconsider a moresimple problem… Wearetestingtheequality of a mean of a population (Y) to a particularvalue. Now, if is assumed, what do weknow? Wehavesome idea aboutthedistribution of samplemean. Weneed a measuringdevicethat is sensitivetothevariations in , or in otherwordsdeviationsfromthestatementtherein…

  6. z1 z2 has a bivariate normal distribution.

  7. If has a bivariate normal distribution then the pdf of points is a one dimensional normal distribution. This distribution is over the line because all points of the form is situated on this line. z1 If has a multinomial normal distribution then the pdf of points is a two dimensional normal distribution. This distribution is over the plane because all points of the form is situated on this plane. (Hard to draw) z2 This is how one dimension (degrees of freedom) is lost!

  8. That means even though the points lie in a two dimensional space, the probability distribution function defined over them is basically single dimensional. But, The situation resembles the following: Assume we have two normally distributed random variables; and . Then the distribution of the sum their squares, i.e., does not necessarily have a Chi-squared distribution with two degrees of freedom. Why?

  9. Consider the case where . Then which has a Chi-square distribution of one degree of freedom. Hence unless are independent has chi-square distribution with one degree of freedom. What is distribution?

  10. The t-distribution

  11. That is whywedivideby (n-1) in calculatingsamples.d.

  12. One sample t-test

  13. Two-sample tests B and A aretypes of seeds.

  14. Numerical Example (wheat again)

  15. Summary: Wehaveso far seen how a good test statistic (nulldistribution) lookslike. Thedistributionthatwehaveselected is a test bookdistribution. Couldwepickothers?

  16. Choosing Test Statistic

  17. The t statistic

  18. The Kolmogorov-Smirnov statistic

  19. Comparing the test statistics

  20. Sensitivity to specific alternatives

  21. Discussion

  22. Or… • Weneedtoadd in additionalassumptionssuch as equality of thestanadarddeviations of thesamples.

  23. Two-sample tests B and A aretypes of seeds. Remembered?

  24. ContingencyTables (Cross-Tabs)

  25. We use cross-tabulation when: • We want to look at relationships among two or three variables. • We want a descriptive statistical measure to tell us whether differences among groups are large enough to indicate some sort of relationship among variables.

  26. Cross-tabs are not sufficient to: • Tell us the strength or actually size of the relationships among two or three variables. • Test a hypothesis about the relationship between two or three variables. • Tell us the direction of the relationship among two or more variables. • Look at relationships between one nominal or ordinal variable and one ratio or interval variable unless the range of possible values for the ratio or interval variable is small. What do you think a table with a large number of ratio values would look like?

  27. Because we use tables in these ways, we can set up some decision rules about how to use tables • Independent variables should be column variables. • If you are not looking at independent and dependent variable relationships, use the variable that can logically be said to influence the other as your column variable. • Using this rule, always calculate column percentages rather than row percentages. • Use the column percentages to interpret your results.

  28. For example, • If we were looking at the relationship between gender and income, gender would be the column variable and income would be the row variable. Logically gender can determine income. Income does not determine your gender. • If we were looking at the relationship between ethnicity and location of a person’s home, ethnicity would be the column variable. • However, if we were looking at the relationship between gender and ethnicity, one does not influence the other. Either variable could be the column variable.

  29. ContingencyTables (Cross-Tabs) How do wemeasuretherelationship?

  30. What do we EXPECT ifthere is norelationship?

  31. 3.18

  32. RESULT • This test statistic has a χ2 distribution with(2-1)(2-1) = 1 degree of freedom • The critical value at α = .01 of the χ2 distribution with 1 degree of freedom is 6.63 • Thus we do not reject the null hypothesis that the two proportions are equal, that the drug is equally effective for female and male patients

  33. INTRODUCTION TO ANOVA • The easiestwaytounderstand ANOVA is togenerate a tiny data set (using GLM): As a first step set themean, to 5 forthedatasetwith 10 cases. Inthetablebelowall 10 caseshave a score of 5 at thispoint.

  34. The next step is toaddtheeffects of the IV. Supposethattheeffect of thetreatment at is toraisescoresby 2 unitsandtheeffect of thetreatment at is tolowerscoresby 2 units.

  35. The changesproducedbytreatmentarethedeviations of thescoresfromOverall of thesecasesthedeviations is This is thesum of the (squared) effects of treatmentifallcasesareinfluencedidenticallybythevariouslevels of A andthere is noerror.

  36. Thethird step is tocompletethe GLM withaddition of error.

More Related