1 / 72

Tests

Tests. Jean-Yves Le Boudec. Contents. The Neyman Pearson framework Likelihood Ratio Tests ANOVA Asymptotic Results Other Tests. Tests. Tests are used to give a binary answer to hypotheses of a statistical nature Ex: is A better than B?

teva
Télécharger la présentation

Tests

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Tests Jean-Yves Le Boudec

  2. Contents • The Neyman Pearson framework • Likelihood Ratio Tests • ANOVA • Asymptotic Results • Other Tests 1

  3. Tests • Tests are used to give a binary answer to hypotheses of a statistical nature • Ex: is A better than B? • Ex: does this data come from a normal distribution ? • Ex: does factor n influence the result ? 2

  4. Example: Non Paired Data • Is red better than blue ? • For data set (a) answer is clear (by inspection of confidence interval) no test required 3

  5. Is this data normal ? 4

  6. 5.1 The Neyman-Pearson Framework • Given: data set a model withparameter (that, webelieve, explains the data) • Twohypotheses on (nullhypothesis) (alternative hypothesis) • Nested model: is a set of smaller dimension than 5

  7. Example: Non Paired Data; Is Red better than Blue ? • Model: and are twoindependentiidsamples and 6

  8. Example: Non Paired Data; Is Red better than Blue ? ANOVA Model • Model: and are twoindependentiidsamples and 7

  9. Critical Region, Size and Power • Critical Region: as set of possible data values suchthatif data thenreject • Type 1 error: rejectwhenistrueSizeof a test = maximum proba of type 1 errorSize = shouldbesmall • Type 2 error: acceptwhen istruePower function: shouldbe large • Neyman Pearson framework: Design a test thatmaximizes power subject to size 8

  10. Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • First attempt: let us take as rejection regionSize = ? • Pb: the sup is 0 because of the term 9

  11. Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • First attempt: let us take as rejection regionSize = ? • Pb: the sup is 0 because of the term • Second attempt:where= estimator of variance • Nowwecancomputesuchthat the size is 95%: • Wefindwereject 10

  12. Power 11

  13. power Grey Zone • isapproximated by on the plot • power (badbut unavoidable)Grey zone: for power If trueis in grey zone, test willoftendeclare • For data at hand: power = 0.9997, Proba of type 2 error = 0.0003 12

  14. p-value of a test • For the previousexample, with • The test consists in computing and see if • Considerwhereis a hypotheticalreplay. It isindependent of and wecan plot it:sayingis the same as saying • P-value of test = • Wereject if p-value issmall 13

  15. p-value of a test 14

  16. 15

  17. Tests are just tests 16

  18. power Grey Zone • Assume wewant to match statisticalsignificance and practical relevance • is the size of reductionweconsiderpracticallysignificant • Withwe have The type 2 and type 1 errors are matched • Ideally, the size of the test shouldbematched to the desiredresolution • In practice, itis not done 17

  19. 18

  20. 2. Likelihood Ratio Test • A special case of Neyman-Pearson • A Systematic Method to define tests, of general applicability 19

  21. 20

  22. Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • Let us compute the likelihood ratio test. 21

  23. 22

  24. A Classical Test: Student Test • The model : • The hypotheses : 23

  25. 24

  26. Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • The likelihood ratio test is the Studenttest • Compare to one sided test: matters ! 25

  27. Here it is the same as a Conf. Interval 26

  28. Test versus Confidence Intervals • If you can have a confidence interval, use it instead of a test 27

  29. The “Simple Goodness of Fit” Test • Model • Hypotheses 28

  30. 1. compute likelihood ratio statistic 29

  31. 2. compute p-value 30

  32. Mendel’s Peas • P= 0.92 ± 0.05 => Accept H0 31

  33. 3 ANOVA • Often used as “Magic Tool” • Important to understand the underlying assumptions • Model • Data comes from iid normalsample with unknown means and same variance • Hypotheses 32

  34. 33

  35. 34

  36. The ANOVA Theorem • We build a likelihood ratio statistic test • The assumption that data is normal and variance is the same allows an explicit computation • it becomes a least square problem = a geometrical problem • we need to compute orthogonal projections on M and M0 35

  37. The ANOVA Theorem 36

  38. Geometrical Interpretation • Accept H0 if SS2 is small • The theorem tells us what “small” means in a statistical sense 37

  39. 38

  40. ANOVA Output: Network Monitoring 39

  41. The Fisher-F distribution 40

  42. 41

  43. Compare Test to Confidence Intervals • For non paired data, we cannot simply compute the difference • However CI is sufficient for parameter set 1 • Tests disambiguate parameter sets 2 and 3 42

  44. Test the assumptions of the test… • Need to test the assumptions • Normal • In each group: qqplot… • Same variance 43

  45. 44

  46. 4 Asymptotic Results 2 x Likelihood ratio statistic 45

  47. 46

  48. The chi-square distribution 47

  49. Asymptotic Result • Applicable when central limit theorem holds • If applicable, radically simple • Compute likelihood ratio statistic • Inspect and find the order p (nb of dimensions that H1 adds to H0) • This is equivalent to 2 optimization subproblemslrs = = max likelihood under H1 - max likelihood under H0 • The p-value is 48

  50. Composite Goodness of Fit Test • We want to test the hypothesis that an iid sample has a distribution that comes from a given parametric family 49

More Related