Tests

Tests Jean-Yves Le Boudec

Contents • The Neyman Pearson framework • Likelihood Ratio Tests • ANOVA • Asymptotic Results • Other Tests 1

Tests • Tests are used to give a binary answer to hypotheses of a statistical nature • Ex: is A better than B? • Ex: does this data come from a normal distribution ? • Ex: does factor n influence the result ? 2

Example: Non Paired Data • Is red better than blue ? • For data set (a) answer is clear (by inspection of confidence interval) no test required 3

Is this data normal ? 4

5.1 The Neyman-Pearson Framework • Given: data set a model withparameter (that, webelieve, explains the data) • Twohypotheses on (nullhypothesis) (alternative hypothesis) • Nested model: is a set of smaller dimension than 5

Example: Non Paired Data; Is Red better than Blue ? • Model: and are twoindependentiidsamples and 6

Example: Non Paired Data; Is Red better than Blue ? ANOVA Model • Model: and are twoindependentiidsamples and 7

Critical Region, Size and Power • Critical Region: as set of possible data values suchthatif data thenreject • Type 1 error: rejectwhenistrueSizeof a test = maximum proba of type 1 errorSize = shouldbesmall • Type 2 error: acceptwhen istruePower function: shouldbe large • Neyman Pearson framework: Design a test thatmaximizes power subject to size 8

Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • First attempt: let us take as rejection regionSize = ? • Pb: the sup is 0 because of the term 9

Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • First attempt: let us take as rejection regionSize = ? • Pb: the sup is 0 because of the term • Second attempt:where= estimator of variance • Nowwecancomputesuchthat the size is 95%: • Wefindwereject 10

Power 11

power Grey Zone • isapproximated by on the plot • power (badbut unavoidable)Grey zone: for power If trueis in grey zone, test willoftendeclare • For data at hand: power = 0.9997, Proba of type 2 error = 0.0003 12

p-value of a test • For the previousexample, with • The test consists in computing and see if • Considerwhereis a hypotheticalreplay. It isindependent of and wecan plot it:sayingis the same as saying • P-value of test = • Wereject if p-value issmall 13

p-value of a test 14

Tests are just tests 16

power Grey Zone • Assume wewant to match statisticalsignificance and practical relevance • is the size of reductionweconsiderpracticallysignificant • Withwe have The type 2 and type 1 errors are matched • Ideally, the size of the test shouldbematched to the desiredresolution • In practice, itis not done 17

2. Likelihood Ratio Test • A special case of Neyman-Pearson • A Systematic Method to define tests, of general applicability 19

Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • Let us compute the likelihood ratio test. 21

A Classical Test: Student Test • The model : • The hypotheses : 23

Example : Paired DataIs A better than B ? • Reduction in execution time • Model: • The likelihood ratio test is the Studenttest • Compare to one sided test: matters ! 25

Here it is the same as a Conf. Interval 26

Test versus Confidence Intervals • If you can have a confidence interval, use it instead of a test 27

The “Simple Goodness of Fit” Test • Model • Hypotheses 28

1. compute likelihood ratio statistic 29

2. compute p-value 30

Mendel’s Peas • P= 0.92 ± 0.05 => Accept H0 31

3 ANOVA • Often used as “Magic Tool” • Important to understand the underlying assumptions • Model • Data comes from iid normalsample with unknown means and same variance • Hypotheses 32

The ANOVA Theorem • We build a likelihood ratio statistic test • The assumption that data is normal and variance is the same allows an explicit computation • it becomes a least square problem = a geometrical problem • we need to compute orthogonal projections on M and M0 35

The ANOVA Theorem 36

Geometrical Interpretation • Accept H0 if SS2 is small • The theorem tells us what “small” means in a statistical sense 37

ANOVA Output: Network Monitoring 39

The Fisher-F distribution 40

Compare Test to Confidence Intervals • For non paired data, we cannot simply compute the difference • However CI is sufficient for parameter set 1 • Tests disambiguate parameter sets 2 and 3 42

Test the assumptions of the test… • Need to test the assumptions • Normal • In each group: qqplot… • Same variance 43

4 Asymptotic Results 2 x Likelihood ratio statistic 45

The chi-square distribution 47

Asymptotic Result • Applicable when central limit theorem holds • If applicable, radically simple • Compute likelihood ratio statistic • Inspect and find the order p (nb of dimensions that H1 adds to H0) • This is equivalent to 2 optimization subproblemslrs = = max likelihood under H1 - max likelihood under H0 • The p-value is 48

Composite Goodness of Fit Test • We want to test the hypothesis that an iid sample has a distribution that comes from a given parametric family 49

Tests

Tests

Presentation Transcript

Biochemical Tests

Tests

TESTS

- Candidate Materials - Optics Tests - Background Tests - HV Tests

Les tests

IQ Tests

Projective Tests

Tests

Laboratory Tests

Unit tests, Integration tests Physics tests

TESTS

tests

Legal Tests

Precision Tests

Tests

Powering tests

t-tests and nonparametric tests

Projective Tests

Tests and more tests

Autoimmune Tests