950 likes | 970 Vues
Hypothesis Testing. To define a statistical Test we Choose a statistic (called the test statistic ) Divide the range of possible values for the test statistic into two parts The Acceptance Region The Critical Region. To perform a statistical Test we Collect the data.
E N D
To define a statistical Test we • Choose a statistic (called the test statistic) • Divide the range of possible values for the test statistic into two parts • The Acceptance Region • The Critical Region
To perform a statistical Test we • Collect the data. • Compute the value of the test statistic. • Make the Decision: • If the value of the test statistic is in the Acceptance Region we decide to accept H0 . • If the value of the test statistic is in the Critical Region we decide to reject H0 .
The z-test for Proportions Testing the probability of success in a binomial experiment
Situation • A success-failure experiment has been repeated n times • The probability of success p is unknown. We want to test • H0: p = p0 (some specified value of p) Against • HA:
The Test Statistic • Accept H0 if: • The Acceptance and Critical Region • Reject H0 if: Two-tailed critical region
One-tailed critical regions These are used when the alternative hypothesis (HA) is one-sided • Accept H0 if: The Acceptance and Critical Region • Reject H0 if: • Accept H0 if: • Reject H0 if:
One-tailed critical regions The Acceptance and Critical Region Accept H0 if: , Reject H0 if:
One-tailed critical regions The Acceptance and Critical Region Accept H0 if: , Reject H0 if:
Comments • Whether you use a one-tailed or a two-tailed tests is determined by the choice of the alternative hypothesis HA • The alternative hypothesis, HA, is usually the research hypothesis. The hypothesis that the researcher is trying to “prove”.
Examples • A person wants to determine if a coin should be accepted as being fair. Let p be the probability that a head is tossed. One is trying to determine if there is a difference (positive or negative) with the fair value of p.
A researcher is interested in determining if a new procedure is an improvement over the old procedure. The probability of success for the old procedure is p0(known). The probability of success for the new procedure is p (unknown) . One is trying to determine if the new procedure is better (i.e. p > p0) .
A researcher is interested in determining if a new procedure is no longer worth considering. The probability of success for the old procedure is p0(known). The probability of success for the new procedure is p (unknown) . One is trying to determine if the new procedure is definitely worse than the one presently being used (i.e. p < p0) .
The z-test for the Mean of a Normal Population We want to test, m, denote the mean of a normal population
The Situation • Let x1, x2, x3 , … , xn denote a sample from a normal population with mean m and standard deviation s. • Let • we want to test if the mean, m, is equal to some given value m0. • Obviously if the sample mean is close to m0 the Null Hypothesis should be accepted otherwise the null Hypothesis should be rejected.
The Acceptance and Critical Region • This depends on H0 and HA Two-tailed critical region • Accept H0 if: • Reject H0 if: One-tailed critical regions • Accept H0 if: • Accept H0 if: • Reject H0 if: • Reject H0 if:
Example A manufacturer Glucosamine capsules claims that each capsule contains on the average: • 500 mg of glucosamine To test this claim n = 40 capsules were selected and amount of glucosamine (X) measured in each capsule. Summary statistics:
Manufacturers claim is correct We want to test: against Manufacturers claim is not correct
The Critical Region and Acceptance Region Using a = 0.05 za/2 = z0.025 = 1.960 We accept H0 if -1.960 ≤ z ≤ 1.960 reject H0 ifz < -1.960 or z > 1.960
The Decision Sincez= -2.75 < -1.960 We reject H0 Conclude: the manufacturers’s claim is incorrect:
Recall: The z-test for means The Test Statistic
Comments • The sampling distribution of this statistic is the standard Normal distribution • The replacement of s by s leaves this distribution unchanged only the sample size n is large.
For small sample sizes: The sampling distribution of Is called “students” t distribution with n –1 degrees of freedom
Properties of Student’s t distribution • Similar to Standard normal distribution • Symmetric • unimodal • Centred at zero • Larger spread about zero. • The reason for this is the increased variability introduced by replacing s by s. • As the sample size increases (degrees of freedom increases) the t distribution approaches the standard normal distribution
t distribution standard normal distribution
The Situation • Let x1, x2, x3 , … , xn denote a sample from a normal population with mean m and standard deviation s. Both m and s are unknown. • Let • we want to test if the mean, m, is equal to some given value m0.
The Test Statistic The sampling distribution of the test statistic is the t distribution with n-1 degrees of freedom
ta and ta/2 are critical values under the t distribution with n – 1 degrees of freedom
a or a/2 Critical values for the t-distribution
Critical values for the t-distribution are provided in tables. A link to these tables are given with today’s lecture
Look up a Look up df
Note: the values tabled for df = ∞ are the same values for the standard normal distribution
Example • Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases. • Assume that x1, x2, x3 , x4, x5, x6 is a sample from a normal population with mean m and standard deviation s. Both m and s are unknown. • we want to test: New diet is not effective versus New diet is effective
The Test Statistic The Critical region: Reject if
The Data The summary statistics:
The Test Statistic The Critical Region (using a= 0.05) Reject if Conclusion: Accept H0:
Confidence Intervals for the mean of a Normal Population, m, using the Standard Normal distribution Confidence Intervals for the mean of a Normal Population, m, using the t distribution
The Data The summary statistics:
Example • Let x1, x2, x3 , x4, x5, x6 denote weight loss from a new diet for n = 6 cases. The Data: The summary statistics:
Comparing Populations Proportions and means
Sums, Differences, Combinations of R.V.’s A linear combination of random variables, X, Y, . . . is a combination of the form: L =aX +bY + … where a, b, etc. are numbers – positive or negative. Most common:Sum = X +Y Difference = X –Y Simple Linear combination of X, bX + a
Means of Linear Combinations If L =aX +bY + … The mean of Lis: Mean(L) =a Mean(X) +b Mean(Y) + … Most common: Mean( X +Y) = Mean(X) + Mean(Y) Mean(X –Y) = Mean(X) – Mean(Y) Mean(bX + a) = bMean(X) + a
Variances of Linear Combinations If X, Y, . . . are independent random variables and L =aX +bY + … then Variance(L) =a2Variance(X) +b2 Variance(Y) + … Most common: Variance( X +Y) = Variance(X) + Variance(Y) Variance(X –Y) = Variance(X) + Variance(Y) Variance(bX + a) = b2Variance(X)
Combining Independent Normal Random Variables If X, Y, . . . are independent normal random variables, then L =aX +bY + … is normally distributed. In particular: X +Y is normal with X –Y is normal with