1.01k likes | 1.24k Vues
Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01. Professor William Greene Stern School of Business IOMS Department Department of Economics. Part 5 – Hypothesis Testing. Objectives of Statistical Analysis. Estimation How long do hard drives last?
E N D
Statistical Inference and Regression Analysis: Stat-GB.3302.30, Stat-UB.0015.01 Professor William Greene Stern School of Business IOMS Department Department of Economics
Objectives of Statistical Analysis • Estimation • How long do hard drives last? • What is the median income among the 99%ers? • Inference – hypothesis testing • Did minorities pay higher mortgage rates during the housing boom? • Is there a link between environmental factors and breast cancer on eastern long island?
General Frameworks • Parametric Tests: features of specific distributions such as the mean of a Bernoulli or normal distribution. • Specification Tests (Semiparametric) • Do the data arrive from a Poisson process • Are the data normally distributed • Nonparametric Tests: Are two discrete processes independent?
Hypotheses • Hypotheses - labels • State 0 of Nature – Null Hypothesis • State 1 – Alternative Hypothesis • Exclusive: Prob(H0 ∩ H1) = 0 • Exhaustive: Prob(H0) + Prob(H1) = 1 • Symmetric: Neither is intrinsically “preferred” – the objective of the study is only to support one or the other. (Rare?)
Does the New Drug Work? • Hypotheses: H0= .50, H1 = .75 • Priors: P0= .40, P1= .60 • Clinical Trial: N = 50, 31 patients “respond’” p = .62 • Likelihoods: • L0 (31| =.50) = Binomial(50,31,.50) = .0270059 • L1 (31| =.75) = Binomial(50,31,.75) = .0148156 • Posterior odds in favor of H0 = (.4/.6)(.0270059/.0148156) = 1.2152 > 1 • Priors favored H1 1.5 to 1, but the posterior odds favor H0, 1.2152 to 1. The evidence discredits H1even though the ‘data’ seem more consistent with prior P1.
Decision Strategy • Prefer the hypothesis with the higher posterior odds • A gap in the theory: How does the investigator do the cost benefit test? • Starting a new business venture or entering a new market: Priors and market research • FDA approving a new drug or medical device. Priors and clinical trials • Statistical Decision Theory adds the costs and benefits of decisions and errors.
An Alternative Strategy • Recognize the asymmetry of null and alternative hypotheses. • Eliminate the prior odds (which are rarely formed or available).
http://query.nytimes.com/gst/fullpage.html?res=9C00E4DF113BF935A3575BC0A9649C8B63http://query.nytimes.com/gst/fullpage.html?res=9C00E4DF113BF935A3575BC0A9649C8B63
Classical Hypothesis Testing • The scientific method applied to statistical hypothesis testing • Hypothesis: The world works according to my hypothesis • Testing or supporting the hypothesis • Data gathering • Rejection of the hypothesis if the data are inconsistent with it • Retention and exposure to further investigation if the data are consistent with the hypothesis • Failure to reject is not equivalent to acceptance.
Asymmetric Hypotheses • Null Hypothesis: The proposed state of nature • Alternative hypothesis: The state of nature that is believed to prevail if the null is rejected.
Hypothesis Testing Strategy • Formulate the null hypothesis • Gather the evidence • Question: If my null hypothesis were true, how likely is it that I would have observed this evidence? • Very unlikely: Reject the hypothesis • Not unlikely: Do not reject. (Retain the hypothesis for continued scrutiny.)
Some Terms of Art • Type I error: Incorrectly rejecting a true null • Type II error: Failure to reject a false null • Power of a test: Probability a test will correctly reject a false null • Alpha level: Probability that a test will incorrectly reject a true null. This is sometimes called the size of the test. • Significance Level: Probability that a test will retain a true null = 1 – alpha. • Rejection Region: Evidence that will lead to rejection of the null • Test statistic: Specific sample evidence used to test the hypothesis • Distribution of the test statistic under the null hypothesis: Probability model used to compute probability of rejecting the null. (Crucial to the testing strategy – how does the analyst assess the evidence?)
Possible Errors in Testing Hypothesis is Hypothesis is True False I Do Not Reject the Hypothesis I Reject the Hypothesis
A Legal Analogy: The Null Hypothesis is INNOCENT Null Hypothesis Alternative Hypothesis Not Guilty Guilty Finding: Verdict Not Guilty Finding: VerdictGuilty The errors are not symmetric. Most thinkers consider Type I errors to be more serious than Type II in this setting.
(Jerzy) Neyman – (Karl) Pearson Methodology • “Statistical” testing • Methodology • Formulate the “null” hypothesis • Decide (in advance) what kinds of “evidence” (data) will lead to rejection of the null hypothesis. I.e., define the rejection region • Gather the data • Mechanically carry out the test.
Formulating the Null Hypothesis • Stating the hypothesis: A belief about the “state of nature” • A parameter takes a particular value • There is a relationship between variables • And so on… • The null vs. the alternative • By induction: If we wish to find evidence of something, first assume it is not true. • Look for evidence that leads to rejection of the assumed hypothesis. • Evidence that rejects the null hypothesis is significant
Example: Credit Scoring Rule • Investigation: I believe that Fair Isaacs relies on home ownership in deciding whether to “accept” an application. • Null hypothesis: There is no relationship • Alternative hypothesis: They do use homeownership data. • What decision rule should I use?
Some Evidence = Homeowners 5469 5030 1845 1100
Hypothesis Test • Acceptance rate for homeowners = 5030/(5030+1100) = .82055 • Acceptance rate for renters is .74774 • H0: Acceptance rate for renters is not less than for owners. • H0: p(renters) > .82055 • H1: p(renters) < .82055
The Rejection Region What is the “rejection region?” • Data (evidence) that are inconsistent with my hypothesis • Evidence is divided into two types: • Data that are inconsistent with my hypothesis (the rejection region) • Everything else
My Testing Procedure • I will reject H0 if p(renters) < .815 (chosen arbitrarily) • Rejection region is sample values of p(renters) < 0.815
Distribution of the Test Statistic Under the Null Hypothesis • Test statistic p(renters) = 1/N i Accept(=1 or 0) • Use the central limit theorem: • Assumed mean = .82055 • Implied standard deviation= sqr(.82055*.17945/7413)=.00459 • Using CLT, normally distributed. (N is very large). • Use z = (p(renters) - .82055) / .00459
Alpha Level and Rejection Region • Prob(Reject H0|H0 true) = Prob(p < .815 | H0 is true)= Prob[(p - .82055)/.00459)= Prob[z < -1.209]= .11333 • Probability of a Type I error • Alpha level for this test
Distribution of the Test Statistic and the Rejection Region Area=.11333
The Test • The observed proportion is 5469/(5469+1845) = 5469/7314 = .74774 • The null hypothesis is rejected at the 11.333% significance level (by the design of the test)
Power Function for the Test(Power = size when alternative = the null.)
Application: Breast Cancer On Long Island • Null Hypothesis: There is no link between the high cancer rate on LI and the use of pesticides and toxic chemicals in dry cleaning, farming, etc. • Neyman-Pearson Procedure • Examine the physical and statistical evidence • If there is convincing covariation, reject the null hypothesis • What is the rejection region? • The NCI study: • Working null hypothesis: There is a link: We will find the evidence. • How do you reject this hypothesis?
Formulating the Testing Procedure • Usually: What kind of data will lead me to reject the hypothesis? • Thinking scientifically: If you want to “prove” a hypothesis is true (or you want to support one) begin by assuming your hypothesis is not true, and look for evidence that contradicts the assumption.
Hypothesis About a Mean • I believe that the average income of individuals in a population is $30,000. • H0 : μ = $30,000 (The null) • H1: μ ≠ $30,000 (The alternative) • I will draw the sample and examine the data. • The rejection region is data for which the sample mean is far from $30,000. • How far is far????? That is the test.
Application • The mean of a population takes a specific value: • Null hypothesis: H0: μ = $30,000H1: μ ≠ $30,000 • Test: Sample mean close to hypothesized population mean? • Rejection region: Sample means that are far from $30,000
Deciding on the Rejection Region • If the sample mean is far from $30,000, reject the hypothesis. • Choose, the region, for example, The probability that the mean falls in the rejection region even though the hypothesis is true (should not be rejected) is the probability of a type 1 error. Even if the true mean really is $30,000, the sample mean could fall in the rejection region. Rejection Rejection 29,500 30,000 30,500
Reduce the Probability of a Type I Error by Making the (non)Rejection Region Wider Reduce the probability of a type I error by moving the boundaries of the rejection region farther out. Probability outside this interval is large. 28,500 29,500 30,000 30,500 31,500 You can make a type I error impossible by making the rejection region very far from the null. Then you would never make a type I error because you would never reject H0. Probability outside this interval is much smaller.
Setting the α Level • “α” is the probability of a type I error • Choose the width of the interval by choosing the desired probability of a type I error, based on the t or normal distribution. (How confident do I want to be?) • Multiply the z or t value by the standard error of the mean.
Testing Procedure • The rejection region will be the range of values greater than μ0 + zσ/√N orless than μ0 - zσ/√N • Use z = 1.96 for 1 - α = 95% • Use z = 2.576 for 1 - α = 99% • Use the t table if small sample, variance is estimated and sampling from a normal distribution.
Deciding on the Rejection Region • If the sample mean is far from $30,000, reject the hypothesis. • Choose, the region, say, Rejection Rejection I am 95% certain that I will not commit a type I error (reject the hypothesis in error). (I cannot be 100% certain.)
The Test Procedure • Choosing z = 1.96 makes the probability of a Type I error 0.05. • Choosing z = 2.576 would reduce the probability of a Type I error to 0.01. • Reducing the probability of a Type I error reduces the power of the test because it reduces the probability that the null hypothesis will be rejected.
P Value • Probability of observing the sample evidence assuming the null hypothesis is true. • Null hypothesis is rejected if P value <
P value < Prob[p(renter) < .74774] = Prob[z < (.74774 - .82055)/.00459] = (-15.86) = .59946942854362260 * 10-56Impossible =.11333
Confidence Intervals • For a two sided test about a parameter, a confidence interval is the complement of the rejection region. (Proof in text, p. 338)
Confidence Interval • If the sample mean is far from $30,000, reject the hypothesis. • Choose, the region, say, Rejection Confidence Rejection I am 95% certain that the confidence interval contains the true mean of the distribution of incomes. (I cannot be 100% certain.)
One Sided Tests • H0 = 0, H10 Rejection region is sample mean far from 0 in either direction • H0 = 0, H1>0. Sample means less than 0 cannot be in the rejection region. • Entire rejection region is above 0. • Reformulate: H0<0, H1>0.
Carrying Out the LR Test • In most cases, exact distribution of the statistic is unknown • Use -2log Chi squared [1] • For a test about 1 parameter, threshold value is 3.84 (5%) or 6.45 (1%)