30 Hypothesis Testing

30Hypothesis Testing John Man 6s  1

Statistical Testing Objectives At the end of this module, the student will be able to: • Understand the concept of a null hypothesis and an alternate hypothesis. • Understand the concept of Decision Risk • Alpha risk (Type I error) • Beta risk (Type II error) • Understand why hypothesis testing is used in Analyze Phase 2

Why Learn Hypothesis Testing ? ? ? 1. Many problems require a decision to accept or reject a statement about a parameter. 2. That statement is a Hypothesis. It represents the translation of a practical question into a statistical question. 3. The decision making procedure is known as Hypothesis Testing. 4. Statistical testing provides an objective solution, with known risks, to questions which are traditionally answered subjectively. 3

Hypotheses • A hypothesis is “a tentative assumption made in order to draw out or test its logical or empirical consequences.” (a.k.a practical question) • The purpose of the hypothesis is to establish a basis, so that one can gather evidence to either disprove the statement or accept it as true. Examples • This man is not guilty • The water from my well is safe to drink • This product is safe to use when the manufacturer’s instructions are followed

Statistical Hypotheses A statistical hypothesis is a statement about the value of one of the characteristics of one or more populations. (e.g. averages, standard deviations) Examples • The average commute time using Interstate 35 is shorter than using Portland Ave. • This policy change will decrease the median cycle time on downstream processes • The variation in cycle time using Method A is 20% longer than using Method B • The abandonment rate this month is the same as last months

After Before After Shift Before Shift Variation reduction Variation reduction After After Before Before No Shift in Mean or Variation Statistical Hypothesis Test in Six Sigma Mean and Variation Shifts Mean Shifts Variation Shift 6

Null Hypotheses • The Null Hypothesis (H0) • Statement generally assumed to be true unless sufficient evidence is found to the contrary • Often assumed to be the status quo, or the preferred outcome. However, it sometimes represents a state you strongly want to disprove. • Designated as H0

Alternative Hypotheses • The Alternative Hypothesis (Ha) • Statement generally held to be true if the null hypothesis is rejected • Can be based on a specific difference in a characteristic value that one desires to detect • Designated as HA

Hypothesis Testing Errors With Hypothesis Testing there is a risk of erroneous conclusion(s) ! Decision Don’t Reject Ho REJECT Ho TYPE I ERROR . Risk Ho TRUE CORRECT DECISION Typically set @ 0.05 {Accept Ha} Truth TYPE II ERROR  Risk Typically set @ .10 CORRECT DECISION Ho FALSE THE PROBABILITY OF A TYPE I ERROR IS . THE PROBABILITY OF A TYPE II ERROR IS  9

Hypothesis Testing Risks • The Risks Involved in Hypothesis Testing • Type I Error: Rejecting the null hypothesis when it is true. Probability of this error equals a (by convention) • Type II Error: Accepting the null hypothesis when it is false. Probability of this error equals b (by convention). When HA has been quantified, the Type II error is better defined as rejecting the alternative hypothesis when it is true.

Hypothesis Testing Examples U.S. Legal System Jury’s Decision Not Guilty Guilty Ho TRUE ActuallyInnocent TYPE I ERROR . Risk CORRECT DECISION Innocent Goes to Jail Truth TYPE II ERROR  Risk Criminal Goes Free Actually Guilty CORRECT DECISION 11

Hypothesis Testing Examples France Legal System Jury’s Decision Guilty Not Guilty Ho TRUE Actually Guilty TYPE I ERROR . Risk CORRECT DECISION Criminal Goes Free Truth TYPE II ERROR  Risk Innocent Goes to Jail CORRECT DECISION ActuallyInnocent 12

Hypothesis Testing Examples Black Belt: Paul Lavery Location: Limavady, Northern Ireland Project: Reduce Late Invoice Payments PROJECT: Reduce Late Invoice Payments Decision Don’t Reject Ho REJECT Ho The payment performance is independent of the category into which the non matched invoice goes TYPE I ERROR . Risk CORRECT DECISION Truth The payment performance is dependant on the category into which the non matched invoice goes TYPE II ERROR  Risk CORRECT DECISION 13

Hypothesis Testing Examples Black Belt: Paul Lavery Location: Limavady, Northern Ireland Project: Reduce Late Invoice Payments PROJECT: Reduce Late Invoice Payments Decision Don’t Reject Ho REJECT Ho The payment performance is independent of the terms on the invoice TYPE I ERROR . Risk CORRECT DECISION Truth The payment performance is dependent on the terms of the invoice TYPE II ERROR  Risk CORRECT DECISION 14

Hypothesis Testing Results • Invoices which fail to match first time are less likely to be paid on time. • Payment terms on the invoice could be important The reason for not matching first time is not important. Other Hypothesis Tests from Paul’s Project: • The currency of the invoice is not important. • Each AP staff member performs the same. Black Belt: Paul Lavery Location: Limavady, Northern Ireland Project: Reduce Late Invoice Payments

Real World Hypothesis I’m wondering if there is a difference in the safety rating for USAIR and Delta. Statistical Hypothesis Ho: AverageUSAIR = AverageDelta AverageUSAIR = AverageDelta Ha: Contestant/Team Questions Type 1 Family: When does Type I Error Occur? Type II Family: When does Type II Error Occur? TYPE I ERROR . Risk Occurs when you decide there is a difference in the safety ratings when the truth is there is not a difference. TYPE II ERROR  Risk Occurs when you decide there is not a difference in the safety ratings when the truth is there is a difference. Decision Error Family Feud - Scenario #1

Real World Hypothesis I’m wondering if there is a different amount of chemicals in city water than in well water. Statistical Hypothesis Ho: AverageCity = AverageWell AverageCity= AverageWell Ha: Contestant/Team Questions Type 1 Family: When does Type I Error Occur? Type II Family: When does Type II Error Occur? TYPEI ERROR . Risk Occurs when you decide there is a difference in the chemical content when the truth is there is not a difference. TYPE II ERROR  Risk Occurs when you decide there is not a difference in the chemical content when the truth is there is a difference. Decision Error Family Feud - Scenario #2

Real World Hypothesis I’m wondering if the proposal hit rate is different if proposals are delivered to the customer within five days versus greater than 5 days. Statistical Hypothesis Ho: Proportion<5days = Proportion>5days Proportion<5days= Proportion>5days Ha: Contestant/Team Questions Type 1 Family: When does Type I Error Occur? Type II Family: When does Type II Error Occur? TYPE I ERROR . Risk Occurs when you decide there is a difference in the hit rate when the truth is there is not a difference. TYPE II ERROR  Risk Occurs when you decide there is not a difference in the hit rate when the truth is there is a difference. Decision Error Family Feud - Scenario #3

Hypothesis Testing “Dirty Harry” Decision Don’t Give Up Give Up TYPE I ERROR . Risk 1 bullet (5 Shots) CORRECT DECISION Truth TYPE II ERROR  Risk CORRECT DECISION No bullets (6 Shots) 19

Is X related to Y, or did I observed this by chance? Why is this Hypothesis Testing Stuff Necessary? Analyze Phase “The Truth” When we’re searching for important X’s, we don’t know if they are important or not. Your Decision “Don’t Reject Ho” Reject Ho We just take a sample and make a decision. Since we don’t know the Truth, there is risk in this decision. Ho True Type IError (a-Risk) Correct The Truth Type II Error (b-Risk) Correct Ho False 20

Your Decision a is the critical “ P-Value ” !!! “Don’t Reject Ho” Reject Ho Type IError (a-Risk) Ho True Correct The Truth Type II Error (b-Risk) Correct Ho False Did this relationship of X and Y I observed, occur by chance? How to use Hypothesis Testing in Analyze Phase? P-value: if the value is as small or smaller than , we say that the data are statistically significant at the  level of significance. 21

a is the critical “ P-Value ” !!! Your Decision “Don’t Reject Ho” Reject Ho If p is low, we declare war on the Null, Ho! Type IError (a-Risk) Ho True Correct The Truth Type II Error (b-Risk) OR Correct Ho False If p is low, X is a go If p is high, don’t say goodbye What is Low? • Generally, If p is • REQUIRED: Ten percent (a = .10), this is uncomfortable, but may be appropriate for certain situations. • DESIRED: Five percent is comfortable (a = .05). • INSPIRED: One percent feels very good (a = .01). How to use Hypothesis Testing in Analyze Phase? But, your choice should depend on interests and consequences !! 22

Hypothesis Testing Key Knowledge Points 23

Appendix (Hypothesis Testing Methods)

Hypothesis Testing Definitions • Null Hypothesis (Ho) - Statement of no change or difference. This statement is assumed true until sufficient evidence is presented to reject it. • Alternative Hypothesis (Ha) - Statement of change or difference. This statement is considered true if Ho is rejected. • Type I Error - The error of rejecting Ho when it is in fact true, or in saying there is a difference when, in fact, there is no difference. • Alpha Risk - The maximum risk or probability of making a Type I Error. This probability is always greater than zero, and is usually established at 5%. The researcher makes the decision to the greatest level of risk that is acceptable for a rejection of Ho. • Significance Level - Same as Alpha Risk. • Type II Error - The error in failing to reject Ho when it is in fact false, or in saying there is no difference when there really is a difference. • Beta Risk - The risk or probability of making Type II Error, or overlooking an effective treatment or solution to the problem. 25

Hypothesis Testing Definitions • Significant Difference - The term used to describe the results of a statistical hypothesis test where a difference is too large to be reasonable attributed to chance. Often referred to as  or . • Power - The ability of a statistical test to detect a real difference when there really is one, or the probability of being correct in rejecting Ho. Commonly used to determine if sample sizes are sufficient to detect a difference in treatments if one exists. Power = 1 - b • Test Statistic - A standardized value (Z, t, F, etc.) which represents the feasibility of Ho, and is distributed in a known manner such that a probability for this observed value can be determined. Usually, the more feasible Ho is, the smaller the absolute value of the test statistic, and the greater the probability of observing this value within its distribution. 26

Hypothesis Testing Definitions, P-value • P-value: if the value is as small or smaller than , we say that the data are statistically significantly at the  level of significance.

Statistics “There are three kinds of lies: Lies, damned lies, and statistics.” Mark Twain Statistics are tools. Like any other tool they can be misused, which may result in misleading, distorted, or incorrect conclusions. It is not sufficient to be able to do the computations. One must also be able to make the correct interpretations. 28

Definition of Statistics: Populations and Samples: A Statistic: Data Type: Statistics are facts and figures Statistics consist of a set of methods and rules for organizing and interpreting observations from populations and samples Population is the entire group or set of all possible events of interest in the particular study Sample is a subset of the population A numerical value that describes a sample Dictates the statistical tool that is applicable Statistical Terms ENTIRE POPULATION SAMPLE WITHIN (subset)

Inferential Statistics- A branch of statistics that enables us to draw conclusions about a population based on a sample. Inferential Statistics facts organizing figures interpreting 30

A newspaper article claims that the average height of males in the USA is not the same that it was 50 years ago. It is now 5’11’’ (or about 5.9’). To investigate the article’s claim, you randomly select 75 males and measure their heights. The results are Mean  = 5.76 ft Standard deviation  = .435 ft The data can be found in: Analyze Data.MTW Inferential Statistics and Hypothesis Testing 31

Hypothesis Testing Description • Statistics communicate information from data • Statistics are not a substitute for professional judgment • Hypothesis Testing answers the practical question: “Is there a real difference between _____ and _____ ?” • A practical process problem is translated into a statistical hypothesis in order to answer this question • In hypothesis testing, we use relatively small samples to answer questions about population parameters • There is always a chance that we selected a sample that is not representative of the population. Therefore, there is always a chance that the conclusion obtained is wrong. • With some assumptions, inferential statistics allows us to estimate the probability of getting an unlikely sample. This lets us quantify the probability (P-Value) of a wrong conclusion. 32

We Never “Accept” the Null Hypothesis • The null hypothesis is the hypothesis being tested. It is either rejected or not rejected on the basis of sample information. • The alternative hypothesis is specified as another choice if the null is rejected. It is never possible to prove beyond a doubt when sampling that a null is correct.

Risks Exercise Determine H0 and HA, and comment on the Type I and Type II errors 1. An American trial 2. The safety of the water from your well 3. A process change which should not cause any effect on downstream results 4. Comparison of a new vendors’ parts (which are slightly more expensive) to the present vendor, when variation is a major issue 5. The yield on Tester ECTZ21 is the same as the yield on Tester ECTZ33

Process Hypotheses • Process Situations • Comparison of one population to a standard • Comparison of two populations • or • Single sided: comparison considers a difference only if it is greater or only if it is less, but not both • Two sided: comparison considers any difference of inequality important

Two Sided (Two “Tailed”) Test Hypotheses ( H issome hypothesized value) HO:  =  H HA:  =  H Reject Region Reject Region Do Not Reject Region

One Sided (One “Tailed”) Test Hypotheses ( H issome hypothesized value) HO:  >  HHO:  <  H HA:  <  HHA:  >  H or Reject Region Do Not Reject Region

MEAN VARIANCE INEQUALITY Ha:   0 Ha: 2  20 NEW  OLD Ha:   0 Ha: 2  20 NEW  OLD Ha:   0 Ha: 2  20 Process Hypotheses: Null and Alternate • NULL HYPOTHESIS: Nothing has changed: • For Tests Of Process Mean: H0: m = m0 • For Tests Of Process Variance: H0: s2 = s20 • ALTERNATE HYPOTHESIS: Change has Occurred: 38

Population Parameters Sample Statistics  Mean x Standard Deviation  s Proportion (percent) p p Parameters vs Statistics 1. Population parameters (values) are fixed, but unknown 2. Sample statistics are used to estimate population values Hypotheses are statements about population parameters, not sample statistics. 39

30 Hypothesis Testing

30 Hypothesis Testing

Presentation Transcript

Hypothesis Testing

Testing Hypothesis

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing:

Hypothesis testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis Testing

Hypothesis testing

Hypothesis Testing

Hypothesis Testing

Hypothesis testing