Créer une présentation
Télécharger la présentation

Télécharger la présentation
## Inference for Categorical Variables

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Inference for Categorical Variables**Probability & Statistics L. Weinstein May 2014**Testing a Claim with Categorical Data**• Three tests: • Goodness of Fit TestDoes the distribution of the categorical variable fit an expected model? • Test for Homogeneity of PopulationsDoes each population have the same distribution for this variable? • Test for Association / IndependenceAre two categorical variables associated?**Goodness of Fit Test**State: Is the distribution of <your variable here> different from the expected distribution of <be specific here>? The distribution is the same as expected for all categories The distribution is the different than expected for at least one category Test at significance level <choose a level>**Goodness of Fit Test**Plan: Use a Goodness of Fit test Conditions: • Sample is randomly selected from population • All expected counts are at least 5 • Sample observations are independent; that is, if sampling without replacement, sample size is not more then 10% of the population size.**Goodness of Fit Test**To conduct the test in Minitab, summarize the data by category and put this in one column. If equal counts are expected, this is enough. If something other than equal counts are expected, make a column of expected counts. Then run Stat>Tables>Chi-Square Goodness of Fit Test in Minitab.**Goodness of Fit Test**Enter the column names for Observed Counts, Category names, and Proportions specified by historical counts (this is your expected counts list):**Goodness of Fit Test**Do: <Include Minitab results of chi-square test here> <Indicate the value of the test statistics, , and the P-value of the test.>**Goodness of Fit Test**Conclude: <Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>**Test for Homogeneity**State: Is the distribution of <your variable here> different for the populations <be specific here>? The distribution is the same for all populations The distribution is the different for at least one category Test at significance level <choose a level>**Test for Homogeneity**Plan: Use a Test for Homogeneity Conditions: • Samples are randomly selected from each population • All expected counts are at least 5 • Sample observations are independent; that is, if sampling without replacement, each sample size is not more then 10% of that population size.**Test for Homogeneity**To conduct the test in Minitab, make a column of the summarized distribution of the variable for each population. Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.**Test for Homogeneity**Enter the column names for each population:**Test for Homogeneity**Do: <Include Minitab results of chi-square test here> <Indicate the value of the test statistics, , and the P-value of the test.>**Test for Homogeneity**Conclude: <Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>**Test for Independence**State: Is there an association between <categorical variable one> and <categorical variable two>? There is no association between the variables (they are independent). There is an association between the variables (they are NOT independent. Test at significance level <choose a level>**Test for Independence**Plan: Use a Test for Independence / Association Conditions: • Sample is randomly selected from population • All expected counts are at least 5 • Sample observations are independent; that is, if sampling without replacement, sample size is not more then 10% of the population size.**Test for Independence**To conduct the test in Minitab, make a two-way table summarizing the observed counts for each category of the two variables. Then run Stat>Tables>Chi-Square Test (2-way table) in Minitab.**Test for Independence**Enter the column names that contain the summarized data:**Test for Independence**Do: <Include Minitab results of chi-square test here> <Indicate the value of the test statistics, , and the P-value of the test.>**Test for Independence**Conclude: <Compare your P-Value to your significance level. Based on this comparison, either reject or fail to reject the null hypothesis. Conclude, or do NOT conclude, the alternative hypothesis in words.>