1 / 35

BCOR 1020 Business Statistics

BCOR 1020 Business Statistics. Lecture 27 – April 29, 2008. Overview. Chapter 14 – Chi-Square Tests Chi-Square Distribution Chi-Square Test for Independence Chi-Square Test for Goodness of Fit.

tyler
Télécharger la présentation

BCOR 1020 Business Statistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. BCOR 1020Business Statistics Lecture 27 – April 29, 2008

  2. Overview • Chapter 14 – Chi-Square Tests • Chi-Square Distribution • Chi-Square Test for Independence • Chi-Square Test for Goodness of Fit

  3. For tests that have a test statistic involving a sum of squared differences, we will often use a chi-square distribution. Our test critical values come from the chi-square probability distribution with n degrees of freedom. n = degrees of freedom (will vary depending on the application) Appendix E contains critical values for right-tail areas of the chi-square distribution. The mean of a chi-square distribution is n with variance 2n. Chapter 14 – Chi-Square Distribution Chi-Square Distribution:

  4. Consider the shape of the chi-square distribution: Chapter 14 – Chi-Square Distribution Chi-Square Distribution: c2.1 = 6.251 c2.1 = 18.55 c2.1 = 4.605 • Example: Find the upper 10% critical point for each of these distributions.

  5. Clicker Using the chi-square table, find the upper 5% critical point for a chi-square distribution with n = 5 degrees of freedom. (A) 1.610 (B) 9.488 (C) 9.236 (D) 11.07

  6. Chapter 14 – Chi-Square Test for Independence Contingency Tables: • A contingency table is a cross-tabulation of n paired observations into categories. • Each cell shows the count of observations that fall into the category defined by its row (r) and column (c)heading.

  7. Chapter 14 – Chi-Square Test for Independence Contingency Tables: • For example: (overhead)

  8. In a test of independence for an r x c contingency table, the hypotheses areH0: Variable A is independent of variable BH1: Variable A is not independent of variable B Use the chi-square test for independence to test these hypotheses. This non-parametric test is based on frequencies. The n data pairs are classified into c columns and r rows and then the observed frequencyfjk is compared with the expected frequencyejk. Chapter 14 – Chi-Square Test for Independence Chi-Square Test:

  9. Chapter 14 – Chi-Square Test for Independence Chi-Square Distribution: • The critical value comes from the chi-square probability distribution with n degrees of freedom. • n = degrees of freedom = (r – 1)(c – 1) where r = number of rows in the tablec = number of columns in the table

  10. Assuming that H0 is true, the expected frequency of row j and column k is: ejk = RjCk/n where Rj = total for row j (j = 1, 2, …, r)Ck = total for column k (k = 1, 2, …, c)n = sample size Chapter 14 – Chi-Square Test for Independence Expected Frequencies:

  11. Chapter 14 – Chi-Square Test for Independence Expected Frequencies: • The table of expected frequencies is: • The ejk always sum to the same row and column frequencies as the observed frequencies.

  12. Step 1: State the Hypotheses H0: Variable A is independent of variable B H1: Variable A is not independent of variable B Step 2: State the Decision Rule Calculate n = (r – 1)(c – 1) For a given a, look up the right-tail critical value (c2a,n) from Appendix E or by using Excel. Reject H0 if test statistic > c2a,n (or if p-value < a). Chapter 14 – Chi-Square Test for Independence Steps in Testing the Hypotheses:

  13. Step 3: Calculate the Expected Frequencies ejk = RjCk/n For example, Chapter 14 – Chi-Square Test for Independence Steps in Testing the Hypotheses:

  14. Step 4: Calculate the Test Statistic The chi-square test statistic is Step 5: Make the Decision Reject H0 if test statistic > c2a,n or if the p-value <a. Chapter 14 – Chi-Square Test for Independence Steps in Testing the Hypotheses:

  15. Chapter 14 – Chi-Square Test for Independence Example: Privacy Disclaimer Location and Web Site Nationality (on overhead) • The actual frequencies are on the overhead (and slide #4). • Our Hypotheses are: H0: Privacy disclaimer location is independent of Web site nationality. H1: Privacy disclaimer location is dependent on Web site nationality. • Decision Rule (at a = 5%): Degrees of Freedom: n= (r – 1) x (c – 1) = (4 – 1) x (3 – 1) =6 Reject H0 if c2 > c2a,n = c2.05,6 = 12.59.

  16. Chapter 14 – Chi-Square Test for Independence Example: Privacy Disclaimer Location and Web Site Nationality (on overhead) • We computed the expected frequencies… • We can use these and the actual frequencies to calculate our test statistic…

  17. Clickers Example (continued)… Decision: Based on our test statistic and decision criteria, we should… (A) Fail to reject H0. (B) Reject H0. (C) Start Laughing. (D) Abandon Hope.

  18. Chapter 14 – Chi-Square Test for Independence Another Example… • Fill in the missing elements for the contingency table below (problem 14.2 on page 541.)… • Our chi-square test will have n = (r – 1) x (c – 1) = 3 d.f. 65 20

  19. Chapter 14 – Chi-Square Test for Independence

  20. Chapter 14 – Chi-Square Test for Independence Example (continued)… • To conduct the test for independence, • State the hypotheses: H0: Running shoe ownership by age-group is independent of world region. H1: Running shoe ownership by age-group is dependent on world region. • Decision Rule (at a = 5%): Reject H0 if c2 > c2a,n = c2.05,3 = 7.815.

  21. Clickers Example (continued)… Our chi-square statistic is computed as c2 = 19.312. What should our decision be? (A) Fail to reject H0. (B) Reject H0. (C) Too close to call.

  22. Clickers Example (continued)… For our computed chi-square statistic, c2 = 19.312, which has n = 3 d.f. under H0, what is the best bound for the p-value for this test using the chi- square table? (A) p-value > 0.05 (B) 0.01 < p-value < 0.05 (C) 0.005 < p-value < 0.01 (D) p-value < 0.005

  23. The goodness-of-fit (GOF) test helps you decide whether your sample resembles a particular kind of population. The chi-square test will be used because it is versatile and easy to understand. The test statistic is intuitive… It involves differences between observed frequencies in the data and expected frequencies (assuming the assumed distribution is correct). Chapter 14 – Chi-Square Test for Goodness-of-Fit Purpose of the Test:

  24. Chapter 14 – Chi-Square Test for Goodness-of-Fit Hypotheses for GOF: • The hypotheses are: • H0: The population follows a _______ distributionH1: The population does not follow a _______ distribution • The blank may contain the name of any theoretical distribution (e.g., uniform, Poisson, normal).

  25. Chapter 14 – Chi-Square Test for Goodness-of-Fit Test Statistic and Degrees of Freedom for GOF: • Assuming n observations, the observations are grouped into c classes and then the chi-square test statistic is found using: where fj = the observed frequency of observations in class j ej = the expected frequency in class j if H0 were true

  26. If the proposed distribution gives a good fit to the sample, the test statistic will be near zero. The test statistic follows the chi-square distribution with degrees of freedomn = c – m – 1 where c is the no. of classes used in the testm is the no. of parameters estimated Chapter 14 – Chi-Square Test for Goodness-of-Fit Test Statistic and Degrees of Freedom for GOF:

  27. A simple “eyeball” inspection of the histogram or dot plot may suffice to rule out a hypothesized population. Chapter 14 – Chi-Square Test for Goodness-of-Fit Eyeball Tests: Small Expected Frequencies: • Goodness-of-fit tests may lack power in small samples. As a guideline, a chi-square goodness-of-fit test should be avoided if n < 25.

  28. A multinomial distribution is defined by any k probabilities p1, p2, …, pk that sum to unity. For example, consider the following “official” proportions of M&M colors. Chapter 14 – Chi-Square Test for Goodness-of-Fit • Multinomial Distribution

  29. The hypotheses are H0: p1 = .30, p2 = .20, p3 = .10, p4 = .10, p5 = .10, p6 = .20H1: At least one of the pj differs from the hypothesized value No parameters are estimated (m = 0) and there are c = 6 classes, so the degrees of freedom aren = c – m – 1 = 6 – 0 – 1 = 5 degrees of freedom Our test statistic (from the table on the previous slide) is c2 = 12.2424. We will compare this to the appropriate critical point of the chi-square distribution with n = 5 d.f. Chapter 14 – Chi-Square Test for Goodness-of-Fit Multinomial Distribution:

  30. Clicker Our test statistic for the M&Ms example was c2 = 12.2424. Under H0, this statistic has a chi-square distribution with n = 5 d.f. Use the chi-square table to bound the p-value for this hypothesis test. (A) 0.005 < p-value < 0.01 (B) 0.01 < p-value < 0.025 (C) 0.025 < p-value < 0.05 (D) 0.05 < p-value < 0.10

  31. The uniform goodness-of-fit test is a special case of the multinomial in which every value has the same chance of occurrence. The chi-square test for a uniform distribution compares all c groups simultaneously. The hypotheses are: H0: p1 = p2 = …, pc = 1/cH1: Not all pj are equal Chapter 14 – Chi-Square Test for Goodness-of-Fit Uniform Distribution:

  32. The test can be performed on data that are already tabulated into groups. Calculate the expected frequency eijfor each cell. The degrees of freedom are n = c – 1 since there are no parameters for the uniform distribution. Obtain the critical value c2a from Appendix E for the desired level of significance a. The p-value can be obtained from Excel. Reject H0 if p-value <a. Chapter 14 – Chi-Square Test for Goodness-of-Fit Uniform GOF Test: Grouped Data

  33. First form c bins of equal width and create a frequency distribution. Calculate the observed frequency fj for each bin. Define ej= n/c. Perform the chi-square calculations. The degrees of freedom are n = c – 1 since there are no parameters for the uniform distribution. Obtain the critical value from Appendix E for a given significance level a and make the decision. Chapter 14 – Chi-Square Test for Goodness-of-Fit Uniform GOF Test: Raw Data

  34. Maximize the test’s power by defining bin width as As a result, the expected frequencies will be as large as possible. s = [(b – a + 1)2 – 1)/12 Chapter 14 – Chi-Square Test for Goodness-of-Fit Uniform GOF Test: Raw Data • Calculate the mean and standard deviation of the uniform distribution as: • m = (a + b)/2 • If the data are not skewed and the sample size is large (n > 30), then the mean is approximately normally distributed. • So, test the hypothesized uniform mean using

  35. Chapter 14 – Chi-Square Test for Goodness-of-Fit General GOF Tests: • Goodness-of-Fit tests can be constructed in a similar manner for other distributions (Poisson, Normal, etc.) • We will generally conduct these tests using a software package.

More Related