1 / 32

CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:-

CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods to be analyzed : Goodness-of-fit test: To test over assumption that some variables follow certain distribution.

Télécharger la présentation

CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:-

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. CHAPTER 5 5.1 INTRODUCTORY CHI-SQUARE TEST Objectives:- Concerning with the methods of analyzing the categorical data In chi-square test, there are 3 methods to be analyzed : • Goodness-of-fit test: To test over assumption that some variables follow certain distribution. • Independence Test To test if the variable is dependent to one another. • Homogeneity Test To test if there is a homogeneous relationship between the variables.

  2. Definition of Chi Square A measure of differences between the observed and expected frequencies is supplied by the statistic chi square, . Characteristics of Chi Square Distribution: • It is not symmetric. • The shape of chi-square distribution depends upon the degreed of freedom. • As the number degreed of freedom increases, the chi square become symmetric. • The values chi square are nonnegative.

  3. Goodness-of-fit test: • Goodness-of-fit test is an inferential procedure used to determine whether a frequency distribution follows a claimed distribution. • In Goodness-of-fit test, chi-square analysis is i. applied for the purpose of examine whether sample data could have been drawn from a population having a specific probability distribution ii. To compare an observed distribution to an expected distribution

  4. In Goodness-of-fit test, the test procedures are appropriate when the following conditions are met : i. The sampling method is simple random sampling ii. The population is at least 10 times as large as the sample iii. The variable under study is categorical iv. The expected value for each level of the variable is at least 5

  5. Test procedure to run the Goodness-of-fit test: • State the null hypothesis and alternative hypothesis • Determine: i. the level of significance, ii. The degree of freedom,

  6. Find the value of from the table of chi-square distribution • Calculate the value of Where the

  7. Determine the rejection region: i. critical value approach; Reject ii. p – value approach; • Make decision

  8. Example 5.1: The authority claims that the proportions of road accidents occurring in this country according to the categories User attitude (A), Mechanical Fault (M), Insufficient Sign Board (I) and Fate (F) are 60%, 20%, 15% and 5% respectively. A study by an independent body shows the following data Can we accept the claim at significance level Solution: 1.

  9. 2. 3. Test statistic: 4.. From chi-square distribution table

  10. 5. Rejection Region: 6. . Since . Thus we accept and conclude that we have no evidence to reject the claim.

  11. Exercise 5.1: The number of students playing truancy in a school over 200 school days is showing below If X is a random variable representing the number of students playing truancy per day, test the hypothesis that X follows the Poisson distribution with mean 3 per day at

  12. Exercise 5.2 : The probabilities of blood phenotypes A, B, AB and O in the population of all Caucasians in the US are 0.41, 0.10, 0.04 and 0.45 respectively. To determine whether or not the actual population proportions fit this set of reported probabilities, a random sample of 200 Americans were selected and their phenotypes were recorded. The observed cells are count as calculated. Test the goodness of fit of these blood phenotype probabilities at

  13. The Chi-Square Test for Homogeneity • The homogeneity test is used to determine whether several populations are similar or equal or homogeneous in some characteristics. • This test is applied to a single categorical variable from two different population

  14. The test procedure is appropriate when satisfy the below conditions : i. For each population, the sampling method is simple random sampling ii. Each population is at least 10 times as large as the sample iii. The variable under study is categorical iv. If sample data are displayed in contingency table (population x category levels), the expected value for each cell of the table is at least 5.

  15. Two dimensional contingency table layout: • The above is contingency table (r x c) where r denotes as the number of categories of the row variable, c denotes as the number of categories of the column variable • is the observed frequency in cell i, j • be the total frequency for row category i • be the total frequency for column category j • be the grand total frequency for all cell (i, j) where

  16. Test procedure to run Chi-square test for homogeneity: • State the null hypothesis and alternative hypothesis Eg: • Determine: i. the level of significance, ii. The degree of freedom, where • Find the value of from the table of chi-square distribution Determine the rejection region: i. critical value approach; Reject ii. p – value approach;

  17. Calculate the value of using the formula below: • Make decision

  18. Example 5.2: Four machines manufacture cylindrical steel pins. The pins are subjected to a diameter specification. A pin may meet the specification or it may be too thin or too thick. Pins are sampled from each machine and the number of pins in each category is counted. Table below presents the results. Test at whether the categories of pins are similar for all machines.

  19. Solution: Construct a contingency table: Calculation of the expected frequency:

  20. Testing procedure: 1. 2. 3. From table of chi-square:

  21. 4. Using the observed and expected frequency in the contingency table, we calculate using the formula given:

  22. Exercise 5.3: 200 female owners and 200 male owners of Proton cars selected at random and the color of their cars are noted. The following data shows the results: Use a 1% significance level to test whether the proportions of color preference are the same for female and male.

  23. Chi-Square Test for Independence • This test is applied to a single population which has categorical variables • To determine whether there is a significant association between the two variables. • Eg : In an election survey, voter might be classified by gender (female and male) and voting preferences (democrate ,republican or independent) . This test is used to determine whether gender is related to voting preferences.

  24. The test is appropriated if the following are met : 1. The sampling method is simple random sampling ii. Each population is at least 10 times as large as the sample iii. The variable under study is categorical iv. If sample data are displayed in contingency table (population x category levels), the expected value for each cell of the table is at least 5.

  25. Note: The procedure for the Chi-square test for independence is the same as the Chi-square test for homogeneity. The only different between these two test is at the determination of the null and alternative hypothesis. The rest of the procedure are the same for both tests. This theorem is useful in testing the following hypothesis:

  26. Example 5.3: Insomnia is disease where a person finds it hard to sleep at night. A study is conducted to determine whether the two attributes, smoking habit and insomnia disease are dependent. The following data set was obtained. Use a 5% significance level to conduct the study.

  27. Solution: 1. 2. 3 From table of chi-square:

  28. 4. Using the observed and expected frequency in the contingency table, we calculate using the formula given:

  29. 5. Since

  30. Exercise 5.4: A study is conducted to determine whether student’s academic performance are independent of their active in co-curricular activities. The following data set was obtained: Use a 5% significance level to conduct the study.

  31. Exercise 5.5: A total of n = 309 furniture defects were recorded and the defects were classified into four types: A,B,C,D. At the same time, each piece of furniture was identified by the production shift in which it was manufactured. Test at 5% significance level types of defects and furniture are independence. These counts are presented in table below:

More Related