Chi Square Test Dr. Asif Rehman
Outline • Types of Variables • Quantitative Data Assessment (parametric) • Descriptive assessment • T-test • Qualitative Data Assessment (Non parametric) • Descriptive Assessment • Chi Square test(Fisher Exact test)
Types of Data • Quantitative data or numerical data • Qualitative or Categorical data • Nominal Data(unordered, Do not represent any amount) • Sex (male ,female) • Marital status (Married, Unmarried) • Blood group (O, A, AB, B) • Color of eyes (blue, green, brown ,Black) • Nationality of a person (Pakistani, American, Turkish) • Ordinal data(ordered) • Measurement of height (tall, medium, short ) • Degree of pain (mild, Moderate, severe) • Size of garment (large, medium ,small )
Chi square test are done when; • Chi square test is used when both variables are measured on a nominal scale • It can be applied to interval or ratio data that have been categorized in to a small number of groups • It assumes that the observations are randomly sampled from the population • All observations are independent (an individual can appear only once in a table and there are no overlapping categories)
Categorical data assessment • Chi Square test(X2) Compares observed and expected frequencies. • This test is applied to compare two or more than two proportions to test whether there is significant association between two are not • It is non parametric test, but is included in traditional methods of parametric tests
Chi Square test • The chi-square test is always testing what scientists call theNull Hypothesis,which states that there is no significant difference between the expected and observed result. • For estimating how closely an observed distribution matches an expected distribution • For estimating whether two random variables are independent.
Conducting Chi-Square Analysis • Make a hypothesis based on your basic research question • Determine the expected frequencies • Create a table with observed frequencies, expected frequencies, and chi-square values using the formula: (O - E)2 E • Find the degrees of freedom: (C - 1)( R - 1) • Find the chi-square statistic in the Chi-Square Distribution table • If chi-square statistic > your calculated chi-square value, you do not reject your null hypothesis and vice versa.
Chi Square Test steps The 5 Steps in a Chi-Square Test: • Step 1: Write the null and alternative hypothesis. H0: There is no relationship between the variables. Ha: There is a relationship between the variables. • Step 2: Check conditions. A) All expected counts should be > 1. B) At least 80% of expected counts should > 5
Chi Square Test steps • Step 3: Calculate Test Statistic and p-value. The test statistic measure the difference between the observed counts and the expected counts assuming independence.
Chi Square Test steps • Step 3 Cont. Find the p-value. • If the χ2- statistic is large, it implies that the observed counts are not close to the counts we would expect to see if the two variables were independent. Thus, ''large'' χ2 gives evidence against the null hypothesis, and supports the alternative. • The p-value of the chi-square test is the probability that the χ2- statistic, is as large or larger than the value we obtained if H0 is true. Also, if H0 is true, the χ2- statistic has chi-square distribution with (r-1)x(c-1) df. • Thus, the p-value for Chi-Square test is ALWAYS the area to the right of the test statistic under the curve, i.e. p-value = P(X> χ2), where X has a chi-square distribution with (r-1)x(c-1) df curve. • To get this probability we need to use a chi-square distribution with (r-1)x(c-1) df (Table ). Using Minitab, or any other statistical software, you can obtain the p-value form the output. Otherwise, you can report a range for the p-value using Table (since usually you will not be able to find the exact p-value on the table.
Chi Square Test steps • Step 4: Decide whether or not the result is statistically significant. • The results are statistically significant if the p-value is less than alpha, where alpha is the significance level (usually α = 0.05). • Step 5: Report the conclusion in the context of the situation. • Thep-valueis ______ which is< a, this result is statistically significant. Reject the H0 Conclude that (the two variables) are related. • Thep-valueis ______ which is> a, this result is NOT statistically significant. We cannot reject the H0 Cannot conclude that (the two variables) are related.
Example To see the prophylactic value of Chloroquine, a study was conducted on 3540 persons. Out of 606 persons , who were given Chloroquine prophylactically, only 19 contracted malaria. Among those who were not given prophylactic treatment 193 contracted malaria. Comment on prophylactic value of Chloroquine. (Continue)
Descriptive frequencies Total study population(n)= 3540 Chloroquine given=606 • developed malaria=19 • Did not developed malaria=587 Chloroquine not given=2934 • Contracted malaria=193 • Did not contract malaria=2741
Null Hypothesis (H0 ) • Chloroquine has no role in prevention of malaria. At the end we have to reject or Accept the hypothesis
Calculation of expected values • E= Row total x Column total/Grand total • E1 = 606x212/3540=36 • E2 = 606x3328/3540=570 • E3 = 2934x212/3540=176 • E4 = 2934x3328/3540=2758 • The greater the difference between observed and expected numbers( values),the larger the value of x2 and less likely the difference is due to chance.
2x2 contingency table Observed Expected
Calculation of degree of freedom Degree of freedom=(R - 1) x (C - 1) = (Rows-1)(Column-1) = (2 - 1)(2 - 1) = 1
Calculation of degree of freedom *If chi-square statistic > your calculated value, then you do not reject your null hypothesis. There is a significant difference that is not due to chance.
Interpretation of results by consulting X2 Table • Table value of X2 with 1 degree of freedom, at the significance level of 0.05 is 3.84 • Our calculated value of X2 is 10.26.which is more than table value of 3.84 • So we will reject the null hypothesis and will say that chloroquine does have the prophylactic role in malaria and P < 0.05. • ( the probability of occurrence of difference between two groups of persons only due to chance is <0.05 or 5%.