## Chi-Square Test

**Chi-Square Test**Chapter 7 105**Content**• test of fourfold data • test of paired fourfold data • Fisher probabilities in fourfold data • test of R×C table • Multiple comparison of sample rates • test of goodness of fit 105**objection：**to deduce if there is any discrimination of the ratio or structure ratio between two populations or among more than two populations multiple comparison of the ratio of multi-samples to deduce if there is any correlation between two class variables test of goodness of fit test statistic : fit for :qualitative data**objective：to judge if there is any discrimination of the**rate or structure ratio between two populations （equal to the u-test） demand：the number of individuals from the two samples classified into two categories should be transformed into a fourfold data 105**1 The basic idea of test**distribution （1）distribution is a continuous distribution： （2） one of the basic characters is that it can be plus to others ： 105**2. The basic idea of test**eg 7-1 one hospital want to compare the curative effect of drug A（experimental group）and drug B control group）in lowering encephalic pressure。They classified 200 patients with high encephalic pressure into two groups at random，the results are as follows (table 7-1)。So whether the effective ratio is different? 105**Table 7-1 the comparison of the efficient ratio between two**groups in lowering encephalic pressure 105**this data can be sorted into the form as chart7-2，that is**to say there are two groups disposed, the number of each of them is made up of two parts, occurred and not occurred. There are four basic data( )in the table ，and other data can be induced by them, that is why it is called fourfold table data. 105**Basic idea ：can be understood through the basic formula of**test A means actual frequency， while T means theoretical frequency。 105**The respected frequencies can be calculated by the following**formula： TRC refers to the respected frequencies in Row R and Line C nR refers to the total of the right row nC refers to the total of the right line 105**the respected frequency is set by the hypothesis**，and by the ratio after merging 。 105**the test statistic :the value of reflects**the fitness of actual frequency and respected frequency 105**from formula 7-1,we can see that the value of**also depends on the size of (exactly the size of )。 is decided by the number of the grids which can be evaluated freely, but not the sample size . 105**3. The process of hypothesis test**（1） establish hypothesis, and set the criteria of the test。 H0:π1=π2 the effective ratios of the two collectivities in lowering encephalic pressure between the experimental group and control group is equal H1:π1≠π2the ratios of them are not equal α=0.05。 105**distribution is a continuous one, while the fourfold table**data is dispersible, the valueof calculated by the latter is also dispersible, so in order toimprove the continuousness of the statistic distribution ,the continuousness correcting is needed. 105**the conditions in choosing test formula for the fourfold**table data： ，special formula； ，corrected formula； ，Fishier exact probabilities method。 the continuity correcting for test is on fit for the fourfold table data when equals to 1，while is more than one ,it shouldn’t be corrected。 105**eg 7-2 one doctor want to compare the effect of drug A and**drug B in curing cerebrovascular diseases，he classifies 78 patients with such illness into two groups at random ,the results are as follows (table 7-2),So whether the curative effect of the two drugs is the same ? 105**Table 7-2 the comparison of the efficient ratio in curing**cerebrovascular diseases with two kinds of drugs 105**in this case, ，so the corrected formula**can be used here ，through the critical value table of ,we can know that 。According to the test level 0.05, can’t be rejected ,so we can’t say that the effective ratios is different in curing cerebrovascular diseases. 105**If not corrected ，then**so the conclusion is on the contrary。 105**Section 2**-test of paired fourfold table 105**It is the same as the measurement data that there are group**design and paired design among the deduction of the differences of the two population ratios (proportions) in enumeration count data . That is fourfold table data and paired fourfold table data 105**Example 7-3,A laboratory has measured the serum antinuclear**antibodies in 58 patients with questionable systemic lupus erythematosus by latex agglutination and immunofluorescence ,according to table 7-3. Is there the difference between the two methods? 105**In the paired design experiment ,there are four possible**results of the two treatments as to the each pair: ① positive number both of the two methods( a)； ② negative number both of the two methods (d)； ③ positive number of immunofluorescence, negative number of latex agglutination (b)； ④ positive number of latex agglutination, negative number of immunofluorescence (c)。 105**a, d are the agreement of the two methodsb, c are not**agreement of the two methods Statistic: 105**Cautions:**The method is used for small sample Reasons : 1. only consider the disagreement condition (b,c) 2. not consider the sample size n and the conditions of the agreement (a,d) When the n ,a,d are large enough and the b,care relative small ,there is nothing practical significance even if there is statistical significance. 105**Section 3**Fisher exact probabilities method in 2×2 table 105**conditons：**Basis of theroy：hypergeometric distribution not test 105**Example 7-4，a doctor will study the precaution affect of**the type B hepatitis immunoglobulin against intrauterine infection of fetus, and randomized 33 positive HBsAg patients into two groups：precaution group and nonprecuation group，looking at the table 7-4.Is there the difference between the two groups on the fetus infection ratio? 105**table7-4 comparison between the two groups of fetus**infection ratio of HBV 105**1.Basic idea:**When the periphery total numbers of fourfold table are fixed, we can calculate the all combinations probabilities of the four actual frequencies, then make deduction according to the α level and the cumulative probabilities. 105**1．Calculate Pi :**combination number: minimal periphery total number +1 For example7-4，the numbers of combination: 9+1=10 105**The sum of the Pi is 1**Calculation formula: 105**2.calculation of the accumulation probabilities**If crossing decibel of existent fourfold table is a*d*－b*c*=D*, the probability is P*, than Direpresents the crossing decibel of other combination fourfold table, the probabilities are Pi. 105**One-sided test**• If the D*>0 in the existent fourfold table, we must calculate the accumulation probabilities of all on the base of Di≥D* and Pi≤P*. If D*<0, then we should calculate the accumulation probabilities on the condition of Di≤D* and Pi≤P*. 105**(2)Two-sided test**Calculate the accumulation probabilities of all assembly fourfold table which are consistent with and . If or , the sequences of all combination in the fourfold table are symmetry, we can get the two-sided accumulation probabilities only through the one-sided accumulation probabilities ×2. 105**Checking procedure (this example is n=33<40)**1、Calculate the D* and P* of existent sample fourfold table ,as well as Di of all fourfold tables, please reference the table 7-5. in this example. 2、Calculate the Pi of all fourfold table consistent with . 105**3、Calculate the accumulation probabilitis of the fourfold**tables corresponding and . In this example , , , , , and , are in line with the qualification. The accumulation probability is According to the size of test we can’t presume that the HBV infection rate of the infants which were performed precaution injection isn’t equal to that of who without pre-caution injection. 105**Table7-5 The Fisherexact probility calculating table of**theexample7-4 105**Example 7-5 Some research studies the P53 expression of**adenoma of adenocarcinoma and adenoma of gallbladder, detect P53 expression of 10 respective samples of each disease from the same time exairesis by immunohistochemistry, data were shown in Table 7-6. The problem is whether there is any significant difference between the positive rate between adenoma of adenocarcinoma and adenoma of gallbladder ? 105**Table 7-6 P53 positive expression rate between adenoma of**adenocarcinoma and adenoma of gallbladder 105