Chi-Square Distributions: Explanation and Testing Independence

Understandable StatisticsSeventh EditionBy Brase and BrasePrepared by: Lynn SmithGloucester County College Chapter Eleven Part 1 (Section 11.1) Chi-Square and F Distributions

The Chi-Square Distribution The 2 Distribution is not symmetrical and depends on the number of degrees of freedom.

 is the Greek Letter Chi.

The 2 Distribution for d.f. = 3 1 2 3 4 5 6 7 8 9 10 n

The 2 Distribution for d.f. = 5 d.f. = 3 1 2 3 4 5 6 7 8 9 10 n

The 2 Distribution for d.f. = 10 d.f. = 3 d.f. = 5 1 2 3 4 5 6 7 8 9 10 n

The mode or high point occurs over n – 2 for n  3. d.f. = 3 d.f. = 5 d.f. = 10 1 2 3 4 5 6 7 8 9 10 n

d.f. = 3 d.f. = 5 d.f. = 10 1 2 3 4 5 6 7 8 9 10 n As the degrees of freedom increase, the graphs looks more bell-like and symmetric.

Use Table 7 in Appendix II to find Critical Values of 2 Distributions

Area in the Right Tail of the Distribution =   2

Use Table 7 (with d.f. = 8) to find the area to the right of 2 = 2.73.  =

Chi Square: Tests of Independence To test the independence of two factors, use a contingency table.

Contingency Table

Shaded boxes (called “cells”) will contain frequencies.

Horizontal lines of cells are called rows.

Vertical lines of cells are called columns.

The size of a table is given as row X column.

This is a 3 X 3 contingency table.

When giving the size of a contingency table, Always give the number of rows first.

Suppose we wish to determine (at 5% level of significance) if the time it takes to complete a given task is independent of gender.

Number and gender of individuals who completed a task in the times indicated.

To test the null hypothesis that gender and the time it takes to complete the task are independent: H0: Variables are independent. H1: Variables are not independent. Use the null hypothesis to determine the expected frequency of each cell.

Expected Frequency

Finding the Expected FrequenciesE = (Row total)(Column total)Sample size Sample size

Finding the Expected FrequenciesE = (Row total)(Column total)sample size Sample size

Finding the Expected FrequenciesE = (Row total)(Column total)sample size

The actual frequency which occurred is called the observed frequency, O.

The Sample Statistic 2 Chi square is a measure of the sum of the differences between observed frequency O and expected frequency E in each cell.

Difference Between Observed and Expected Frequencies

The Sum of the (O – E) Column Will Equal Zero.

To calculate Chi Square, we use the values (O – E)2/E • To reflect the magnitude of the differences between the observed and expected frequencies. • To reflect the fact that the small difference between the observed and expected frequencies is more important when the expected frequency is small.

Computing 2

Degrees of Freedom d.f. = (R – 1)(C – 1) R = number of cell rows C = number of cell columns

For our example: R = 2, C = 3 d.f. = (2 – 1)(3 – 1) = 2

Using d.f. = 2 and  = 0.05, find the critical value of 2 from Table 7.

If the sample statistic is larger than the critical value, reject the null hypothesis of independence. In our example, the sample statistic 2 = 10.36 . The critical value = 5.99.

Conclusion Reject the null hypothesis of independence. We conclude that the time it takes to complete the task is not independent of gender.

P Value Approach • In our example, the sample statistic 2 = 10.36 . • For d.f. = 2, the sample statistic 2 = 10.36 falls between 9.21 and 10.60 (the critical values for  = .010 and .005 respectively). • We conclude that 0.005 < P < 0.010. • We would reject H0 for any   P. • We, therefore reject H0 for  = 0.05.

In order to safely use the critical values of 2 from Table 7, we must assure that all expected frequencies are greater than or equal to five. If this condition is not met, the sample size should be increased.

Using Chi-Square Distribution to Test the Independence of Two Variables • Set up the hypotheses H0: The variables are independent. H1: The variables are not independent. • Compute the expected frequency for each cell in the contingency table.

Using Chi-Square Distribution to Test the Independence of Two Variables • Compute the statistic 2 for the sample.

Using Chi-Square Distribution to Test the Independence of Two Variables • Find the critical value 2 in Table 7. Use the level of significance  and degrees of freedom: d.f. = (R – 1)(C – 1) where R and C are the numbers of rows and columns of cells. • The critical region = all values of 2 to the right of the critical value 2 .

Using Chi-Square Distribution to Test the Independence of Two Variables • Compare the sample statistic 2 with the critical value 2 . • If the sample statistic is larger than the critical value, reject the null hypothesis of independence. • Otherwise, do not reject the null hypothesis.

Chi-Square Distributions: Explanation and Testing Independence

Chi-Square Distributions: Explanation and Testing Independence

Presentation Transcript

Chapter Eleven

Section 11.1

Chapter 11

CHAPTER ELEVEN OVERVIEW

Section 11.1 Conics

Chapter 18 Section 1 Part 1

Chapter 4: Part 1

CHAPTER ELEVEN

Chapter Menu

CHAPTER 11 – PART A

Chapter Eleven: Heat

Chapter 11

11.1 Section Objectives – page 281

Chapter 11 - 1

CHAPTER ELEVEN OVERVIEW

Section 11.1

n Chapter 11 Management Skills

Chapter Menu