340 likes | 743 Vues
“Chi-Square Statistics”. By Namrata Khemka. Table of Contents. What is Chi-Square? When and why is Chi-Square used? Limitations/Restrictions of Chi-Square Examples References. What is “Chi Square”. Invented by Pearson Test for “Goodness of fit” Tests for independence of variables
E N D
“Chi-Square Statistics” By Namrata Khemka
Table of Contents • What is Chi-Square? • When and why is Chi-Square used? • Limitations/Restrictions of Chi-Square • Examples • References
What is “Chi Square” • Invented by Pearson • Test for “Goodness of fit” • Tests for independence of variables • Non parametric test
Parametric data Numerical scores Manipulate the scores Example Average height of people in 10 cities Non Parametric data Nominal data Scores not manipulated Example How many people are over 6ft and how many are below in 2 cities Parametric vs. Non Parametric Data
What is “Chi Square” • Invented by Pearson • Test for “Goodness of fit” • Tests for independence of variables • Non parametric test • Analyze categorical or measurement data • SPSS or Excel
Null Hypothesis Observed frequency Expected frequencies Good Fit Poor Fit Sum of observed frequencies = sum of expected frequencies. Goodness of the Fit
Computational Steps • Scenario
A movie theater owner would like to know the factors involved in movie selection by people. A sample of 50 people were asked, which of the following were important to them. Scenario: • They may choose one of the following: • Actors • Directors • Time the movies is playing • Genre
Question • Do any of these factors play a greater role than the others?
Computational Steps • Scenario • Threshold Value = 0.05 • Null Hypothesis
Null Hypothesis • There is no difference in the importance of these 4 factors in determining which movie is selected
Computational Steps • Scenario • Threshold Value = 0.05 • Null Hypothesis • Observed Frequencies • Expected Frequencies • p-value
Since p is < 0.05, we reject the null hypothesis. There fore, some of the factors are mentioned more than others in response to movie selection Interpret the Results
Test of Independence • Examines the extent to which two variables are related • Example
Scenario: • University of Calgary is interested in determining whether or not there is a relationship between educational level and the number of flights taken each year. • 150 travelers in the airport were interviewed and the results are:
Computational Steps • Scenario • Threshold Value = 0.05 • Null Hypothesis
Null Hypothesis • The educational level of the travelers and the number of flights are independent of one another.
Computational Steps • Scenario • Threshold Value = 0.05 • Null Hypothesis • Observed Frequencies • Expected Frequencies • p-value
Since p is < 0.05, we reject the null hypothesis. These 2 variables are not independent of one another. Thus, the educational level of travelers and the number of flights they take are related Interpret the Results
Requirements and Limitations • Random sampling • Data must be in raw frequencies • Independence of observations • Size of the expected frequencies • Collapsing values
Calculation - Details • Fo – fe • (Fo – fe)2 • ((Fo – fe)2)/fe • Chi-square = SUM((Fo – fe)2)/fe • Calculate the degrees of freedom = (R-1) (C-1)
Chi-square = SUM((Fo – fe)2)/fe 7.1111 Calculate the degrees of freedom = (R-1) (C-1) (2-1)(2-1) = 1 Calculation – Continued
Conclusion • What is chi-square • When should chi-square be used • Limitations of Chi-square • Examples • Resources
References • www.ling.upenn.edu/courses/Summer_2002/ling102/chisq.html • Statistical techniques in business and economics by Lind, Marchal and Mason • Statistics for the behavioral sciences by Federick J. Gravetter and Larry B. Wallnau