 Download Download Presentation Supplement 12: Finite-Population Correction Factor

# Supplement 12: Finite-Population Correction Factor

Télécharger la présentation ## Supplement 12: Finite-Population Correction Factor

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -
##### Presentation Transcript

1. Supplement 12:Finite-Population Correction Factor Ka-fu WONG & Yi KE *The ppt is a joint effort: Mr KE Yi discussed whether we should make the finite population correction with Dr. Ka-fu Wong on 19 April 2007; Ka-fu explained the problem; Yi drafted the ppt; Ka-fu revised it. Use it at your own risks. Comments, if any, should be sent to kafuwong@econ.hku.hk.

2. Binomial distribution • The binomial distribution has the following characteristics: • An outcome of an experiment is classified into one of two mutually exclusive categories, such as a success or failure. • The data collected are the results of counts in a series of trials. • The probability of success stays the same for each trial. • The trials are independent. • For example, tossing an unfair coin three times. • H is labeled success and T is labeled failure. • The data collected are number of H in the three tosses. • The probability of H stays the same for each toss. • The results of the tosses are independent.

3. Hypergeometric Distribution • The hypergeometric distribution has the following characteristics: • There are only 2 possible outcomes. • The probability of a success is not the same on each trial. • It results from a count of the number of successes in a fixed number of trials.

4. Example: Hypergeometric Distribution In a bag containing 7 red chips and 5 blue chips you select 2 chips one after the other without replacement. 6/11 R2 R1 7/12 B2 5/11 R2 7/11 B1 5/12 4/11 B2 The probability of a success (red chip) is not the same on each trial.

5. Binomial vs. hypergeometric • The formula for the binomial probability distribution is: • P(x) = C(n,x) px(1- p)n-x • The formula for the hypergeometric probability distribution is: • P(x) = C(S,x)C(N-S,n-x)/C(N,n)

6. Example: Hypergeometric Distribution In a bag containing 700 red chips and 500 blue chips you select 2 chips one after the other without replacement. R2 699/1199 R1 700/1200 500/1199 B2 R2 700/1199 B1 500/1200 499/1199 B2 The probability of a success (red chip) is not the same on each trial but the difference is very small. So small that our conclusion will not be affected much even if we ignore the difference in probability in the two trials.

7. Example: Hypergeometric Distribution In a bag containing 700 red chips and 500 blue chips you select 2 chips one after the other without replacement. The probability of a success (red chip) is not the same on each trial but the difference is very small. So small that our conclusion will not be affected even if we ignore the difference in probability in the two trials. Knowing when we can ignore the difference is important when 1. we have limited computation power. 2. N is unknown but we know that N is somewhat large relative to sample size.

8. Finite? Infinite? • Actually, most population we are working with is finite: • Number of goods produced in a workshop per hour • The number of times you visit a club per month • Even the population of a big city, say Shanghai ALL FINITE!!

9. Cost and benefits • Would it be harmful to make correction every time you came across a finite population? • No harm! • But the correction can be costly: • Need to collect / know the population size. • Need additional computational power. • Need to remember one additional formula • Whether to recognize the finite population (and hence whether to use a different formula) depends on the benefits and costs of doing so.

10. Cost and benefits • Whether to recognize the finite population (and hence whether to use a different formula) depends on the benefits and costs of doing so. • Benefits increases with the proportion of sample size to population size. • Cost is higher when • We do not have a computer with us. • When we need to make extra effort to find out N. Use an approximation instead of the exact formula if Cost of using exact formula > benefit

11. Finite? Infinite? • Actually, most population we are working with is finite: • Number of goods produced in a workshop per hour • The number of times you visit a club per month • Even the population of a big city, say Shanghai ALL FINITE!! But… do we know N ?

12. An Example: benefits decreases when n/N decreases • A population of size N=500 (finite),  is known • When the size of sample n=300 • With finite population correction the standard error of the sample is: • Why? =0.633 make a big difference. The benefit of recognizing the finite population (in terms of the conclusion of testing hypothesis and constructing CI) appears big.

13. An Example: benefits decreases when n/N decreases • A population of size N=500 (finite),  is known • When the size of sample n=5 With finite population correction the standard error of the sample is: • Why? =0.996 make a small difference The benefit of recognizing the finite population (in terms of the conclusion of testing hypothesis and constructing CI) appears small.

14. N relative to n • 1st case, 500 is relative small when the sample size is 300, correction is needed • 2nd case, 500 is relative large when the sample size is 5, correction can be ignored (a finite population can be treated as infinite) Notes: • Usually, If n/N < .05, the finite-population correction factor is ignored • This applies to sample proportions as well