1 / 31

Sampling & Large Sample Test

Sampling & Large Sample Test. Compiled by Dr. Kunal Pathak. Introduction. Sampling is a part of our day to life which we use knowingly or unknowingly. e.g. a housewife takes one or two grains of rice from the cooking pan and decides the rice is cooked or not,

mickeyj
Télécharger la présentation

Sampling & Large Sample Test

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Sampling &Large Sample Test Compiled by Dr. Kunal Pathak

  2. Introduction • Sampling is a part of our day to life which we use knowingly or unknowingly. e.g. a housewife takes one or two grains of rice from the cooking pan and decides the rice is cooked or not, a quality controller takes a few items and decides whether the lot is in accordance with the desired specification or not, a pathologist takes few drops of blood and tests for any change in blood of the whole body than the normal. In all these situation sampling is unavoidable and give satisfactory results. • Complete enumeration (which is time consuming and expensive, required more skilled and technical personnel, having more error) is used only in case of small population.

  3. Population • Population is a group of items, units or subjects which is under reference of study. • Type of population: Population can be classified into four categories • Finite population:number of units constituting the population is fixed and limited (e.g. worker in a factory, students in a college etc.) • Infinite population: infinite number of items (e.g. all real numbers lying between 20 to 50, the population of stars in the sky etc.)

  4. Real population: consisting of the items, which are all present physically is termed as real population. • Hypothetical population: consisting of the results of repeated trials is named as hypothetical population. (e.g. tossing a coin repeatedly results into a hypothetical population of head and tails)

  5. Need of Sampling The situation in which the sampling is compulsory: • When population is infinite • When the items or unit is destroyed under investigation • When the results are required in short time • When resources for survey are limited particularly in respect of money and trained persons • When the area of survey is wide

  6. Sample and Sampling methods • Sample: Sample is a part or fraction of a population selected on some basis. • Sampling methods:The manner or scheme through which the required number of units is selected from the population. • Simple random sampling • Stratified random sampling • Systematic sampling

  7. Terminology • Parameter: Any populationconstant is called parameter. e.g. mean and variance • Statistics: A statistics is a function of observable random variable and does not involve any unknown parameter (i.e. sample constants are known as statistics)

  8. Estimator: It is a function of variate for estimating the population parameters. It is expressed as a function of sample variate. An estimator itself a random variable and can have any value within its domain. e.g. • Estimate: A particular value of an estimator from a fixed set of values of a random variable is known as estimate.

  9. Statistical Hypothesis:It is some statement/assertion about a population/equivalently about the probability distribution characterizing a population, which we want to verify on the basis of information available from a sample. • Test of Statistical Hypothesis: It is a two action decision problem after the experiment sample values have been obtained - the acceptance or rejection of the hypothesis under consideration.

  10. Null Hypothesis (H0):In hypothesis testing, a statistician or decision-maker should be completely impartial and should not allow his personal views to influence the decision. Much therefore, depends upon how the hypothesis is framed. e.g. Let us suppose that the bulb manufactured under some standard process have average life μ hours and it is proposed to test a new procedure for manufacturing light bulbs. Thus we have two populations of bulbs, those manufactured by standard process, and those manufactured by new process. In this problem the following three hypotheses may be setup: i) New process is better than standard process. ii) New process is inferior to standard process. iii) There is no difference between the two processes.

  11. Alternative Hypothesis: The rejection of null hypothesis i.e. H0implies that it is rejected in favor of some other hypothesis which is accepted. A hypothesis which is accepted in the event of H0being rejected is called the alternative hypothesis and it is denoted by HA.

  12. Critical region: Let x1, x2, …,xn be All the values of the random sample observations denoted by O. Owill be aggregate of a sample and they constitute a space is called the sample space denoted by S. Since the sample values x1, x2, …,xn can be taken as a point in n- dimensional space, we specify some region of the n- dimensional space and see whether this point lies within this region. We divide the whole sample S space into two disjoint parts W (critical/rejection region) and S-W (acceptance region). The null hypothesis H0 is rejected if it the observed sample points fall in W other wise accept accepted.

  13. Errors in statistical inference:The conclusion drawn on the basis of a particular sample may not always true in respect of the population. The four possible situations that arise in any test procedure are given in the following table

  14. Level of significance: The probability of type one error is called level of significance and is denoted by α. • One tail and two tail tests: in any test, the critical region is represented by a portion of the area under the probability curve of the sampling distribution of the statistic. One tail: | Two tails: |

  15. Procedure for testing of hypothesis • Null Hypothesis: Setup null hypothesis H0 • Alternate Hypothesis: Set up the alternative hypothesis HA.This will enable us to decide whether we have to use single-tailed (left or right) test or two-tailed test. • Level of Significance: Chose appropriate depending upon the reliability of the estimates and permissible risk level of significance (α). This is to be decided before sample is drawn, i.e. α is fixed in advance. • Perform the test: Compute the test statistic • Conclusion: We compare the computed value of test statistic in step (iv) with the significant value at the given level of significance, if it fall in critical region reject and if it fall in acceptance region accept null hypothesis.

  16. Large Sample Tests • We have seen that for large values of n, the number trials, almost all the distributions (Binomial& Poisson) are very closely approximated by normal distribution. Thus in this case we apply the normal test, based on the following property of the normal probability curve. If X~N(μ,σ^2), then

  17. Two Tailed Test • P(-3<=Z<=3)=0.9973 => P(|Z|>3)=1-P(|Z|<3)=0.0027 i.e. in all probability we should expect a Z variate to lie in (-3,3). • P(-1.96<=Z<=1.96)=0.95 => P(|Z|>1.96)=1-P(|Z|<1.96)= 0.05 and P(-2.58<=Z<=2.58)=0.99 => P(|Z|>2.58)=1-P(|Z|<2.58)= 0.01 i.e. the significant values of Z at 5% and 1% level of significance for a two tailed test are 1.96 and 2.58 respectively.

  18. Notes: For two tailed test • If |Z|>3: H0 is always rejected • If |Z|<=3:H0 is accepted at certain level of significance • If |Z|<=3 &|Z|>1.96 : H0 is rejected with 5% level of significance • If |Z|<=3 &|Z|<=1.96 : H0 is accepted with 5% level of significance • If |Z|<=3 &|Z|>2.58 : H0 is rejected with 1% level of significance • If |Z|<=3 &|Z|<=2.58: H0 is acceptedwith 1% level of significance

  19. One Tailed Test • P(-3<=Z<=3)=0.9973 => P(|Z|>3)=1-P(|Z|<3)=0.0027 i.e. in all probability we should expect a Z variate to lie in (-3,3). • P(Z>1.645)=0.5-P(0<=Z<=1.645)=0.5-0.45= 0.05 and P(Z>2.33)=0.5-P(0<=Z<=2.33)=0.5-0.49= 0.01 i.e. the significant values of Z at 5% and 1% level of significance for a one tailed test are 1.645 and 2.33 respectively.

  20. Notes: For one tailed test • If |Z|>3: H0 is always rejected • If |Z|<=3:H0 is accepted at certain level of significance • If |Z|<=3 &|Z|>1.645 : H0 is rejected with 5% level of significance • If |Z|<=3 &|Z|<=1.645 : H0 is accepted with 5% level of significance • If |Z|<=3 &|Z|>2.33 : H0 is rejected with 1% level of significance • If |Z|<=3 &|Z|<=2.33: H0 is acceptedwith 1% level of significance

  21. Important Remark: • Theoretically: n > 100 for large sample • Practically: n > 30 for large sample

  22. Statistics and normal Z- variate: • Single Mean: • Difference of Means:

  23. Single S.D.: • Difference of two S.D.:

  24. Examples: 1. Ten individuals are chosen at random from normal population and their heights are found to be 63, 63, 66, 67, 68, 69, 70, 70, 71, 71 inches. Test if the sample belongs to the population whose mean height is 66 at 5% significant level. Solution: n = 10, μ= 66 (Population mean) H0: μ1 (sample mean) = μ HA: μ1 != μ (Two Tailed)

  25. μ1 = 67.8, σ = 2.86 Z= 1.99 > 1.96 (critical value at 5% level of significance) Hence, H0 is rejected.

  26. 2. An insurance agent has claimed that the average age of policy-holders who insure through him is less than the average for all agents, which is 30.5years. A random sample of 100 policy holders who had insured through him gave the following age distribution: Test his claim at the 5% level of significance. Solution:

  27. H0: μ1 (sample mean) = μ (population mean) • HA: μ1 (sample mean) < μ(population mean) {One tailed: left} • μ1 = 28.8, s = 6.35, μ = 30.5 • Z=-2.681 < -1.645  Reject H0  Claim of agent is accepted.

  28. 3. The mean height of 50 male students who showed above average participation in college athletics was 68.2 inches with a standard deviation of 2.5inches; while 50 male students who showed no interest in such participation had mean height 67.5 inches with a standard deviation of 2.8 inches. Test the hypothesis that male students who participated in college athletics are taller than the other male students at 5% significant level. Solution:

  29. X1 :height of athletic participants, X2: height of non athletic participants n1 = 50, μ1 = 68.2, s1 = 2.5, n1 = 50, μ2 = 67.5, s1 = 2.8 H0: μ1 = μ2 HA: μ1 > μ2 (one tailed: right tailed) Z= 1.32 < 1.645 (critical value at 5% level of significance) Hence, H0 is acceptable.

  30. Thank You

More Related