1 / 19

Comparing Samples

Learn about sampling distributions, critical cut points, hypothesis testing, T-distribution, ANOVA, and more in statistical analysis.

danuta
Télécharger la présentation

Comparing Samples

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Comparing Samples

  2. Last Time • I talked about what could go wrong in an experiment where you compared a sample mean against a population with a known population mean and standard deviation. • You will build a sampling (null) distribution and set an alpha level using the population values. The population has a fixed amount of variability (SD) but the variability in the sample statistics is affected by the sample size. The smaller the sample size, the more variability in sample statistics.

  3. Example with SE of Means • SAS EG example of simulating a population • Draw a single sample, get the mean • Get many means • Calculate the mean and SD from these means • Compare vs. theoretical distribution

  4. Critical Cut Points • Given the hypothetical mean, standard deviation and sample size, you then determine what is such an usual sample that you would reject your null hypothesis (the null distribution).

  5. Alpha and Beta Again • If your sample data came from a different population, you will guess that the data for this (sub) population is centered around your sample mean and the distribution will not completely overlap the null distribution. The part of the alternative distribution (area under the curve) which does not overlap the null distribution is the power.

  6. Graphical Example • Here is an R example of cut points in the theoretic distribution and how the alternate distribution overlaps with the null distribution:

  7. Comparing Means • In reality, you will almost never have a known population mean and standard deviation and compare your sample against that. You will likely have a hypothetical population mean and you will want to see if your sample was likely to have come from the set of sample means distributed around that hypothetical population mean. Conceptually it is the same task but the shape of the sampling distribution is different when you don’t know the population SD.

  8. Gossett described the function that describes the distribution for when you are comparing means and estimating the population SD from the sample. • He figured it out while working at a brewery that would not let him publish under his own name so he published it under the name Students and called the distribution T. (Was he thinking tea?) • The T distribution describes the samples when you don’t know the population standard deviation. There is extra uncertainty and that is manifested as a wider (and fatter-tailed) looking distribution.

  9. Student’s T T with 5 df

  10. Asymptotic T • As your sample size gets bigger the T distribution looks more and more like a Z distribution. N of 30 is essentially indistinguishable from a Z.

  11. Calculate It • To do the t-test is trivially easy. First load the data into an analysis package. Graph it and then do the one sample t-test. • See the example SAS Enterprise Guide project. • The formula for the statistic sure looks familiar…

  12. Two Samples • If you have two samples, the formula gets a bit more complicated. Instead of using a single sample to get the guess for the population variability, you have two and if the samples are not of the same size, you want to put more trust (weight) in the larger sample.

  13. Estimated Variance • Basically you take the weighted average, with a tweak to the denominator to consider you are estimating population parameters in the formula.

  14. The T-Statistic

  15. Paired samples? • What is your variance like if you sample the same person before and after a treatment relative to if you sampled two different people? • Smaller

  16. ANOVA • To compare three or more groups you will want to use a method called ANOVA. Analysis of variance is baffling when you first see the algebra because you are looking for differences in group means by comparing variances.

  17. How ANOVA Works • Begin by looking at the overall variability in your data vs. the overall mean. Then look at the variability in your data if you compare relative to the subgroups. If there is no meaningful effect of the treatments, the overall variability will look like the variability relative to the subgroups.

  18. Reduced Variance • With the T or Z distributions you get excited if your sample mean is far from the proposed population mean. Here, you get excited if the ratio of the two variances is far from 1. You need a distribution that can describe the ratio of two variances. That distribution is the F. It has a parameter to describe the number of subjects in the two halves of the fraction.

More Related