Significance Tests

Significance Tests …and their significance

Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean self-esteem. If the mean should be 25, you might get this. Self-esteem 15 20 25 30 35 40

Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean self-esteem. If the mean should be 25, you might get this. The sample means would stack up in a normal curve. A normal sampling distribution. z -3 -2 -1 0 1 2 3 Self-esteem 15 20 25 30 35 40

Significance Tests Remember how a sampling distribution of means is created? Take a sample of size 500 from the US. Record the mean self-esteem. If the mean should be 25, you might get this. The sample means would stack up in a normal curve. A normal sampling distribution. 2.5% 2.5% z -3 -2 -1 0 1 2 3 Self-esteem 15 20 25 30 35 40

Significance Tests The sample size affects the sampling distribution: Standard error = population standard deviation / square root of sample size Y-bar= /n s.e. = pop. Sd./ n But in fact we use our sample’s standard deviation as an estimate of the population’s.

Σ(Y – Y-bar)2 n - 1

Significance Tests And if we increase our sample size (n)… Our repeated sample means will be closer to the true mean: 2.5% 2.5% Z-3 -2 -1 0 1 2 3 z -3 -2 -1 0 1 2 3

Significance Tests Means will be closer to the true mean, and our standard error of the sampling distribution is smaller: 2.5% 2.5% Z-3 -2 -1 0 1 2 3 z -3 -2 -1 0 1 2 3

Significance Tests The range of particular middle percentages gets smaller: Self-esteem 15 20 25 30 35 40 Z-3 -2 -1 0 1 2 3 95% Range z -3 -2 -1 0 1 2 3

Significance Tests • We use that measuring stick to say two things: • If my sample is in the middle specified percent, the population’s mean is within this range. (Confidence Interval) • Besides construct a confidence interval, we can also do a significance test. 1.96z 95% 1z 68% 3z 99.99%  -3 -1.96 -1 0 1 1.96 3 68% 95% 99.99%

Significance Tests • We use that measuring stick to say two things: • If my sample is in the middle specified percent, the population’s mean is within this range. (Confidence Interval) • If the population mean is the same as a guess of mine, then my sample’s mean would have to fall within this range to have been drawn from the middle specified percent. (Significance Test) 1.96z 95% 1z 68% 3z 99.99%  -3 -1.96 -1 0 1 1.96 3 68% 95% 99.99%

Significance Tests • We know that if you have your sampling distribution centered on the population mean: • 16% of samples’ means would be larger than  + 1z and 16% would be smaller than  - 1z, for a total of 32% outside that range. 1z 68%  -3 -1.96 -1 0 1 1.96 3 68%

Significance Tests • We know that if you have your sampling distribution centered on the population mean: • 2.5% of samples’ means would be larger than  + 1.96z and 2.5% would be smaller than  - 1.96z, for a total of 5% outside that range. 1.96z 95%  -3 -1.96 -1 0 1 1.96 3 95%

Significance Tests • We know that if you have your sampling distribution centered on the population mean: • 0.005% of samples’ means would be larger than  + 3z and 0.005% would be smaller than  - 3z, for a total of 0.01% outside that range. 3z 99.99%  -3 -1.96 -1 0 1 1.96 3 99.99%

Significance Tests But you remember that we don’t normally know the actual mean for the population. But what if we guessed? What if we specified a value that might be the population mean?

Significance Tests If we guessed a mean… If our guess is correct, our sample’s mean should be among the common samples that would have been drawn from a population with that guessed mean. If it is not, it is likely that the sample did not come from such a population. guess  -3 -1.96 -1 0 1 1.96 3 What if my sample’s mean were here?

Significance Tests One way to tell whether our sample’s mean was generated by such a population is to place our sampling distribution over the guessed mean to see if the sample mean is among the middle 99% or 95% of samples that would be generated by such a mean. 1.96z 95% What if my sample’s mean were here? It is among the rare 5% of possible means.  -3 -1.96 -1 0 1 1.96 3 guess 95%

Significance Tests Essentially, a significance test for a mean tells you what the odds are that your sample mean could have come from a population whose mean equals your guess. 1.96z 95% What if my sample’s mean were here? It is among the rare 5% of possible means.  -3 -1.96 -1 0 1 1.96 3 guess 95%

Significance Tests What you do is figure out what your sample’s z-score is relative to your guessed mean. If z is larger than 1.96 or smaller than -1.96, you have less than a 5% chance than your sample came from such a “guess population” —reject the guess! Essentially, a significance test for a mean tells you what the odds are that your sample mean could have come a population with a particular mean.  -3 -1.96 -1 0 1 1.96 3 guess 95% Sample mean

Significance Tests For example: If our guess was that self-doubt scores in the population averaged 20 on a scale from 1 – 50, we’d place a guess as below. Self-doubt 16 18 20 22 24 26 28

Significance Tests We guess 20, but our sample of size 100 has a mean of 25 and a standard deviation of 10. Guess,  Sample, Y-bar Self-doubt 16 18 20 22 24 26 28

Significance Tests Let’s build a sampling distribution around our guess, 20: sample of size 100; s.d. = 10. Sample, Y-bar s.e. = 10/100 = 10/10 = 1 Self-doubt 16 18 20 22 24 26 28 Z: -3 -2 -1 0 1 2 3 4 5

Significance Tests Our sample appears to be larger than a critical value of 1.96 (outer 5% of samples) or even 2.58 (outer 1% of samples). Sample, Y-bar s.e. = 10/100 = 10/10 = 1 Self-doubt 16 18 20 22 24 26 28 Z: -3 -2 -1 0 1 2 3 4 5

Significance Tests How many z’s is our sample mean away from our guess? Z = Y-bar –  / s.e. Z = 25 – 20 / 1 z = 5 s.e. = 10/100 = 10/10 = 1 Sample, Y-bar Self-doubt 16 18 20 22 24 26 28 Z: -3 -2 -1 0 1 2 3 4 5

Significance Tests Indeed, our sample z-score is 5, well above 1.96 or 2.58. Reject the guess! Looking in Appendix B… Our sample has a .0000287 % chance of having come from a population whose mean is 20! s.e. = 10/100 = 10/10 = 1 Sample, Y-bar Self-doubt 16 18 20 22 24 26 28 Z: -3 -2 -1 0 1 2 3 4 5

Significance Tests Conducting a Test of Significance for the Mean By slapping the sampling distribution for the mean over a guess of the mean, Ho, we can find out whether our sample could have been drawn from a population where the mean is equal to our guess. • Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed) • Set critical z (z = +/- 1.96) or t • Make guess or null hypothesis, Ho:  = 0 Ha:   0 • Collect and analyze data • Calculate Z or t: z = Y-bar-  s.e. • Make a decision about the null hypothesis (reject or fail to reject) • Find the P-value

Significance Tests 1. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed). -level refers to how unlikely a sample’s mean would have to be before you’d reject your guess. The scientific standard is typically .05 probability, or a 5% chance that your sample came from a population whose mean is what you guessed. If your sample’s mean has less than 5% chance of having come from a population with your guess, you’d reject the guess (the null hypothesis). -level could be set at .10, .01, etc. Sampling distribution of sample means, s.e. calculated by s/√n Guess, µo What if my sample mean were here?

Significance Tests 1. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed). One- or two-tailed test refers to the rejection region in your sampling distribution. If your -level were .05, in a two-tailed test your rejection region would be the outer 2.5% of each tail. A two-tailed test implies a directionless null hypothesis such as µo = 0. Sampling distribution of sample means, s.e. calculated by s/√n 2.5% of sampling distribution. Guess, µo What if my sample mean were here?

Significance Tests 1. Decide -level ( = .05) and nature of test (two-tailed vs. one-tailed). One- or two-tailed test refers to the rejection region in your sampling distribution. If your -level were .05, in a one-tailed test your rejection region would be the outer 5% of one of the tails. A one-tailed test implies a directional null hypothesis such as µo ≤ 0 or µo ≥ 0 . The idea: If I have good reason to think a parameter would be above a particular value, then I only need to set the guess at that value or less (µo ≤ 0) and look to see if the sample statistic is in the rare 5% of possible samples above the null. If it is in the extreme low end, I won’t reject the null! Sampling distribution of sample means, s.e. calculated by s/√n 5% of sampling distribution. Guess, µo What if my sample mean were here or there?

Significance Tests 2. Set critical z (z = +/- 1.96) or t -level refers to how unlikely a sample’s mean would have to be before you’d reject your guess. There is a z- or t-score that corresponds with that proportion of the area in the tails of the curve (area in the tails of the sampling distribution). For example, ?? in the right tail corresponds with z = ?? .10 1.28 .05 1.65 .025 1.96 .01 2.33 .005 2.58 Sampling distribution of sample means, s.e. calculated by s/√n Guess, µo What if my sample mean were here?

Significance Tests • We use t instead of z to be more accurate: • t curves are symmetric and bell-shaped like the normal distribution. However, the spread is more than that of the standard normal distribution—the tails are fatter. Tea Tests? df = 1, 2, 3, and so on, approaching normal as df exceeds 120.

Significance Tests • The reason for using t is due to the fact that we use sample standard deviation (s) rather than population standard deviation (σ) to calculate standard error. Since s, standard deviations, will vary from sample to sample, the variability in the sampling distribution ought to be greater than in the normal curve. t has a larger spread, more accurately reflecting the likelihood of extreme samples, especially when sample size is small. • The larger the degrees of freedom (n – 1), the closer the t curve is to the normal curve. This reflects the fact that the standard deviation s approaches σ for large sample size n. • Even though z-scores based on the normal curve will work for larger samples (n > 120) SPSS uses t for all tests because it works for small samples and large samples alike. Tea Tests?

Significance Tests 3. Make guess or null hypothesis: Ho:  = 0 Ha:   0 The guess refers to the value that you will feel comfortable with declaring is true for the population unless your sample evidence suggests otherwise. In science, we wouldn’t want to assert something based on a sample unless we had extremely good evidence. The null is a default assumption, such as saying previous research says the mean is . In more advanced statistics, we will typically use null hypotheses that declare “no difference between groups” or “no relationship between variables.” The alternative is typically consistent with your research hypothesis or expectations. Sampling distribution of sample means, s.e. calculated by s/√n Guess, µo What if my sample mean were here?

Significance Tests 3. Make guess or null hypothesis: Ho:  = 0 (or some other value) Ha:   0 The hypotheses above refer to a two-tailed test. Hypotheses for one-tailed tests would be like this: Ho:  ≤ 0 (or some other value) Ha:  > 0 Ho:  ≥ 0 Ha:  < 0 Sampling distribution of sample means, s.e. calculated by s/√n Guess, µo What if my sample mean were here?

Significance Tests 4. Collect and analyze data. Once you’ve established your assumptions and what you are testing for, you can get into data analysis. Note that this ordering of steps helps prevent you from “peaking into” the data to establish your assumptions and tests. Basing tests on sample information sets up predetermined outcomes—bad! If calculating inferential statistics by hand, you would need to find your mean and standard deviation for each variable. Sampling distribution of sample means, s.e. calculated by s/√n Guess, µo What if my sample mean were here?

Significance Tests 5. Calculate Z or t: z or t = Y-bar -  s.e. s.e. for z = σ/√n s.e. for t = s/√n Calculating the test statistic will tell you how many standard errors away from the null hypothesis your sample statistic is. Corresponding with the z or t value is an area under the curve that tells what proportion or percentage of sample means would have been that far away if your null hypothesis were correct. Sampling distribution of sample means, s.e. calculated by s/√n Guess, µo What if my sample mean were here?

Significance Tests • 6. Make a decision about the null hypothesis. • Is your sample statistic more standard errors away from your guess or null hypothesized value than your critical z or t? • If it is farther out: • It meets the criteria for implausibly rare that you established from the outset. • You would reject the null, saying it is unlikely your sample could have come from a population where that null value is true • If it is not more extreme than your critical z or t: • It is not an unlikely occurrence as established from the outset. • You would fail to reject the null, saying that your guess is likely true and your sample has a good chance of having come from a population with that null value. Sampling distribution of sample means, s.e. calculated by s/√n - Z? - Guess, µo What if my sample mean were here?

Significance Tests 7. Find the p-value. The p-value will tell you the actual likelihood that you’d get a sample with your statistic that is as far away from the null value or the guess if your null or guess were true for your population. To find p, look in a z or t table to find the proportion of the area in the tails of the curve that corresponds with the z or t that you calculated for your sample statistic. Remember to be sure you keep track of whether you are doing a two-tailed (p * 2) or one-tailed (p) test. Sampling distribution of sample means, s.e. calculated by s/√n p? Guess, µo What if my sample mean were here?

Significance Tests • Another Example of a Significance Test of the mean or proportion. • An administrator read that “snob universities” have over 50% of student GPAs over 3.5. He wants to determine whether SJSU is a “snob university.” • He decides: • To use an alpha-level of .05 with a one-tailed test. • That the critical z or t will be 1.65. • Thinking SJSU is a “snob university” he sets his null as: Ho: Π ≤ .5; Ha: Π > .5 Sampling distribution of sample means, s.e. calculated by s/√n 5% Guess, .50

Significance Tests • Another Example of a Significance Test of the mean or proportion. • An administrator read that “snob universities” have over 50% of student GPAs over 3.5. He wants to determine whether SJSU is a “best university.” • He decides: • To use an alpha-level of .05 with a one-tailed test. • That the critical z or t will be 1.65. • Thinking SJSU is a “snob university” he sets his null as: Ho: Π ≤ .5; Ha: Π > .5 • He collects data from 500 randomly selected SJSU students and finds that .40 have GPAs above 3.5. • He calculates z s.e.= √(p)(1-p)/N • z = p – Πo / s.e. s.e. = √.4(.6) = √.24 = .022 • 500 500 • z = .4 - .5/ .022 = -4.55 Sampling distribution of sample means, s.e. calculated by s/√n 5% Guess, .50 Our sample.

Significance Tests • Another Example of a Significance Test of the mean or proportion. • An administrator read that “snob universities” have over 50% of student GPAs over 3.5. He wants to determine whether SJSU is a “snob university.” • He decides: • He calculates z s.e.= √(p)(1-p)/N • z = p – po / s.e. s.e. = √.4(.6) = √.24 = .022 • 500 500 • z = .4 - .5/ .022 • = -4.55 • Making a decision about the null is easy. He sees that his sample proportion is lower than the null of .5 and within the null of less than .5. He fails to reject the null. • In finding the p-value, he sees that if the population value were .5, he’d have less than .ooo1 chance of getting a GPA that low. He has good evidence that SJSU is not a “snob univeristy.” Sampling distribution of sample means, s.e. calculated by s/√n 5% Guess, .50 Our sample.

Significance Tests • One final note: • The tests we typically use in sociology have assumptions of large sample sizes. • When conducting tests with small sample sizes, some restrictions apply: • When working with means, we typically have to assume the population values are normally distributed. • When working with proportions, we must use a binomial probability distribution.

Significance Tests