Understanding Confidence Intervals and Hypothesis Testing in Educational Nonprofits

Confidence Intervals, Hypothesis Testing

Example 1 You manage a large educational nonprofit and are trying to estimate the amount of deductions your teachers apply for in order to comment to the media (you can write off $250 for supplies annually on your federal tax return). Your assistant randomly samples 50 employees . The mean write-off was 150$ with a SD of $55. What is the probability that the mean write-off is between 140 and 160?

What can we say about the question before we even start calculations? • We know we can use the z distribution because of the sample size • In asking for a range of values, (between 140 and 160) we can take the normalized scores (number of SD from the mean i.e. z scores) and take that area under the curve • This is an example of how one would apply the confidence interval concept

Step one • Calculate what you can: • Standard error= sigma/sqrt(n)=55/sqrt(50) • =7.78 • Z score= (160-150)/7.78=1.28

Visualized

Find z score

Put it in plain English • Multiply your area under the curve (probability) by 2 to get 0.798 • The probability that the write-offs were within 10 of that mean is 79.8% • Comment on these results • Think of the alpha we commonly use in class: 0.05 and 0.1

What is a confidence interval? • Definition: the best estimate for a range of a population value (parameter) that we can come up with given a sample (sample statistic) • The general formula for n>30: X bar plus or minus (critical value * s.e.) • Here is a list of critical values at the most common confidence levels

T versus z • The General formula for n<30: • If your n is less than 30, you need to look up the critical value in the t table, at the intersection of the (df) and the significance level depending on if it is a one tailed or 2 tailed test. • Thus the critical value changes depending on your sample size n and your confidence level that you desire

Example 2 Example: we know the mean test scores for 20 peopleout of a class of 300. The mean score is an 82. The sample standard deviation=15. What confidence level are the scores between 75 and 89?

What can we say about the question before we even start calculations? • We know we can use the t distribution because of the sample size • In asking for a range of values, (between 75 and 89) we can take the t scores (number of SD from the mean) and take that area under the curve • This is an example of how one would apply the confidence interval concept

Calculate what you can: • Standard error= sigma/sqrt(n)=15/sqrt(20) • =3.35 • t score= (89-82)/3.35=2.0896 • Don’t want to use the t table? • http://stattrek.com/online-calculator/t-distribution.aspx • Be careful, it is a cumulative probability here • OR want an easy t table? • http://www.medcalc.org/manual/t-distribution.php

T table

Visualize your answer

Put your answer in words • We are 94.96% confident that the population mean of exam scores is between 75 and 89

Hypothesis Testing • A null hypothesis: nothing has changed or happened • change has not occurred, the effect has not been realized • A statement of no difference • Always refers to the population, and is therefore untestable, so it is an implied hypothesis • The null hypothesis is a statement of equality • The purpose of the null: acts as a starting point or benchmark against which the actual outcomes of a study can be measured • Until you prove there is a difference, you assume there is no difference

Research hypothesis Definition: a definite statement that there is a relationship between variables • They posit a relationship between variables, not an equality • They always refer to the sample, not the population

One tailed v. two tailed • Non-directional: says two variables are different • Directional: specifies if one is more than or less than the other • One tailed tests: reflect a directional hypothesis • Greater use than a two tailed test • Two Tailed Tests: reflect a non-directional hypothesis • There is a difference but in no particular direction • Example one tailed test: Arrest rate is higher after a crackdown on prostitution • Example two tailed test: The arrest rate after the crackdown does not equal the arrest rate after

Steps to work through a CI General Steps to take to test a null hypothesis 1. State the null hypothesis 2. Set the level of risk associated with the null hypothesis 3. Select the appropriate test statistic (z or t score, depends on n) 4. Compute the test statistic 5. Determine the value needed for rejection of the null based on a table of critical values for that particular statistic -each test statistic has a critical value, this is the value you’d expect if the null were true 6. If the obtained value is more extreme than the critical value, the null cannot be accepted, that is, the null occurring by chance is not the best explanation of the events 7. If the obtained value doesn’t exceed the critical value, you do not reject the null

Example There is a series of complaints made to the local police department on prostitution. Before the crackdown, there were 3.4 arrests per day. The chief wants to show that the crackdown has worked. What is the null and research hypothesis?

Hypotheses • H_0: Following the crackdown arrests after=arrests before; arrests after=3.4 • H_A: Following the crackdown, arrests after > 3.4

Here is the random sample of arrests per day Day Prostitution Arrests 1 3 2 5 3 7 4 2 5 3 6 6 7 4 8 3 9 6 10 1 Step 1: Estimate the population and sample means there are a lot of sites out there that do this: http://www.miniwebtool.com/sample-standard-deviation-calculator/

Use these estimates to calculate the standard error • Sample mean: 4 • Sample SD (hint divide by n-1)=1.94 • S.e.= s/sqrt(n)= 1.94/sqrt(10)=0.61

Test the hypothesis with these numbers • When you’re told to test a hypothesis, this is asking you to get the probability of taking a random sample of 10 with the mean at 4.0 if the population mean is actually 3.4 • Get the t score for 4.0

T score • 4.0-3.4 / 0.61= 0.98 • Look up the t score

Interpret the t score • You can see it is in between 0.15 and 0.2 • (the computer shows it is 0.176) • this means that The probability of drawing a sample of 10 with a mean of 4 if the population mean is really 3.4 is between 0.1 and 0.2; should we accept the null?

Why is the t score not enough? • Typically we’d reject the null with 95% confidence, the critical value there is 1.833, we only got 0.98

Significance levels (alpha) • The risk that what you observe is not due to the treatment • Also, the risk you’re willing to take that you’ll reject a null hypothesis when it is actually true • Example: the increase in test scores is by chance, not due to the after school program • If an article reports significance at the 0.05 level, this means there is a 1 in 20 chance that whatever they observed can be attributed to chance as opposed to the treatment they hypothesized • The researcher picks this value (the risk they’re willing to accept)

How sure must you be? • If the t score you generate EXCEEDS the t score that is associated with the alpha, we can reject the null hypothesis • And accept the research hypothesis • The alpha is the probability that you SELECT in order to reject the null • When our alpha is 0.05, this is the threshold it takes to reject the null, if our t score exceeds the t score associated with 0.05 (at the df) then we reject the null, but there is still a 5% chance that the null is true

In the previous example • Returning to the problem above, the t score is 0.98 • if our alpha was 0.05 at (df=9) and the t score is 1.833. 0.98 does not exceed 1.833, so we cannot reject the null. • There is ~17% chance that the null is true, and that’s too high

Language in psets • “Evaluate your hypothesis at alpha=0.1 and at alpha 0.05 “ • This is asking you to see if your t/z score that you calculated exceeds the t/z score at the chosen level of confidence (alpha = 0.05 is the same as 95% confidence) • If you’re using the t dist. make sure you determine the correct value at the proper degrees of freedom

Handy graphic for errors

Interpreting the previous graphic • The null can either be true or false, you’ll never know because you’re not testing the whole population • You can either accept it or reject it • Type I: The value associated with a type I is the risk you’re rilling to take and it is conventionally between 0.01 and 0.05 • If it is at 0.05, there is a 5% chance you’ll reject the null when it is actually true • Reduce the chance of getting a type I by using smaller and smaller alphas • Raising the alpha increases the chance you commit a type II error!

Interpreting the previous graphic Type II: you accepted a null by mistake, and conclude there are no differences when there actually are Reduce your likelihood of committing a type II error by increasing the sample size

SAMPLE SIZE • Sample size • When you test a hypothesis with a small sample, the t scores with the associated alpha values will be higher than those for larger samples • This is because when you estimate a population with a small sample, it contains more error • As the number of df goes up, the t values for rejecting the null go down • If n is bigger than 30, use the normal distribution

FORMULA TO DETERMINE SAMPLE SIZE How to determine the sample size: N=[(t (i.e.1.96) * s)/ error we can tolerate ] ^ 2

Example We need to determine for the Welfare office the average income for all residents that receive welfare. They want to be 95% confident that the estimate of average income is within $100 of the actual average. How large of a sample do we need in order to reduce the error to 100 (the SD is 442)?

Solve by plugging in Step1: we know to build a 95% confidence interval we take (that is the t score/critical value that we want Step 2: n=[(1.96 * 442)/100]^2 N=75.05 or 76 In English: the best sample size is 76 respondents

Example We are testing the effect of a drug by injecting 100 people with it and recording their response time. The mean response time for those not who did not get the drug was 1.2 seconds, and the mean response time for those who were injected with the drug its 1.05 seconds. The sample standard deviation is 0.5 seconds. Do you think the drug affects the response time?

Step one • Set the hypotheses: Null: the response time is equal between those injected and those not injected (the drug has no effect) Research hypothesis: The response time for those injected is less than those not injected (mu _(with drug) < 1.2 seconds)

Step 2 • If the null was true, what is the probability we would have gotten this with the sample (if that probability is really small then we can reject the null.) • we know that n>30, so the CI can use the critical value in the z distribution

Next steps • Step 3: Estimate the s.e. = s/sqrt(n) = 0.5/10 = 0.05 • Step 4: get the test statistic • Conceptualize the problem by drawing it out: 1.2 is the mean, how many SD is 1.05s away from 1.2s. Then get the z score for 1.05 to find how many SD it is away from the mean of 1.2.

Get the z score • get the z score using this formula: • z=[(1.2-1.05)/0.05] = 3 • In english this means that 1.05 seconds is 3 SD away from the mean • So in setting up this confidence interval, you’re asking what the odds of getting a score 3 SD from the mean (1.05 s) completely by chance. Since it is far out there in the tails, intuition says it is low. • given we set our hypothesis up this way, we are only testing to see if the drug lowers response time • This calls for a one tailed test

Draw it out to help

Look at the z table You look at the z table and see that 3.0 has .4986 between mu and the score. Thus if we add .5 to .4986 we see that the odds of getting this score by chance are 1-.9956 or .0014 How to put this into plain English?

Estimating population proportions

Proportions • You can set up confidence intervals around them just like we did with means • Here are the steps: 1. estimate the proportion 2. Take the SD with this formula: s= sqrt(p * (1-p)) 3. Find the s.e. with this formula: s / sqrt(n) 4. Set up the confidence interval with this formula: proportion plus or minus t * s.e.

Example The warden wants to estimate how many re-admits he is getting because of a new job training program taking place in the jail. He takes a sample of 100 inmates who went through the program, and found that 68 became inmates again. Give a 95% confidence interval around this population proportion.

Calculate what we can Step 1: estimate the population proportion =0.68 become re-admitted each year Step 2: get the sample standard deviation using this formula: s= sqrt(p * (1-p)) =sqrt(1* 0.68 * 0.32) =0.47

Next steps Step 3: Use this in order to find the standard error: = s / sqrt(n) =0.47/ sqrt(100) = 0.047 Step 4: What are the 95% confidence limits of the proportion? Since n isbiggerthan 30, the normal curvecanbeused. Set up a confidence intervalusingthis formula: proportion plus or minus t * s.e. =0.68 + or - 1.96 * 0.047 =0.68 + or - 0.092 =0.59 to 0.77

Understanding Confidence Intervals and Hypothesis Testing in Educational Nonprofits