Calculating Statistical Significance and Confidence Intervals

1. Calculating Statistical Significance and Confidence Intervals Bioterrorism Epidemiology Module 12 Missouri Department of Health And Senior Services In module 8, Evaluating Risk of Disease, relative risk and odds ratio were presented. In this module we will learn how to determine statistical significance associated with these risk measures. In module 8, Evaluating Risk of Disease, relative risk and odds ratio were presented. In this module we will learn how to determine statistical significance associated with these risk measures.

2. Tests of Significance (p values) Used to determine how likely it is that the observed results could have occurred by chance Study population is a sample from a population In the source population the amount of disease is the same for exposed and unexposed people A test of significance, also referred to as a �p� value, indicates the likelihood that a risk measure occurred by chance. This means that if we would repeat the measurement in another sample of the population, we would get a similar or different risk value. Since we cannot test the whole population, we are attempting to make a judgment about the effects of exposure on disease from a sample of that population.A test of significance, also referred to as a �p� value, indicates the likelihood that a risk measure occurred by chance. This means that if we would repeat the measurement in another sample of the population, we would get a similar or different risk value. Since we cannot test the whole population, we are attempting to make a judgment about the effects of exposure on disease from a sample of that population.

3. Tests of Significance (p values) This hypothesis is known as the null hypothesis The minimum accepted p value for significance is 0.05 The most frequently used significant test in epidemiology is the chi-square test The test of significance tests the hypothesis that there is no difference in the amount of disease between people exposed to a toxin and people not exposed. That is, the disease is not related to exposure. This hypothesis is known as the null hypothesis. The significance test results in a p value which is the probability that disease is not related to exposure. Because of years of convention, if the p value is greater than .05, we conclude that there is no relation between disease and exposure. If the p value is less than .05, you reject the hypotheses and conclude that there is a statistically significant difference between mortality or morbidity in people who were exposed to a toxin and those who were not. The chi square test is the most frequently used test to determine this p value, however, confidence intervals can also be used to test the hypothesis of no difference.The test of significance tests the hypothesis that there is no difference in the amount of disease between people exposed to a toxin and people not exposed. That is, the disease is not related to exposure. This hypothesis is known as the null hypothesis. The significance test results in a p value which is the probability that disease is not related to exposure. Because of years of convention, if the p value is greater than .05, we conclude that there is no relation between disease and exposure. If the p value is less than .05, you reject the hypotheses and conclude that there is a statistically significant difference between mortality or morbidity in people who were exposed to a toxin and those who were not. The chi square test is the most frequently used test to determine this p value, however, confidence intervals can also be used to test the hypothesis of no difference.

4. Chi-Square (?2) This slide shows the formula for calculating chi square. Chi square calculations can be used for data collected from any study design, cohort, case-control, cross-sectional, or experimental, where the data can be displayed in a two by two table like the one in this slide. The next slide provides data to practice this calculation.This slide shows the formula for calculating chi square. Chi square calculations can be used for data collected from any study design, cohort, case-control, cross-sectional, or experimental, where the data can be displayed in a two by two table like the one in this slide. The next slide provides data to practice this calculation.

5. Chi-Square (?2) Calculate the chi-square for this data and then use this value in Table 8 to determine the p value associated with the calculated chi-square. This data was obtained from a case control study design. Remember that cases are people with the disease and controls are people without the disease.Calculate the chi-square for this data and then use this value in Table 8 to determine the p value associated with the calculated chi-square. This data was obtained from a case control study design. Remember that cases are people with the disease and controls are people without the disease.

6. Chi-Square (?2) The calculation results in a chi-square value of 99.83. Using this value in Table 8 results in a p value that is less than .001. Because this value is less than .05, you reject the hypothesis that there is no difference between exposure in cases and controls. By rejecting this null hypothesis, you accept the alternative hypothesis that cases have significantly more exposure to the toxin of interest than do controls. Another way of interpreting this data is that the odds ratio, (100/50)/(100/350) = 7.0 is significantly different from 1.0. The significance of the odds ratio can be better understood by calculating a confidence interval around the odds ratio. The calculation results in a chi-square value of 99.83. Using this value in Table 8 results in a p value that is less than .001. Because this value is less than .05, you reject the hypothesis that there is no difference between exposure in cases and controls. By rejecting this null hypothesis, you accept the alternative hypothesis that cases have significantly more exposure to the toxin of interest than do controls. Another way of interpreting this data is that the odds ratio, (100/50)/(100/350) = 7.0 is significantly different from 1.0. The significance of the odds ratio can be better understood by calculating a confidence interval around the odds ratio.

7. 95% Confidence Interval of an Odds Ratio and a Relative Risk We use the 95% confidence interval because of the convention that a p value of 5% was the convention for rejecting the null hypothesis in a significance test. 100% minus 5% equals 95%. This can be better understood by looking at the calculation.We use the 95% confidence interval because of the convention that a p value of 5% was the convention for rejecting the null hypothesis in a significance test. 100% minus 5% equals 95%. This can be better understood by looking at the calculation.

8. 95% CI for Relative Risk (RR) This slide shows the calculation for the 95% CI for the relative risk. As you can see, the formula for calculating confidence intervals is relatively complicated compared to the formula for calculating chi square. In the next slides we will go through this step by step. This slide shows the calculation for the 95% CI for the relative risk. As you can see, the formula for calculating confidence intervals is relatively complicated compared to the formula for calculating chi square. In the next slides we will go through this step by step.

9. Relative Risk Cohort Study Iexp = 10 / 1,000 X 100,000 = 1,000 per 100,000 Inexp = 5 / 3,000 X 100,000 = 167 per 100,000 RR = Iexp / Inexp = 1,000 / 167 = 6.00 This slide shows the calculation for relative risk. Use this data to calculate confidence intervals around the relative risk.This slide shows the calculation for relative risk. Use this data to calculate confidence intervals around the relative risk.

10. 95% CI for Relative Risk (RR) This slide shows the calculation for the 95% CI for the relative risk. From the previous slide we calculated RR to be 6.00. The log normal value of 6.00 is 1.79. The variance of the log normal RR is 990 divided by 10, which equals 99. This is divided by 1000 which equals .099. This value is added to .200 which is 599 divided by 3000. .099 is added to .200 resulting in the variance of the log normal RR of .299. The square root of this value, .547, is the standard error of the normal log of RR. We are now ready to use these values to calculate the upper and lower 95% limit for the normal log of RR. The standard error is multiplied by 1.96, the z value of the 95% value of the normal curve. .547 times 1.96 equals 1.07 which is added to and subtracted from the normal log of the RR, 1.79. These values, 2.86 and .72, are converted to the antilog to arrive at the upper and lower 95% limits for the confidence interval, 17.46 to 2.05. We can conclude that we are 95% certain that the true RR lies between 17.46 and 2.05. Since this interval does not include a RR of 1.00, we conclude that there is a statistically significant difference between the exposed and non exposed groups. This is the same conclusion we arrived at using the chi square value earlier in this module. Now we will calculate the 95% confidence interval for an OR.This slide shows the calculation for the 95% CI for the relative risk. From the previous slide we calculated RR to be 6.00. The log normal value of 6.00 is 1.79. The variance of the log normal RR is 990 divided by 10, which equals 99. This is divided by 1000 which equals .099. This value is added to .200 which is 599 divided by 3000. .099 is added to .200 resulting in the variance of the log normal RR of .299. The square root of this value, .547, is the standard error of the normal log of RR. We are now ready to use these values to calculate the upper and lower 95% limit for the normal log of RR. The standard error is multiplied by 1.96, the z value of the 95% value of the normal curve. .547 times 1.96 equals 1.07 which is added to and subtracted from the normal log of the RR, 1.79. These values, 2.86 and .72, are converted to the antilog to arrive at the upper and lower 95% limits for the confidence interval, 17.46 to 2.05. We can conclude that we are 95% certain that the true RR lies between 17.46 and 2.05. Since this interval does not include a RR of 1.00, we conclude that there is a statistically significant difference between the exposed and non exposed groups. This is the same conclusion we arrived at using the chi square value earlier in this module. Now we will calculate the 95% confidence interval for an OR.

11. 95% CI for Odds Ratio (OR) This slide shows the calculations which are similar to the formula for calculating the confidence interval for RR. The only difference is the calculation of the variance of the normal log of the OR. This slide shows the calculations which are similar to the formula for calculating the confidence interval for RR. The only difference is the calculation of the variance of the normal log of the OR.

12. Odds Ratios Case Control Study ODDS OF EXPOSUREcases = 100 / 50 = 2.0 ODDS OF EXPOSUREcontrols = 100 / 350 = 0.29 OR = ODDScases / ODDScontrols = 2 / 0.29 = 6.90 Use the data in this table to calculate the 95% confidence interval for this odds ratio. Use the data in this table to calculate the 95% confidence interval for this odds ratio.

13. 95% CI for Odds Ratio (OR) First we calculate the variance of the normal log of the OR, which is the sum of the inverse of each of the cells, 1/100 + 1/100 + 1/50 + 1/350. We then take the square root of this value, .21, which is the standard error of the normal log of the OR. The standard error is now used in the formula in the same manner as for calculating RR. The resulting 95% confidence interval is 10.38 to 4.57. We are 95% certain that the true OR lies between these two values and that the interval does not include the null value of 1. This concludes this module. In the next module, we will discuss r recognizing and controlling bias and confounding. We will also use the information that has been presented to discuss criteria for causality.First we calculate the variance of the normal log of the OR, which is the sum of the inverse of each of the cells, 1/100 + 1/100 + 1/50 + 1/350. We then take the square root of this value, .21, which is the standard error of the normal log of the OR. The standard error is now used in the formula in the same manner as for calculating RR. The resulting 95% confidence interval is 10.38 to 4.57. We are 95% certain that the true OR lies between these two values and that the interval does not include the null value of 1. This concludes this module. In the next module, we will discuss r recognizing and controlling bias and confounding. We will also use the information that has been presented to discuss criteria for causality.

Calculating Statistical Significance and Confidence Intervals

Calculating Statistical Significance and Confidence Intervals

Presentation Transcript

CONFIDENCE INTERVALS

Confidence Intervals

Significance testing and confidence intervals

Chapter 7 Statistical Inference: Confidence Intervals

Confidence Intervals

LECTURE 36: STATISTICAL SIGNIFICANCE AND CONFIDENCE

LECTURE 28: STATISTICAL SIGNIFICANCE AND CONFIDENCE

Statistical significance using Confidence Intervals

Confidence Intervals

Confidence Intervals

Calculating Statistical Significance and Confidence Intervals

Confidence Intervals

Confidence Intervals and Hypothesis Tests (Statistical Inference)

Statistical inference: confidence intervals and hypothesis testing

Significance testing and confidence intervals

Stat 1510 Statistical Inference: Confidence Intervals &amp; Test of Significance

Statistical Methods II: Confidence Intervals

Statistical models, estimation, and confidence intervals

Statistical significance using Confidence Intervals

Calculating Statistical Significance and Confidence Intervals