Confidence Intervals

Confidence Intervals Elizabeth Garrett-Mayer garrettm@musc.edu

What is a “confidence interval”? • It is an interval that tells the precision with which we have estimated a sample statistic. • Examples: • parameter of interest: median progression-free survival time in the cetuximab arm: “The estimated median progression-free survival is 10.1 weeks and the 95% confidence interval for median progression-free survival is 8.6 to 11.2 weeks.” • parameter of interest: response rate in the cetuximab arm “The estimated response rate is 36% with a 95% confidence interval on response rate is 29% to 42%.” • Parameter of interest: HR comparing Overall Survival in the Cetuximab versus Chemotherapy Alone arms. “The estimated HR for OS is 0.80 with a 95% confidence interval of 0.64 to 0.99.” Vermorken et al, NEJM, 359:11.

Different Interpretations of the 95% confidence interval • “We are 95% sure that the TRUE parameter value is in the 95% confidence interval” • “If we repeated the experiment many many times, 95% of the time the TRUE parameter value would be in the interval” • “Before performing the experiment, the probability that the interval would contain the true parameter value was 0.95.”

Where does the interval come from? • Based on several quantities: • Estimate of the parameter helps determine “center” of the confidence interval • Width of the CI based on three things • The level of confidence you desire (e.g., 95%) • The variability in the patient population • The sample size

General formula Sample size Measure Of population variability Width for 95% confidence Parameter estimate

Caveats • This assumes a “normal” approximation • Not appropriate for all situations (e.g., response rate with small N) • General principles are the same, but formula is not the same.

Not only 95%…. • 90% confidence interval: NARROWER than 95% • 99% confidence interval: WIDER than 95%

But why do we always see 95% CI’s? • “Duality” between confidence intervals and pvalues • Example: Consider the HR for overall survival comparing cetuximab vs. chemo alone. • 95% confidence interval: (0.64, 0.99) • pvalue = 0.04 • If it is true that if the 95% confidence interval does not overlap 1, then testing that the HR is 1 will be significant at the alpha = 0.05 level. • If it is true that if the 95% confidence interval does overlap 1, then testing that the HR is 1 will not be significant at the alpha = 0.05 level.

Other Confidence Intervals • Differences in means (e.g. QoL, CTCs) • Response rates • Differences in response rates • Odds ratios • median survival • difference in median survival • ……..

Recap • 95% confidence intervals are used to quantify certainty about parameters of interest. • Confidence intervals can be constructed for any parameter of interest (we have just looked at some common ones). • The general formulas shown here rely on the central limit theorem • You can choose level of confidence (does not have to be 95%). • Confidence intervals are often preferable to pvalues because they give a “reasonable range” of values for a parameter.

Some Confidence Intervals in a Survival Analysis Example: Urba et al. Randomized Trial of Preoperative Chemoradiation Versus Surgery Alone in Patients with Locoregional Esophageal Carcinoma, JCO, Jan 15, 2001. Hazard Ratio95% CI p-value Chemo v. surgery 0.69 0.46-1.06 0.09 Arm 1 Arm II % 95%CI % 95%CI 1 year survival 58 46-73 72 58-84 3 year survival 16 8-30 30 20-46 What about the confidence interval for the 1 year and 3 year difference?

Why not provide confidence intervals for... • Difference in median survival • Difference in 1 year survival • Difference in 3 year survival • Would give readers a “reasonable range” of values to consider for treatment effect that are intuitive. • What is remembered? • P = 0.09 which means insignificant result • But, can anyone remember the treatment effect?

Confidence Intervals for Reporting Results of Clinical Trials, Simon • “[Hypothesis tests] are sometimes overused and their results misinterpreted.” • “Confidence intervals are of more than philosophical interest, because their broader use would help eliminate misinterpretations of published results.” • “Frequently, a significance level or pvalue is reduced to a ‘significance test’ by saying that if the level is greater than 0.05, then the difference is ‘not significant’ and the null hypothesis is ‘not rejected’….The distinction between statistical significance and clinical significance should not be confused.”

Caveats “They should not be interpreted as reflecting the absence of a clinically important difference in true response probabilities.”

Excellent References on Use of Confidence Intervals in Clinical Trials • Richard Simon, “Confidence Intervals for Reporting Results of Clinical Trials”, Annals of Internal Medicine, v.105, 1986, 429-435. • Leonard Braitman, “Confidence Intervals Extract Clinically Useful Information from the Data”, Annals of Internal Medicine, v. 108, 1988, 296-298. • Leonard Braitman, “Confidence Intervals Assess Both Clinical and Statistical Significance”, Annals of Internal Medicine, v. 114, 1991, 515-517.

Confidence Intervals