Clinical Trial Writing II Sample Size Calculation and Randomization

Clinical Trial Writing IISample Size Calculation and Randomization Liying XU (Tel: 22528716) CCTER CUHK 31st July 2002

1Sample Size Planning

1.1 Introduction • Fundamental Points • Clinical trials should have sufficient statistical power to detect difference between groups considered to be of clinical interest. Therefore calculation of sample size with provision for adequate levels of significance and power is a essential part of planning.

Five Key Questions Regarding the Sample Size • What is the main purpose of the trial? • What is the principal measure of patients outcome? • How will the data be analyzed to detect a treatment difference? (The test statistic: t-test , X2 or CI.) • What type of results does one anticipate with standard treatment? • Ho and HA, How small a treatment difference is it important to detect and with what degree of certainty? ( ,  and .) • How to deal with treatment withdraws and protocol violations. (Data set used.)

SSC: Only an Estimate • Parameters used in calculation are estimates with uncertaintyand often base on very small prior studies • Population may be different • Publication bias--overly optimistic • Different inclusion and exclusion criteria • Mathematical models approximation

What should be in the protocol? • Sample size justification • Methods of calculation • Quantities used in calculation: • Variances • mean values • response rates • difference to be detected

Realistic and Conservative • Overestimated size: • unfeasible • early termination • Underestimated size • justify an increase • extension in follow-up • incorrect conclusion (WORSE)

What is  (Type I error)? • The probability of erroneously rejecting the null hypothesis • (Put an useless medicine into the market!)

What is  (Type II error)? • The probability of erroneously failing to reject the null hypothesis. • (keep a good medicine away from patients!)

What is Power ? • Power quantifies the ability of the study to find true differences of various values of . • Power = 1- =P (accept H1|H1 is true) • ----the chance of correctly identify H1 (correctly identify a better medicine)

What is ? •  is the minimum difference between groups that is judged to be clinically important • Minimal effect which has clinical relevance in the management of patients • The anticipated effect of the new treatment (larger)

The Choice of  and  depend on: • the medical and practical consequences of the two kinds of errors • prior plausibility of the hypothesis • the desired impact of the results

The Choice of  and  • =0.10 and =0.2 for preliminary trials that are likely to be replicated. • =0.01 and =0.05 for the trial that are unlikely replicated. • = if both test and control treatments are new, about equal in cost, and there are good reasons to consider them both relatively safe.

The Choice of  and  • > if there is no established control treatment and test treatment is relatively inexpensive, easy to apply and is not known to have any serious side effects. • < (the mostcommon approach 0.05 and 0,2)if the control treatment is already widely used and is known to be reasonably safe and effective, whereas the test treatment is new,costly, and produces serious side effects.

1.2 SSC for Continuous Outcome Variables • H0: =C-I=0 • HA: =C-I0 • If the variance in known • If • If H0 will be rejected at the  level of significance.

A total sample 2N would be needed to detect a true difference  between I and C with power (1-) and significant level  by formula:

Example 1 • An investigator wish to estimate the sample size necessary to detect a 10 mg/dl difference in cholesterol level in a diet intervention group compared to the control group. The variance from other data is estimated to be (50 mg/dl). For a two sided 5% significance level, Z=1.96, and for 90% power, Z=1.282. • 2N=4(1.96+1.282)2(50)2/102=1050

Example1a Baseline Adjustment • An investigator interested in the mean levels of change might want to test whether diet intervention lowers serum cholesterol from baseline levels when compare with a control. • H0: =0 • HA: 0 • =20mg/dl, =10mg/dl • 2N=4(1.96+1.282)2(20)2/102=170

A Professional Statement • A sample size of 85 in each group will have 90% power to detect a difference in means of 10.0 assuming that the common standard deviation is 20.0 using a two group t-test with a 0.05 two-sided significant level.

Values of f(,) to be used in formula for sample size calculation

1.3 SSC for a Binary Outcome • Two independent samples

Example 2 • Suppose the annual event rate in the control group is anticipated to be 20%. The investigator hopes that the intervention will reduce the annual rate to 15%. The study is planned so that each participant will be followed for 2 years. Therefore, if the assumption are accurate, approximately 40% of the participants in the control group and 30% of the participants in the intervention group will develop an event.

A Professional Statement • A two group x2 test with a 0.05 two-sided significant level will have 90% power to detect the difference between a Group 1 proportion, P1,of 0.40 and a Group 2 proportion P2 of 0.30 (odds ratio of 0.643) when the sample size in each group is 480.

Table 1.3 Approximate total sample size for comparing various proportions in two groups with significance level () of 0.05 and power(1-) of 0.8 and 0.9

From Table 1.3 You can see: • N • The power 1- N  • The N 

Paired Binary Outcome • McNemar’s test • d=difference in the proportion of successes (d=pI-pC) • f=the portion of participants whose response is discordant (the pair of outcome are not the same)

Example 3 • Consider an eye study where one eye is treated for loss in visual acuity by a new laser procedure and the other eye is treated by standard therapy. The failure rate on the control, pC, is estimated to be 0.4, and the new procedure is projected to reduce the failure rate to 0.20. The discordant rate f is assumed to be 0.50.

=0.05 • The power 1- =0.90 • f=0.5 • PC=0.4 PI=0.2

1.4 Adjusting for Non-adherence • Ro =drop out rate • RI=drop in rate • N=N • If RO=0.20, RI=0.05 • N =1.78N

1.5 Adjusting the Multiple Comparison • ’= /k • k= the number of multiple comparison variables

Table 1.4 Adjusting for Randomization Ratio

1.6 Adjusting for loss of follow up • If p is the proportion of subjects lost to follow-up, the number of subjects must be increased by a factor of 1/(1-p).

1.7 Other Factors: • the rate of attrition of subjects during a trial • intermediate analyses

Sample size re-estimation • Events rates are lower than anticipate • Variability of larger than expected • Without unbinding data and • Making treatment comparisons

1.8 Power Calculation(assuming we compare two medicines) • Power Depends on 4 Elements: • The real difference between the two medicines,  • Big big power • The variation among individuals, • Small big power • The sample size, n • Large nbig power • Type I error, • Large  big power

Sensitivity of the sample size estimate • to a variety of deviations from these assumptions • a power table

Table 1 Statistical Power of the Tanzania Vitamin and HIV Infection Trial (N=960)

Example 4 Regret for Low Power Due to Small Sample? • I have a set of data that the mean change between the 2 groups is significantly different (p<0.05). But when I put calculate the power it gives only 50%. How should I interpret this?Also, can someone kindly advise as whether it is meaningful (or pointless) to calculate the power when the result is statistically significant?

Books and Software • Sample size tables for clinical studies (second edition) • By David Machin, Michael Campbell Peter Fayers and Alain Pinol • Blackwell Science 1997 • PASS 2000 available in CCTER • nQuery 4.0 available in CCTER

2. Randomization

Randomization • Definition: • randomization is a process by which each participant has the same chance of being assigned to either intervention or control.

Fundamental Point • Randomization trends to produce study groups comparable with respect to known and unknown risk factors, removes investigator bias in the allocation of participants, and guarantees that statistical tests will have valid significance levels.

Two Types of Bias in Randomization • Selection bias • occurs if the allocation process is predictable. If any bias exists as to what treatment particular types of participants should receive, then a selection bias might occur. • Accidental bias • can arise if the randomization procedure does not achieve balance on risk factors or prognostic covariates especially in small studies.

Fixed Allocation Randomization • Fixed allocation randomization procedures assign the intervention to participants with a pre-specified probability, usually equal, and that allocation probability is not altered as the study processes • Simple randomization • Blocked randomization • Stratified randomization

Randomization Types • Simple randomization

Simple Randomization • Option 1: to toss an unbiased coin for a randomized trial with two treatment (call them A and B) • Option 2: to use a random digit table. A randomization list may be generated by using the digits, one per treatment assignment, starting with the top row and working downwards: • Option 3: to use a random number-producing algorithm, available on most digital computer systems.

Advantages • Each treatment assignment is completely unpredictable, and probability theory guarantees that in the long run the numbers of patients on each treatment will not be radically different and easy to implement

Disadvantages • Unequal groups • one treatment is assigned more often than another • Time imbalance or chronological bias • One treatment is given with greater frequency at the beginning of a trial and another with greater frequency at the end of the trial. • Simple randomization is not often used, even for large studies.

Clinical Trial Writing II Sample Size Calculation and Randomization