310 likes | 480 Vues
Statistical Issues in Clinical Studies Using Laboratory Endpoints. ACOSOG June 5, 2003 Montreal. Elizabeth S. Garrett, PhD Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University. Outline. Motivating example: Endostatin trial Surrogate markers
E N D
Statistical Issues in Clinical Studies UsingLaboratory Endpoints ACOSOG June 5, 2003 Montreal Elizabeth S. Garrett, PhD Sidney Kimmel Comprehensive Cancer Center Johns Hopkins University
Outline • Motivating example: Endostatin trial • Surrogate markers • Design issues in laboratory studies • Defining outcomes • Issues relating to measurement
Motivating Example: Dose Finding Endostatin Trial* • Endostatin is angiogenesis inhibitor. • Complex biology: finding best dose will require extensive study. • Cytostatic versus cytotoxic agent. • Outcomes of interest are biomarkers: • Tumor blood flow • Tumor metabolism • Outcomes were measured at baseline, 28 days and 56 days. * Herbst, Abbruzzese et al. Development of Biologic Markers of Response and Assessment of Antiangiogenic Activity in Clinical Trial of Human Recombinant Endostatin. JCO v. 20, 2002.
Motivating Example: Dose Finding Endostatin Trial* • Why is this study different than other dose finding studies? • Standard Phase I trials find MTD (maximum tolerated dose) with safety as primary outcome. • Primary outcome is ‘efficacy-related’. • Outcomes are ‘surrogate’ outcomes. • Measuring outcomes is more invasive and more costly than standard safety or efficacy trials. • Measurement of outcomes can be complicated.
Surrogate Markers • If tumor blood flow and tumor metabolism change as we expect, what can we conclude? • Is it true, then, that endostatin is efficacious? • Not necessarily…..how closely tied are these markers and clinical response? • Surrogate outcomes: outcomes in the causal pathway of true outcome.
Surrogate Markers • Replace a distal endpoint (response) by proxy endpoint (tumor metabolism and/or blood flow). • Benefits of using surrogate markers • Reduction in sample size • Reduction in trial duration • Reduction in cost • Reduction in time to evaluate new therapies • Their use is NOT AS EASY AS IT SOUNDS… • Use of a marker as surrogate for outcome requires that you first identify one...
What is a surrogate marker? • Defining Characteristic: • a marker must predict clinical outcome, in addition to predicting the effect of treatment on clinical outcome • Operational Definition • establish an association between marker & clinical outcome • establish an association between marker, treatment & clinical outcome, in which marker mediates relationship between clinical outcome and treatment
Surrogate Markers [ Clinical Outcome | Treatment & Marker] = [ Clinical Outcome | Marker] Effect of the treatment on the clinical outcome is completely explained by the effect of the marker Knowing the status of the marker, the treatment adds nothing more to explaining the clinical outcome.
Surrogate Markers 1) establish an association between marker & clinical outcome. marker Clinical outcome 2) establish an association between marker, treatment & clinical outcome, in which marker completely mediates relationship between clinical outcome and treatment. marker treatment Clinical outcome
NOT Surrogate Markers treatment Clinical outcome marker Clinical outcome marker treatment
Correlative Studies • It is still valid to look at markers that you would expect to be correlated with the clinical outcome. • But, we do not want to be overconfident by saying that they are true ‘surrogates.’ • Correlative studies might include: • Pharmacokinetics • Pharmacodynamics • Other biologic markers that can be measured in serum, biopsy samples, etc. Clinical outcome marker treatment
Design Issues • Novel laboratory studies may not be suited for standard phase I and phase II clinical trials.
Standard Phase 1 Clinical Trial Designs • Historically, MTD is of interest • ‘3+3’: aims for dose with <33% toxicity • Treat 3 patients at a dose. • If no toxicities, escalate. • If 1 toxicity, treat 3 more at that level. • If 2 or more, de-escalate to lower dose. • ‘3+3’ or variants are most commonly seen. • Not relevant to cytostatic and other agents (e.g. vaccine trials). • Need different approach.
Alternative Designs • Continual Reassessment Method (CRM) • ‘Response’ can be defined as toxicity or efficacy endpoint. • Target level of response is chosen. • Mathematical model for dose-response is assumed. • Dose levels are not fixed in advance: later doses are determined by responses to previous doses. • Many variants, but generally need to pre-specify a minimum and maximum dose. • Problem: we might not know what our ‘target’ response is.
Alternative Designs • Dose-ranging study • Choose fixed doses and treat a fixed number of patients at each of the doses. • Allows estimation of dose-response curve. • Choice of Herbst, et al. for endostatin trial. • Can be used when safety is not of great concern (i.e. probability of toxicities is small). • For both CRM and dose-ranging, we tend to get better estimates of dose-response curves. • Toxicity almost always gets included as (at least) a secondary outcome.
Selecting a Phase I Design • Don’t assume the standard ‘phase 1’ trial designs are appropriate for assessing laboratory endpoints. • Dose-ranging studies are more likely candidates. • Treat enough patients at each dose level to get ‘reasonable’ estimate of response (i.e. at least 3). • Think carefully about dose levels…. • Want enough dose levels to understand/estimate the curve • Don’t want so many that it will take you a long time to finish the study. • Patient population • Endostatin trial: broad patient population • 3 NSCLC, 8 melanoma, 2 HNSCC, 3 breast, 2 colon, 2 renal cell, 4 sarcoma, 1 thyroid. • Generalizability versus accrual versus specificity
Choosing dose levels • Can be hard to know exactly how to select doses. • If there is too much ‘space’ between doses, it can be difficult to estimate true curve. • If you expect, ‘monotonicity’, then you can infer intermediate doses. • But, what if not monotonic? Endostatin example.
Choosing the timing • Standard Phase I • Patients are under surveillance for short term toxicities. • Patients contact study team when a toxicity occurs after discharge. • Not an issue of ‘when to look.’ • In laboratory studies • Usually a pre-post setting: how does the measure compare after treatment to before treatment. • Baseline (before treatment) measure is needed. • Post-treatment measures: • When is treatment at its most potent? • How often should response be measured? • Expensive? Invasive? • Is it sufficient to look at clinic visit times?
Timing in Endostatin Trial 28 days 56 days
Phase II Designs • Phase II designs: efficacy • Standard efficacy outcomes • Complete response • Partial response • Overall survival • For cytostatic agents • Progression • Progression-free survival • Laboratory outcomes…
Defining outcomes in laboratory studies • They are usually messy • More common binary outcomes have nice properties: • “Looking for 40% response vs. 20% response” • Laboratory outcomes are not so nice: • Often skewed. • Often have ‘undetectable’ range. • Often do not know what to expect. • This makes it hard to plan (i.e. sample size, power). • Novel assays: Not obvious what expected changes would be without treatment. • How much fluctuation would we expect to see?
Expected Fluctuations But, with multiple patients, these changes would tend to average to zero. However, if half of patients have increase and half have decrease, we might conclude that treatment is 50% effective! 56 0 28
Expected Fluctuations • Follow-up times are compared to baseline • That puts a lot of stock in baseline measure • Why not consider a ‘burn-in’ period? • If baseline is inaccurately measured, all comparisons will be incorrect.
Measurement Issues Accurate? • Regardless of natural fluctuations… • How accurately are we measuring our outcome? • How accurate can we measure blood flow? • Are we on target?
Measurement Issues • Why would it be measured inaccurately? • Often LONG protocol to get ‘final’ measure. • Lots of room for errors! • Although the classic efficacy outcome of ‘response’ is relatively soft, it has been objectively defined. • Laboratory endpoints require assays and other measurements, and also assumptions. • Often, assay is being developed along with the trial.
Assay Properties • IS THE ASSAY SENSITIVE? • Sensitivity: does the assay detect abnormalities in cases where abnormalities exist? • Specificity: does the assay find normal levels in normal cases? • “It has been shown that at high flow rates, measuring blood flow by PET underestimates blood flow.” bias in results.
More Measurement Issues • Reliability of assay • How reproducible are the results? • Two samples taken from the same patient on the same day from different lesions? • Two samples taken from the same patient on the same day from the same lesion? • One sample analyzed twice using the same method? • Subjectivity • Inter-rater agreement • Intra-rater agreement • In what ways can ‘error’ come into the procedure? Great study + bad assay = bad study
Measurement Issues • How can we know the reliability of the measures? • Preliminary studies (pre-clinical). • Build it into the study design! • Reliability substudy: • Inter-rater agreement • Intra-rater agreement • Incorporate burn-in period • Take multiple measures • Run the assay (test) more than once
Final Comments • Novel trials often need creative/novel study designs • Understanding the properties (i.e. reliability, sensitivity) of your assay are crucial. • Try to build reliability testing into study design • Think carefully about: • Are your markers are truly surrogates? • Which doses you should choose (minimum, maximum, and increments)? • How often you should measure the outcome? Statisticians usually can provide guidance, but not necessarily answers to these questions! Answers need to be based on experience with the novel agent and the underlying biology.