1 / 72

Gene Pennello, Ph.D. Team Leader, Diagnostics Devices Branch Division of Biostatistics

Clinical Validation of Prognostic Biomarkers of Risk and Predictive Biomarkers of Drug Efficacy or Safety. Gene Pennello, Ph.D. Team Leader, Diagnostics Devices Branch Division of Biostatistics Office of Surveillance and Biometrics Center for Devices and Radiological Health, FDA.

marja
Télécharger la présentation

Gene Pennello, Ph.D. Team Leader, Diagnostics Devices Branch Division of Biostatistics

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Clinical Validation of Prognostic Biomarkers of Risk and Predictive Biomarkers of Drug Efficacy or Safety Gene Pennello, Ph.D. Team Leader, Diagnostics Devices Branch Division of Biostatistics Office of Surveillance and Biometrics Center for Devices and Radiological Health, FDA SAMSI Risk Perception Policy Practice Workshop October 3, 2007

  2. Outline • FDA and Device Regulation • Types of Biomarkers • Validation of Diagnostics • Predictive and Prognostic Biomarkers • Definitions, Endpoints • Study Designs for Predictive Biomarkers • Prospective Designs – efficiency comparison • Prospective-Retrospective Designs • Summary

  3. FDA CDERDrugs CDRH,Devices CBER,Biologics CVM,Veterinary CFSAN,Food NCTR

  4. What are Medical Devices? An item for treating or diagnosing a health condition whose intended use is not achieved primarily by chemical or biological action within the body (Section 201(h) of the Federal Food Drug & Cosmetic (FD&C) Act). Definition by exclusion: Simply put, a medical device is any medical item for use in humans that is not a drug nor a biological product.

  5. Example of Medical Devices Cardiovascular Devicespacemakers defibrillators heart valves coronary stents artificial hearts Monitoring Devices glucometers bone densitometers Diagnostic Devicesdiagnostic test kits for HIVprostate-specific antigen (PSA) testhuman papillomavirus (HPV) test Relatively Simple Devicestongue depressors thermometers latex gloves simple surgical instruments Ophthalmic devicesintraocular lenses PRK lasers, Radiological devicesMRI machines CT scannersdigital mammographycomputer aided detection

  6. Example of Medical Devices Dental, Ear, Nose, andThroat Devices hearing aidsbronchoscopy system General, Surgical, and Restorative Devicesbreast implants artificial hips spinal fixation devices artificial skin  Emerging technologiesmultiplex genetic tests (e.g., for multiple mutations or microbes) Genomic and proteomic Dx tests Nanotechnological devices Microspheres for molecular treatment of cancer Robotics Theranostics (predictive biomarkers of response or adverse reaction to therapy). Artificial pancreas

  7. Example of Medical Devices Due to the wide variety in technology, complexity, and intended use, medical devices can present novel statistical design and analysis challenges.

  8. Device Regulation Decision to approve a PMA application must “rely upon valid scientific evidence to determine whether there is reasonable assurance that the device is safe and effective”. “Valid scientific evidence is evidence from well controlled studies, partially controlled studies and objective trials without matched controls, well documented case histories conducted by qualified experts that there is a reasonable assurance of safety and effectiveness . . .” U.S. Code of Federal Regulations, Title 21 (Food and Drugs), U.S. Government Printing Office, Washington DC, 2001, Part 860.7 Web address http://www.access.gpo.gov/nara/cfr/waisidx_01/21cfr860_01.html (Accessed February, 2002)

  9. Device Regulation Least Burdensome Provisions of FDA Modernization Act (1997) “Secretary shall only request information that is necessary to making substantial equivalence determinations.” “Secretary shall consider, …, the least burdensome appropriate means of evaluating device effectiveness that would have a reasonable likelihood of resulting in approval.” U.S. Code of Federal Regulations, Title 21 (Food and Drugs), U.S. Government Printing Office, Washington DC, 2001, Part 513(i)(1)(D) and 513(a)(3)(D)(ii). Web address http://www.access.gpo.gov/nara/cfr/waisidx_01/21cfr860_01.html

  10. FDA Least Burdensome Guidance FDA Guidance: The Least Burdensome Provisions of the FDA Modernization Act of 1997: Concept and Principles (2002) “Modern statistical methods may also play an important role in achieving a least burdensome path to market. For example, through the use of Baysian [sic] analyses, studies can be combined in order to help reduce the sample size needed for the experimental and/or control device.”

  11. Examples of Less Burdensome Non-U.S. data Surrogate endpoints (e.g., acute follow-up) Interim analysis, Adaptive design Bayesian methods (e.g., to reduce sample size)† Propensity Scores for historical controls Sensitivity analysis for missing data.Note, could trade clinical for statistical burden †FDA Draft Guidance for the Use of Bayesian Statistics in Medical Device (released May 23, 2006) www.fda.gov/cdrh/osb/guidance/1601.html

  12. Least Burdensome Provision • Least burdensome provision in FDAMA of 1997 is directed to both medical devices and diagnostics (including biomarkers).

  13. Device Risk Classification Class I: Devices for which “general controls” provide reasonable assurance of the safety and effectiveness. Class II: “General controls” insufficient, Can establish “special controls” (performance standards [CLIA, ISO], FDA guidance. May require clinical data on a 510(k). Class III: General and special controls insufficient. Life-sustaining/supporting, substantial importance in preventing impairment of human health, potential unreasonable risk of illness or injury. Needs pre-market approval (PMA).

  14. Post-Market Transformation • “Make postmarket data more widely available to Center staff and supplement search and reporting tools” • "Investigate the use of data and text mining techniques to identify the "needles in the haystack" by identifying patterns in the incoming data that equate to public health signals.” • Example is WebVDME Bayesian data-mining • Design a pilot project to test the usefulness of quantitative decision-making methods for medical device regulation across the total product life cycle http://www.fda.gov/cdrh/postmarket/mdpi-report-1106.html

  15. Types of Biomarkers • Diagnostic • Early detection (screening), enabling intervention at an earlier and potentially more curable stage than under usual clinical diagnostic conditions • Monitoring of diseaseresponse during therapy, with potential for adjusting level of intervention (e.g. dose) on a dynamic and personal basis • Risk assessment leading to preventive interventions for those at sufficient risk • Prognosis, allowing for more aggressive therapy for patients with poorer prognosis • Prediction of safety or efficacy (response) of a therapy, thereby providing guidance in choice of therapy

  16. Types of Biomarkers • Diagnostic • Early Detection (screening) • Monitoring • Risk Assessment • Prognostic • Predictive of Safety or Efficacy The first three are considered together, where the focus is on identifying the disease or condition.

  17. Types of Biomarkers • Diagnostic • Early Detection (screening) • Monitoring • Risk Assessment • Prognostic • Predictive of Safety or EfficacyThe last three are attempting to predict the future.

  18. Analytical Validation How well are you measuring the measurand? • Precision / Reproducibility • Method Comparison • LoB, LoD, LoQ • Linearity • Stability Clinical Laboratory Standards Institute (CLSI) (http://www.nccls.org/)

  19. Clinical Validation (“Qualification”) • Does the test have clinical utility? • Does it have added value over standard tests (e.g, clinical covariates like age, tumor size, stage)? • May or may not require a clinical study • EX. Roche Amplichip CDRH guidance document: “Statistical Guidance on Reporting Results from Studies Evaluating Diagnostic Tests” issued in final form in March, 2007, concerns reporting agreement when there is no perfect standard and also discrepancy resolution. http://www.fda.gov/cdrh/osb/guidance/1620.html

  20. Roche AmpliChip CYP450 Test (CDRH de novo 510(k) K042259) Genotypes two cytochrome P450 genes (29 polymorphisms in CYP2D6 gene, 2 in CYP2C19) to provide the predictive phenotype of the metabolic rate for a class of therapeutics metabolized primarily by CYP2D6 or CYP2C19 gene products. The phenotypes are (1) Poor metabolizers: (3) Extensive metabolizers: (2) Intermediate metabolizers: (4) Ultrarapid metabolizers: Cytochrome P450s are a large multi-gene family of enzymes found in the liver, and are linked to the metabolism of approximately 70-80% of all drugs. Among them, the polymorphic CYP2D6 and CYP2C19 genes are responsible for approximately 25% of all CYP450-mediated drug metabolism. A polymorphism in these enzymes can lead to an excessive or prolonged therapeutic effect or drug-related toxicity after a typical dose by failing to clear a drug from the blood or by changing the pattern of metabolism to produce toxic metabolites. http://www.accessdata.fda.gov/scripts/cdrh/cfdocs/cfPMN/pmn.cfm

  21. Adding Value to Standard Clinical Predictors • Head to Head: Marker superior to clinical predictors at predicting outcome. • Incremental Improvement: Combination superior to clinical predictors alone. • Marker Predictive within Clinical Strata: e.g., HR(+, –) significant within age, tumor grade, tumor size groups.

  22. Multivariate Index Assays • An IVDMIA is a device that: • Combines the values of multiple variables using an interpretation function to yield a single, patient-specific result (e.g., a “classification,” “score,” “index,” etc.), that is intended for use in the diagnosis of disease or other conditions, or in the cure, mitigation, treatment or prevention of disease, and • Provides a result whose derivation is non-transparent and cannot be independently derived or verified by the end user. MIA result could be a binary (dichotomous) (such as yes or no), categorical (such as disease type), ordinal (such as low, medium, high) or a continuous scale. • Source: FDA MIA Draft Guidance http://www.fda.gov/cdrh/oivd/guidance/1610.html

  23. Typical Endpoints for Prognostic or Predictive Biomarkers • Time to Event • Event by Time t

  24. Relative Risk vs. Diagnostic Accuracy† Event by Time t Relative Risk looks good, but Dx accuracy not great → limited clinical utility? Marker †Example taken from Emir, Wieand, Su, Cha, Analysis of repeated markers used to predict progression of cancer Statist. Med., 17, 2563-78, 1998.

  25. Hazard Ratio vs. Diagnostic Accuracy† • NCCTG Mayo Clinic Study. CA15-3 ratio as diagnostic for progression of breast cancer (as determined by physical exam). †Example taken from Emir, Wieand, Su, Cha, Analysis of repeated markers used to predict progression of cancer Statist. Med., 17, 2563-78, 1998.

  26. Diagnostic Performance Sensitivity Specificity (TP rate): (TN rate): FP rate:fraction of fraction of fraction of responders non-responders non-responders who test + who test – who test + Test is useful if TP rate > FP rate, i.e., sensitivity + specificity > 1. EX. Useless test: sensitivity 0.80, specificity 0.20

  27. Diagnostic Performance Positive Negative predictive predictive value (PPV): value (NPV): 1 – NPV:fraction of fraction of fraction of test +’s who test –’s who test –’s whorespond don’t respond respondTest is useful if PPV + NPV > 1 EX. Useless test: PPV 0.60, NPV 0.40

  28. d A ROC curve is a plot of sensitivity (true positive rate) vs. 1-specificity (false positive rate) over all possible cutoff points for the test. The test is informative if the area under the curve is greater than 0.5.

  29. Prognostic Biomarker(Strong Def’n) Prognostic factor. Informs about an outcome independent of specific treatment (ability of tumor to proliferate, invade, and/or spread). Prognostic biomarker is associated with likelihood of an outcome (e.g., survival, response, recurrence) such that magnitude of association is independent of treatment. On some scale, treatment and biomarker effects are additive, that is, do not interact.

  30. HR(A,B)=0.67 HR(A,B)=0.67

  31. Prognostic Biomarker (Weak Def’n) Prognostic factor. Informs about an outcome independent of specific treatment (ability of tumor to proliferate, invade, and/or spread). Prognostic biomarker is associated with likelihood of an outcome (e.g., survival, response, recurrence) in a population that is untreated or on a “standard” (non-targeted) treatment. If population is clearly defined, than can use to choose more or less aggressive therapy, but not specific therapies, per se.

  32. HR(A,B)=0.67 HR(A,B)=0.67

  33. Prognostic Biomarker • Her2-neu for node-negative women with breast cancer – prognostic for recurrence • Breast cancer prognostic test based on microarray gene expression of RNAs extracted from breast tumor tissue to assess a patient’s risk for distant metastasis for women less than 61 with Stage I or II disease with tumor size less than or equal 5.0 cm and who are lymph node negative. (Ref.: Buyse et al. JNCI 98, 1183-1192)

  34. Agendia Mammaprint Gene Signature for Time to Distant Metastasis (N=302) 5-year: Low risk group: 0.95 (0.91-0.99) High risk group: 0.78 (0.72-0.84) 10-year: Low risk group: 0.90 (0.85-0.96) High risk group:0.71 (0.65-0.78) Buyse et al JNCI (2006), 98,1183-1192

  35. Proportion alive at 10 years Clinical Gene N Proportion* Signature Low Risk Low Risk 52 0.88 (0.74 to 0.95) Sp Low Risk High Risk 28 0.69 (0.45 to 0.84) 1–Se High Risk Low Risk 59 0.89 (0.77 to 0.95) Sp High Risk High Risk 163 0.69 (0.61 to 0.76) 1–Se *Buyse et al JNCI 2006

  36. Predictive Biomarker Predictive factor. Implies relative sensitivity or resistance to specific treatments or agents. Predictive biomarker predicts differential effect of treatment on outcome. Treatment and biomarker interact.Predictive biomarker can be useful for selecting specific therapy.

  37. HR(A,B)=0.5 HR(A,B)=1.0

  38. Predictive Biomarker of Efficacy Marker: HER2/neuTreatment: Trastuzumab (Herceptin) Objective response rate: Herceptin+Chemo ChemoFISH+ 95/176 (54%) 51/168 (30%)FISH- 19/50 (38%) 22/57 (39%) Arch. Pathol. Lab Med Jan 2007 (ASCO/CAP Guidelines)

  39. Predictive Biomarkers for Safety • Predict risk of an adverse event dependent on the biomarker • Example • UGT1A1, cleared by FDA, to predict the risk of neutropenia in patients taking irinotecan for colorectal cancer

  40. Prospective Study Designs for Predictive Markers • Untargeted Design (Reference) Validate Treatment, Marker Simultaneously • Marker by Treatment Design • Targeted Design (Marker + Subset Only) • Marker Strategy Design • Historical Control

  41. Untargeted Design (Reference) • Test if drug works in entire population. • Mixture of marker + and – drug effects. • Can store samples if test is not ready.

  42. Marker by Treatment (Interaction) Design • A Randomized Block Design • Can test for biomarker by treatment interaction (predictive biomarker) • Test needs to be available before trial ensues.

  43. Marker by Treatment Design Questions • Test Drug Overall and within Marker + Subset • 0.04, 0.01 tests suggested to control Type I error rate at 0.05 (Simon), but subset could drive overall result. • Frequentist multiplicity penalty may preclude subset testing as good business strategy. • Statement about drug, not biomarker • Test Marker Overall and within Drug Subset • Statement about marker, not drug. • Test for Treatment by Marker Interaction • Simultaneously validates drug and marker.

  44. Targeted Design Test if drug works in subset. Cannot test if marker discriminates. Only PPV available.

  45. Efficiency of Designs Efficiency gain depends on marker prevalence, relative efficacy, and difference tested. * Marker – to Marker + Patients †Simon & Maitournam, CCR 2004 †† Marker by Treatment Design: Test for Interaction approx. efficiency enriching with half +’s, half –’s.

  46. Efficiency of Designs Efficiency gain depends on marker prevalence, relative efficacy, and difference tested. * Marker – to Marker + Patients †Simon & Maitournam, CCR 2004 †† Marker by Treatment Design: Test for Interaction approx. efficiency enriching with half +’s, half –’s.

  47. Efficiency of Designs Efficiency gain depends on marker prevalence, relative efficacy, and difference tested. * Marker – to Marker + Patients †Simon & Maitournam, CCR 2004 †† Marker by Treatment Design: Test for Interaction approx. efficiency when enriching with half +’s, half –’s.

  48. Improving Efficiency of Interaction Design • Enrich with Test Positives if Pr(+) is low • Find scale such that marker and treatment effects are additive • Adaptive Randomization • Bayesian subset analysis • If reader variability (e.g., IHC), then use multiple readers. • Prior Information

  49. Possibilities for Increasing Efficiency of Interaction Design • Enrich with Test Positives if Pr(+) is low • Estimates of Sensitivity and Specificity are biased because they depend on Pr(+). • Use inverse probability weighting (Horvitz, Thompson, 1952) or Bayes Theorem (Begg, Greenes, 1983) to obtain unbiased estimates.

More Related