170 likes | 298 Vues
This presentation addresses the statistical complexities faced in analyzing the BCG vaccine efficacy from a cluster randomized trial perspective. It emphasizes the need for unbiased measurement of vaccine efficacy while considering potential confounding variables, intra-cluster correlation, and low incidence of tuberculosis. The objectives include estimating vaccine efficacy with a 95% confidence interval and identifying effect modifiers. Key analytical challenges, methodological solutions, and subgroup analyses pertaining to trial design and statistical adjustments are thoroughly discussed, providing valuable insights for epidemiologists and biostatisticians.
E N D
BCG REVAC- Cluster Randomization Trial Data Analysis – Statistical Issues Bernd Genser, PhD Instituto de Saúde Coletiva, Universidade Federal da Bahia, Salvador Email: bernd.genser@bgstats.com Slides available at:www.bgstats.com/port/links/downloads Seminário ABRASCO- Métodos em Epidemiologia: ESTUDOS DE COORTE, Rio de Janeiro, 01-AUG - 2005
The BGC-trial from a statistician‘s point of view Main Objective: Estimation of an unbiased consistent measure of Vaccine Efficacy (VE) incl. 95% CI of a BCG dose given to school children in a population with a high coverage of neonatal BCG vaccination Secondary objective: Identify effect modifiers (city, BCG scar, …)
Issues to be addressed in Statistical Analysis 1) Potential confounding and effect modification - Trial design:Complex multi-level covariate structure - Adjusting/controlling for confounding of fixed and time-varying (e.g. age) tb predictors - Heterogeneity of VE across covariatestrata expected 2) Cluster Randomization – Adjusting the estimates for potential intra-cluster correlation 3) Expected low incidence of tb: More clusters than cases expected => Traditional statistical methods for CRT could not applied
Analytical Solutions for the BCG trial 1) Issue 1: Dealing with potential confounding variables: • Controlled by study design Stratification/randomization: • Allocation groups were highly balanced in confounding variables => No statistical adjustment required for these covariates • Matching by size of school accounts additionally for effect of “cluster size” • Adjusted in Statistical Analysis • Tb incidence is well known strongly dependent on age => age modeled as time-varying variable
Dealing with covariates in the BCG trial Subgroup analysis Design: Strat. Subgroup analysis Design: Strat. Design: Random. Stat. Adjustment Design: Random. Subgroup analysis Design: Matching Subgroup analysis Design: Random. Subgroup analysis
total population recruited study children total allocation group allocation group intervention control total intervention control total covariate n=180655 n=173751 n=354406 n=124340 n=115594 n=239934 individual level city (% children in Salvador) 58.6% 53.0% 55.9% 58.8% 49.0% 54.1% age (mean, sd) 11.53 (2.16) 11.46 (2.17) 11.51 (2.16) 11.53 (2.08) 11.44 (2.10) 11.48 (2.09) age_group < 7 0.0% 0.3% 0.2% 0.0% 0.3% 0.1% 7-8 15.6% 16.3% 16.0% 14.3% 15.6% 14.9% 9-10 24.9% 24.6% 24.7% 25.5% 25.3% 25.4% 11-12 29.2% 29.0% 29.1% 31.0% 30.6% 30.8% 13-14 28.0% 28.3% 28.1% 28.1% 28.1% 28.1% > 14 2.3% 1.5% 1.9% 1.1% 0.1% 0.6% gender (% males) 49.5% 49.5% 49.5% 48.2% 48.5% 48.4% BCG scar reading total 76.0% 72.5% 74.3% 100.0% 100.0% 100% after excl. bec. Age 76.3% 72.5% 74.2% BCG scar count no scar 11.7% 10.8% 11.3% 16.6% 16.0% 16.3% one scar 58.4% 56.6% 57.5% 83.4% 84.0% 83.7% two scars 4.3% 3.9% 4.1% 0% 0% 0% no data 25.6% 28.7% 27.1% 0% 0% 0% vaccination (% vaccinated) 66.7% 0.5% 34.2% 94.6% 0.7% 49% Evaluation of the random allocation procedure
total population recruited study children total allocation group allocation group intervention control total intervention control total covariate n=180655 n=173751 n=354406 n=124340 n=115594 n=239934 cluster level schools count 388 (50.8%) 375 (49.2%) 763 (100%) 386 (51.3%) 365 (48.7%) 751 (100%) cluster size mean (sd) 465 (325) 463 (290) 464 (308) 322 (233) 317 (215) 319 (224) min, max 26; 2368 36; 1764 26; 2368 11; 1430 10; 1334 10; 1430 gender (% males) mean (sd) 49.7% (7.0%) 50.1% (5.4%) 49.9% (6.3%) 48.4% (7.4%) 49.3% (5.8%) 48.8% (6.7%) min, max 0%; 84.7% 35.2%; 99.1% 0%; 99.1% 0%; 83.3% 35%; 98.8% 0%; 98.8% Scar Read. (% yes) mean (sd) 75.1% (12.2%) 71.0% (17.0%) 73.1% (14.9%) 100% 100% 100% min, max 0%; 95.7% 0%; 95.1% 0%; 95.7% Scar Count (% 0 or 1) mean (sd) 69.2% (12.2%) 65.8% (16.3%) 67.5% (14.4%) 100% 100% 100% min, max 0%; 90.3% 0%; 91.3% 0%; 91.3% data available for Salvador only: soc. Eco. cond. 0-25 2.5% 3.1% 2.8% 26-50 10.0% 11.1% 10.5% 51-75 30.3% 29.6% 30.0% 76-HI 57.3% 56.2% 56.7% data available for Manaus only: inc. of tbc (mean, sd) 121.6 (91.3) 126.3 (74.8) 123.5 (84.8) mean (sd) 14.5; 618.0 14.5; 618 14.5; 618 min, max inc. of leprosy (mean, sd) 8.8 (9.8) 7.8 (7.2) 8.4 (8.8) mean (sd) 0.3; 66.9 0; 66.9 0; 66.9 min, max Evaluation of the random allocation procedure
Analytical Solutions for the BCG trial (2) • Issue 2: Dealing with effect modification: • Subgroup analyses conducted by • No. of BCG Scars (First or Second dose) • City (Salvador and Manaus) • Clinical form/Certainty level Strong evidence of effect heterogeneity found: - We decided to analyze children with 1 and 0 scar seperately: 1st, 2nd dose effect are completely different scientific questions =>No interaction model fitted! - All analyses were presented overall and by city and clinical form
Analytical Solutions for the BCG trial (3) • Issue 3: Adjusting the estimates for the “design effect” Statistical problem: between-cluster variation (=intra-cluster correlation), induced by unexplained dependence structure between children from the same school, usually caused by common unknow/unobserved risk factores => Consequence: standard statistical approaches can substantially underestimate the true variance of the effect estimators (Overdispersion)!!! – confidence intervals too narrow!
Analytical Solutions for the BCG trial (4) • Statistical approaches to deal with ICC: For binary or quantitative outcomes: Direct adjustment of confidence intervals possible by estimating intracluster (intraclass-) correlation (ICC) For count outcomes (Poisson distributed data): • Explicit estimation of ICC not possible! • Examining the magnitude of the design effect by comparing unadjusted and adjusted CI • Novel univariate approaches that directly adjust the CI and P-values for the clustering
Analytical Solutions for the BCG trial (5) Two basic approaches for CRT with Poisson data: A) Analyses at the cluster level: „Cluster summary statistic“, meta-analysis techniques: not recommended in our trial because of the very low cluster specific incidence – i.e. more clusters than cases!!! B) Analyses at the individual level New approach for univariate analysis: Ratio estimator approach for overdispersed Poisson data (Rao & Scott, Stat Med 1999, implemented in Software ACLUSTER): Direct adjustment of confidence intervals using an robust variance estimator
Analytical Solutions for the BCG trial (6) • Multivariate modeling - Poisson Regression • Basic Assumption: constant rate over the follow-up time • Could be relaxed by inclusion of time-varying variables (e.g. age) • Extensions for clustered data: • Parametric random effects or multi-level modelling: • intra-cluster correlation modeled by cluster specific random effect • Disadvantage: strong distributional assumptions! => Random effects models not recommended for that trial: - violation of distributional assumptions, - convergence problems, l - large bias in variance estimation of the random effect!!! • Better: Semi-parametric approach based onGeneralized Estimating Procedures (GEE): • calculate an adjusted variance estimator by an iterative algorithm • assuming a „working correlating structure“ • Advantage: No distributional assumptions! • Disadvantage: Very computer intensive for large datasets because of the calculation complexity: time for the BCG data: 1 hour! (1000
95% CI 95% CI 95% CI VE lb ub RR lb ub beta ln_lb ln_ub SE(beta) Wald P-Value All cases, Second dose Standard Poisson 9 -15 28 0.91 0.72 1.15 -0.09 -0.32850 0.13976 0.1195 -0.789 0.430 GEE Poisson 9 -16 29 0.91 0.71 1.16 -0.09 -0.33624 0.14762 0.1234 -0.764 0.445 Non Pulm., Second dose Standard Poisson 37 -4 61 0.63 0.39 1.04 -0.46 -0.94161 0.03922 0.2502 -1.847 0.065 GEE Poisson 37 -3 61 0.63 0.39 1.03 -0.46 -0.94896 0.02489 0.2484 -1.860 0.063 Pulm., Second dose Standard Poisson -1 -32 13 1.01 0.87 1.32 0.01 -0.13926 0.27763 0.1064 0.094 1.000 GEE Poisson -1 -24 18 1.01 0.82 1.24 0.01 -0.19657 0.21647 0.1054 0.094 1.000 Results of the Poisson Regression models Naive and robust variance estimations were very similar: No “design effect” observed
Statistical software for analysing/planning CRT • STATA 7/8/9, General Purpose Statistical Package, Stata Corporation www.stata.com • GLM with GEE, random effects or robust variance estimation to adjust for clustering • STATA 9, MLWin: Multi-level models www.multilevel.ioe.ac.uk • ACLUSTER - Software for the Design and Analysis of Cluster Randomized Trials www.update-software.com/acluster • Easy computation of the intraclass correlation coefficient • Direct adjustment approaches for univariate analysis • Power Analysis for the three types of cluster randomized study design
Literatur • Statistics in Medicine (2001); 20 (Special Issue): Design and Analysis of Cluster Randomized Trials • Donner A. Klar N. Design and analysis of cluster randomisation trials (2000). Arnold Publications, London.