Phase II cancer studies are undertaken to assess the activity of a new drug or a new treatment regimen. Activity is sometimes defined in terms of a survival probability, a binary outcome such as one-year survival that is derived from a time-to-event variable. Phase II studies are usually designed with an interim analysis so they can be stopped if early results are disappointing. Most designs that allow for an interim look are not appropriate for monitoring survival probabilities since many patients will not have enough follow-up by the time of the interim analysis, thus necessitating an inconvenient suspension of accrual while patients are being followed.
Two-stage phase II clinical trial designs are developed for evaluating survival probabilities. These designs are compared to fixed sample designs and to existing designs developed to monitor binomial probabilities to illustrate the expected reduction in sample size or study length possible with the use of the proposed designs.
Savings can be realized in both the duration of accrual and the total study length, with the expected savings increasing as the accrual rate decreases. Misspecifying the underlying survival distribution and the accrual rate during the planning phase can adversely influence the operating characteristics of the designs.
Two-stage phase II trials for assessing survival probabilities can be designed that do not require prolonged suspension of patient accrual. These designs are more efficient than single stage designs and more practical than existing two-stage designs developed for binomial outcomes, particularly in trials with slow accrual.
A clinical trial conducted according to a schedule of interim analyses written into the protocol, and stopped according to a predetermined rule, is known to statisticians as a sequential clinical trial. This methodology is becoming more widely used in trials concerning life-threatening diseases because of its ability to adjust the sample size to the emerging information on treatment efficacy. When treatments under comparison differ appreciably, small samples will be sufficient; for more subtle differences larger numbers of patients need to be recruited. Sequential methods have already been used in certain cancer clinical trials, and they are especially appropriate for such studies. In this paper the principles of sample size determination are reviewed, and the essential aspects of designing sequential trials are described. The necessity for a special form of statistical analysis following a sequential trial is explained, and the consequences of early or late stopping on the analysis are investigated. Compromises which have to be made between the formal requirements of theory and the practical realities of trial conduct are discussed.
The primary objective of the present study was to show the long lasting cardioprotective activity, at different time-points, up to 18 month-follow-up, of telmisartan in preserving the systolic function (assessed as Strain Rate-SR) in cancer patients treated with EPI both in the adjuvant and metastatic setting; the secondary objective was to confirm the correlation of the cardioprotective activity of telmisartan with a reduction of inflammation and oxidative stress induced by EPI.
Phase II single blind placebo-controlled randomized trial. Sample size 50 patients per arm: based on a pre-planned interim analysis for early stopping rules, the study was discontinued for ethical reasons at 49 patients. Cardiovascular disease-free patients with cancer at different sites eligible for EPI-based treatment randomized to: telmisartan n = 25 or placebo n = 24. Echocardiography Tissue Doppler imaging (TDI) strain and strain rate was performed, serum levels of proinflammatory cytokines (IL-6, TNF-α) and oxidative stress (reactive oxygen species, ROS) were assessed at baseline, every 100 mg/m2 EPI dose and at 6-, 12- and 18-month follow-up (FU).
Significant SR peak reduction in both arms was observed at t2 (cumulative dose EPI 200 mg/m2) vs t0. Conversely, at t3, t4, 6-, 12- and 18-month FU SR increased towards normal range in the telmisartan arm, while in the placebo arm SR remained significantly lower. Differences between SR changes in the placebo and telmisartan arm were significant from t3 up to 18 month-FU. IL-6 and ROS increased significantly in the placebo arm at t2 but did not change in the telmisartan arm. A significant (p < 0.05) correlation between changes of SR vs IL-6 and ROS was observed.
Our results suggest that the protective effect of telmisartan is long lasting, probably by ensuring a permanent (at least up to 18-month FU) defense against chronic or late-onset types of anthracycline-induced cardiotoxicity.
Epirubicin-induced cardiotoxicity; Cytokines; Oxidative stress; RAS; Telmisartan
In 2011, Royston et al. described technical details of a two-arm, multi-stage (TAMS) design. The design enables a trial to be stopped part-way through recruitment if the accumulating data suggests a lack of benefit of the experimental arm. Such interim decisions can be made using data on an available ‘intermediate’ outcome. At the conclusion of the trial, the definitive outcome is analyzed. Typical intermediate and definitive outcomes in cancer might be progression-free and overall survival, respectively. In TAMS designs, the stopping rule applied at the interim stage(s) affects the sampling distribution of the treatment effect estimator, potentially inducing bias that needs addressing.
We quantified the bias in the treatment effect estimator in TAMS trials according to the size of the treatment effect and for different designs. We also retrospectively ‘redesigned’ completed cancer trials as TAMS trials and used the bootstrap to quantify bias.
In trials in which the experimental treatment is better than the control and which continue to their planned end, the bias in the estimate of treatment effect is small and of no practical importance. In trials stopped for lack of benefit at an interim stage, the treatment effect estimate is biased at the time of interim assessment. This bias is markedly reduced by further patient follow-up and reanalysis at the planned ‘end’ of the trial.
Provided that all patients in a TAMS trial are followed up to the planned end of the trial, the bias in the estimated treatment effect is of no practical importance. Bias correction is then unnecessary.
Discovery of biomarkers that are correlated with therapy response and thus with survival is an important goal of medical research on severe diseases, e.g. cancer. Frequently, microarray studies are performed to identify genes of which the expression levels in pretherapeutic tissue samples are correlated to survival times of patients. Typically, such a study can take several years until the full planned sample size is available.
Therefore, interim analyses are desirable, offering the possibility of stopping the study earlier, or of performing additional laboratory experiments to validate the role of the detected genes. While many methods correcting the multiple testing bias introduced by interim analyses have been proposed for studies of one single feature, there are still open questions about interim analyses of multiple features, particularly of high-dimensional microarray data, where the number of features clearly exceeds the number of samples. Therefore, we examine false discovery rates and power rates in microarray experiments performed during interim analyses of survival studies. In addition, the early stopping based on interim results of such studies is evaluated. As stop criterion we employ the achieved average power rate, i.e. the proportion of detected true positives, for which a new estimator is derived and compared to existing estimators.
In a simulation study, pre-specified levels of the false discovery rate are maintained in each interim analysis, where reduced levels as used in classical group sequential designs of one single feature are not necessary. Average power rates increase with each interim analysis, and many studies can be stopped prior to their planned end when a certain pre-specified power rate is achieved. The new estimator for the power rate slightly deviates from the true power rate but is comparable to other estimators.
Interim analyses of microarray experiments can provide evidence for early stopping of long-term survival studies. The developed simulation framework, which we also offer as a new R package 'SurvGenesInterim' available at http://survgenesinter.R-Forge.R-Project.org, can be used for sample size planning of the evaluated study design.
Decisions about interim analysis and early stopping of clinical trials, as based on recommendations of Data Monitoring Committees (DMCs), have far reaching consequences for the scientific validity and clinical impact of a trial. Our aim was to evaluate the frequency and quality of the reporting on DMC composition and roles, interim analysis and early termination in pediatric trials.
We conducted a systematic review of randomized controlled clinical trials published from 2005 to 2007 in a sample of four general and four pediatric journals. We used full-text databases to identify trials which reported on DMCs, interim analysis or early termination, and included children or adolescents. Information was extracted on general trial characteristics, risk of bias, and a set of parameters regarding DMC composition and roles, interim analysis and early termination.
110 of the 648 pediatric trials in this sample (17%) reported on DMC or interim analysis or early stopping, and were included; 68 from general and 42 from pediatric journals. The presence of DMCs was reported in 89 of the 110 included trials (81%); 62 papers, including 46 of the 89 that reported on DMCs (52%), also presented information about interim analysis. No paper adequately reported all DMC parameters, and nine (15%) reported all interim analysis details. Of 32 trials which terminated early, 22 (69%) did not report predefined stopping guidelines and 15 (47%) did not provide information on statistical monitoring methods.
Reporting on DMC composition and roles, on interim analysis results and on early termination of pediatric trials is incomplete and heterogeneous. We propose a minimal set of reporting parameters that will allow the reader to assess the validity of trial results.
An internal pilot with interim analysis (IPIA) design combines interim power analysis (an internal pilot) with interim data analysis (two stage group sequential). We provide IPIA methods for single df hypotheses within the Gaussian general linear model, including one and two group t tests. The design allows early stopping for efficacy and futility while also re-estimating sample size based on an interim variance estimate. Study planning in small samples requires the exact and computable forms reported here. The formulation gives fast and accurate calculations of power, type I error rate, and expected sample size.
Adaptive designs; Power; Sample size re-estimation
Randomized clinical trials (RCTs) stopped early for benefit often receive great attention and affect clinical practice, but pose interpretational challenges for clinicians, researchers, and policy makers. Because the decision to stop the trial may arise from catching the treatment effect at a random high, truncated RCTs (tRCTs) may overestimate the true treatment effect. The Study Of Trial Policy Of Interim Truncation (STOPIT-1), which systematically reviewed the epidemiology and reporting quality of tRCTs, found that such trials are becoming more common, but that reporting of stopping rules and decisions were often deficient. Most importantly, treatment effects were often implausibly large and inversely related to the number of the events accrued. The aim of STOPIT-2 is to determine the magnitude and determinants of possible bias introduced by stopping RCTs early for benefit.
We will use sensitive strategies to search for systematic reviews addressing the same clinical question as each of the tRCTs identified in STOPIT-1 and in a subsequent literature search. We will check all RCTs included in each systematic review to determine their similarity to the index tRCT in terms of participants, interventions, and outcome definition, and conduct new meta-analyses addressing the outcome that led to early termination of the tRCT. For each pair of tRCT and systematic review of corresponding non-tRCTs we will estimate the ratio of relative risks, and hence estimate the degree of bias. We will use hierarchical multivariable regression to determine the factors associated with the magnitude of this ratio. Factors explored will include the presence and quality of a stopping rule, the methodological quality of the trials, and the number of total events that had occurred at the time of truncation.
Finally, we will evaluate whether Bayesian methods using conservative informative priors to "regress to the mean" overoptimistic tRCTs can correct observed biases.
A better understanding of the extent to which tRCTs exaggerate treatment effects and of the factors associated with the magnitude of this bias can optimize trial design and data monitoring charters, and may aid in the interpretation of the results from trials stopped early for benefit.
Ventilator-associated tracheobronchitis (VAT) is associated with increased duration of mechanical ventilation. We hypothesized that, in patients with VAT, antibiotic treatment would be associated with reduced duration of mechanical ventilation.
We conducted a prospective, randomized, controlled, unblinded, multicenter study. Patients were randomly assigned (1:1) to receive or not receive intravenous antibiotics for 8 days. Patients with ventilator-associated pneumonia (VAP) prior to VAT and those with severe immunosuppression were not eligible. The trial was stopped early because a planned interim analysis found a significant difference in intensive care unit (ICU) mortality.
Fifty-eight patients were randomly assigned. Patient characteristics were similar in the antibiotic (n = 22) and no antibiotic (n = 36) groups. Pseudomonas aeruginosa was identified in 32% of VAT episodes. Although no difference was found in mechanical ventilation duration and length of ICU stay, mechanical ventilation-free days were significantly higher (median [interquartile range], 12 [8 to 24] versus 2 [0 to 6] days, P < 0.001) in the antibiotic group than in the no antibiotic group. In addition, subsequent VAP (13% versus 47%, P = 0.011, odds ratio [OR] 0.17, 95% confidence interval [CI] 0.04 to 0.70) and ICU mortality (18% versus 47%, P = 0.047, OR 0.24, 95% CI 0.07 to 0.88) rates were significantly lower in the antibiotic group than in the no antibiotic group. Similar results were found after exclusion of patients with do-not-resuscitate orders and those randomly assigned to the no antibiotic group but who received antibiotics for infections other than VAT or subsequent VAP.
In patients with VAT, antimicrobial treatment is associated with a greater number of days free of mechanical ventilation and lower rates of VAP and ICU mortality. However, antibiotic treatment has no significant impact on total duration of mechanical ventilation.
ClinicalTrials.gov, number NCT00122057.
Consider a comparative, randomized clinical study with a specific event time as the primary endpoint. In the presence of censoring, standard methods of summarizing the treatment difference are based on Kaplan-Meier curves, the logrank test and the point and interval estimates via Cox’s procedure. Moreover, for designing and monitoring the study, one usually utilizes an event-driven scheme to determine the sample sizes and interim analysis time points.
When the proportional hazards assumption is violated, the logrank test may not have sufficient power to detect the difference between two event time distributions. The resulting hazard ratio estimate is difficult, if not impossible, to interpret as a treatment contrast. When the event rates are low, the corresponding interval estimate for the “hazard ratio” can be quite large due to the fact that the interval length depends on the observed numbers of events. This may indicate that there is not enough information for making inferences about the treatment comparison even when there is no difference between two groups. This situation is quite common for a post marketing safety study. We need an alternative way to quantify the group difference.
Instead of quantifying the treatment group difference using the hazard ratio, we consider an easily interpretable and model-free parameter, the integrated survival rate difference over a pre-specified time interval, as an alternative. We present the inference procedures for such a treatment contrast. This approach is purely nonparametric and does not need any model assumption such as the proportional hazards. Moreover, when we deal with equivalence or non-inferiority studies and the event rates are low, our procedure would provide more information about the treatment difference. We used a cardiovascular trial data set to illustrate our approach.
The results using the integrated event rate differences have a heuristic interpretation for the treatment difference even when the proportional hazards assumption is not valid. When the event rates are low, for example, for the cardiovascular study discussed in the paper, the procedure for the integrated event rate difference provides tight interval estimates in contrast to those based on the event-driven inference method.
The design of a trial with the integrated event rate difference may be more complicated than that using the event-driven procedure. One may use simulation to determine the sample size and the estimated duration of the study.
The procedure discussed in the paper can be a useful alternative to the standard proportional hazards method in survival analysis.
Equivalence study; Event-driven study; Kaplan-Meier curve; Non-inferiority trial; Post-market study; Proportional hazards estimate
Although interim analysis approaches in clinical trials are widely known, information on current practice of planned monitoring is still scarce. Reports of studies rarely include details on the strategies for both data monitoring and interim analysis. The aim of this project is to investigate the forms of monitoring used in cancer clinical trials and in particular to gather information on the role of interim analyses in the data monitoring process of a clinical trial. This study focused on the prevalence of different types of interim analyses and data monitoring in cancer clinical trials.
Source of investigation were the protocols of cancer clinical trials included in the Italian registry of clinical trials from 2000 to 2005. Evaluation was restricted to protocols of randomised studies with a time to event endpoint, such as overall survival (OS) or progression free survival (PFS). A template data extraction form was developed and tested in a pilot phase. Selection of relevant protocols and data extraction were performed independently by two evaluators, with differences in the data assessment resolved by consensus with a third reviewer, referring back to the original protocol. Information was obtained on a) general characteristics of the protocol b) disease localization and patient setting; c) study design d) interim analyses; e) DSMC.
The analysis of the collected protocols reveals that 70.7% of the protocols incorporate statistical interim analysis plans, but only 56% have also a DSMC and be considered adequately planned. The most concerning cases are related to lack of any form of monitoring (20.0% of the protocols), and the planning of interim analysis, without DSMC (14.7%).
The results indicate that there is still insufficient attention paid to the implementation of interim analysis.
In this paper, we derive sequential conditional probability ratio tests to compare diagnostic tests without distributional assumptions on test results. The test statistics in our method are nonparametric weighted areas under the receiver-operating characteristic curves. By using the new method, the decision of stopping the diagnostic trial early is unlikely to be reversed should the trials continue to the planned end. The conservatism reflected in this approach to have more conservative stopping boundaries during the course of the trial is especially appealing for diagnostic trials since the end point is not death. In addition, the maximum sample size of our method is not greater than a fixed sample test with similar power functions. Simulation studies are performed to evaluate the properties of the proposed sequential procedure. We illustrate the method using data from a thoracic aorta imaging study.
diagnostic accuracy; ROC; AUC; weighted AUC; SCPRT
The applicability of water method colonoscopy in trainee education is not known.
To compare the water method vs. usual air method in teaching novice trainee colonoscopy.
An IRB approved prospective randomized cross-over study (NCT01482546) in a university setting with diverse patient population.
Three first year GI fellows consented to participate in the study. Trainees were randomized to learn with either usual air method or the water method in performing colonoscopy with a dedicated endoscopy attending during their weekly outpatient endoscopy clinics for the initial six months of training and then cross-over to the other method for the remaining six months.
Patients undergoing screening, surveillance or diagnostic colonoscopy.
The interim data revealed no significant difference in age, gender, and body mass index (BMI). Trainees rated the water method colonoscopy as significantly easier to learn compared to the air method (p=0.007).
The interim data demonstrate positive effects of using the water method in training novice endoscopists who reported a significant ease of learning colonoscopy using this method. Training programs could consider joining us in evaluating the use of warm water infusion in colonoscopy education.
water method; trainee; novice; education
Two-stage designs to develop and validate a panel of biomarkers present a natural setting for the inclusion of stopping rules for futility in the event of poor preliminary estimates of performance. We consider the design of a two-stage study to develop and validate a panel of biomarkers where a predictive model is developed using a subset of the samples in stage 1 and the model is validated using the remainder of the samples in stage 2. First, we illustrate how a stopping rule for futility can be implemented in a standard, two-stage study for developing and validating a predictive model where samples are separated into a training and validation sample. Simulation results indicate that our design has similar type-I error rate and power to the fixed-sample design but with a substantially reduced sample size under the null hypothesis. We then illustrate how additional interim analyses can be included in stage 2 by applying existing group sequential methodology, which results in even greater savings in the number of samples required under both the null and alternative. Our simulation results also illustrate that the operating characteristics of our design are robust to changes in the underlying marker distribution.
Group Sequential Design; Biomarker Panel; ROC Curve
The smallest difference to be detected in superiority trials or the largest difference to be ruled out in noninferiority trials is a key determinant of sample size, but little guidance exists to help researchers in their choice. The objectives were to examine the distribution of differences that researchers aim to detect in clinical trials and to verify that those differences are smaller in noninferiority compared to superiority trials.
Cross-sectional study based on a random sample of two hundred two-arm, parallel group superiority (100) and noninferiority (100) randomized clinical trials published between 2004 and 2009 in 27 leading medical journals. The main outcome measure was the smallest difference in favor of the new treatment to be detected (superiority trials) or largest unfavorable difference to be ruled out (noninferiority trials) used for sample size computation, expressed as standardized difference in proportions, or standardized difference in means. Student t test and analysis of variance were used.
The differences to be detected or ruled out varied considerably from one study to the next; e.g., for superiority trials, the standardized difference in means ranged from 0.007 to 0.87, and the standardized difference in proportions from 0.04 to 1.56. On average, superiority trials were designed to detect larger differences than noninferiority trials (standardized difference in proportions: mean 0.37 versus 0.27, P = 0.001; standardized difference in means: 0.56 versus 0.40, P = 0.006). Standardized differences were lower for mortality than for other outcomes, and lower in cardiovascular trials than in other research areas.
Superiority trials are designed to detect larger differences than noninferiority trials are designed to rule out. The variability between studies is considerable and is partly explained by the type of outcome and the medical context. A more explicit and rational approach to choosing the difference to be detected or to be ruled out in clinical trials may be desirable.
We propose an efficient group sequential monitoring rule for clinical trials. At each interim analysis both efficacy and futility are evaluated through a specified loss structure together with the predicted power. The proposed design is robust to a wide range of priors, and achieves the specified power with a saving of sample size compared to existing adaptive designs. A method is also proposed to obtain a reduced-bias estimator of treatment difference for the proposed design. The new approaches hold great potential for efficiently selecting a more effective treatment in comparative trials. Operating characteristics are evaluated and compared with other group sequential designs in empirical studies. An example is provided to illustrate the application of the method.
Decision theory; Group sequential clinical trial design; Loss function; Predicted power; Reduced-bias estimator
Lower-than-expected incidence of HIV undermines sample size calculations and compromises the power of a HIV prevention trial. We evaluated the effectiveness of interim monitoring of HIV infection rates and on-going modification of recruitment strategies to enroll women at higher risk of HIV in the Cellulose Sulfate Phase III study in Nigeria.
We analyzed prevalence and incidence of HIV and other sexually transmitted infections, demographic and sexual behavior characteristics aggregated over the treatment groups on a quarterly basis. The site investigators were advised on their recruitment strategies based on the findings of the interim analyses.
A total of 3619 women were screened and 1644 enrolled at the Ikeja and Apapa clinics in Lagos, and at the Central and Peripheral clinics in Port Harcourt. Twelve months after study initiation, the overall incidence of HIV was less than one-third of the pre-study assumption, with rates of HIV that varied substantially between clinics. Due to the low prevalence and incidence rates of HIV, it was decided to close the Ikeja clinic in Lagos and to find new catchment areas in Port Harcourt. This strategy was associated with an almost two-fold increase in observed HIV incidence during the second year of the study.
Given the difficulties in estimating HIV incidence, a close monitoring of HIV prevalence and incidence rates during a trial is warranted. The on-going modification of recruitment strategies based on the regular analysis of HIV rates appeared to be an efficient method for targeting populations at greatest risk of HIV infection and increasing study power in the Nigeria trial.
The trial was registered with the ClinicalTrials.gov registry under #NCT00120770 http://clinicaltrials.gov/ct2/show/NCT00120770
The role of cetuximab for the treatment of advanced non-small-cell lung cancer (NSCLC) is currently unclear. The molecular target of cetuximab, epidermal growth factor receptor (EGFR), as measured by fluorescent in situ hybridization (FISH), has shown potential to be a predictive biomarker for cetuximab efficacy in NSCLC. SWOG S0819 is a Phase III trial evaluating both the value of cetuximab in this setting as well as EGFR FISH as a predictive biomarker. This paper describes the decision process for determining the design and interim monitoring plan for S0819. Six possible designs were evaluated in terms of their properties and the hypotheses that can be addressed within the design constraints. A subgroup-focused multiple-hypothesis design was selected for S0819 incorporating co-primary endpoints to assess cetuximab in both the overall study population and among EGFR FISH positive patients with the sample size determined based on evaluation in the EGFR FISH positive group. The interim monitoring plan chosen specifies interim evaluations of both efficacy and futility in the EGFR FISH positive group alone. The futility monitoring plan to determine early stopping in the EGFR FISH non-positive group is based on evaluation within the positive group, the entire study population, and the non-positive group. SWOG S0819 employs a design which addresses both the biomarker-driven and general efficacy objectives of this study.
Clinical trial design; Molecular targeted therapies; cetuximab; interim monitoring; non-small cell lung cancer
It is a challenge to evaluate experimental treatments where it is suspected that the treatment effect may only be strong for certain subpopulations, such as those having a high initial severity of disease, or those having a particular gene variant. Standard randomized controlled trials can have low power in such situations. They also are not optimized to distinguish which subpopulations benefit from a treatment. With the goal of overcoming these limitations, we consider randomized trial designs in which the criteria for patient enrollment may be changed, in a preplanned manner, based on interim analyses. Since such designs allow data-dependent changes to the population enrolled, care must be taken to ensure strong control of the familywise Type I error rate. Our main contribution is a general method for constructing randomized trial designs that allow changes to the population enrolled based on interim data using a prespecified decision rule, for which the asymptotic, familywise Type I error rate is strongly controlled at a specified level α. As a demonstration of our method, we prove new, sharp results for a simple, two-stage enrichment design. We then compare this design to fixed designs, focusing on each design’s ability to determine the overall and subpopulation-specific treatment effects.
Adaptive design; Enrichment design; Group sequential design; Optimization; Patient-oriented research; Randomized trial; Subpopulation
An adaptive design allows the modifications of various features such as sample size and treatment assignments in a clinical study based on the analysis of interim data. The goal is to enhance statistical efficiency by maximizing relevant information obtained from the clinical data. The promise of efficiency, however, comes with a “cost” that is seldom made explicit in the literature. This article reviews some commonly used adaptive strategies in early phase stroke trials and discusses their associated costs. Specifically, we illustrate the tradeoffs in several clinical contexts, including dose finding in the Neuroprotection with Statin Therapy for Acute Recovery Trial, futility analyses and internal pilot in phase 2 proof-of-concept trials, and sample size considerations in an imaging-based dose selection trial. Through these illustrations, we demonstrate the potential tension between the perspectives of an individual investigator and the broader community of stakeholders. This understanding is critical to appreciate the limitations, as well as full promise, of adaptive designs, so that investigators can deploy an appropriate statistical design—be it adaptive or not—in a clinical study.
Continual reassessment method; Futility interim analysis; Internal pilot; Prospective planning
Metastatic renal cell carcinoma (RCC) has a poor prognosis. Conventional treatment strategies, including chemotherapy and hormonal therapy, have limited value. Although encouraging results have been achieved in terms of objective response using immunological manipulations, no conclusive studies yet exist with a controlled comparative evaluation of survival. Therefore, the present study was undertaken, which compared one of the present (and presumed best) treatments, interleukin 2/interferon-alpha (IL-2/IFN-alpha) and tamoxifen, with a control arm of tamoxifen only. Tamoxifen has been shown to potentiate in vivo anti-tumour activity of IL-2, and because of its non-toxic behaviour it was included in both groups. The study was open, randomized and included seven institutions in Sweden. The patients were stratified according to the different centres involved. An interim analysis was planned when a minimum of 100 patients were evaluable. The 128 patients finally included had a histologically documented metastatic RCC, with a life expectancy of more than 3 months, a performance status WHO 0-2 and no prior chemo- or immunotherapy. Informed consent was obtained from each patient. The patients randomized to the control arm (n = 63) received only tamoxifen 40 mg p.o. daily for at least 1 year or until progression. The patients (n = 65) randomized to biotherapy received subcutaneous recombinant IL-2, leucocyte IFN-alpha in a treatment cycle of 42 days, as well as tamoxifen p.o. In the absence of undue toxicity or disease progression, these patients received one additional treatment cycle of 42 days followed by maintenance treatment, consisting of 5 days therapy every 4 weeks, for 1 year, or until proven progression. Only two patients in the tamoxifen-only group received immunotherapy when the disease progressed, but without any beneficial effect. All patients received appropriate local treatment when indicated. The interim analysis demonstrated no survival advantage for either group, and therefore further inclusion of patients was stopped. The median follow-up was 11 months (range 0.4-48 months). The final survival analysis showed no significant differences between the two treatment arms in so far as comparison from the day of diagnosis of primary disease, from the day of first evidence of metastatic spread, or from the onset of treatment. This was valid both when the evaluation was performed with regard to intention to treat and when the analysis was directed only to patients that managed at least one treatment cycle (42 days) of IL-2/IFN-alpha. The adverse effects were more pronounced in the IL-2/IFN-alpha group. Although the number of patients is limited, the results raise doubt concerning immunotherapy with IL-2 and IFN-alpha as a routine treatment in the management of advanced RCC. The difference in cost of drugs and health care (drug costs per patient: IL-2/IFN-alpha $27000 vs tamoxifen $360) as well as adverse effects caused by IL-2/IFN-alpha are also factors of importance. The study emphasizes the need for more effort to find the 'optimal schedule' of immunotherapy, as well as the need for randomized controlled studies before approval of a new treatment in the routine setting.
The motivation for proposing sequential methods for cancer clinical trials is presented, and the methodology examined by re-analysing two completed phase III cancer trials of the Lung Cancer Working Party of the British Medical Research Council. The reanalysis proceeds as if the trials had been designed with a planned series of interim analyses governing stopping. Specifically, the triangular and double-triangular tests were applied. The sequential reanalysis gave a substantial reduction in the number of patient required, and deaths observed, for conclusions to be reached in comparison with the completed studies. In each case, the sequential analysis was stratified for baseline prognostic factors which were seen to be important at the first interim analysis.
With the need for healthcare cost-containment, increased scrutiny will be placed on new medical therapeutic or diagnostic technologies. Several challenges exist for a new diagnostic test to demonstrate cost-effectiveness. New diagnostic tests differ from therapeutic procedures due to the fact that diagnostic tests do not generally directly affect long-term patient outcomes. Instead, the results of diagnostic tests can influence management decisions for patients and by this route, diagnostic tests indirectly affect long-term outcomes. The benefits from a specific diagnostic technology depend therefore not only on its performance characteristics, but also on other factors such as prevalence of disease, and effectiveness of existing treatments for the disease of interest. We review the concepts and theories of cost-effectiveness analyses (CEA) as they apply to diagnostic tests in general. The limitations of CEA across different study designs and geographic regions are discussed, and we also examine the strengths and weakness of the existing publications where CMR was the focus of CEA compared to other diagnostic options.
To explicate differences between early and recent meta-analytic estimates of the effects of cognitive-behavioral therapy (CBT) for adolescent depression.
Meta-analytic procedures were used to investigate whether methodological characteristics moderated mean effect sizes among 11 randomized, controlled trials of CBT focusing on adolescents meeting diagnostic criteria for unipolar depression.
Cumulative meta-analyses indicated that effects of CBT have decreased from large effects in early trials, and confidence intervals have become narrower. Effect sizes were significantly smaller among studies that used intent-to-treat analytic strategies, compared CBT to active treatments, were conducted in clinical settings, and featured greater methodological rigor based on CONSORT (Consolidated Standards of Reporting Trials) criteria. The mean posttreatment effect size of 0.53 was statistically significant.
Differences in estimates of the efficacy of CBT for depressed adolescents may stem from methodological differences between early and more recent investigations. Overall, results support the effectiveness of CBT for the treatment of adolescent depression.
depression; cognitive-behavioral therapy; meta-analysis
Purpose of review
Two phase IIb test of concept studies evaluated the replication-defective adenovirus type 5 vaccine MRK gag/pol/nef HIV vaccine to prevent infection or decrease early plasma viral load in disparate populations. The Step study enrolled men and women in the Americas, Caribbean and Australia; the Phambili trial enrolled men and women in South Africa, where the modes of sexual transmission and HIV-1 risk, sub-types of HIV-1, and background Ad5 seroprevalence differed.
Vaccination in both studies were stopped, after the first interim efficacy analysis of the Step study crossed pre-determined non-efficacy boundaries. Neither trial demonstrated a decrease in HIV acquisition nor decreased early plasma viral load in vaccinees compared to placebo recipients. Post-hoc analyses of men enrolled in the Step study showed a larger number of HIV infections in the sub-group of vaccinated men who were Ad5 seropositive and uncircumcised compared to a comparable placebo group. This was not demonstrated in the Phambili study, where most men were heterosexual, while most in Step were homosexual/bisexual. Further analysis of the Step study has yet to explain the effect of Ad5 seroprevalence on increased HIV-1 susceptibility in men receiving the vaccine. However, promising vaccine effects on early viral control were seen, and the possibility of effects on early viral load setpoint in women in Phambili was seen.
These trials have provided a number of lessons about the importance of clinical trials in the HIV vaccine discovery process, and insight into the type and level of immune response that will be required for control of viral replication.
Phase IIb test of concept trials; HIV vaccines; ad5 vaccines