With the accelerated implementation of genomic medicine, health-care providers will depend heavily on professional guidelines and recommendations. Because genomics affects many diseases across the life span, no single professional group covers the entirety of this rapidly developing field.
To pursue a discussion of the minimal elements needed to develop evidence-based guidelines in genomics, the Centers for Disease Control and Prevention and the National Cancer Institute jointly held a workshop to engage representatives from 35 organizations with interest in genomics (13 of which make recommendations). The workshop explored methods used in evidence synthesis and guideline development and initiated a dialogue to compare these methods and to assess whether they are consistent with the Institute of Medicine report “Clinical Practice Guidelines We Can Trust.”
The participating organizations that develop guidelines or recommendations all had policies to manage guideline development and group membership, and processes to address conflicts of interests. However, there was wide variation in the reliance on external reviews, regular updating of recommendations, and use of systematic reviews to assess the strength of scientific evidence.
Ongoing efforts are required to establish criteria for guideline development in genomic medicine as proposed by the Institute of Medicine.
evidence synthesis; genomic medicine; guideline development
To understand how often ‘breakthroughs,’ that is, treatments that significantly improve health outcomes, can be developed.
We applied weighted adaptive kernel density estimation to construct the probability density function for observed treatment effects from five publicly funded cohorts and one privately funded group.
820 trials involving 1064 comparisons and enrolling 331 004 patients were conducted by five publicly funded cooperative groups. 40 cancer trials involving 50 comparisons and enrolling a total of 19 889 patients were conducted by GlaxoSmithKline.
We calculated that the probability of detecting treatment with large effects is 10% (5–25%), and that the probability of detecting treatment with very large treatment effects is 2% (0.3–10%). Researchers themselves judged that they discovered a new, breakthrough intervention in 16% of trials.
We propose these figures as the benchmarks against which future development of ‘breakthrough’ treatments should be measured.
STATISTICS & RESEARCH METHODS; BIOTECHNOLOGY & BIOINFORMATICS; EPIDEMIOLOGY
Symptomatic hip osteoarthritis (OA) is a disabling condition with up to a 25% cumulative lifetime risk. Total hip arthroplasty (THA) is effective in relieving patients’ symptoms and improving function. It is, however, associated with substantial risk of complications, pain and major functional limitation before patients can return to full function. In contrast, hip arthroscopy (HA) is less invasive and can postpone THA. However, there is no evidence regarding the delay in the need for THA that patients would find acceptable to undergoing HA. Knowing patients’ values and preferences (VP) on this expected delay is critical when making recommendations regarding the advisability of HA. Furthermore, little is known on the optimal amount of information regarding interventions and outcomes needed to present in order to optimally elicit patients’ VP.
Methods and analysis
We will perform a multinational, structured interview-based survey of preference in delay time for THA among patients with non-advanced OA who failed to respond to conservative therapy. We will combine these interviews with a randomised trial addressing the optimal amount of information regarding the interventions and outcomes required to elicit preferences. Eligible patients will be randomly assigned (1 : 1) to either a short or a long format of health scenarios of THA and HA. We will determine each patient's VP using a trade-off and anticipated regret exercises. Our primary outcomes for the combined surveys will be: (1) the minimal delay time in the need for THA surgery that patients would find acceptable to undertaking HA, (2) patients’ satisfaction with the amount of information provided in the health scenarios used to elicit their VPs.
Ethics and dissemination
The protocol has been approved by the Hamilton Integrated Research Ethics Board (HIREB13-506). We will disseminate our study findings through peer-reviewed publications and conference presentations, and make them available to guideline makers issuing recommendations addressing HA and THA.
Patients' values and preference; Total Hip Arthroplasty; Hip Arthroscopy; Patient Written Information; Decision Making
Randomized controlled trials (RCTs) are considered the gold standard for assessing the efficacy of new treatments compared to standard treatments. However, the reasoning behind treatment selection in RCTs is often unclear. Here, we focus on a cohort of RCTs in multiple myeloma (MM) to understand the patterns of competing treatment selections.
We used social network analysis (SNA) to study relationships between treatment regimens in MM RCTs and to examine the topology of RCT treatment networks. All trials considering induction or autologous stem cell transplant among patients with MM were eligible for our analysis. Medline and abstracts from the annual proceedings of the American Society of Hematology and American Society for Clinical Oncology, as well as all references from relevant publications were searched. We extracted data on treatment regimens, year of publication, funding type, and number of patients enrolled. The SNA metrics used are related to node and network level centrality and to node positioning characterization.
135 RCTs enrolling a total of 36,869 patients were included. The density of the RCT network was low indicating little cohesion among treatments. Network Betweenness was also low signifying that the network does not facilitate exchange of information. The maximum geodesic distance was equal to 4, indicating that all connected treatments could reach each other in four “steps” within the same pathway of development. The distance between many important treatment regimens was greater than 1, indicating that no RCTs have compared these regimens.
Our findings show that research programs in myeloma, which is a relatively small field, are surprisingly decentralized with a lack of connectivity among various research pathways. As a result there is much crucial research left unexplored. Using SNA to visually and analytically examine treatment networks prior to designing a clinical trial can lead to better designed studies.
In the USA, there is little systematic evidence about the real-world trajectories of patient medical care after hospice enrolment. The objective of this study was to analyse predictors of the length of stay for hospice patients who were admitted to hospital in a retrospective analysis of the mandatorily reported hospital discharge data.
All acute-care hospitals in Florida during 1 January 2010 to 30 June 2012.
All patients with source of admission coded as ‘hospice’ (n=2674).
Primary outcome measures
The length of stay and discharge status: (1) died in hospital; (2) discharged back to hospice; (3) discharged to another healthcare facility; and (4) discharged home.
Patients were elderly (median age=81) with a high burden of disease. Almost half died (46%), while the majority of survivors were discharged to hospice (80% of survivors, 44% of total). A minority went to a healthcare facility (5.6%) or to home (5.2%). Only 9.2% received any procedure. Respiratory services were received by 29.4% and 16.8% were admitted to the intensive care unit. The median length of stay was 1 day for those who died. In an adjusted survival model, discharge to a healthcare facility resulted in a 74% longer hospital stay compared with discharge to hospice (event time ratio (ETR)=1.74, 95% CI 1.54 to 1.97 p<0.0001), with 61% longer hospital stays among patients discharged home (ETR=1.61, 95% CI 1.39 to 1.86 p<0.0001). Total financial charges for all patients exceeded $25 million; 10% of patients who appeared to exit hospice incurred 32% of the charges.
Our results raise significant questions about the ethics and pragmatics of end-of-life medical care, and the intentions and scope of hospices in the USA. Future studies should incorporate prospective linkage of subjective patient-centred data and objective healthcare encounter data.
Epidemiology; General Medicine (see Internal Medicine)
Systematic review (SR) of randomized controlled trials (RCT) is the gold standard for informing treatment choice. Decision analyses (DA) also play an important role in informing health care decisions. It is unknown how often the results of DA and matching SR of RCTs are in concordance. We assessed whether the results of DA are in concordance with SR of RCTs matched on patient population, intervention, control, and outcomes.
We searched PubMed up to 2008 for DAs comparing at least two interventions followed by matching SRs of RCTs. Data were extracted on patient population, intervention, control, and outcomes from DAs and matching SRs of RCTs. Data extraction from DAs was done by one reviewer and from SR of RCTs by two independent reviewers.
We identified 28 DAs representing 37 comparisons for which we found matching SR of RCTs. Results of the DAs and SRs of RCTs were in concordance in 73% (27/37) of cases. The sensitivity analyses conducted in either DA or SR of RCTs did not impact the concordance. Use of single (4/37) versus multiple data source (33/37) in design of DA model was statistically significantly associated with concordance between DA and SR of RCTs.
Our findings illustrate the high concordance of current DA models compared with SR of RCTs. It is shown previously that there is 50% concordance between DA and matching single RCT. Our study showing the concordance of 73% between DA and matching SR of RCTs underlines the importance of totality of evidence (i.e. SR of RCTs) in the design of DA models and in general medical decision-making.
According to the threshold model, when faced with a decision under diagnostic uncertainty, physicians should administer treatment if the probability of disease is above a specified threshold and withhold treatment otherwise. The objectives of the present study are to a) evaluate if physicians act according to a threshold model, b) examine which of the existing threshold models [expected utility theory model (EUT), regret-based threshold model, or dual-processing theory] explains the physicians’ decision-making best.
A survey employing realistic clinical treatment vignettes for patients with pulmonary embolism and acute myeloid leukemia was administered to forty-one practicing physicians across different medical specialties. Participants were randomly assigned to the order of presentation of the case vignettes and re-randomized to the order of “high” versus “low” threshold case. The main outcome measure was the proportion of physicians who would or would not prescribe treatment in relation to perceived changes in threshold probability.
Fewer physicians choose to treat as the benefit/harms ratio decreased (i.e. the threshold increased) and more physicians administered treatment as the benefit/harms ratio increased (and the threshold decreased). When compared to the actual treatment recommendations, we found that the regret model was marginally superior to the EUT model [Odds ratio (OR) = 1.49; 95% confidence interval (CI) 1.00 to 2.23; p = 0.056]. The dual-processing model was statistically significantly superior to both EUT model [OR = 1.75, 95% CI 1.67 to 4.08; p < 0.001] and regret model [OR = 2.61, 95% CI 1.11 to 2.77; p = 0.018].
We provide the first empirical evidence that physicians’ decision-making can be explained by the threshold model. Of the threshold models tested, the dual-processing theory of decision-making provides the best explanation for the observed empirical results.
Medical decision-making; Threshold model; Dual-processing theory; Regret, Expected utility theory
It is estimated that about half of currently published research cannot be reproduced. Many reasons have been offered as explanations for failure to reproduce scientific research findings- from fraud to the issues related to design, conduct, analysis, or publishing scientific research. We also postulate a sensitive dependency on initial conditions by which small changes can result in the large differences in the research findings when attempted to be reproduced at later times.
We employed a simple logistic regression equation to model the effect of covariates on the initial study findings. We then fed the input from the logistic equation into a logistic map function to model stability of the results in repeated experiments over time. We illustrate the approach by modeling effects of different factors on the choice of correct treatment.
We found that reproducibility of the study findings depended both on the initial values of all independent variables and the rate of change in the baseline conditions, the latter being more important. When the changes in the baseline conditions vary by about 3.5 to about 4 in between experiments, no research findings could be reproduced. However, when the rate of change between the experiments is ≤2.5 the results become highly predictable between the experiments.
Many results cannot be reproduced because of the changes in the initial conditions between the experiments. Better control of the baseline conditions in-between the experiments may help improve reproducibility of scientific findings.
scientific research; initial conditions; reproducibility
Assessing impact of poor accrual on premature trial closure requires a relevant metric. We propose defining accrual sufficiency on apparent ability to address primary endpoints (PE) rather than attaining accrual targets.
All phase III trials open January 1, 1993, to December 31, 2002, by five U.S. oncology Clinical Trials Cooperative Groups (CTCG) were evaluated for accrual sufficiency and scientific results. Sufficient accrual included meeting accrual target, CTCGs documentation attesting adequate accrual, or conclusive results at interim analysis; insufficient accrual included poor accrual as cited closure reason or other reasons rendering a trial unable to address its primary endpoints. Closure rates based on our accrual sufficiency definition are compared with rates of meeting accrual targets and addressing the primary endpoints. A percentage of target accrual above which trials commonly answer the intended scientific question was identified to serve as an alternative to meeting full target accrual in designating accrual success.
Of 238 eligible trials, 158 (66%) closed with sufficient accrual. Among 80 trials with insufficient accrual, 70 (29%) closed specifically because of poor accrual. Inadequate accrual rates are overemphasized when defining accrual success solely by meeting accrual targets. Nearly 75% of trials conclusively addressed the primary endpoints with positive results in 39% of trials. Exceeding 80% of target accrual serves as a reliable proxy for answering the intended scientific question.
Approximately one third of phase III trials closed with insufficient accrual to address the primary endpoints, primarily due to poor accrual. Defining accrual sufficiency broader than meeting accrual targets represents a fairer account of trial closures.
A major challenge for randomized phase III oncology trials is the frequent low rates of patient enrollment, resulting in high rates of premature closure due to insufficient accrual.
We conducted a pilot study to determine the extent of trial closure due to poor accrual, feasibility of identifying trial factors associated with sufficient accrual, impact of redesign strategies on trial accrual, and accrual benchmarks designating high failure risk in the clinical trials cooperative group (CTCG) setting.
A subset of phase III trials opened by five CTCGs between August 1991 and March 2004 was evaluated. Design elements, experimental agents, redesign strategies, and pretrial accrual assessment supporting accrual predictions were abstracted from CTCG documents. Percent actual/predicted accrual rate averaged per month was calculated. Trials were categorized as having sufficient or insufficient accrual based on reason for trial termination. Analyses included univariate and bivariate summaries to identify potential trial factors associated with accrual sufficiency.
Among 40 trials from one CTCG, 21 (52.5%) trials closed due to insufficient accrual. In 82 trials from five CTCGs, therapeutic trials accrued sufficiently more often than nontherapeutic trials (59% vs 27%, p = 0.05). Trials including pretrial accrual assessment more often achieved sufficient accrual than those without (67% vs 47%, p = 0.08). Fewer exclusion criteria, shorter consent forms, other CTCG participation, and trial design simplicity were not associated with achieving sufficient accrual. Trials accruing at a rate much lower than predicted (<35% actual/predicted accrual rate) were consistently closed due to insufficient accrual.
This trial subset under-represents certain experimental modalities. Data sources do not allow accounting for all factors potentially related to accrual success.
Trial closure due to insufficient accrual is common. Certain trial design factors appear associated with attaining sufficient accrual. Defining accrual benchmarks for early trial termination or redesign is feasible, but better accrual prediction methods are critically needed. Future studies should focus on identifying trial factors that allow more accurate accrual predictions and strategies that can salvage open trials experiencing slow accrual.
A 55-year-old, previously healthy woman received a diagnosis of diffuse large-B-cell lymphoma after the evaluation of an enlarged left axillary lymph node obtained on biopsy. She had been asymptomatic except for the presence of enlarged axillary lymph nodes, which she had found while bathing. She was referred to an oncologist, who performed a staging evaluation. A complete blood count and test results for liver and renal function and serum lactate dehydrogenase were normal. Positron-emission tomography and computed tomography (PET–CT) identified enlarged lymph nodes with abnormal uptake in the left axilla, mediastinum, and retroperitoneum. Results on bone marrow biopsy were normal. The patient’s oncologist recommends treatment with six cycles of cyclophosphamide, doxorubicin, vincristine, and prednisone with rituximab (CHOP-R) at 21-day intervals. Is the administration of prophylactic granulocyte colony-stimulating factor (G-CSF) with the first cycle of chemotherapy indicated?
To provide an update on recent revisions to Evaluation of Genomic Applications in Practice and Prevention (EGAPP) methods designed to improve efficiency, and an assessment of the implications of whole genome sequencing for evidence-based recommendation development. Improvements to the EGAPP approach include automated searches for horizon scanning, a quantitative ranking process for topic prioritization, and the development of a staged evidence review and evaluation process. The staged process entails (i) triaging tests with minimal evidence of clinical validity, (ii) using and updating existing reviews, (iii) evaluating clinical validity prior to analytic validity or clinical utility, (iv) using decision modeling to assess potential clinical utility when direct evidence is not available. EGAPP experience to date suggests the following approaches will be critical for the development of evidence based recommendations in the whole genome sequencing era: (i) use of triage approaches and frameworks to improve efficiency, (ii) development of evidence thresholds that consider the value of further research, (iii) incorporation of patient preferences, and (iv) engagement of diverse stakeholders. The rapid advances in genomics present a significant challenge to traditional evidence based medicine, but also an opportunity for innovative approaches to recommendation development.
evidence-based medicine/methods; evidence-based medicine/standards; genetics; genomics/methods; genomics/standards; medical/methods
In decades of clinical-trial data, new treatments are better than standard ones just over half the time. That's as it should be, say Benjamin Djulbegovic and colleagues.
Rituximab is an anti-CD-20 monoclonal antibody used in the management of lymphoproliferative disorders. The use of maintenance rituximab has improved progression free survival and overall survival in follicular lymphomas. Although rapid rituximab infusions have been studied extensively, there is little data on the use of rapid infusions during maintenance therapy for low grade lymphomas. The primary objective of this retrospective analysis was to evaluate the incidence of Grade 3 and 4 toxicities with maintenance rapid infusion rituximab according to the Common Terminology Criteria for Adverse Events version 4 (CTC v. 4). Secondary objectives included evaluating all grade infusion related adverse events and correlation of adverse events with varying schedules of rituximab maintenance therapy. All patients who received rapid infusion rituximab as maintenance therapy for low grade lymphoma between December 2007 and November 2011 were included. Rapid rituximab infusions were administered over 90 minutes. Demographic, laboratory and clinical data were collected. A total of 109 patients received 647 rapid rituximab infusions. Three patients experienced an adverse reaction which resulted in one grade 1 infusion reaction and three grade 3 infusion reactions. No patients required hospitalization. All 3 patients received pharmacological and/or supportive care to relieve symptoms associated with the reaction.
The proportion of proposed new treatments that are ’successful’ is of ethical, scientific, and public importance. We investigated how often new, experimental treatments evaluated in randomized controlled trials (RCTs) are superior to established treatments.
Our main question was: “On average how often are new treatments more effective, equally effective or less effective than established treatments?” Additionally, we wanted to explain the observed results, i.e. whether the observed distribution of outcomes is consistent with the ’uncertainty requirement’ for enrollment in RCTs. We also investigated the effect of choice of comparator (active versus no treatment/placebo) on the observed results.
We searched the Cochrane Methodology Register (CMR) 2010, Issue 1 in The Cochrane Library (searched 31 March 2010); MEDLINE Ovid 1950 to March Week 2 2010 (searched 24 March 2010); and EMBASE Ovid 1980 to 2010 Week 11 (searched 24 March 2010).
Cohorts of studies were eligible for the analysis if they met all of the following criteria: (i) consecutive series of RCTs, (ii) registered at or before study onset, and (iii) compared new against established treatments in humans.
Data collection and analysis
RCTs from four cohorts of RCTs met all inclusion criteria and provided data from 743 RCTs involving 297,744 patients. All four cohorts consisted of publicly funded trials. Two cohorts involved evaluations of new treatments in cancer, one in neurological disorders, and one for mixed types of diseases. We employed kernel density estimation, meta-analysis and meta-regression to assess the probability of new treatments being superior to established treatments in their effect on primary outcomes and overall survival.
The distribution of effects seen was generally symmetrical in the size of difference between new versus established treatments. Meta-analytic pooling indicated that, on average, new treatments were slightly more favorable both in terms of their effect on reducing the primary outcomes (hazard ratio (HR)/odds ratio (OR) 0.91, 99% confidence interval (CI) 0.88 to 0.95) and improving overall survival (HR 0.95, 99% CI 0.92 to 0.98). No heterogeneity was observed in the analysis based on primary outcomes or overall survival (I2 = 0%). Kernel density analysis was consistent with the meta-analysis, but showed a fairly symmetrical distribution of new versus established treatments indicating unpredictability in the results. This was consistent with the interpretation that new treatments are only slightly superior to established treatments when tested in RCTs. Additionally, meta-regression demonstrated that results have remained stable over time and that the success rate of new treatments has not changed over the last half century of clinical trials. The results were not significantly affected by the choice of comparator (active versus placebo/no therapy).
Society can expect that slightly more than half of new experimental treatments will prove to be better than established treatments when tested in RCTs, but few will be substantially better. This is an important finding for patients (as they contemplate participation in RCTs), researchers (as they plan design of the new trials), and funders (as they assess the ’return on investment’). Although we provide the current best evidence on the question of expected ’success rate’ of new versus established treatments consistent with a priori theoretical predictions reflective of ’uncertainty or equipoise hypothesis’, it should be noted that our sample represents less than 1% of all available randomized trials; therefore, one should exercise the appropriate caution in interpretation of our findings. In addition, our conclusion applies to publicly funded trials only, as we did not include studies funded by commercial sponsors in our analysis.
To assess whether reported methodological quality of randomized controlled trials (RCTs) reflect the actual methodological quality, and to evaluate the association of effect size (ES) and sample size with methodological quality.
Retrospective analysis of all consecutive phase III RCTs published by 8 National Cancer Institute Cooperative Groups until year 2006. Data were extracted from protocols (actual quality) and publications (reported quality) for each study.
429 RCTs met the inclusion criteria. Overall reporting of methodological quality was poor and did not reflect the actual high methodological quality of RCTs. The results showed no association between sample size and actual methodological quality of a trial. Poor reporting of allocation concealment and blinding exaggerated the ES by 6% (ratio of hazard ratio [RHR]: 0.94, 95%CI: 0.88, 0.99) and 24% (RHR: 1.24, 95%CI: 1.05, 1.43), respectively. However, actual quality assessment showed no association between ES and methodological quality.
The largest study to-date shows poor quality of reporting does not reflect the actual high methodological quality. Assessment of the impact of quality on the ES based on reported quality can produce misleading results.
To assess the adherence to antiretroviral therapy (ART) in the human immunodeficiency virus (HIV)-infected population in India.
Systematic review and meta-analysis.
Materials and Methods:
The Medline and Cochrane library database were searched. Any prospective or retrospective study enrolling a minimum of 10 subjects with a primary objective of assessing ART adherence in the HIV population in India was included. Data were extracted on adherence definition, adherence estimates, study design, study population characteristics, recall period and assessment method. For metaanalysis, the pooled proportion was calculated as a back-transform of the weighted mean of the transformed proportions (calculated according to the Freeman-Tukey variant of the arcsine square root) using the random effects model.
There were seven cross-sectional studies and one retrospective study enrolling 1666 participants. Publication bias was significant (P = 0.003). Pooled results showed an ART adherence rate of 70% (95% confidence interval: 59–81%, I2 = 96.3%). Sensitivity analyses based on study design, adherence assessment method and study region did not influence adherence estimates. Fifty percent (4/8) of the studies reported cost of medication as the most common obstacle for ART adherence. Twenty-five percent (2/8) reported lack of access to medication as the reason for non-adherence and 12% (1/8) cited adverse events as the most prevalent reason for non-adherence. The overall methodological quality of the included studies was poor.
Pooled results show that overall ART adherence in India is below the required levels to have an optimal treatment effect. The quality of studies is poor and cannot be used to guide policies to improve ART adherence.
Anti-retroviral therapy; India; systematic review; treatment compliance
To assess if commercially sponsored trials are associated with higher success rates than publicly-sponsored trials.
Study Design and Settings
We undertook a systematic review of all consecutive, published and unpublished phase III cancer randomized controlled trials (RCTs) conducted by GlaxoSmithKline (GSK) and the NCIC Clinical Trials Group (CTG). We included all phase III cancer RCTs assessing treatment superiority from 1980 to 2010. Three metrics were assessed to determine treatment successes: (1) the proportion of statistically significant trials favouring the experimental treatment, (2) the proportion of the trials in which new treatments were considered superior according to the investigators, and (3) quantitative synthesis of data for primary outcomes as defined in each trial.
GSK conducted 40 cancer RCTs accruing 19,889 patients and CTG conducted 77 trials enrolling 33,260 patients. 42% (99%CI 24 to 60) of the results were statistically significant favouring experimental treatments in GSK compared to 25% (99%CI 13 to 37) in the CTG cohort (RR = 1.68; p = 0.04). Investigators concluded that new treatments were superior to standard treatments in 80% of GSK compared to 44% of CTG trials (RR = 1.81; p<0.001). Meta-analysis of the primary outcome indicated larger effects in GSK trials (odds ratio = 0.61 [99%CI 0.47–0.78] compared to 0.86 [0.74–1.00]; p = 0.003). However, testing for the effect of treatment over time indicated that treatment success has become comparable in the last decade.
While overall industry sponsorship is associated with higher success rates than publicly-sponsored trials, the difference seems to have disappeared over time.
Lung cancer is considered a terminal illness with a five-year survival rate of about 16%. Informed decision-making related to the management of a disease requires accurate prognosis of the disease with or without treatment. Despite the significance of disease prognosis in clinical decision-making, systematic assessment of prognosis in patients with lung cancer without treatment has not been performed. We conducted a systematic review and meta-analysis of the natural history of patients with confirmed diagnosis of lung cancer without active treatment, to provide evidence-based recommendations for practitioners on management decisions related to the disease. Specifically, we estimated overall survival when no anticancer therapy is provided.
Relevant studies were identified by search of electronic databases and abstract proceedings, review of bibliographies of included articles, and contacting experts in the field. All prospective or retrospective studies assessing prognosis of lung cancer patients without treatment were eligible for inclusion. Data on mortality was extracted from all included studies. Pooled proportion of mortality was calculated as a back-transform of the weighted mean of the transformed proportions using the random-effects model. To perform meta-analysis of median survival, published methods were used to pool the estimates as mean and standard error under the random-effects model. Methodological quality of the studies was examined.
Seven cohort studies (4,418 patients) and 15 randomized controlled trials (1,031 patients) were included in the meta-analysis. All studies assessed mortality without treatment in patients with non-small cell lung cancer (NSCLC). The pooled proportion of mortality without treatment in cohort studies was 0.97 (95% CI: 0.96 to 0.99) and 0.96 in randomized controlled trials (95% CI: 0.94 to 0.98) over median study periods of eight and three years, respectively. When data from cohort and randomized controlled trials were combined, the pooled proportion of mortality was 0.97 (95% CI: 0.96 to 0.98). Test of interaction showed a statistically non-significant difference between subgroups of cohort and randomized controlled trials. The pooled mean survival for patients without anticancer treatment in cohort studies was 11.94 months (95% CI: 10.07 to 13.8) and 5.03 months (95% CI: 4.17 to 5.89) in RCTs. For the combined data (cohort studies and RCTs), the pooled mean survival was 7.15 months (95% CI: 5.87 to 8.42), with a statistically significant difference between the two designs. Overall, the studies were of moderate methodological quality.
Systematic evaluation of evidence on prognosis of NSCLC without treatment shows that mortality is very high. Untreated lung cancer patients live on average for 7.15 months. Although limited by study design, these findings provide the basis for future trials to determine optimal expected improvement in mortality with innovative treatments.
Best supportive care; Natural history; Meta-analysis; Palliative care; Placebo
Despite advances in understanding of clinical, genetic, and molecular aspects of multiple myeloma (MM) and availability of more effective therapies, MM remains incurable. The autologous-allogeneic (auto-allo) hematopoietic cell transplantation (HCT) strategy is based on combining cytoreduction from high-dose (chemo- or chemoradio)-therapy with adoptive immunotherapy. However, conflicting results have been reported when an auto-allo HCT approach is compared to tandem autologous (auto-auto) HCT. A previously published meta-analysis has been reported; however, it suffers from serious methodological flaws.
A systematic search identified 152 publications, of which five studies (enrolling 1538 patients) met inclusion criteria. All studies eligible for inclusion utilized biologic randomization.
Assessing response rates by achievement of at least a very good partial response did not differ among the treatment arms [risk ratio (RR) (95% CI) = 0.97 (0.87-1.09), p = 0.66]; but complete remission was higher in the auto-allo HCT arm [RR = 1.65 (1.25-2.19), p = 0.0005]. Event-free survival did not differ between auto-allo HCT group versus auto-auto HCT group using per-protocol analysis [hazard ratio (HR) = 0.78 (0.58-1.05)), p = 0.11] or using intention-to-treat analysis [HR = 0.83 (0.60-1.15), p = 0.26]. Overall survival (OS) did not differ among these treatment arms whether analyzed on per-protocol [HR = 0.88 (0.33-2.35), p = 0.79], or by intention-to-treat [HR = 0.80 (0.48-1.32), p = 0.39] analysis. Non-relapse mortality (NRM) was significantly worse with auto-allo HCT [RR (95%CI) = 3.55 (2.17-5.80), p < 0.00001].
Despite higher complete remission rates, there is no improvement in OS with auto-allo HCT; but this approach results in higher NRM in patients with newly diagnosed MM. At present, totality of evidence suggests that an auto-allo HCT approach for patients with newly diagnosed myeloma should not be offered outside the setting of a clinical trial.
Autologous hematopoietic stem cell transplantation; Allogeneic hematopoietic stem cell transplantation; Multiple myeloma; Systematic review
The conduct of a randomized controlled trial (RCT) is deemed ethical only if we are in state of “equipoise” as to which treatment would be most beneficial for the patients. Individual equipoise applies to an individual clinician or a member of ethical, institutional review board (IRB), whilst collective equipoise refers to the profession as a whole. It is argued that physicians are not bound by the equipoise but their actions are directed by the confines of the expert opinion. Experts can agree or disagree in various proportions on the merit of a given treatment. Hence, the collective equipoise will be often incomplete. In turn, the opinions of content expert in the field of the proposed trial influence the IRB members’ decision regarding trial approval.
We conducted a survey of IRB members at University of South Florida and the IRB members attending the bioethics conference organized in Clearwater, Florida, USA. The survey was made available as hard copy (paper based) and included six hypothetical scenarios outlining clinical trials targeted at measuring the collective equipoise. We defined the collective equipoise as the situation when survey participants were equally split (50:50) in their decision regarding whether a proposed clinical trial would be ethical to conduct. The opinion of 100 experts in the field expressed as proportion of experts favoring treatment A vs. B in each of the five scenarios was made available to the participants.
The response rate of our survey was 33% (71/218). Fifty percent of the IRB members would approve an RCT addressing the efficacy of two drugs for the management of headache even if 80% of experts favor one treatment over another (median: 80%; third quartile: 80%). Similarly, half of participating IRB members would approve the study when the median distribution of equipoise among experts was 70% (70 in favor of treatment A vs. 30 in favor of treatment B) for treatment of leukemia, 60% for treatment of geriatric patients and 70% for treatment of newborns. Half of IRB members would approve the study when the median distribution of equipoise among experts was 70% for treatment for leukemia in dogs and 85% for leukemia in rats (and 25% of IRB members would approve such a study even if 100% of experts favors one treatment over another). None of the demographic features of respondents affected collective equipoise.
This is the first study assessing collective equipoise among ethical committee/IRB members. Our study findings show that IRB members perceived that conduct of a trial enrolling humans is unethical when the equipoise level is beyond 80% (80:20 distribution of uncertainty). IRB members require a higher level of equipoise when it comes to testing a new drug in humans than in animals. A relatively high level of equipoise is needed for IRB members to be comfortable to approve trials involving life-threatening situations, children and elderly patients.
randomized controlled trial; ethics; Ethical committees.
Prognostic models are often used to estimate the length of patient survival. The Cox proportional hazards model has traditionally been applied to assess the accuracy of prognostic models. However, it may be suboptimal due to the inflexibility to model the baseline survival function and when the proportional hazards assumption is violated. The aim of this study was to use internal validation to compare the predictive power of a flexible Royston-Parmar family of survival functions with the Cox proportional hazards model. We applied the Palliative Performance Scale on a dataset of 590 hospice patients at the time of hospice admission. The retrospective data were obtained from the Lifepath Hospice and Palliative Care center in Hillsborough County, Florida, USA. The criteria used to evaluate and compare the models' predictive performance were the explained variation statistic R2, scaled Brier score, and the discrimination slope. The explained variation statistic demonstrated that overall the Royston-Parmar family of survival functions provided a better fit (R2 = 0.298; 95% CI: 0.236–0.358) than the Cox model (R2 = 0.156; 95% CI: 0.111–0.203). The scaled Brier scores and discrimination slopes were consistently higher under the Royston-Parmar model. Researchers involved in prognosticating patient survival are encouraged to consider the Royston-Parmar model as an alternative to Cox.
In recent years, various authors have proposed that the concept of equipoise be abandoned since it conflates the practice of clinical care with clinical research. At the same time, the equipoise opponents acknowledge the necessity of clinical research if there are unresolved uncertainties about the effects of proposed healthcare interventions. Since equipoise represents just one measure of uncertainty, proposals to abandon equipoise while maintaining a requirement for addressing uncertainties are contradictory and ultimately not valid. As acknowledgment and articulation of uncertainties represent key scientific and moral requirements for human experimentation, the concept of equipoise remains the most useful framework to link the theory of human experimentation with the theory of rational choice. In this paper, I show how uncertainty (equipoise) is at the intersection between epistemology, decision-making and ethics of clinical research. In particular, I show how our formulation of responses to uncertainties of hoped-for benefits and unknown harms of testing is a function of the way humans cognitively process information. This approach is based on the view that considerations of ethics and rationality cannot be separated. I analyze the response to uncertainties as it relates to the dual-processing theory, which postulates that rational approach to (clinical research) decision-making depends both on analytical, deliberative processes embodied in scientific method (system II) and “good” human intuition (system I). Ultimately, our choices can only become wiser if we understand a close and intertwined relationship between irreducible uncertainty, inevitable errors, and unavoidable injustice.
Clinical Equipoise; Informed Consent; Clinical Research; Research Ethics
Dual processing theory of human cognition postulates that reasoning and decision-making can be described as a function of both an intuitive, experiential, affective system (system I) and/or an analytical, deliberative (system II) processing system. To date no formal descriptive model of medical decision-making based on dual processing theory has been developed. Here we postulate such a model and apply it to a common clinical situation: whether treatment should be administered to the patient who may or may not have a disease.
We developed a mathematical model in which we linked a recently proposed descriptive psychological model of cognition with the threshold model of medical decision-making and show how this approach can be used to better understand decision-making at the bedside and explain the widespread variation in treatments observed in clinical practice.
We show that physician’s beliefs about whether to treat at higher (lower) probability levels compared to the prescriptive therapeutic thresholds obtained via system II processing is moderated by system I and the ratio of benefit and harms as evaluated by both system I and II. Under some conditions, the system I decision maker’s threshold may dramatically drop below the expected utility threshold derived by system II. This can explain the overtreatment often seen in the contemporary practice. The opposite can also occur as in the situations where empirical evidence is considered unreliable, or when cognitive processes of decision-makers are biased through recent experience: the threshold will increase relative to the normative threshold value derived via system II using expected utility threshold. This inclination for the higher diagnostic certainty may, in turn, explain undertreatment that is also documented in the current medical practice.
We have developed the first dual processing model of medical decision-making that has potential to enrich the current medical decision-making field, which is still to the large extent dominated by expected utility theory. The model also provides a platform for reconciling two groups of competing dual processing theories (parallel competitive with default-interventionalist theories).