|Home | About | Journals | Submit | Contact Us | Français|
Allogeneic stem cell transplantation (allo-SCT) outcomes in patients with Hodgkin lymphoma (HL) remain poorly defined. We performed a meta-analysis of allo-SCT studies in HL patients. The primary endpoints were 6-month, 1-year, 2-year, and 3-year relapse-free survival (RFS) and overall survival (OS). A total of 42 reports (1,850 patients) were included. The pooled estimates (95%CI) for 6-month, 1-year, 2-year, and 3-year RFS were 77 (59–91)%, 50 (42–57)%, 37 (31–43)%, and 31 (25–37)%, respectively. The corresponding numbers for OS were 83 (75–91)%, 68 (62–74)%, 58 (52–64)%, and 50 (41–58)%, respectively. There was statistical heterogeneity among studies in all outcomes. In meta-regression, accrual initiation year in 2000 or later was associated with higher 6-month (P = 0.012) and 1-year OS (P = 0.046), and pre-SCT remission with higher 2-year OS (P = 0.047) and 1-year RFS (P = 0.016). In conclusion, outcomes of allo-SCT in HL have improved over time, with 5–10% lower non-relapse mortality and relapse rates and 15–20% higher RFS and OS in studies that initiated accrual in 2000 or later compared to earlier studies. However, there is no apparent survival plateau, demonstrating the need to improve on current allo-SCT strategies in relapsed/refractory HL.
Although frontline combination chemotherapy cures most patients with Hodgkin lymphoma (HL), relapse remains a major problem. High-dose chemotherapy followed by autologous stem cell transplantation (auto-SCT) cures over 50% of patients with relapsed disease1, with significant superiority in progression-free survival over chemotherapy alone2. Relapse after auto-SCT is a challenge in the treatment of patients with HL because they are unlikely to be cured with conventional chemotherapy alone. Allogeneic stem cell transplantation (allo-SCT) has been used in this setting since early 1980’s with varying degrees of success3. Relapse and transplant-related mortality (TRM), most commonly due to graft-versus-host disease (GvHD) and infections, are two obstacles in allo-SCT in HL. With the advent of reduced-intensity (RI) conditioning regimens and mini-allografts, the rates of TRM have declined and the graft-versus-disease effect, rather than high-dose therapy, is now recognized as the primary mechanism for long-term remissions after allo-SCT4.
The recently published guidelines of the American Society of Blood and Marrow Transplantation recommend allo-SCT as a preferred strategy over standard chemotherapy for relapse following auto-SCT5. This was a grade B recommendation due to the lack of adequate data and the sporadic nature of the available reports. Similarly, the European Society for Blood and Marrow Transplantation currently recommends that allo-SCT can be considered as the standard treatment option in patients with chemosensitive relapse after auto-SCT6. The purpose of the present study was to determine the outcomes of allo-SCT in HL patients using a systematic review and meta-analysis.
This study was conducted in accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement7. We performed electronic searches of Medline and Embase from inception until 1 June 2015. The following terms were used for Medline search: allogeneic AND (Hodgkin OR Hodgkin's) NOT non-Hodgkin NOT non-Hodgkin's. The following were used for Embase search: 'allogeneic':ti AND 'hodgkin':ti.
Studies were included if patients had HL and underwent allo-SCT. Only full-text articles published in English were considered. Studies were included in data extraction if they reported at least one of the two primary endpoints (see below). Duplicates were first removed from the search results. The remaining reports were then screened by scanning titles and abstracts for the following exclusion criteria: reviews or meta-analyses not reporting primary data, abstracts, conference proceedings, commentaries, editorials, and no primary endpoints reported. References cited in the included articles were manually searched to find any additional reports. The corresponding authors of the retrieved studies with missing information were contacted for additional data. A.R. and M.E. independently reviewed the included studies, collected the data, and resolved discrepancies by consensus.
Study quality was assessed using a 5-factor scoring system. The factors were (i) conditioning regimen(s), (ii) stem cell source, (iii) donor, (iv) GvHD prophylaxis regimen, and (v) disease status before allo-SCT. For each scoring factor, studies received a score of 1 if the corresponding information was provided in the report and zero otherwise. The total quality score (range 0–5) was calculated by adding the scores for individual factors. A higher total score indicated higher study quality. These scores were not a basis for inclusion or exclusion of studies.
The primary endpoints were overall survival (OS) and relapse-free survival (RFS) at 6 months, 1 year, 2 years, and 3 years, measured from the time of allo-SCT. Secondary endpoints were the cumulative incidence of relapse (CIR) and NRM. The proportions and standard errors (SE) were calculated by the following formula: , where r is the number of patients who experienced the event of interest at the time point of interest and n is the sample size. Proportions were transformed to logit event estimates by . Variances were calculated by Var(lp) = 1/np(1 − p). Study heterogeneity was assessed using the Cochran Q test and quantified using I2 = (Q – degrees of freedom) × 100/Q, where degrees of freedom equals k-1 and k is the number of studies. A random-effects model was used to calculate pooled proportions with 95% confidence intervals (95%CI).8 All estimates are presented as proportion with two-sided 95%CI. Publication bias was assessed using funnel plots and Egger test. Sensitivity analysis for each outcome was performed by removing the studies (one at a time) that were out of the 95% confidence interval for the estimated pooled proportion from the database and re-analyzing the new dataset. If the new estimated pooled proportion was different by more than 10% from the estimate made on the full dataset, the removed study was considered overly influential. Details of such studies were further investigated.
Meta-regression was used to determine the independent effect of the following variables on primary and secondary outcomes: median age, accrual initiation year (before 2000 vs. later), gender (male percentage), study type (prospective vs. retrospective), study scale (single- vs. multicenter), use of total body irradiation (TBI) in conditioning (TBI- vs. non-TBI-based), stem cell source (peripheral blood vs. bone marrow), previous radiation (%), previous auto-SCT (%), conditioning intensity (reduced-intensity vs. myeloablative), HLA match degree (matched vs. haploidentical/cord), and disease status before allo-SCT (percentage of partial/complete remission [CR/PR] vs. stable/progressive disease). Variables were entered into the regression model if they were significantly correlated with the outcome of interest in univariate analysis. Logarithmically transformed values of the dependent variable were used in these analyses. For regression analysis for the effect of TBI, stem cell source, or HLA match degree, only studies with uniformity with respect to the variable of interest (e.g. all transplants using peripheral blood stem cells) were included in the model. STATA 13 (Stata, College Station, TX) was used for analysis. P < 0.05 was considered statistically significant.
The study flow diagram is shown in Figure 1. A total of 38 reports were eligible for analysis9–46. The studies by Chen et al.18, Burroughs et al.15, and Sureda et al.42 were split to 2, 3, and 2 series, respectively, because they contained detailed information on important variables of interest in our study. Therefore, the total number of series included in meta-analysis was 42 (Tables 1 and and2).2). The sample size of the included studies ranged between 5 and 285, with a total of 1,850 patients included in study-level, summary-based meta-analysis. Twenty eight studies were retrospective and 14 were prospective. Twenty one studies were single-center and 21 were multi-center. Two study scored 1, two scored 2, one scored 3, nine scored 4, and 28 scored 5 in quality assessment. Conditioning was myeloablative in 7, RI in 30, and mixed or unknown in the remaining studies. Conditioning was TBI-based in 6, non-TBI-based in 15, and mixed or unknown in the remaining studies. Uniform use of ATG, alemtuzumab, and post-transplant cyclophosphamide was present in two, one, and two studies, respectively. The source of stem cells was marrow in 8, peripheral blood in 9, and mixed or unknown in the remaining studies. Transplants were using an HLA-matched donor in 20, a mismatched donor (haploidentical or cord blood) in 4, and mixed or unknown donor type in the remaining studies. The median follow up of patients ranged between 11 and 104 months.
The pooled estimates (95%CI) for RFS at 6 months, 1 year, 2 years, and 3 years were 77 (59–91)%, 50 (42–57)%, 37 (31–43)%, and 31 (25–37)%, respectively (Figure 2). The corresponding numbers for OS were 83 (75–91)%, 68 (62–74)%, 58 (52–64)%, and 50 (41–58)%, respectively (Figure 3). The pooled estimates (95%CI) for CIR at 6 months, 1 year, 2 years, and 3 years were 15 (6–27)%, 34 (30–39)%, 42 (35–49)%, and 46 (40–51)%, respectively (Figure S1). The corresponding numbers for NRM were 13 (6–22)%, 19 (14–24)%, 19 (14–25)%, and 19 (14–24)%, respectively (Figure S2).
There was significant statistical heterogeneity across studies in all studied outcomes at all measured times (Appendix). Figures S3–S6 show the funnel plots for the estimated outcomes. The plots and the Egger’s bias coefficients argued against the presence of publication bias with the exception of 2-year NRM and 2-year OS, where there was a lack of small studies with high NRM (P = 0.04) and low OS (P = 0.02) rates. In sensitivity analysis, studies with a significant influence on outcome were Phillips et al.32 (6-month CIR change by −13% after study removal and 6-month NRM change by −20% after study removal) and Garcias et al.22 (3-year RFS change by −10% after study removal). No single study was found to be overly influential with regards to other outcomes.
Accrual initiation in year 2000 or later was associated with higher 6-month (P < 0.01) and 1-year (P < 0.01) OS and lower 1-year NRM (P = 0.021). Previous auto-SCT was associated with higher 6-month (P < 0.01), 1-year (P < 0.01), 2-year (P = 0.01), and 3-year (P < 0.01) OS, higher 6-month (P < 0.01) and 1-year (P = 0.002) RFS, and lower 6-month (P = 0.006), 1-year (P < 0.001), 2-year (P = 0.002), and 3-year NRM (P = 0.007). Chemosensitive disease (CR/PR before allo-SCT) was associated with higher 1-year (P = 0.010), 2-year (P = 0.032) and 3-year (P = 0.022) RFS, and higher 1-year (P < 0.01) and 2-year (P = 0.01) OS. Prior radiation therapy (P = 0.04) was significantly associated with higher 3-year OS. Single-center scale (P < 0.01) was associated with higher 1-year OS. Male gender was associated with higher 6-month (P = 0.017) and 2-year (P = 0.010) NRM. None of the studied variables were associated with CIR at any of the studied time points.
In multivariate analysis, accrual initiation in year 2000 or later was independently associated with higher 6-month (P = 0.012) and 1-year (P = 0.046) OS. Previous auto-SCT was independently associated with higher 1-year (P = 0.012) and 2-year (P = 0.040) OS, higher 1-year RFS (P = 0.005), and lower 1-year (P < 0.001) and 2-year (P = 0.037) NRM. Pre-SCT remission (i.e. chemosensitive relapse) was independently associated with higher 2-year OS (P = 0.047) and 1-year RFS (P = 0.016).
Considering that the study accrual initiation had a significant impact on survival, we re-performed the analysis separately for studies with accrual initiation before 2000 (23 series, 1,293 patients) vs. 2000 or later (18 series, 547 patients). Figure 4 shows reconstructed curves for outcomes at different time points derived from different combinations of studies (depending on the data reported in each study). The pooled estimates (95%CI) for RFS at 6 months, 1 year, 2 years, and 3 years for studies that initiated accrual before 2000 were 50 (42–59)%, 39 (30–48)%, 29 (22–37)%, and 25 (19–31)%, respectively (Figure 4A). These numbers improved to 85 (72–99)%, 60 (49–71)%, 45 (35–56)%, and 40 (28–53)%, respectively, for studies with accrual initiation in 2000 or later (Figure 4A). The pooled estimates (95%CI) for OS at 6 months, 1 year, 2 years, and 3 years for studies with accrual initiation before 2000 were 66 (53–77)%, 54 (46–63)%, 47 (40–54)%, and 39 (28–51)%, respectively (Figure 4A). These numbers improved to 92 (82–99)%, 80 (73–86)%, 67 (59–74)%, and 60 (49–70)%, respectively, for studies with accrual initiation in 2000 or later (Figure 4A). CIR and NRM curves are shown in Figure 4B. In summary, improvements in NRM and CIR over time have led to improvements in RFS and OS.
Next, we limited the analysis to patients with chemosensitive disease at the time of allo-SCT. A total of 7 studies with accrual initiation in 2000 or later reported outcomes for their chemosensitive patients (n = 83) separately. The pooled estimates (95%CI) for RFS at 6 months, 1 year, 2 years, and 3 years for patients for chemosensitive disease were 86 (30–100)%, 86 (30–100)%, 66 (19–37)%, and 59 (4–100)%, respectively. The pooled estimates (95%CI) for OS at 6 months, 1 year, 2 years, and 3 years for these patients were 97 (89–100)%, 96 (88–100)%, 89 (75–99)%, and 78 (49–98)%, respectively. The pooled estimates (95%CI) for NRM at 6 months, 1 year, 2 years, and 3 years for patients for chemosensitive disease were 3 (0–11)%, 4 (0–12)%, 7 (1–17)%, and 8 (1–19)%, respectively. The pooled estimates (95%CI) for CIR at 6 months, 1 year, 2 years, and 3 years for these patients were 14 (0–70)%, 21 (0–63)%, 21 (0–63)%, and 32 (0–82)%, respectively. In summary, outcomes in patients with chemosensitive disease appear superior to those with chemoresistant relapse, but small numbers make the comparison limited.
Allo-SCT has been used for several decades as a salvage regimen for patients with multiply relapsed or refractory HL, with some patients achieving durable disease control with this approach. Although there is a plateau of NRM at approximately 15–20%, relapses continue to occur with no apparent plateau until 3 years post-SCT. We demonstrate a high CIR at 3 years of more than 40%, which parallels continuously declining survival rates. Only about 40% of patients in the more recent studies are alive and relapse-free at 3 years following allo-SCT.
The quality score of 37/42 (88%) studies included in our analysis was high (4 or 5 out of 5), making the possibility of poor outcomes arising from suboptimal study quality unlikely. Similarly, although studies varied in methodology (resulting in significant study heterogeneity), most of these differences did not contribute to outcome differences in multivariate regression. This result argues against a hypothesis that specific study characteristics were a major cause for suboptimal outcomes. Chemosensitive relapse as defined by remission (partial or complete) at the time of allo-SCT was associated with improved outcomes. According to our analysis, outcomes of allo-SCT in HL have improved over time, with 5–10% less NRM and CIR and 15–20% higher RFS and OS in studies that initiated accrual in 2000 or later compared to earlier studies. Finally, previous auto-SCT was another independent predictor of improved survival and lower NRM, probably because patients who had a previous auto-SCT were physically fitter and more likely to have chemosensitive disease than those without prior auto-SCT.
The “small study effect” (i.e. publication bias), a potential source of bias in systematic reviews and meta-analyses, is caused by the preferential publication of positive studies when they are small. Due to their limited sample size, small studies which show positive findings such as significant treatment effects have a higher chance of being published than those showing no difference in outcome, hence introducing bias to meta-analyses. Most of our results were not affected by publication bias. The only exceptions were 2-year NRM and 2-year OS, where there was an excess of small studies with low NRM and high OS rates. If anything, this source of bias would have resulted in an overestimation of favorable outcomes. Similarly, sensitivity analysis did not reveal an overly influential study resulting in overestimation of favorable outcomes, with the single exception being one outlier study that contributed to about 20% of the pooled estimate for 3-year RFS22. In this small study (n = 12), all patients had received the anti-CD30 antibody-drug conjugate, brentuximab vedotin, prior to allo-SCT.
In conclusion, outcomes of allo-SCT in HL have improved over time, likely due to improved supportive care and the availability of more effective salvage therapies. Patients with chemosensitive disease appear to have improved outcomes post allo-SCT, but our ability to identify reliably other subgroups of patients who benefit from allo-SCT is limited. This large meta-analysis demonstrates that non-durable remissions are a major shortcoming of allo-SCT in HL, questioning the benefit of allo-SCT in its current form in the face of alternative therapies that are less toxic and more effective. The role of allo-SCT in patients with HL may change as drugs with novel mechanisms of action continue to be developed.
ES: effect size (odds ratio or relative risk, depending on the study)
ES: effect size (odds ratio or relative risk, depending on the study)
There is no evidence for publication bias according to Egger’s test. See text for details.
There is potential publication bias in the estimates of 2-year NRM, due to lack of small studies with high NRM (P = 0.04). See text for details.
There is no evidence for publication bias according to Egger’s test. See text for details.
There is potential publication bias in the estimates of 2-year OS, due to lack of small studies with low OS (P = 0.02). See text for details.
AR was supported by the Washington University Institute of Clinical and Translational Sciences grant UL1 TR000448 from the National Center for Advancing Translational Sciences. The funding sources had no role in study design, data collection, analysis, or interpretation of results. The authors thank Drs. R. Chen, A. Claviez, S. Kako, and M.P. Devetten for providing additional data.
The analysis of study heterogeneity revealed the following results: 6-month CIR (n = 8, χ2 = 28, I2 = 75%), 1-year CIR (n = 21, χ2 = 43, I2 = 53%), 2-year CIR (n = 19, χ2 = 57, I2 = 68%), 3-year CIR (n = 21, χ2 = 65, I2 = 69%), 6-month NRM (n = 15, χ2 = 68, I2 = 79%), 1-year NRM (n = 28, χ2 = 111, I2 = 76%), 2-year NRM (n = 24, χ2 = 83, I2 = 72%), 3-year NRM (n = 22, χ2 = 73, I2 = 72%), 6-month RFS (n = 9, χ2 = 62, I2 = 87%), 1-year RFS (n = 22, χ2 = 22, I2 = 78%), 2-year RFS (n = 26, χ2 = 99, I2 = 75%), 3-year RFS (n = 22, χ2 = 99, I2 = 79%), 6-month OS (n = 16, χ2 = 59, I2 = 75%), 1y OS (n = 28, χ2 = 116, I2 = 77%), 2-year OS (n = 24, χ2 = 83, I2 = 72%), and 3-year OS (n = 20, χ2 = 138, I2 = 86%).
Conflicts of interest
Authors’ contributionsAR and ME reviewed the studies and extracted the data. AR analyzed the data. AR, ME, and AFC interpreted the results and wrote the manuscript.