|Home | About | Journals | Submit | Contact Us | Français|
Response Evaluation Criteria in Solid Tumors [RECIST (unidimensional)], World Health Organization [WHO (bi-dimensional)] and European Association for Study of the Liver [EASL (necrosis)] guidelines are commonly used to assess response following therapy for hepatocellular carcinoma (HCC). No universally accepted standard exists.
To evaluate intermethod agreement between these 3 imaging guidelines and to introduce the concept of the “primary index lesion” as a biomarker for response.
Single-center comprehensive imaging analysis.
245 consecutive patients with HCC who were treated with chemoembolization or radioembolization between January 2000 and December 2008. Computed tomography and magnetic resonance imaging scans (N=1065) were reviewed to assess response in the “primary index lesion,” defined as the largest tumor targeted during first treatment.
Intermethod agreement (k statistics) between RECIST, WHO, and EASL guidelines response; correlation of WHO and EASL response in the primary index lesion with time to progression and survival.
κ coefficients were 0.86(95% confidence interval [CI],0.80–0.92) between the WHO and RECIST guidelines, 0.24(95% CI, 0.16–0.33) between RECIST and EASL and 0.28 (95% CI, 0.19–0.36) between WHO and EASL. Disease progressed in 96 patients; 113 died. The hazard ratio for time to progression in responders compared with nonresponders was 0.36(95% CI, 0.23–0.57) for WHO, 0.38(95% CI, 0.24–0.58) for RECIST, and 0.38(95% CI, 0.22–0.64) for EASL. Hazard ratios for survival in responders compared with nonresponders in univariate and multivariate analyses were 0.46(95% CI, 0.32–0.67) and 0.55(95% CI, 0.35–0.84); they were 0.36(95% CI, 0.22–0.57) and 0.54(95% CI, 0.34–0.85) for EASL. Hazard ratios for survival in responders vs nonresponders in patients with solitary and multifocal HCC were 0.39 (95% CI, 0.19–0.77) and 0.51 (95% CI, 0.32–0.82) for WHO and 0.26 (95% CI, 0.10–0.67) and 0.47 (95% CI, 0.28–0.79) for EASL.
Among a group of patients with HCC, agreement for classification of therapeutic response was high between RECIST and WHO, but low between each of these and EASL. Application of these methods to measure response in a primary index lesion resulted in statistically significant correlations with disease progression and survival.
The incidence of hepatocellular carcinoma (HCC) is increasing worldwide1. Curative surgical therapies have demonstrated the best long-term survival rates; however, most patients do not meet selection criteria2. Sorafenib is the sole systemic agent that has shown a survival benefit for advanced HCC3. Locoregional therapies (LRTs) deliver toxic thermal, chemical, and/or radioactive doses to tumors, with acceptable toxicity to normal tissue. Chemoembolization and radioembolization using yttrium 90(90Y) are transarterial LRTs that have a palliative role in HCC4–7.
Given the lack of standardization of functional imaging in HCC, anatomical methods are still considered the gold-standard for response assessment. In 1979, the World Health Organization (WHO) (bi-dimensional perpendicular measurements) published guidance on the anatomicical assessment of tumor response to therapy8. In 2000, the Response Evaluation Criteria in Solid Tumors (RECIST) guidelines (uni-dimensional measurements) were published, updating the WHO document9. While the original intent of the WHO and RECIST was to describe methods for assessing response following systemic chemotherapies in which all tumors are theoretically equally exposed to systemic agents, this approach does not directly translate to LRTs. These therapies are usually staged procedures and do not target all disease in one treatment session; they often cause tumor necrosis without change in tumor size10, 11. In response to these limitations, the European Association for Study of the Liver (EASL) guidelines were published in 2001 and were based on percent change in amount of enhancing tumoral tissue post-treatment9, 10, 12. Most recently, RECIST guidelines (version 1.1) advocated assessing response using fewer lesions (≤2 per organ) than the original RECIST guidelines (≤5 per organ), suggesting that the optimal number of lesions that should be measured remains uncertain13, 14.
We performed 2 comprehensive analyses in 245 patients treated with transarterial LRT. Because the WHO and RECIST guidelines have been shown to be similar in their ability to capture response and investigators have demonstrated minimal agreement between RECIST and EASL10, we sought to validate these concepts following transarterial LRTs15. Our first analysis addressed this question: Is there agreement in response between the RECIST (uni-dimensional), WHO (bi-dimensional) and EASL (necrosis) guidelines?
Given that the single common factor for all patients undergoing transarterial therapy is that they have at least 1 dominant first-treated lesion, we hypothesized that this lesion may be prospectively identified as the “primary index lesion” for that patient and that response measurement using that lesion alone may be considered a predictor for time to progression (TTP) and survival. Investigators have shown that imaging response may predict survival benefit; the ability of response in the primary index lesion to capture a TTP benefit would further strengthen this concept16. Therefore, our second analysis sought to assess whether RECIST, WHO or EASL response was predictive of a therapeutic (TTP and survival) benefit when compared with patients not exhibiting response by using the primary index lesion alone, irrespective of multifocality: ie, Does an imaging response in the primary index lesion correlate with improved TTP and survival in solitary or multifocal disease? This would result in a simple, reproducible and standardizable methodology for assessing response in HCC.
Consecutive patients with HCC (without vascular invasion or extrahepatic metastases) who were treated with transarterial LRTs (chemoembolization and radioemboliztion) at our institution between January 2000 and December 2008 were included. This study was approved by Northwestern University Institutional Review Board and complied with the Health Insurance Portability and Accountability Act compliant. All patients provided written informed consent for treatment.
Diagnosis of HCC was confirmed by biopsy or radiographic findings using accepted guidelines17. Baseline characteristics were obtained; patients were staged using Child-Pugh, United Network for Organ Sharing (UNOS) and Barcelona Clinic Liver Cancer (BCLC) classification systems18, 19. Patients were classified as BCLC class C if they exhibited HCC-related symptoms (e.g. capsular pain). The decision to treat with chemoembolization or radioembolization was based on consensus of a multidisciplinary team at a weekly HCC conference of hepatologists, medical oncologists, transplant surgeons and interventional radiologists.
Chemoembolization is a transarterial therapy delivering high doses of chemotherapeutic agents to a tumor via the hepatic artery. It was performed using the standard triple-drug mixture of mitomycin (30 mg), adriamycin (30 mg) and cisplatinum (100 mg) mixed with lipiodol; this was followed by delivery of permanent embolic particles. Technical details of chemoembolization have been discussed elsewhere20.
Radioembolization using 90Y radioembolization is a transarterial therapy in which high doses of radioactivity are delivered to the tumor via the hepatic artery. The device used was glass-based (TheraSphere®, MDS Nordion, Ottawa, Canada), with 90Y as an integral constituent of the 20 to 30- μm microspheres. Technical details for radioembolization have been discussed elsewhere21.
Contrast-enhanced computed tomography or gadolinium-enhanced magnetic resonance imaging were used for response assessment. The primary index lesion was defined as the lesion targeted during the first treatment session. It was usually the largest and was considered to be the most appropriate target for first transarterial therapy. For this analysis, even if several tumors were targeted during the first or subsequent treatments with chemoembolization or radioembolization, only the primary index lesion was used to assess response and followed longitudinally, even if it was not the lesion most recently treated. Lesions other than the primary index lesion are therefore analogous to nontarget lesions as defined by RECIST and hence not used for response assessment.
Response rates were assessed using size (WHO, RECIST) and necrosis (EASL) guidelines9, 10, 12. For purposes of WHO and RECIST measurements, the entire lesion was measured, irrespective of the amount of necrosis seen. This was deemed most conservative.
All calculations for imaging analyses were performed from the date of first LRT. Imaging follow-up (and hence measurement of the primary index lesion) was performed at 1-month following each treatment; subsequent scans were performed at scheduled 2–3 month intervals as per standard of care. WHO, RECIST, and EASL guidelines used for this analysis are outlined in eTable 18, 9,12.
Intermethod agreement was assessed using kappa statistics; a kappa coefficient (κ) value of 0.75–0.8 represents excellent inter-method agreement22, 23. There are 2 rationales for correlating RECIST and WHO. First, lesions are never completely spherical and often change shape following treatment; capturing only the largest single dimension (ie, using RECIST guidelines) may underestimate response (and conversely progression) when compared with the cross-product of the largest dimension and its perpendicular dimension (ie, using WHO guidelines). Second, the identification of a sample size at which RECIST (uni-dimensional) and WHO (bidimensional) become equivalent is of clinical relevance, because it may identify a minimum number of patients at which the 2 guidelines become interchangeable. To search for a sample size at which there would be excellent intermethod agreement between RECIST and WHO guidelines, we calculated κ at increments of 25 random patients per treatment group, with final κ coefficients based on all 245 patients.
Times to response (defined as complete or partial response) were calculated from the date of first treatment session using the Kaplan-Meier method and were compared using the log-rank test24. For purposes of calculating time to response, an endpoint was defined as imaging response in the primary index lesion using either WHO, RECIST, or EASL guidelines.
Time to progression and survival were calculated from the date of first treatment session using the Kaplan-Meier method and were compared using the log-rank test24. Importantly, for purposes of calculating TTP, an endpoint was defined as any of the following: progression by WHO guidelines (>25% increase in the cross-product of the index lesion), EASL guidelines (>25% increase in the amount of enhancing tissue in the index lesion), UNOS stage (progressing from a less advanced to more advanced UNOS stage eg, UNOS T4a to T4b) or appearance of new lesions or extrahepatic metastases. Survival calculations were censored to transplantation or resection25.
In case of solitary HCC, the solitary, measurable tumor is by definition the primary index lesion. Therefore, the concept of assessing response using the primary index lesion alone is only confounded in cases of multifocal disease. The ability to validate the index lesion concept is based on the fact that nearly half of our cohort (112/245 patients, [46%]) had solitary HCC. Therefore, we were able to perform sub-stratification analyses by tumor distribution (solitary vs multifocal HCC) to assess if the hazard ratios (HRs) for survival using only the primary index lesion for response assessment remained significant; HRs indicating a significant survival benefit in responders vs. nonresponders in solitary and multifocal disease would suggest the ability to capture therapeutic benefit using the primary index lesion, despite the presence of multifocal disease.
Univariate and multivariate analyses were conducted to identify factors associated with survival. Race/ethnicity (white, Asian, Hispanic, African American) was assessed by the physician and was studied for univariate analysis but not included as a variable in the multivariate analysis. Univariate analysis was performed using the Kaplan-Meier method with the log-rank test and multivariate analyses were performed using the Cox proportional hazards model26. Only variables having P<.05 on univariate analysis were included in the multivariate model, and the HR estimates were based on simultaneous analysis of all predicated variables. Assumption of proportionality was tested using the log-minus-log plot and was met. We used Child-Pugh class, UNOS stage, and Eastern Cooperative Oncology Group (ECOG) performance status to determine the effect of liver function, tumor stage and performance status on survival.
Our power analysis indicated that using the Cox regression of the log HR with an anticipated event rate of 0.45, a sample of 245 patients will achieve 81% power at a 2-tailed .05 significance level to detect a minimum HR of 1.33 for any given covariate27. Thus, our study is powered to detect any HRs greater than 1.33 or less than 0.75 (1/1.33).
All analyses were conducted using SAS 9.2 (SAS Inc., Cary North Carolina). All P values were 2-sided, and a P-value <.05 was considered statistically significant.
Table 1 summarizes the baseline patient demographics for the 245-patient cohort. Of these, 122 (49%) were treated with chemoembolization and 123 (51%) with radioembolization. One hundred thirty-nine patients (57%) patients were diagnosed by imaging; 106 (43%) patients required biopsy. The median (interquartile range) number of treatments was 2 (1–3) for chemoembolization and 1 (1–2) for radioembolization (p=.09). Otherwise, the groups were similar. The median follow-up time was 13.8 months (95% confidence interval [CI],12.1–17.0).
All 245 patients underwent imaging, and none were lost to follow-up; 1065 scans were reviewed (mean, 4.3 scans/patient). These data served as the source data for all imaging analyses described in this study.
In the WHO (bi-dimensional) analysis, complete response was seen in 4 patients (1.6%), partial response in 100 (40.8%), stable disease in 108 (44.1%), and progressive disease in 33 (13.5%). The median time to WHO response was 7.7 months (95% CI, 6.1–9.5).
In the RECIST (uni-dimensional) analysis, complete response was seen in 4 patients (1.6%), partial response in 97 (39.6%), stable disease in 114 (46.5%), and progressive disease in 30 (12.3%). The median time to RECIST response was 7.7 months (95% CI, :6.2–10.3). In the EASL (necrosis) analysis, complete response was seen in 79 patients (32.2%), partial response was seen in 93 (38%), stable disease in 54 (22%), and progressive disease in 19 (7.8%). The median time to EASL response was 1.6 months (95% CI,1.3–2.2).
Table 2 represents the intermethod agreement of the WHO and RECIST guidelines. The κ coefficient increased with the number of patients. The number of patients in whom a κ of 0.8 was reached was 50 (25 per group). The κ between WHO and RECIST for all 245 patients was 0.86 (95% CI, 0.80–0.92).
Table 3 presents the κ between EASL and WHO and RECIST. The κ for EASL and RECIST was 0.24 (95% CI, 0.16–0.33). The κ for EASL and WHO was 0.28 (95% CI, 0.19–0.36).
Time to Progression. Disease progressed in 96 patients. The HRs for TTP in responders compared with non-responders by WHO, RECIST and EASL guidelines were 0.36 (95% CI, 0.23–0.57), 0.38 (95% CI, 0.24–0.58) and 0.38 (95% CI, 0.22–0.64) respectively.
One hundred thirteen patients died. The HRs for survival in responders compared with non-responders by WHO, RECIST and EASL guidelines were 0.46 (95% CI, 0.32–0.67), 0.46 (95% CI, 0.31–0.66), and 0.36 (95% CI, 0.22–0.57) respectively.
Table 4 presents TTP and survival comparisons between responders and non-responders when WHO and EASL guidelines were applied, substratified by solitary and multifocal disease. Significant HRs between responders and nonresponders were seen for the response guidelines between solitary and multifocal disease. This suggests that measurement of the primary index lesion alone was able to capture the therapeutic benefit on TTP and survival of an imaging response in that lesion.
The difference in patient survival between chemoembolization (median, 17.4 months) and radioembolization (median, 20.5 months) using the Kaplan-Meier method was not significant (P=.23). The eFigure presents a survival comparison between treatment groups adjusted for covariates using the Cox proportional hazards model (P=.12)
eTable 2 presents the univariate and multivariate analyses. Variables entered into the multivariate Cox model included age, baseline alpha-fetoprotein, , Child-Pugh class, UNOS stage, ECOG performance status, and WHO and EASL guidelines response. Multivariate analysis confirmed the following as significant prognosticators of survival: baseline ECOG performance status 0 (HR, 0.65; 95% CI,0.43–0.97), UNOS stage less than T4a (HR,0.62; 95% CI,0.41–0.95), Child-Pugh Class less than C (HR,0.18; 95% CI,0.05–0.59), WHO response (HR,0.55; 95% CI,0.35–0.84), EASL response (HR,0.54; 95% CI, 0.34–0.85) and pretreatment alpha-fetoprotein level of 200 or less (HR, 0.59; 95% CI, 0.37–0.94). The FIGURE illustrates the survival distribution function by WHO and EASL response, respectively, adjusted for covariates.
Novel LRTs are establishing their role in the management of HCC, necessitating the ability to accurately determine tumor response4, 5, 28. For LRTs, standardization in methodology and evidence are lacking13. In this analysis, we sought to test agreement between RECIST (uni-dimensional), WHO (bi-dimensional) and EASL (necrosis) guidelines, as well as to validate the clinical benefit imparted by observing response by correlating imaging response to TTP and survival. Lastly, the concept of the primary index lesion was introduced, potentially leading to a novel methodology of standardization of response and analyses of time to endpoint in patients with HCC who are receiving transarterial LRTs.
Although the pretreatment and post-treatment determinations of tumor volume (3-dimensional) are intuitively most representative of actual treatment effect, limitations in available technology prevent their routine use29. Our data validate that the WHO (bidimensional) and RECIST (uni-dimensional) are similar in assessing change in size when applied to transarterial LRTs15. The intermethod agreement was high (κ=0.86). Thus, given that tumors usually are not spherical and have irregular borders, we prefer using WHO over RECIST guidelines for transarterial LRTs, particularly when the patient sample size is small. On the other hand, there was minimal intermethod agreement between necrosis and size guidelines (EASL vs WHO, κ=0.28; EASL vs RECIST, κ=0.24). These findings are intuitive; a complete response by EASL could be be classified as a complete response, partial response, stable disease, or progressive disease by WHO or RECIST guidelines. Keppke et al. confirmed this, reporting response rates of 23%, 26% and 57% with application of RECIST, WHO, and EASL guidelines, respectively11. These findings were further confirmed by Forner et al10. (κ=0.19 between EASL and RECIST guidelines)
The development of systemic biologic therapies in the management of HCCs, particularly those that are cytostatic rather than cytotoxic, necessitates the ability to measure response despite no change in tumor size3. The imaging characteristics and response rates following LRTs can be heterogeneous at the lesional level; this is potentially related to anomalous blood supply to HCCs30. Ablative techniques have been shown to cause necrosis without affecting tumor size10. Complete response by EASL at 1 month following percutaneous therapies correlates with improved survival16. The presence of residual tumor at 1month following ablation may indicate incomplete targeting of tumor. Thus, unlike complete response, a partial response by EASL guidelines (potentially representing treatment failure) may not necessarily indicate improved outcomes following thermal ablation16. As seen in this study, transarterial LRTs have been shown to affect both size and necrosis as seen in this study, with both translating into favorable long-term outcomes31.
Since EASL partial response is usually manifest at 1.6 months, one could postulate that EASL response may serve as an earlier surrogate for therapeutic benefit when compared with WHO response. Response by EASL also may have an important role in patients with HCC who are listed for liver transplantation because due: time from treatment to transplantation is variable and because EASL response is achieved earlier. The median time to WHO response was 7.7 months. A lesion that has decreased in size (WHO response) has stood the test of time, suggesting favorable tumor biology and the ability of the surrounding hepatic parenchyma to regenerate normally. The HRs for survival of responders vs. nonresponders by WHO and EASL were similar on multivariate analysis (0.55-0.54, respectively). These data suggest that EASL response (achieved early) and WHO response (achieved later and therefore time-tested) are both important parameters and are independent predictors of survival.
In patients with solitary HCC, the primary index lesion is clearly represented by the single tumor nodule that undergoes treatment. However, with multifocal disease, the ability to capture response and time-to-event endpoints becomes less evident given the multiplicity of tumors and staged treatment points.
Given these realities, can the concept of the primary index lesion be expanded to multifocal disease? There are 3 rationales for hypothesizing the potential clinical utility of the index lesion in multifocal disease. First, LRTs are performed as staged procedures and hence, as opposed to systemic therapies, not all lesions are treated at the same time, resulting in variable starting points for response and time-to-progression analyses. Second, not all patients being treated with LRTs undergo a complete treatment cycle (patients may progressive disease, experience adverse events, or be intolerant to further treatment). This would result in a confounding mathematical effect of overall response being dependent on the magnitude of the size changes of treated and untreated tumors, potentially erroneously leading to the reporting of stable or even progressive disease rather than response to therapy. Third, response assessment, TTP and survival are inherently flawed if they are measured from the end of the treatment cycle, because it may take 2 to 6 months to treat all disease. This would make comparison of LRTs with systemic agents (eg, TTP using sorafenib, 5.5 months) and hypothesis generation for future clinical trials, in which analyses of time to end point begin at the time of protocol enrollment or randomization, difficult3.
Our analysis suggests that response seen in the primary index lesion following treatment, even in the presence of multifocal disease, showed a prognostic benefit following LRTs. In some sense, the primary index lesion was able to serve as a biomarker of long-term outcomes. The HRs using WHO and EASL were able to capture the significant TTP and survival benefit in responders vs non-responders in patients with solitary and multifocal HCC. Furthermore, responses by WHO (bidimensional) and EASL (necrosis) guidelines had independent effects on survival on multivariate analysis.
This study has limitations. The study includes patients treated with chemoembolization or with radioembolization using 90Y. However, the baseline characteristics of the 2 treatment groups by Child-Pugh class, BCLC class, and UNOS stage were identical. In contradistinction to the data used to formulate the original RECIST guidelines (different malignancies, various systemic therapies), our analysis is more standardized, because it is based only on HCC. Few studies have compared size and enhancement guidelines for assessing tumor response10. Although pathologic evaluation of the treated lesion represents the gold-standard for assessing response to treatment, this is available only in select cases following resection, transplantation, or autopsy31, 32. The therapeutic benefit of imaging and other biomarkers must be studied33. It would be interesting to investigate if the same concept of primary index lesion holds for ablative LRTs or systemic therapy. Survival between the 2 treatment groups was not significantly different.
In conclusion, WHO and RECIST guidelines had minimal agreement with EASL guidelines10. The WHO and EASL responses were favorable and independent prognostic factors of survival. No imaging guidelines are currently considered the gold-standard for LRTs. The combined findings of response at the primary index lesion level being predictive of TTP and survival, the significant HR for TTP and survival in solitary and multifocal disease, and the significant HRs for survival maintained in the multivariate analyses all support the use of the primary index lesion as a biomarker to assess imaging response following transarterial LRTs. This may potentially lead to simplification, reproducibility and standardization of imaging assessment guidelines in LRTs. Measuring the primary index lesion and then starting analyses of time to endpoint at the time of first treatment (irrespective of completion of the treatment plan) is in keeping with the principles of intention-to-treat. It should be stressed that patients should continue to receive the planned treatment to target all disease, even in the presence of response in the primary index lesion. The findings presented herein will require further validation.
Funding: This study was not funded.
Access to Data Statement:
Riad Salem had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Potential Conflicts of Interest:
Riad Salem is a consultant to and receives grant support from MDS Nordion
Riad Salem and Reed Omary supported in part by NIH R01 CA126809
Al Benson III is an advisor to and receives grant support from MDS Nordion.
None of the other authors report any potential conflicts of interest