In our study, using rigorous quantitative analysis and 6 different response criteria, we found that the classification of HCC response to 90Y radioembolization highly depended upon the criteria chosen to assess tumor response. Three months after 90Y treatment, bi-dimensional viable tumor analysis that incorporated necrosis (2D EASL) was the only anatomic imaging size criterion showing a statistically significant reduction in tumor size. The remaining 4 tested anatomic imaging criteria did not show any statistically significant changes in tumor size. Although 3D-EASL did not reach a statistically significant decrease in size after 90Y, the decrease in size could be clinically significant because the average patient had >50% decease in viable tumor volume. With more patients in this study, it is possible that 3D EASL would have reached statistical significance.
At this same time period, the functional imaging measure of tumor response — an increase in ADC value on DW imaging — demonstrated statistically significant changes. Unlike a previous study which found an inverse relationship between changes in ADC values and anatomical size changes after 90
), our study shows that the ADC value increased after treatment for the majority of patients and that there was no correlation between a decrease in tumor size and an increase in the ADC value. The most likely explanation for this discrepancy is that the functional changes are not mirrored by the anatomical changes. However, correlation with histopathology is needed to confirm this hypothesis. From these results we believe that a functional imaging measurement of tumor burden, based upon increases in water mobility from the breakdown of necrotic cellular membranes, can serve as a separate but complementary assessment of tumor response.
We also showed that the traditional size criteria (RECIST, WHO, volumetric) had substantial agreement (κ=0.76) when classifying tumors into treatment response categories. This outcome supports previous studies showing that volumetric tumor measurements do not add significant value to those obtained by RECIST and WHO (18
). However, our results diverge from other studies showing that volumetric measurements yield discordant and possibly more accurate results compared to those obtained by RECIST (uni-dimensional) and WHO (bi-dimensional) measurements (7
). Given the time consuming effort required to measure tumor volume and the similar results compared to RECIST and WHO, we do not think that 3D volumetric analysis is applicable to daily clinical practice, unless software is designed to make the process more automated.
Additionally, we demonstrated that the three traditional anatomic imaging criteria (RECIST, WHO, and volumetric) underestimate response compared to the EASL anatomic criteria which incorporate necrosis. Specifically, the traditional size criteria had only slight to fair agreement (κ = 0.06 to 0.42) with 2D and 3D EASL and showed a less favorable treatment response. We found that the 2D EASL criteria produced the most favorable response to 90
Y. Additionally, employing 3D EASL criteria proved extremely time consuming and yielded similar results to the simpler, faster 2D EASL criteria. The fact that 3D EASL had a lower response rate compared to 2D EASL (55% vs. 20%) can be attributed to the phenomenon of necrosis typically occurring at the center of tumor before the edges. In other words, most 2D EASL measurements were made on the central slice of the tumor where necrosis was greatest. However, once we moved away from the central slices to obtain a volume, there was progressively more viable tumor noted towards the periphery of the lesions. Additionally, there are no universally agreed upon criteria to classify response using the 3D EASL method because EASL never specified how necrosis should be defined or measured. Previously, Keppke et al. and Miller et al. used a 30% increase in necrosis to define partial response (11
). They chose this value because it was consistent with the 30% decrease in size required by RECIST method to obtain partial response. In this manuscript we used a more rigorous 65% decrease in viable tumor in order to define response. This value is extrapolated from the 65% decrease in tumor volume required by the volume criteria (3
). Had we used a 30% decrease in viable tumor volume, our response by 3D EASL would have approached response as measured by 2D EASL.
Our findings are similar to the results of Keppke et al and Miller et al (11
) who showed that WHO and RECIST underestimated response rates compared to criteria which take tumor necrosis into account. However, unlike these two previous studies, we used MRI as our imaging modality instead of CT. This study further differed from Keppke et al (11
) because we correlated tumor size with a functional measurement of tumor burden and from Miller et al (13
) because we focused on HCC rather than liver metastases.
Our study has several important limitations. First, while necrosis was estimated as non-enhancing tissue for the anatomical imaging criteria, —non-enhancement may not accurately differentiate viable from necrotic tumor on histopathology. Imaging response was not correlated with histopathology because tissue sampling was not clinically warranted. Second, results were assessed only for single dominant tumors with well-defined margins. Whether the results can be generalized to multiple tumors or those without well-defined borders remains to be shown by future studies. Third, we assessed response to radioembolization and not other liver-directed therapies, such as chemoembolization or radiofrequency ablation. However, because the methodology is the same, we see no reason why the results would not be applicable to all liver-directed therapies, or even systemic chemotherapies. Fourth, there was variability in the number of days post 90Y that the follow up MRI was obtained. In order to eliminate this variability, while maintaining 20 patients, we would have needed additional patients treated prior to July 2005 when DW MR images were not routinely available at our institution. Finally, we used a time-consuming rigorous method to outline and quantify necrotic volume. In clinical practice, necrosis is typically estimated as regions lacking enhancement. Although it remains to be shown how the results would occur in clinical practice, the use of bi-dimensional viable criteria appears most clinically applicable.
We acknowledge the potential for bias with parts of the study design such as lesion selection, using consensus readings, or choosing readers who are familiar with the disease process and 90Y therapy. However, the goal of this study was not to establish the amount of necrosis caused by 90Y or to evaluate intra- or inter-observer variability. Instead, it was to show how different imaging criteria leads to different response rates. We selected patients with single or dominant lesions with well defined margins because measuring these lesions would offer the highest likelihood for reproducibility between readers. Also, by definition, lesions without measurable margins cannot be assessed using RECIST or WHO imaging criteria.
In conclusion, of the 5 competing anatomic imaging analysis approaches, bi-dimensional viable tumor size (2D EASL) criteria provide the greatest anatomic response to radioembolization. However, in the absence of correlative histopathology or longitudinal studies, we can only speculate that the 2D EASL approach is the preferred anatomic approach, while traditional size criteria underestimate response. For functional assessment of water mobility changes, DW MR imaging can serve as a complementary secondary method to assess response. However, it is not a substitute for primary anatomic measurements because we did not find a correlation between these methods. Correlation with clinical outcome is the subject of future study.