|Home | About | Journals | Submit | Contact Us | Français|
Visual semi-quantitative assessment of liver tumour burden for neuroendocrine tumour liver metastases is often used in patient management and outcome. However, published data on the reproducibility of these evaluations are lacking.
The aim of this study was to evaluate the interobserver and intraobserver agreement of a visual semi-quantitative assessment of liver tumour burden using CT scan.
Fifty consecutive patients (24 men and 26 women, mean aged 54 years) were retrospectively reviewed by four readers (two senior radiologists, one junior radiologist and one gastroenterologist) who assessed the liver tumour burden based on a visual semi-quantitative method with four classes (0–10, 11–25, 26–50 and ≥50%). Interobserver and intraobserver agreement were assessed by weighted kappa coefficient and percentage of agreement. The intraclass correlation was calculated.
Agreement among the four observers for the evaluation of liver tumour burden was substantial, ranging from 0.62 to 0.73 (P<0.0001). The intraclass coefficient was 0.977 (P<0.0001). Intraobserver agreement was 0.78 and ICC was 0.97.
Reproducibility of the visual semi-quantitative evaluation of liver tumour burden is good and is independent of the level of experience of the readers. We therefore suggest that clinical studies in patients with neuroendocrine liver metastases use this method to categorise liver tumour burden.
Neuroendocrine neoplasms are a heterogeneous group of tumours arising from endocrine and nervous system cells. Two of the most common anatomical sites of origin are the gastrointestinal tract and the pancreatic islet cells, both grouped as gastroenteropancreatic neuroendocrine tumours, or GEP-NETs. They are frequently metastasised at diagnosis, and the liver is the most common site of metastases (1). The presence of liver metastases is a significant negative prognostic factor that depends on the site of the primary tumour, its histological differentiation and proliferative activity. At present, the latter is assessed by the number of mitoses per unit area of tumour or as the percentage of neoplastic cells immunolabelling for the proliferation marker Ki67. The system that was recently proposed for GEP-NETs by the European Neuroendocrine Tumor Society (ENETS) and recommended by WHO uses either mitotic rate or the Ki67 labelling index (2).
Except for neuroendocrine carcinomas (G3 tumours), curative surgical resection is the reference treatment for liver metastases (NELM), with 5-year overall survival rates ranging from 60 to 80%, low mortality and acceptable morbidity (2). If more than 90% of the liver tumour burden can be resected, debulking resection can also be considered to reduce symptoms (2).
When surgical resection cannot be considered, locoregional liver-directed therapies and/or medical therapies may be discussed. In these cases, the therapeutic choice is based on the previously mentioned tumour characteristics (primary tumour site, histological differentiation and tumour activity) as well as tumour progression in the liver, the presence of extrahepatic lesions and the liver tumour burden (2).
Several recently published studies have shown that liver tumour burden was an important factor for patient management and long-term outcome, especially for somatostatin analogue treatment, and also for transarterial chemoembolisation, bland embolisation or selective intraarterial radiation therapy (SIRT) (3, 4, 5, 6, 7, 8). For example, liver tumour burden has been shown to be an independent prognostic factor for survival and, in the Promid study, an important factor for the antiproliferative effect of octreotide LAR (4, 6, 9).
In most of these studies, the liver tumour burden was evaluated by a visual semi-quantitative assessment of the total tumour volume in the liver on CT scan and/or MRI, categorised into three to five classes by percentage. To our knowledge, there are no published data on the interobserver agreement for this assessment.
CT scan, because it enables the exploration of the most common metastatic sites, is the reference technique for initial evaluation and follow-up of NET-associated metastases, when using appropriate imaging techniques (triphasic CT scan acquisition including a late arterial phase), with a mean sensitivity for detection of NET liver metastases of 82% (range 78–100%) and a mean specificity of 92% range (83–100%) (10, 11, 12). MRI is generally more sensitive at detecting liver metastases because of the improved tissue contrast and is a non-radiant technique; nonetheless, MRI is less available and more expensive than CT and is generally not a standard NET imaging method (10, 11, 12).
Thus, the aim of our study was to evaluate the intraobserver and interobserver agreement of a visual semi-quantitative assessment of liver tumour burden, based on CT scan, because this is the most widely used imaging modality in oncology.
This was a retrospective study and a waiver was obtained from the institutional review board.
From 2013 to 2014, all patients referred to our institution with a diagnosis of GEP-NETs were retrospectively reviewed. All patients in whom at least one synchronous liver metastasis was identified on baseline CT scan were included. Patients with previous systemic or locoregional liver therapies were excluded.
The final diagnosis of a neuroendocrine tumour was histologically confirmed by conventional and immunohistochemical techniques using chromogranin A and synaptophysin stain performed on the liver biopsy or the surgical specimen of the primary tumour. In patients with no pathologic proof of neuroendocrine liver metastases, the diagnosis was based on typical imaging features (13).
Demographic, clinical and biological data were collected in a retrospective review of the medical records.
All patients underwent CT scan, performed with a 64-section scanner (VCT LightSpeed; GE Healthcare). The same MDCT protocol optimised for neuroendocrine tumours was performed in all patients. This included an unenhanced acquisition and two contrast-enhanced acquisitions (late arterial and portal venous phase) after intravenous administration of 2mL/kg of non-ionic contrast medium (Xenetix, Guerbet, France).
Anonymous CT data were analysed by 4 readers blinded to the clinical and biological data: two senior abdominal radiologists (MZ and MPV) with 11 and 20 years of experience, respectively, one junior radiologist (ML) and one gastroenterologist (OH) with 10 years of experience in neuroendocrine tumour management.
Each reader was asked to assess the liver tumour burden based on a visual semi-quantitative scale as follows: 0–10, 11–25, 26–50 and more than 50% of tumour involvement of the liver, using arterial or portal venous phase acquisitions depending on the tumour conspicuity (Fig. 1). Intraobserver variability was assessed for one reader (MZ) by repeating the assessment of tumour burden in all patients six months later.
Quantitative variables were determined by mean and s.d. Categorical variables were determined by count and percentages. Frequencies were compared using a chi-square test. Interobserver agreement was assessed using weighted kappa values. A coefficient between 0.00 and 0.20 indicated slight agreement, 0.21 and 0.40 fair agreement, 0.41 and 0.60 moderate agreement, 0.61 and 0.80 substantial agreement and 0.81 and 1.00 almost perfect agreement (14). Furthermore, percentage of agreement and intraclass correlation coefficient (ICC) were calculated. A P value of 0.05 was considered to be significant. All analyses were performed using the Statistical Package for the Social Sciences software (SPSS Inc., version 20.0).
A total of 50 patients were included in the study, 24 men and 26 women, mean age 54 years (30–75).
The primary sites of the neuroendocrine tumours were the foregut (n=28), the midgut (n=15), the hindgut (n=2) and indeterminate (n=5). Forty patients had well-differentiated tumours (80%) and 10 had poorly differentiated tumours (20%). Thirteen (26%) tumours were G1, 33 (66%) were G2 and the remaining 4 (8%) were G3.
The details of the radiological assessment of liver tumour burden by the four readers are summarised in Table 1. There was no significant difference among the four liver tumour burden distribution (P=0.87).
In a patient-based analysis, all four observers agreed on the liver tumour burden assessment in 29/50 patients (58%). All cases of disagreement concerned two contiguous classes.
The interobserver agreement was substantial for all observer pairs (kappa ranging from 0.62 to 0.73) (Table 2). Percentage of agreement ranged from 72 to 80%. Overall, ICC was 0.977 (P<0.0001).
The intraobserver agreement was substantial (kappa=0.78), with a percentage of agreement of 84%; ICC was 0.97 (Table 2).
Although there is no standardised imaging method to reliably measure liver tumour burden, the ENETS Consensus Guidelines state that the estimation of the percentage of liver tumour involvement by an experienced radiologist with a visual semi-quantitative method is the best option (2). However, to our knowledge, the reproducibility of this method has never been studied.
Our study demonstrates that there was substantial agreement in evaluating liver tumour burden on CT examinations among the four observers with different levels of experience. As expected, the strongest interobserver correlation was between the two senior radiologists and the weakest was between the junior radiologist and the gastroenterologist. However, the differences between pairs of observers were not significant. Intraobserver correlation was a bit better than all interobserver correlations.
In our study, we included consecutive patients with neuroendocrine liver metastases whatever the primary site, tumour grading or differentiation and the imaging features of the liver metastases. Thus, there were different patterns of liver metastases (well defined or infiltrative) in our patient population. Indeed, the evaluation of liver tumour burden is expected to be more difficult when metastases are infiltrative. Moreover, the distribution among the different classes by percentage of liver tumour burden was fair, which provides an assessment of the reproducibility in patients with various liver tumour burdens.
The thresholds for the classes (≤10, 11–25, 26–50 and >50% of tumour involvement) were chosen because they have already been used in other studies. One of the first studies to use the same percentages for the assessment of liver tumour burden showed a good correlation between plasma levels of chromogranin A and liver tumour burden (9). The clinical relevance of these classes has been confirmed in many studies. Hentic and coworkers have shown that a significant liver tumour burden defined as more than 25% was an independent predictor of poorer survival, whereas Bertani and coworkers showed that a liver tumour burden of less than 25% was an important factor for improved survival (4, 15). Rincke and coworkers showed that the antiproliferative effect of somatostatin analogues was greater in patients with a low (≤10%) liver tumour burden, and Palazzo and coworkers found that a low-to-moderate (≤25%) liver tumour burden was predictive of tumour stability under somatostatin analogue therapy, whereas Caplin and coworkers found an antiproliferative effect of lanreotide in patient with larger hepatic tumour volumes (6, 16). Kress and coworkers have shown that best morphological response to transarterial chemoembolisation or bland embolisation was obtained with limited liver involvement (<50%) (5). More recently, in a series of 48 patients undergoing SIRT for unresectable NELM, Saxena and coworkers showed that a low liver tumour burden (<25%) was associated with a partial or complete tumour response and thus improved survival (8).
We chose to evaluate liver tumour burden using a visual semi-quantitative assessment, and we showed that this easy method was highly reproducible. In several other studies using the same approach, liver tumour burden was calculated using four to six of the most amount of diseased CT slices (6, 9). We feel that our method using the whole liver was more reliable and thorough. Other authors have evaluated liver tumour burden by counting the number of liver metastases (more or less than five lesions) (17, 18). Although a correlation was found between the number of liver metastases and the prognosis, the clinical relevance of this method is probably limited because the number of liver metastases does not necessarily reflect tumour load. Other researchers have determined the percentage of the liver tumour burden using liver volumetry with three-dimensional image-reconstruction commercial software (8, 19). However, this method is complicated and often difficult in patients with a major liver tumour burden and when lesions are poorly defined and/or infiltrative.
Our study has certain limitations. We did not use a method of reference in our liver tumour burden assessment. For example, it would have been very difficult to use a pathological examination of the entire liver, because very few patients underwent liver transplantation or hepatectomy. Moreover, comparisons between imaging findings and histological count of NELM on thin serial slices have already been performed by Elias and coworkers who concluded that half the number of NELM were undetectable on preoperative imaging, with an accuracy of 38% for CT scan (20). We could have compared the visual semi-quantitative evaluation to a quantitative volume evaluation using dedicated software. However, as stated previously, obtaining a gold standard by an accurate quantitative evaluation is impossible when the liver tumour burden is significant or in case of infiltrative disease. Moreover, our primary goal was to assess the reproducibility of visual semi-quantitative assessment and not to assess its accuracy.
In conclusion, we showed that a visual semi-quantitative evaluation of liver tumour burden is very reproducible, regardless of the level of experience of the readers. Therefore, we think that clinical studies in patients with neuroendocrine liver metastases should use this method to categorise liver tumour burden.
The authors declare that there is no conflict of interest that could be perceived as prejudicing the impartiality of the research reported.
This research did not receive any specific grant from any funding agency in the public, commercial or not-for-profit sector.