|Home | About | Journals | Submit | Contact Us | Français|
To obtain diagnostic performance values of CT, MRI, ultrasound and 18-fludeoxyglucose positron emission tomography (PET)/CT for staging of hilar cholangiocarcinoma.
A comprehensive systematic search was performed for articles published up to March 2011 that fulfilled the inclusion criteria. Study quality was assessed with the quality assessment of diagnostic accuracy studies tool.
16 articles (448 patients) were included that evaluated CT (n=11), MRI (n=3), ultrasound (n=3), or PET/CT (n=1). Overall, their quality was moderate. The accuracy estimates for evaluation of CT for ductal extent of the tumour was 86%. The sensitivity and specificity estimates of CT were 89% and 92% for evaluation of portal vein involvement, 83% and 93% for hepatic artery involvement, and 61% and 88% for lymph node involvement, respectively. Data were too limited for adequate comparisons of the different techniques.
Diagnostic accuracy studies of CT, MRI, ultrasound or PET/CT for staging of hilar cholangiocarcinoma are sparse and have moderate methodological quality. Data primarily concern CT, which has an acceptable accuracy for assessment of ductal extent, portal vein and hepatic artery involvement, but low sensitivity for nodal status.
Different imaging investigations are currently used for staging of hilar cholangiocarcinoma patients, yet most evidence is available from CT. Although the quality of imaging has improved in recent years, a substantial proportion of tumours are still found to be unresectable during laparotomy, despite extensive pre-operative work . Only about half of the tumours who are surgically explored are ultimately resectable . Hence, correct staging of hilar cholangiocarcinoma remains a challenge.
Criteria for unresectable disease include locally advanced tumour, distant metastases and lymph node metastases beyond the hepatoduodenal ligament. Locally advanced disease is based on extent of infiltration proximally in the biliary ductal tree, the portal vein and the hepatic artery or its branches. Therefore an accurate assessment of resectability of patients with hilar cholangiocarcinoma includes correct evaluation of local, vascular and nodal status, as well as assessment of distant metastases. Accurate staging of vascular status and proximal ductal extent of hilar cholangiocarcinoma is challenging owing to the anatomic complexity of the hilar region, in conjunction with the small tumour size in most cases.
MRI, CT, ultrasound and 18-fludeoxyglucose (FDG) positron emission tomography (PET)/CT are currently used to assess resectability of hilar cholangiocarcinoma. The aim of this study was to systematically review the literature on these four imaging investigations for assessment of ductal extent of the tumour, portal vein involvement, hepatic artery involvement, lymph node status and presence of metastases in patients with hilar cholangiocarcinoma.
A comprehensive search on studies in human subjects was performed by one observer to identify those studies dealing with the diagnostic performance of CT, ultrasound, MRI and FDG-PET/CT for staging of hilar cholangiocarcinoma. The databases used for the literature search included MEDLINE (January 1966 to March 2011), Embase (January 1980 to March 2011) and the Cochrane Database of Systematic Reviews. Details of the search strategy are shown in Table 1. All review articles, letters, comments and case reports (<10 patients) were eliminated. Articles found to be eligible on the basis of their title and abstract were subsequently selected for full manuscript review. We augmented our literature search by manually reviewing the reference lists of identified studies.
Two observers independently reviewed all eligible articles for the following inclusion criteria: the study population (hilar cholangiocarcinoma patients with reference standard) included at least 10 patients; presented data were about primary staging; the reference standard included surgical or pathological confirmation; and 2×2 contingency tables or data allowing their reconstruction were presented for portal vein involvement, hepatic artery involvement, lymph node involvement or presence of distant metastasis. Articles giving sufficient data to calculate the accuracy for ductal extent of the tumour were also included, because these data are not presented in 2×2 tables. Details are described in the following section.
Studies were excluded when no specific data of hilar cholangiocarcinoma patients could be retrieved (when patients were pooled with patients with other bile duct carcinomas). When duplicate studies presenting data about the same population were found, only the most recent one was included. If an article presented data on more than one staging element, inclusion and exclusion criteria were checked for each discrete data set. Disagreements on inclusion suitability were resolved by consensus.
Both observers independently extracted relevant data from each article by using a standardised form. The readers were not blinded to the authors, journal or year of publication. Disagreements were resolved by consensus.
Methodological quality of included studies was assessed independently by the two readers using the quality assessment of diagnostic accuracy studies (QUADAS) tool, which is a quality assessment tool specifically developed for systematic reviews of diagnostic accuracy studies . Nine items were extracted; six items were deemed irrelevant, given the inclusion and exclusion criteria (e.g. whether the reference standard was likely to correctly classify the target condition). In addition to the nine QUADAS items, the design (retrospective or prospective) of the study was scored. Criteria for different QUADAS items are shown in Table 2.
The following data were recorded for each article: sample size, year of publication, male:female ratio, age, stage of disease, imaging modality (CT, ultrasound, MRI, PET/CT) and technique.
For studies in which CT was used, the following imaging features were recorded: system type, contrast material use, amount of iodine administered, imaging phases, section thickness and number of detection rows. For MRI, we recorded the magnetic field strength, contrast material use, pulse sequences and section thickness. For PET/CT, we recorded the system type, tracer specifics, scanning time and type of analysis. For ultrasound, we recorded the system type, megahertz and experience of the ultrasonographer.
For evaluation of the portal vein, hepatic artery and lymph node involvement, and distant metastasis, we constructed a 2×2 contingency table on a per-patient basis for the imaging modality used as compared with the reference standard. For ductal extent of the tumour we calculated the accuracy for each study as follows: correct diagnosis of the ductal extent of the tumour (as classified by the Bismuth–Corlette classification ) divided by the total number of patients.
We calculated the accuracy for each study by dividing the correct staging by the number of patients. We used random-effects models to obtain summary estimates of accuracy. All analyses were performed on logit-transformed accuracy because these are assumed to follow a normal distribution across studies, and therefore the mean logit accuracy corresponding standard errors can be calculated. After antilogit transformation, summary estimates of accuracy with their 95% confidence intervals (CIs) were obtained.
For each study, we calculated the sensitivity and specificity on a per-patient basis. We calculated sensitivity as true-positive findings divided by (false-negative findings+true-positive findings) and specificity as true-negative findings divided by (false-positive findings+true-negative findings) on a per-patient basis. We used random bivariate models to summarise estimates of sensitivity and specificity with 95% CIs. All analyses were performed on logit-transformed sensitivity and specificity, because these are assumed to follow a normal distribution across studies, and therefore the mean logit sensitivity and specificity with corresponding standard errors were obtained. After antilogit transformation, we obtained summary estimates of sensitivity and specificity along with their 95% CIs. These analyses were executed by using procNlmixed in SAS® v. 9.2 software (SAS Institute, Cary, NC).
The initial search yielded 766 articles (Figure 1). The Cochrane Database of Systematic Reviews yielded no additional studies. A total of 135 were duplicate studies, and were excluded. An additional 576 articles were excluded based on review of article titles, abstracts or both. The most frequent reason for exclusion was that the study concerned a review, comment or case report. We excluded another four studies, since these were written in Chinese and we could not read the manuscript.
The manual search of cross-references of the 51 remaining studies yielded an additional 19 articles based on title, and subsequently the abstract. Of the 70 full-text publications, 54 were excluded. 20 publications presented the data in a format that precluded construction of 2×2 tables, and 22 publications included a wider spectrum of bile duct cancers, and specific data about hilar cholangiocarcinoma could not be retrieved. Other reasons for exclusion were: data not about primary staging (n=3), <10 hilar cholangiocarcinoma patients (n=1), no surgical or pathological reference standard (n=6) and duplicate studies (n=2). The remaining 16 articles met the inclusion criteria.
Table 3 shows methodological assessment of included studies using the QUADAS checklist. There was a large variation in methodological quality, and many QUADAS items were not adequately described in the publication. Hence, many QUADAS items were scored as unclear. A consecutive series of patients with a description of age, sex and classification was presented in six studies. Inclusion and exclusion criteria were clearly mentioned in seven studies. The time period between imaging and the reference standard was mentioned in only 9 studies, and was 31 days or less in 7 studies. In none of the studies was the reference standard established by blinded pathological assessment, and therefore no reference standard was obtained without knowledge of the index test. In most studies, it was unclear whether the same clinical information typically available in clinical practice was available at the time of the index test interpretation. Only one study reported a prospective design. Eight studies were retrospectively performed, and in seven studies the design was unclear.
The characteristics of the included studies are presented in Table 4. The total number of patients in a study ranged from 11 to 83 (median 22 patients). Reported age ranged from 29 to 89 years, and the proportion of male patients ranged from 46% to 91%. Most studies evaluated the performance of CT (n=11); other studies evaluated the use of MRI (n=3), ultrasound (n=3) and PET/CT (n=1). Two small studies compared two investigations (CT and MRI) head to head [5,6]. The included studies evaluated only some of the staging items (ductal extent, portal vein, hepatic artery, nodal status and metastasis), as is shown in Table 5.
CT was performed in 11 studies (no system type was specified in one study, a single-slice helical scanner was used in another study, the remaining 9 studies used a multislice helical scanner). Iodinated contrast material was intravenously administered in all studies, with the possible exception of one study that did not report any technical details of the CT. 8 of the 11 studies used a 16-slice CT. The type of contrast material was reported in four studies, with variation in the administered amount (100–150 ml). All studies used three or four phases, including a portal and an arterial phase. The delay after injection of contrast agent before image acquisition varied in the ranges 5–25 s for the arterial phase and 20–70 s for the portal phase. The section thickness was described in eight studies (range 1.25–6.00 mm).
MRI was performed in three studies, including one that did not report details on the MRI. The other two studies used 1.5 T MRI equipment. One study used intravenous gadolinium contrast enhancement; no specific contrast media were used.
PET/CT was evaluated in one study that used a combination of a single-slice spiral CT and full-ring PET. PET imaging data were acquired 60 min after injection of 350 MBq of FDG. PET images were corrected for attenuation based on the CT data.
Ultrasound was performed in three studies using 2.5–5.0 MHz systems. Only one study reported the experience of the sonographers performing the ultrasound, which was 5 and 20 years in this study.
Most CT studies provided data about the assessment of ductal extent of the tumour. Table 6 shows accuracy for ductal involvement of the various studies. The summary estimate for accuracy for CT was 86% (95% CI, 77–92%).
Three included studies reported on the performance of MRI. Accuracy for ductal extent for MRI varied from 71% to 80%.
Two studies were found reporting on the performance of ultrasound for the assessment of ductal extent of the tumour, one with an accuracy of 59% (13/22)  and the other with an accuracy of 82% (32/39) .
No data on PET/CT were available.
7 studies, including 292 patients, reported sensitivity and specificity as shown in Table 7. Summary estimates were 89% for sensitivity (95% CI, 80–94%) and 92% for specificity (95% CI, 85–96%).
Only one study evaluated the use of MRI for assessment of portal vein involvement and found a sensitivity of 79% (11/14) and specificity of 0% (0/1) .
Two studies evaluated the use of ultrasound for assessing portal vein involvement, and reported a sensitivity and specificity of 75% (6/8) and 93% (12/13) , and 83% (15/18) and 100% (5/5) , respectively.
No data on PET/CT were available.
6 studies, including 191 patients, reported sensitivity and specificity on hepatic artery involvement evaluated by CT (Table 7). Summary estimates were 84% for sensitivity (95% CI, 63–94%) and 93% for specificity (95% CI, 69–99%).
Two studies reported on the use of ultrasound for assessment of involvement of the hepatic artery, and reported a sensitivity and specificity of 0% (0/1) and 100% (21/21) , and 43% (3/7) and 100% (4/4) , respectively.
No studies were identified that evaluated the use of MRI or PET/CT for assessing hepatic artery involvement.
5 studies including 136 patients reported on sensitivity and specificity of CT for lymph node status evaluated by CT (Table 7). Summary estimates were 61% for sensitivity (95% CI, 28–86%) and 88% for specificity (95% CI, 74–95%).
One study, including only 17 patients, reported on the performance of PET/CT in evaluating nodal status . The authors found a sensitivity and specificity for nodal status of 42% and 80%, respectively.
No studies were identified that evaluated the use of MRI or ultrasound for assessment of lymph node metastasis.
Only one study reported on the sensitivity and specificity of CT on metastasis, which were 67% (4/6) and 94% (15/16), respectively . Yet these results must be interpreted with caution, since this study was published in 1989 and it has several serious methodological limitations, as demonstrated by the methodological quality assessment (Table 3).
One study reported on the performance of PET/CT in identifying distant metastasis , and found a sensitivity and specificity of 56% and 88%, respectively.
Two studies compared MRI with CT head to head [5,6]. In one study, accuracies for ductal extent of the tumour for CT and MRI were 64% (9/14) and 71% (10/14), respectively . In the second study, accuracies for ductal extent of the tumour were 80% (16/20) and 75% (15/20) for CT and MRI, respectively . No statistical tests were executed owing to the very small patient numbers. Both studies only reported the accuracy for ductal extent of the tumour, and unfortunately did not report on the other staging parameters (vascular involvement or lymph node metastases).
In this review of all imaging studies concerning the staging of hilar cholangiocarcinoma, we found only 16 studies (primarily concerning CT only), including in total 448 patients. Although our aim was to compare the four individual investigations, this was not feasible because of the insufficient availability of adequate studies on MRI, ultrasound and PET/CT. Only data on CT were sufficient for pooling the findings. Pooled accuracy of CT for assessment of ductal extent of the tumour was 86%. Pooled sensitivity and specificity of CT were 89% and 92% for assessment of portal vein involvement, 84% and 93% for assessment of hepatic artery involvement, and 61% and 88% for assessment of nodal status, respectively.
The results of this systematic review should be interpreted with caution because of several limitations. Firstly, the included studies have limited methodological quality, as was detected by using the QUADAS tool (Table 3). Moreover, probably most importantly, the time between the index test and reference standard was mentioned in only 9 studies, and was less than 31 days in 7 studies. Therefore, misclassification due to progression of disease may have occurred . Results of studies reporting on diagnostic performance are hard to interpret without details on methodology, and consequently many QUADAS items were scored as unclear. New diagnostic studies should be reported, adhering to the Standards for Reporting of Diagnostic Accuracy (STARD) . The STARD initiative provides a checklist of items that should be included in the report of a study on diagnostic accuracy.
Secondly, a meta-analysis was not feasible for ultrasound, MRI and PET/CT owing to the low number of data sets and low number of patients in the data sets. Consequently, no comparisons could be made. Thirdly, heterogeneity of the studies was substantial, owing to differences in imaging technique (e.g. for CT varying between 4- and 64-slice CT); study quality; small sample sizes in the studies (and consequent variability as a result of chance); and patient population (e.g. differences in disease stage). To compensate for this problem, we performed random models to adjust for the heterogeneity. Fourthly, we found only two studies, including only 34 patients in total, that compared two imaging investigations in the same population [5,6]. Ideally the diagnostic accuracy of competing imaging investigations is assessed in the same patient population. This enables a more accurate assessment of differences in investigations, and also should identify pros and cons more easily. Fifthly, another limitation could be publication bias. Because of the small number of studies, as well as small number of patients within the studies, we believe a funnel plot to investigate publication bias was not meaningful.
20 articles were excluded because no 2×2 contingency table could be extracted. An additional 22 of the 70 articles selected for full manuscript review were excluded because no specific hilar cholangiocarcinoma data could be retrieved. Often data regarding hilar cholangiocarcinoma patients are reported in combination with the findings in other bile duct cancers. Presumably, this is a result of the rareness of the disease, and consequent small patient numbers. From a molecular and cell biological point of view, the discussion is still ongoing as to whether bile duct cancers can be seen as one entity or should be separated according to their location (intrahepatic, hilar and distal) [14,15]. Yet, from a radiological point of view, interpretation of vascular involvement is clearly different when assessing an intrahepatic tumour from when assessing a hilar tumour that grows directly adjacent to the vessels. Moreover, gallbladder cancer, which is known for its high likelihood of metastatic disease, is also frequently included in studies on bile duct cancer, which can significantly impede correct interpretation of results. Therefore, we believe that future diagnostic studies on bile duct cancer should separately report the specific hilar cholangiocarcinoma data in the publication.
In addition to the imaging investigations evaluated in this study, also more invasive techniques—such as endoscopic retrograde cholangiopancreatography, percutaneous transhepatic cholangiography, staging laparoscopy, endoscopic ultrasound, intraductal ultrasound and choledochoscopy—are used in some centres to improve staging of hilar cholangiocarcinoma. Yet, since the staging process starts with non-invasive imaging, we intentionally have not evaluated these techniques in this systematic review, and have focused on non-invasive imaging. Nonetheless, the staging accuracy for hilar cholangiocarcinoma patients could be improved, and exploratory laparotomies for unresectable patients could potentially be decreased by using these techniques.
Finally, ultrasound, CT, MRI and PET/CT are used nowadays alone or in various combinations with each other for staging of hilar cholangiocarcinoma, and for the choice for surgical resection with curative intent or a palliative treatment. As Table 4 clearly shows, evidence on the staging accuracy of these investigations is limited, especially for MRI, ultrasound and PET/CT. Moreover, no adequate head-to-head comparative studies exist. As a result, an accurate comparison of the investigations and an evidence-based guideline cannot be made yet. Thus, it is of vital importance that future studies will provide this evidence. These studies ideally should have a prospective design, although because of the time required for prospective studies on this rare disease, well-designed retrospective studies could also be of importance. Furthermore, most evidence is available about the use of CT for staging of hilar cholangiocarcinoma, yet these studies have serious methodological flaws and small patient numbers. However—probably even more importantly—no criteria for involvement of adjacent structures (e.g. of the portal vein) exist. For example, for pancreatic cancer several criteria for portal vein involvement, including the presence of tumour involvement exceeding 180° of the circumference of the portal vein, have been established [16,17]. Hence, future CT studies on hilar cholangiocarcinoma should focus on radiological criteria that can accurately predict vessel involvement using contemporary scanners.
In conclusion, diagnostic accuracy studies of CT, MRI, ultrasound or PET/CT for staging of hilar cholangiocarcinoma patients are sparse, often with a low number of patients giving moderate methodological quality. Therefore, there is a need for new methodologically solid studies. Owing to the lack of evidence, an adequate comparison of the various investigations was not feasible. Most evidence is available regarding staging with CT, which seems to have acceptable accuracy for assessment of ductal extent, portal vein and hepatic artery involvement (sensitivity and specificity ranging between 84% and 90%). The sensitivity of CT for nodal status, however, seems to be low (61%).