The 30 PET scans comprised a total of 89 locations of suspected malignancy, according to the gold standard (expert reading). Thirty-four represented tumor locations, 55 were lymph nodes (10 hilar, 39 mediastinal, and six supraclavicular). According to expert readers, there was a mean of three sites (primary lesion and lymph nodes) per patient (range 1–13). The experts classified (according to Table ) 82 lesions as “definitely malignant,” five as “probably malignant,” and two as “equivocal.” In the final analysis, these “probably” and “definitely” malignant locations were classified as malignant. The expert N-stage classifications included nine “N0,” three “N1,” one “N0–N1,” three “N0–N2,” nine “N2,” and five “N3,” according to the definitions mentioned earlier.
Management recommendations were correct in 80% of cases (86 errors out of 420 recommendations, 42 in the experienced group and 44 in the inexperienced group). The accuracy vs. expert reading was moderate (kappa 0.59) at either level of experience (Table ). The level of agreement among inexperienced observers tended to be lower but did not reach significance. Four scans accounted for a total of 38 errors (44%), while not a single mistake by any observer was made in eight.
Interobserver agreement and accuracy as a function of experience with respect to the classification of “N-stage” and “management recommendation”
In the group of inexperienced readers, 29 (of 44; 66%) of the incorrect management recommendations were protocol violations (type “P”), vs. 17 (of 42; 40%) in the experienced readers group (p
0.12). On the contrary, errors that directly flow from reading errors (type “M”) were significantly more prevalent in the group of experienced readers (25 out of 42
59%), vs. 15 out of 44 (34%) in the inexperienced readers group (p
Common errors (type “P”, protocol violations) were, e.g., to recommend “expectative policy” or “directly to thoracotomy” in a patient without enhanced PET uptake in primary tumor and mediastinal lymph nodes. However, the provided clinical information stated that bronchoalveolar cell carcinoma had been proven histologically. Therefore, “mediastinal lymph node evaluation” should have been recommended because the mediastinum in a patient with adenocarcinoma without FDG uptake of the primary tumor cannot be reliably evaluated so that histological confirmation of the mediastinum is required.
N-stage classifications were correct in 68% of cases (286 out of 420 assigned N-stages, 138 in the inexperienced group and 148 in the experienced group). Experienced observers tended to have a better agreement with the expert reading than inexperienced ones (weighted kappa’s 0.72 and 0.58, respectively). N-stages were overestimated in 17.4% (16.7% by the experienced and 18.1% by the inexperienced observers) and underestimated in 14.5% of cases (12.9 and 16.2%, respectively). The individual scores of the observers (Table ) reveal that errors in either direction were made by most of them.
Details on N-stage (using the classification system described in the methods section) in 30 scans for each observer
Because we used three scans to practice on localizing mediastinal lymph nodes, 27 scans remained with 26 separate lymph node localizations. The detection rate of individual mediastinal lymph node stations was similar for inexperienced and experienced observers (71 and 74%, respectively, Table ), and the variation within the groups was also comparable. However, experienced readers were better at localizing the stations than inexperienced readers were (correct in 68 vs. 51%, respectively). The most common mislocalizations (Table ) were to classify right tracheobronchial stations (4R) as upper-right paratracheal (2R), subcarinal (7) as right tracheobronchial (4R), and left para-esophageal (8/9L) as left tracheobronchial (4L).
Accuracy of inexperienced and experienced observers to detect and localize the 26 mediastinal lymph node stations present according to the expert reading
Mediastinal lymph node stations by experienced and inexperienced observers, according to Mountain and Dresler