|Home | About | Journals | Submit | Contact Us | Français|
Over time, the histology of papillary thyroid cancers detected in a population exposed to radiation at Chornobyl has shifted from a more aggressive toward less aggressive subtypes. This change may reflect biologic behavior, but could also be influenced by the detectability of different subtypes. The study objective was to identify whether there is any relationship between the conspicuity of ultrasound-detected papillary cancers and histologic subtype.
Ultrasound images of 84 papillary cancers occurring in young people exposed to radiation at Chornobyl were each given a conspicuity score using a subjective 1 to 5 scale by four independent expert readers blinded to histologic subtype. The effects of tumor subtype, tumor encapsulation, reader, machine type, and size on conspicuity were determined using ANOVA and Spearman correlations.
Cancer subtype was related (p<0.01) to conspicuity. The relatively aggressive solid subtype of papillary carcinoma was more conspicuous than the papillary, follicular, and mixed subtypes, p<0.05. The other subtypes did not differ significantly from each other in conspicuity. Conspicuity was not significantly related to nodule size, degree of encapsulation, age and gender of the subject, or machine type. Although the mean conspicuity score for each reader differed significantly, reliability of conspicuity judgments across readers was fair.
In subjects exposed to the Chornobyl accident, the solid subtype of papillary carcinoma appears to be more conspicuous on ultrasound than other subtypes. It is therefore possible that the observed change in subtype over time in this repeatedly screened population is influenced by differences in nodule conspicuity.
The most significant health consequence to date of the April 26, 1986 Chornobyl nuclear power plant accident has been papillary thyroid cancer, principally among those who were exposed during childhood or adolescence (REF 1). Since 1998, the Ukrainian-American Cohort Study of Thyroid Cancer and Other Thyroid Diseases Following the Chornobyl Accident (UkrAm Project) has followed a cohort of approximately 13,000 people who were living in areas affected by the fallout from Chornobyl1 and who received direct dosimetric measurements of thyroid gland activity within a short time of the accident. All the participants were under 18 at the time of the catastrophe. Those in the study cohort received comprehensive evaluations at least every two years, including an endocrinologic assessment and an ultrasound evaluation of the thyroid (REF 2). Most of these assessments were done by mobile teams who traveled among villages in the affected territory, often working in difficult conditions. Patients with suspicious findings were referred to the central endocrinologic hospital in Kyiv for further evaluation, often including biopsy. Patients with suspicious clinical or cytological findings were referred for surgery.
Over time, there has been a change in the relative percentages of the various subtypes of papillary cancer found among the UkrAm cohort (REF 3,4). The system for histologic subtyping used in the project categorizes papillary cancers into papillary, follicular, solid, and mixed variants. Of these, the solid variants are considered less well differentiated than the papillary and follicular variants. The solid subtypes have shown a greater propensity for aggressive biologic behavior, such as extrathyroidal extension and regional and distant metastases. Fortunately, however, this more aggressive appearance has not been associated with any difference in mortality: there have been no papillary-cancer-related deaths in the study cohort to date (REF 4). With increasing time since the accident, there has been a trend toward more well-differentiated subtypes of papillary cancer. This finding has been described both in our cohort and in an earlier study by Williams and colleagues of children exposed at Chornobyl (REF 3–5). One obvious possibility is that this apparent change in subtype ratios reflects a true biologic change over time. Another possibility in a repeatedly screened population is that the change, at least in part, simply reflects a difference in the ease with which the various subtypes can be detected by ultrasound. That is, if one subtype is easier to detect on ultrasound than the others, a higher proportion of such tumors may be found in each screening cycle, thus overstating the relative incidence of this subtype. Preferential culling of the more easily detectable subtypes in early screenings could cause a relative increase in the less conspicuous cancers on later screenings. It is unknown, however, whether in fact there is any difference in the ultrasound conspicuity of the different subtypes. This study was designed to answer that question.
In this study, we asked readers to judge the overall conspicuity of the papillary tumor subtypes in the Ukraine sample. A failure to find a relationship between conspicuity and subtype would strengthen the hypothesis that subtype has truly changed over time, as has previously been suggested. If, on the other hand, a relationship between tumor subtype and conspicuity does exist, then that hypothesis may be incomplete or incorrect.
The Ukrainian-American Cohort Study of Thyroid Cancer and Other Thyroid Diseases Following the Chornobyl Accident (UkrAm Study) was reviewed and approved by the Institutional Review Boards of the U.S. National Cancer Institute and the Institute of Endocrinology and Metabolism (IEM) of the Academy of Medical Sciences of Ukraine, and all participants signed an informed consent form.
Details of the design of the UkrAm study, and of a parallel study in Belarus, have been published previously (REFS 2, 3). Briefly, the cohort includes subjects who had direct thyroid radioactivity measurements made in May or June 1986, lived in the most heavily contaminated regions of Chernihiv, Zhytomyr, and Kyiv oblasts or Kyiv city, Ukraine, in 1998, and were younger than 18 years of age on 26 April 1986. An oblast is an administrative subdivision similar in size to a state or province.
The source population for the Ukrainian cohort consisted of 75,349 individuals born between April 26, 1968 and April 26, 1986 who were stratified according to preliminary 131I thyroid dose estimates obtained from direct thyroid dosimetry measurements. All those from the highest dose group (1 gray or more) and a random sample of those from lower dose groups were selected, resulting in 32,385 potential study participants. From the subjects originally selected, 2,466 (8%) were not eligible for a variety of reasons, including military service or incarceration, and 10,307 (32%) could not be located, primarily due to resettlement or migration after the accident. We invited 19,612, of whom 6,369 (32%) refused to participate or failed to attend the initial screening, leaving a screened cohort of 13,243.
The cohort was screened at least biennially either by a mobile team assigned to a regional hospital or at the IEM in Kyiv, Ukraine. Examinations included thyroid palpation and ultrasound examination by a physician specializing in ultrasound (ES, YN) and independent clinical examination and palpation by an endocrinologist. The thyroid was imaged with the subject supine and neck extended. From 1998–2002, screening was done using 7.5 MHz probes, either an electronic linear transducer (Hitachi Medical Systems, Tokyo, Japan; GE Logiq a100, General Electric Company, Milwaukee, WI, USA) or a mechanical sector probe with water bag kit (Tosbee SSA 240s with 7.5 MHz SM-708A probes; Toshiba Corporation, Tokyo, Japan). In 2002, this equipment was replaced with a laptop-based mobile system that used a 10 MHz linear probe (Terason Ultrasound, Burlington, MA, USA). Detailed information about the location and characteristics of nodules and regional lymph nodes, and about thyroid size and echostructure were recorded on a standardized ultrasound form. Early images were recorded on thermal paper or a Camtronics magneto-optical disk (Camtronics Medical Systems, Birmingham, AL, USA) and later were scanned into a central database as part of the patient record. The images obtained with the laptop-based Terason system were stored initially on a hard drive and transferred directly to the central image database.
Ultrasound-guided fine needle aspiration (FNA) was performed by a physician specializing in ultrasound at the IEM on all palpable and ultrasound-detected nodules that were ≥ 10 mm in greatest dimension and on all nodules 5–10 mm in greatest dimension that had one or more of the following features: were hypoechoic, or had microcalcifications, an irregular contour, extension through thyroid capsule, interval growth, or suspicious adenopathy. In glands containing multiple nodules, up to three were sampled. Suspicious lymph nodes were also biopsied, regardless of the presence of a thyroid abnormality. Patients were referred to surgery if their FNA cytology was interpreted as suspicious for, or diagnostic of, malignancy or follicular neoplasm.
All surgical specimens were examined by two Ukrainian pathologists (TIB and LYZ) and classified as papillary, follicular, or medullary cancer according to the World Health Organization histological system (REF 6); reviewed by one of the US authors (EG); and confirmed by an International Pathology Panel established by the Chernobyl Tissue Bank Project (REF 7).
The papillary thyroid cancers were further separated into subtypes depending on the dominant histological component: papillary, follicular, or solid if > 80% of the tumor had the corresponding structure and mixed when nearly equal parts of different components were present, according to the system described by Bogdanova and colleagues (REF 4) FIGURE 1. Note that this classification refers only to the histologic appearance of the cancers and does not correspond to whether a nodule appears solid or cystic on ultrasound. Each cancer was characterized as “completely encapsulated,” “partly encapsulated,” or “not encapsulated.”
Digital images from the last ultrasound examination before surgery were available for all papillary cancers identified in the screened cohort for the first four screening cycles. After removal of all identifying information, the images were randomized and placed on CD ROM, and case ID, date of examination and surgery, type of equipment, number of nodules described by pathologist, nodule location and maximum dimension, and tumor histology were kept in a separate reference database by an author (VS) who did not participate in the image review.
A subjective ordinal conspicuity scale was developed by POK, with the scale defined as follows: Grade 1, “Almost invisible, could easily be missed.” Grade 2, “Subtle, could miss in some cases.” Grade 3, “Fairly visible, should see in most cases.” Grade 4, “Clearly visible, should rarely be missed.” Grade 5, “Completely obvious, should never be missed.” A reference set of images that contained 6–10 examples from each category (Figure 2) was created and reviewed both independently and in consensus by four expert readers (POK, ES, RJ McC, YN). Following the training session, the readers viewed the study ultrasound images projected on a screen and assigned a subjective visibility grade to each. Readers were blinded to clinical information and tumor histology. Readers graded images both individually and then, in a separate session, by consensus. Evaluation of each image was limited to 30 seconds. In glands containing more than one nodule, each nodule was scored separately. Responses were recorded on a preprinted form. Readers also recorded the type of ultrasound equipment used and the number of images available for each nodule.
Only nodules that fulfilled the ultrasound screening criteria for eligibility were included. A total of 107 histologically categorized, prospectively identified nodules were available for analysis. From this, all nodules other than papillary cancers were excluded (6 follicular carcinomas, 1 medullary carcinoma, 12 noncarcinomas), yielding a total of 88 papillary cancers. Of these, 7 were of the solid subtype, 22 of the papillary, 24 of the follicular, and 35 of the mixed subtype. In four cases (2 papillary, 2 follicular subtype), no nodule could be identified on the ultrasound images available, and these cases were excluded. In one case, one reader failed to record a score. The final dataset for analysis thus consisted of 335 separate individual scores and 84 consensus judgment scores for 84 papillary cancers.
We first examined bivariate relationships between conspicuity and cancer subtype, encapsulation, machine type, and cycle, using Kruskal-Wallis one-way analysis of variance (ANOVA), and between conspicuity and reader, using Wilcoxon matched-pairs signed ranks test. The bivariate relationship between conspicuity and number of images and nodule size was explored with Spearman rank-order correlations. Our main analysis was a repeated measures (repeated readers) ANOVA of subtype on conspicuity, along with any other factors that had a significant bivariate relationship with conspicuity. We also evaluated for possible confounding by gender and age, using Kruskal-Wallis tests for gender and Spearman correlations for age, and added these variables into our final ANOVA. Multiple comparisons between levels of the factors were tested using the Student-Newman-Keuls (SNK) method for each reader and for the consensus judgment. We examined the relationship of encapsulation and cancer subtype with one-way ANOVA. To determine the reliability of the conspicuity judgments, the intraclass correlation (ICC) was calculated, using the method developed by Shrout and Fleiss (REF 8). We did not measure test-retest reliability for several reasons: It would have been logistically difficult; the consensus judgment following individual judgment would quite likely produce a training bias in the test-retest coefficient; and inter-reader reliability seemed a sufficient estimate of true reliability, and likely to be a more conservative one than intra-reader reliability. All analyses were performed using SAS Windows V 9.1.3.
Kruskal-Wallis one-way ANOVA showed that cancer subtype had a significant effect on conspicuity at the p<0.05 level or better both for each reader and for the consensus judgment. Encapsulation was significantly related to conspicuity in one reader (p<0.01), but not in the other three readers or in the consensus judgment. Pairwise comparisons of readers’ conspicuity scores using Wilcoxon signed rank tests were significant for all readers at the p<0.0001 level; that is, different readers tended to score the same nodules differently. Number of images was positively correlated with conspicuity (p<0.01). Machine type, screening cycle, nodule size, age, and gender were not significantly related to conspicuity (data not shown).
A repeated measures analysis of variance (repeated readers) of cancer subtype and number of images was performed on conspicuity. Cancer subtype (p<0.01) was significant and number of images (p<0.052) approached significance. Reader (p<0.0001) was also significant. Encapsulation was not significantly related to conspicuity.
SNK tests of pairwise comparisons of conspicuity score showed that the solid subtype was significantly different (p<0.05) from all other subtypes for two of the readers, and for all except the papillary subtype for the other two. Table 1 shows the mean conspicuity scores; the consensus judgment is presented for simplicity. The overall mean conspicuity score for each reader differed significantly. That is, each reader had a different “center point” in his use of the rating scale. A repeated measures analysis of variance showed that readers differed in conspicuity judgment (p<0.0001). Despite the difference in center point, inter-reader reliability was fair (ICC= 0.50). Figure 3 shows each reader’s mean conspicuity score for each subtype. With only one exception (reader 2, mixed to papillary), scores monotonically increase for each reader from follicular to mixed to papillary to solid. The reader by cancer subtype interaction term of the ANOVA tests whether there is a difference across readers in this pattern of rating the subtypes: It is not significant (p>0.39), indicating that there is no difference in the rating pattern.
The relation between histologic subtype and nodule size as determined at pathology was also evaluated. A one-way ANOVA showed that cancer subtype had a significant effect on nodule size (p <0.05). Follicular and solid subtypes had larger mean sizes than papillary and mixed subtypes. However, none of the pairwise comparisons of type reached significance.
Another one-way ANOVA showed a significant relationship between cancer subtype and encapsulation (P<0.005). SNK tests show that solid is different (p<0.05) from mixed, but that none of the other multiple comparisons are significant.
Over 95% of the approximately 2000 thyroid cancers detected so far in populations exposed to radiation from Chornobyl are papillary cancers (REF 1). Various histologic subtypes of papillary carcinoma have been described (REF 5,9,10). The less differentiated, solid subtype has been associated with more aggressive behavior, with greater propensity for intrathyroidal, extrathyroidal, and vascular invasion than other more differentiated subtypes. Overall survival from papillary cancer remains high, however, and little or no difference related to subtype has been found to date (REF 3,5,10).
Two studies have found a relation between papillary cancer subtype and latency after exposure to radiation. Williams and colleagues in a study of children exposed at Chornobyl found that less differentiated forms of papillary cancer had a shorter latency (REF 5). Their report does not specify in detail how the tumors were initially detected, though it does comment that many of the early, aggressive cancers were identified clinically, not by screening. In our cohort, which was repeatedly screened by ultrasound, Bogdanova and colleagues found a trend for less differentiated solid subtypes to have a shorter latency (REF 4). Although the trend reported in these studies may reflect true biologic behavior, it could also be influenced by differences in the detectability of different subtypes. The goal of this study was to assess whether there are any differences in the ultrasound conspicuity of the subtypes that could potentially affect detection rates.
We looked at a number of factors that might influence conspicuity, including tumor subtype, degree of encapsulation, reader, nodule size, age, gender, and machine type. Of all these, only subtype and reader were significant.
This study found that the solid histologic subtype of papillary carcinoma is more conspicuous on ultrasound than any other subtype. A natural history of tumor development which places the solid subtype as the earliest developing subtype may be confounding easier detectability of the solid subtype with the latency of its development.
The mean conspicuity scores in our study were significantly different for all four readers, despite the initial training session. The readers tended to agree, however, about the relative conspicuity of individual nodules (Figure 3). This agreement on general trend in conspicuity is probably of more relevance than the disagreements about mean conspicuity: The agreement on which nodules are more or less conspicuous suggests that this is a reproducible perception. The mean conspicuity scores, on the other hand, merely reflect the readers’ opinions about how difficult the nodules are to see: It is uncertain how closely these opinions relate to how well these readers, or others, would perform in practice. That is, it is not at all certain that a reader who gave higher scores would actually detect more nodules in a screening session than a reader who gave lower scores.
Although the solid and follicular subtypes tended to be larger than the papillary and mixed subtypes, we were unable to show that the difference had an effect on conspicuity. We found a difference in the degree of encapsulation among subtypes, but found no relation between encapsulation and lesion conspicuity. As might be expected, nodules that had more images had a slight tendency to score as more conspicuous. However, there were no significant differences among the subtypes in the mean number of images per nodule. Even though papillary carcinomas having a marked solid component are more likely to demonstrate extrathyroidal spread and regional and distant metastases (REF 4), there was no correlation of these clinical characteristics with nodule conspicuity.
Many studies have evaluated various ultrasound characteristics of papillary thyroid cancer (REFS 11–20). Because the goal of this study was to identify potential differences in detectability, it focused on conspicuity rather than a more specific ultrasound descriptor, such as echogenicity: Although any one characteristic might contribute to conspicuity, it might not necessarily be the sole or even the most important contributor. We have found no studies in the peer-reviewed literature that evaluate the conspicuity of thyroid nodules either in relation to benign vs malignant etiology or in relation to papillary carcinoma subtype. A textbook published in Russian by Ephstein in Ukraine described some structural changes related to histologic subtype, but is not widely available (Ref 19). Lyshchik and colleagues, in an oral presentation at the 2007 ARRS Annual Meeting, showed data suggesting that more aggressive papillary cancers were more hypoechoic than other subtypes (Lyshchik A, Moses R, Barnes SL, Higashi T, Miga MI, Fleischer AC. Sonographic appearance of thyroid tumors with different appearance and metastatic potential. 2007 ARRS meeting). It is certainly possible that the increased conspicuity found in our study for the solid histologic subtype is a result of more pronounced hypoechogenicity, which one would expect in a tumor with fewer internal interfaces. However, the reason for the greater conspicuity of the solid subtype was not investigated in this study.
The study reported here has some limitations: It includes a relatively small number of cancers (n=84). It is the greater conspicuity of the solid subtype that drives the difference in conspicuity that we found, and there are only 7 of these tumors. Only gray-scale static images were evaluated: No Doppler or elastographic information was captured in the screening process. It is possible that the nodules would have been scored as more conspicuous in a real-time setting than on the static images. On the other hand, the fact that the readers knew in every case that the image contained a nodule might tend to increase the nodule’s apparent conspicuity. In addition, not all those referred for surgery after a suspicious FNA actually agreed to have the surgery. We assume that no bias was introduced as a result, since FNA conclusions contained no information about subtype.
Our findings suggest that it is possible that an ultrasound-based screening process may favor early detection of the solid subtype of papillary carcinoma. The study, however, simply identifies a difference in conspicuity: It does not address the question of whether the difference in conspicuity translates into a difference in detection rates. Because of the potential impact of conspicuity differences on the apparent developmental trajectory of cancer, the issue deserves further study.
This study found that the more aggressive solid subtype of papillary thyroid cancer appears to be more conspicuous on ultrasound than other subtypes. Therefore the shift in subtype over time seen in Chornobyl-exposed populations may be at least in part an effect of differences in nodule conspicuity.
1Note that although “Chernobyl” is the more widely accepted spelling in the English literature, “Chornobyl” is the preferred form in Ukraine