The Ukrainian-American Cohort Study of Thyroid Cancer and Other Thyroid Diseases Following the Chornobyl Accident (UkrAm Study) was reviewed and approved by the Institutional Review Boards of the U.S. National Cancer Institute and the Institute of Endocrinology and Metabolism (IEM) of the Academy of Medical Sciences of Ukraine, and all participants signed an informed consent form.
Details of the design of the UkrAm study, and of a parallel study in Belarus, have been published previously (REFS 2
). Briefly, the cohort includes subjects who had direct thyroid radioactivity measurements made in May or June 1986, lived in the most heavily contaminated regions of Chernihiv, Zhytomyr, and Kyiv oblasts or Kyiv city, Ukraine, in 1998, and were younger than 18 years of age on 26 April 1986. An oblast is an administrative subdivision similar in size to a state or province.
The source population for the Ukrainian cohort consisted of 75,349 individuals born between April 26, 1968 and April 26, 1986 who were stratified according to preliminary 131I thyroid dose estimates obtained from direct thyroid dosimetry measurements. All those from the highest dose group (1 gray or more) and a random sample of those from lower dose groups were selected, resulting in 32,385 potential study participants. From the subjects originally selected, 2,466 (8%) were not eligible for a variety of reasons, including military service or incarceration, and 10,307 (32%) could not be located, primarily due to resettlement or migration after the accident. We invited 19,612, of whom 6,369 (32%) refused to participate or failed to attend the initial screening, leaving a screened cohort of 13,243.
The cohort was screened at least biennially either by a mobile team assigned to a regional hospital or at the IEM in Kyiv, Ukraine. Examinations included thyroid palpation and ultrasound examination by a physician specializing in ultrasound (ES, YN) and independent clinical examination and palpation by an endocrinologist. The thyroid was imaged with the subject supine and neck extended. From 1998–2002, screening was done using 7.5 MHz probes, either an electronic linear transducer (Hitachi Medical Systems, Tokyo, Japan; GE Logiq a100, General Electric Company, Milwaukee, WI, USA) or a mechanical sector probe with water bag kit (Tosbee SSA 240s with 7.5 MHz SM-708A probes; Toshiba Corporation, Tokyo, Japan). In 2002, this equipment was replaced with a laptop-based mobile system that used a 10 MHz linear probe (Terason Ultrasound, Burlington, MA, USA). Detailed information about the location and characteristics of nodules and regional lymph nodes, and about thyroid size and echostructure were recorded on a standardized ultrasound form. Early images were recorded on thermal paper or a Camtronics magneto-optical disk (Camtronics Medical Systems, Birmingham, AL, USA) and later were scanned into a central database as part of the patient record. The images obtained with the laptop-based Terason system were stored initially on a hard drive and transferred directly to the central image database.
Ultrasound-guided fine needle aspiration (FNA) was performed by a physician specializing in ultrasound at the IEM on all palpable and ultrasound-detected nodules that were ≥ 10 mm in greatest dimension and on all nodules 5–10 mm in greatest dimension that had one or more of the following features: were hypoechoic, or had microcalcifications, an irregular contour, extension through thyroid capsule, interval growth, or suspicious adenopathy. In glands containing multiple nodules, up to three were sampled. Suspicious lymph nodes were also biopsied, regardless of the presence of a thyroid abnormality. Patients were referred to surgery if their FNA cytology was interpreted as suspicious for, or diagnostic of, malignancy or follicular neoplasm.
All surgical specimens were examined by two Ukrainian pathologists (TIB and LYZ) and classified as papillary, follicular, or medullary cancer according to the World Health Organization histological system (REF 6
); reviewed by one of the US authors (EG); and confirmed by an International Pathology Panel established by the Chernobyl Tissue Bank Project (REF 7
The papillary thyroid cancers were further separated into subtypes depending on the dominant histological component: papillary, follicular, or solid if > 80% of the tumor had the corresponding structure and mixed when nearly equal parts of different components were present, according to the system described by Bogdanova and colleagues (REF 4
) . Note that this classification refers only to the histologic appearance of the cancers and does not correspond to whether a nodule appears solid or cystic on ultrasound. Each cancer was characterized as “completely encapsulated,” “partly encapsulated,” or “not encapsulated.”
Histologic subtypes of papillary carcinoma
Digital images from the last ultrasound examination before surgery were available for all papillary cancers identified in the screened cohort for the first four screening cycles. After removal of all identifying information, the images were randomized and placed on CD ROM, and case ID, date of examination and surgery, type of equipment, number of nodules described by pathologist, nodule location and maximum dimension, and tumor histology were kept in a separate reference database by an author (VS) who did not participate in the image review.
A subjective ordinal conspicuity scale was developed by POK, with the scale defined as follows: Grade 1, “Almost invisible, could easily be missed.” Grade 2, “Subtle, could miss in some cases.” Grade 3, “Fairly visible, should see in most cases.” Grade 4, “Clearly visible, should rarely be missed.” Grade 5, “Completely obvious, should never be missed.” A reference set of images that contained 6–10 examples from each category () was created and reviewed both independently and in consensus by four expert readers (POK, ES, RJ McC, YN). Following the training session, the readers viewed the study ultrasound images projected on a screen and assigned a subjective visibility grade to each. Readers were blinded to clinical information and tumor histology. Readers graded images both individually and then, in a separate session, by consensus. Evaluation of each image was limited to 30 seconds. In glands containing more than one nodule, each nodule was scored separately. Responses were recorded on a preprinted form. Readers also recorded the type of ultrasound equipment used and the number of images available for each nodule.
Reference examples for grading ultrasound conspicuity of thyroid nodules
Only nodules that fulfilled the ultrasound screening criteria for eligibility were included. A total of 107 histologically categorized, prospectively identified nodules were available for analysis. From this, all nodules other than papillary cancers were excluded (6 follicular carcinomas, 1 medullary carcinoma, 12 noncarcinomas), yielding a total of 88 papillary cancers. Of these, 7 were of the solid subtype, 22 of the papillary, 24 of the follicular, and 35 of the mixed subtype. In four cases (2 papillary, 2 follicular subtype), no nodule could be identified on the ultrasound images available, and these cases were excluded. In one case, one reader failed to record a score. The final dataset for analysis thus consisted of 335 separate individual scores and 84 consensus judgment scores for 84 papillary cancers.
We first examined bivariate relationships between conspicuity and cancer subtype, encapsulation, machine type, and cycle, using Kruskal-Wallis one-way analysis of variance (ANOVA), and between conspicuity and reader, using Wilcoxon matched-pairs signed ranks test. The bivariate relationship between conspicuity and number of images and nodule size was explored with Spearman rank-order correlations. Our main analysis was a repeated measures (repeated readers) ANOVA of subtype on conspicuity, along with any other factors that had a significant bivariate relationship with conspicuity. We also evaluated for possible confounding by gender and age, using Kruskal-Wallis tests for gender and Spearman correlations for age, and added these variables into our final ANOVA. Multiple comparisons between levels of the factors were tested using the Student-Newman-Keuls (SNK) method for each reader and for the consensus judgment. We examined the relationship of encapsulation and cancer subtype with one-way ANOVA. To determine the reliability of the conspicuity judgments, the intraclass correlation (ICC) was calculated, using the method developed by Shrout and Fleiss (REF 8
). We did not measure test-retest reliability for several reasons: It would have been logistically difficult; the consensus judgment following individual judgment would quite likely produce a training bias in the test-retest coefficient; and inter-reader reliability seemed a sufficient estimate of true reliability, and likely to be a more conservative one than intra-reader reliability. All analyses were performed using SAS Windows V 9.1.3.