|Home | About | Journals | Submit | Contact Us | Français|
Aims: To assess inter/intraobserver variability in the interpretation of a series of digitised images of columnar cell lesions (CCLs) of the breast.
Methods: After a tutorial on breast CCL, 39 images were presented to seven staff pathologists, who were instructed to categorise the lesions as follows: 0, no columnar cell change (CCC) or ductal carcinoma in situ (DCIS); 1, CCC; 2, columnar cell hyperplasia; 3, CCC with architectural atypia; 4, CCC with cytological atypia; 5, DCIS. Concordance with the tutor’s diagnosis and degree of agreement among pathologists for each image were determined. The same set of images was re-presented to the pathologists one week later, their diagnoses collated, and inter/intraobservor reproducibility and level of agreement for individual images analysed.
Results: Diagnostic reproducibility with the tutor ranged from moderate to substantial (κ values, 0.439–0.697) in the first exercise. At repeat evaluation, intraobserver agreement was fair to perfect (κ values, 0.271–0.832), whereas concordance with the tutor varied from fair to substantial (κ values, 0.334–0.669). There was unanimous agreement on more images during the second exercise, mainly because of agreement on the diagnosis of DCIS. The lowest agreement was seen for CCC with cytological atypia.
Conclusions: Interobserver and intraobserver agreement is good for DCIS, but more effort is needed to improve diagnostic consistency in the category of CCC with cytological atypia. Continued awareness and study of these lesions are necessary to enhance recognition and understanding.
Columnar cell lesions (CCLs) of the breast comprise a spectrum of benign to atypical entities that have in common variably dilated terminal duct lobular units lined by columnar epithelial cells with prominent apical cytoplasmic snouts.1,2 They are increasingly being encountered in breast biopsies because their associated microcalcifications are detected on mammographic screening.2
CCLs present a new challenge in breast pathology. Many are benign histologically and biologically, others display cytological atypia, and yet others show both cytological and architectural alterations that place them into the category of ductal carcinoma in situ (DCIS).3,4 For CCL with cytological atypia, a major issue is the reproducible recognition of the degree of atypia on core biopsies that would warrant a further excision biopsy, because this has implications for cost and anxiety in affected women.
“Columnar cell lesions are increasingly being encountered in breast biopsies because their associated microcalcifications are detected on mammographic screening”
A uniform approach to the diagnosis of this relatively new entity is necessary for rational treatment strategies. Here, we assess interobserver and intraobserver reproducibility in the evaluation of breast CCL, which to the best of our knowledge has not previously been reported.
After a didactic tutorial on CCL of the breast (including an explanation of a classification scheme accompanied by images) conducted by pathologist AH, 39 representative digitised images were presented sequentially to a panel of staff pathologists from two institutions who regularly interpret breast surgical specimens. They were instructed to categorise the lesions as follows: 0, no columnar cell change (CCC) or DCIS; 1, CCC; 2, columnar cell hyperplasia (CCH); 3, CCC with architectural atypia; 4, CCC with cytological atypia; 5, DCIS. Each image was displayed for approximately one minute. There was no review option or discussion among participating pathologists during the process. Several repeat images were interspersed among others during the exercise, some were of identical magnification, whereas others were identical histological lesions at different magnifications. A discussion of the images and their diagnoses (given by pathologist AH) followed. Diagnostic agreement between the tutor and the staff pathologists was determined using the κ statistic. The degree of agreement among pathologists for each image was also ascertained using the highest proportion of identical answers for each case, expressed as a percentage of the total number of participating pathologists (including AH). The same set of images was circulated to the same staff pathologists one week later for their individual evaluation; their diagnoses were collated, the interobserver concordance (compared with the diagnoses given by the tutor), intraobserver reproducibility, and the level of agreement for individual images were determined.
SPSS 11.5 statistical software (SPSS Inc, Chicago, Illinois, USA) was used for statistical analysis. Interobserver and intraobserver agreement was analysed using the χ2 test and pairwise κ statistic, and κ values were interpreted according to the guidelines of Landis and Koch.5 Briefly, the greater the κ value, the stronger the agreement between the tests (variables). When the κ value ranged from 0.81 to 1.0, 0.61 to 0.8, 0.41 to 0.6, 0.21 to 0.4, or 0 to 0.2, the strength of agreement was perfect, substantial, moderate, fair, or slight, respectively.
Of the seven staff pathologists who participated in the initial exercise, agreement with the tutor for all 39 images ranged from moderate to substantial (κ values, 0.439–0.708; mean, 0.597; median, 0.608). At the second evaluation, concordance with the tutor varied from fair to substantial (κ values, 0.334–0.699; mean, 0.519; median, 0.479), whereas intraobserver agreement ranged from fair to perfect (κ values, 0.271–0.832; mean, 0.520; median, 0.482). Table 11 shows the individual κ values of the participating pathologists.
When individual images were considered, identical answers were given by three to all pathologists for the 39 images. In the repeat exercise, two to all pathologists agreed for each image. Table 22 shows the distribution of pathologists giving similar diagnoses for the 39 images in the initial and repeat evaluations. Of note, there was a greater number of images with unanimous agreement among pathologists during the second exercise (six and 10 images, respectively), mainly because of agreement in the diagnosis of DCIS (fig 11).
Table 33 shows the agreement of pathologists stratified according to the diagnosis of CCL given by AH. The highest number of common answers for each image did not necessarily mirror the tutor’s diagnosis, although they agreed in most cases. In the first exercise, for four images the majority diagnosis differed from the tutor: one case was diagnosed by the tutor as CCC, whereas three pathologists considered the image to show CCC with cytological atypia (fig 22);); the tutor’s diagnosis in two cases was CCC with architectural atypia, but four pathologists thought that the images showed CCC (fig 33);); in the fourth case, the diagnosis given was CCC with cytological atypia, but participating pathologists felt that the image showed CCH (fig 44).). In the second exercise, the number of images in which the majority answer differed from the tutor increased to six. Three of these were repeats of those in the first exercise, whereas the other three were: a case of CCH thought by three pathologists to be CCC with architectural atypia; a case of CCC with architectural atypia, regarded as CCH by four pathologists; and a case of CCC with cytological atypia, viewed by four pathologists as CCH.
Table 44 shows the sets of repeated images/lesions and the number of pathologists in each exercise whose answers were individually at variance for the same image/lesion. When images were identical (same magnification), fewer pathologists gave different diagnoses for these repeat images than when the images represented different magnifications of the same lesion. Of note, for images 9, 10, and 12, which showed the same lesion at different magnifications (fig 55),), there was greatest individual variation in both exercises. In the first exercise, three pathologists changed their diagnosis from CCC (image 9) to CCC with cytological atypia (images 10 and 12); one pathologist changed his diagnosis from a lesion that was not CCC/DCIS to CCC with cytological atypia; one pathologist called image 9 not CCC/DCIS but images 10 and 12 CCC; another pathologist proffered diagnoses of not CCC/DCIS to CCC with cytological atypia to CCC with architectural atypia; and the last pathologist rendered diagnoses of CCC to CCH to CCC with cytological atypia. Similarly, in the second exercise, most of the pathologists concluded that image 12 (at high magnification) represented CCC with cytological atypia, whereas cytological atypia was not noticed in the images shown at lower magnification.
Reproducibility studies are useful in determining the degree of agreement among pathologists in morphological diagnoses, and in the assessment of the universal applicability of histological criteria in classification schemes. The grading of invasive breast cancer,6,7,8,9,10,11 the evaluation of different DCIS classifications,12–16 and the assessment of diagnostic agreement between general and expert breast pathologists in core biopsy interpretation17,18 and among community based surgical pathologists19 have been the subject of several published reports. Results from these studies have provided insight into problem issues in diagnostic breast pathology and identified areas that require further refinement in classification schemes, particularly in honing objective histological criteria to allow improved reproducibility and correlation with clinical outcome.
Rosai in 1991 defined borderline epithelial lesions of the breast as “a type of proliferative process placed somewhere between the usual type of hyperplasia and carcinoma in situ, both in terms of morphological features and propensity for the development of invasive carcinoma”, and found an “unacceptably high” degree of interobserver variability among a group of experienced pathologists in the interpretation of such lesions.20 Although most CCLs do not fall into the “borderline” category, those with cytological atypia, also referred to as flat epithelial atypia,21 present similar diagnostic problems, with a suggestion that these may either be precursors of or the earliest morphological manifestation of DCIS. It has also been recommended that the identification of CCL with cytological atypia on core biopsies should lead to an open excision, because a more advanced lesion is seen in about a third of cases.21
Therefore, it is of paramount importance that the spectrum of CCL should be recognised by practising surgical pathologists who interpret breast specimens, specifically in identifying CCL with cytological atypia, so that proper management can be instituted. This is particularly relevant for Singapore, because we have a National Breast Screening Programme (BreastScreen Singapore) and a proportion of core biopsies carried out for mammographic calcifications contain CCL.
In our study, interobserver agreement with the tutor’s diagnoses immediately after a didactic session on CCL varied from moderate to substantial, indicating that an acceptable degree of agreement can be achieved through an educational session incorporating digitised images. We believe that the results from this first assessment provide an unbiased reflection of interobserver variability, because all answers proffered by the participating pathologists were independently concluded, and an equal amount of time was given to the presentation of each image for their analysis. In the second exercise, the interobserver agreement ranged from fair to substantial, with the κ values of four of the pathologists being lower than in the first exercise, confirming that the recollection of histological criteria and diagnostic reproducibility are better when criteria have been just expounded. These results also imply that recurrent educational sessions may be needed to maintain reproducible diagnoses, particularly for the relatively new and emerging entity of CCL. The intraobserver agreement varied from fair to perfect, which can also be explained by the diminishing consistency of application of histological criteria with increasing time since the educational session.
When we assessed the degree of agreement among all pathologists for individual images, there was complete diagnostic agreement in only six of the 39 images during the first exercise, which improved to 10 in the second exercise. This improvement was mainly because of agreement among all pathologists with regard to DCIS images during the second exercise, suggesting that for the extreme end of the spectrum, where the impact on management is greatest, pathologists learn and retain diagnostic criteria more effectively, especially after a post exercise discussion. It could also be that the severe degree of cytological atypia and the characteristic architectural abnormalities of the presented images were more readily appreciated after they had been specifically pointed out during the discussion after the first exercise. Overall, however, agreement among pathologists decreased during the second exercise, with at least seven of the eight pathologists agreeing in 12 and 17 images in the second and first exercises, respectively. In fact, in the second exercise, there were two images in which common answers were given by only two pathologists.
“Closer scrutiny of all columnar cell lesions at medium to high magnification is necessary to recognise any accompanying cytological atypia that may have implications for management”
When stratified according to the type of CCL, the lowest numbers of complete agreement for individual images were in the categories of CCC with cytological atypia and CCC, in both exercises. This underscores the fact that cytological atypia is subjective, and the threshold between pure CCC and that with cytological atypia is difficult to delineate. A concerted effort should be made to define what constitutes cytological atypia in CCL in a semiquantitative manner, similar to the way that nuclear grading of DCIS is taught, applied,22,23 and validated.24 This is so that CCL with cytological atypia can be diagnosed more reproducibly and consistently, especially because this lesion has management implications for breast core biopsies. Interestingly, among images/lesions that were repeated during the sequential presentation of all 39 images, the greatest internal variation in pathologists’ diagnoses occurred when the same lesion was presented at different magnifications, with cytological atypia being noticed only at medium to high magnification. This corroborates the opinion expressed in a review by Schnitt that cytological atypia in these lesions becomes evident only at high magnification,21 so that all CCLs should not just be accorded a cursory view, but should be subjected to a high magnification examination.
The limitations of our study include the fact that digitised images were used instead of histological sections on glass slides, which represent the real practical situation. However, it can be argued that for teaching, focusing on a specific image may be advantageous to the learning process. Even in past studies using circulated glass slides, an area of interest is circled or marked on the slide for participating pathologists.12,13,20 Other potentially contentious issues that may have affected our findings are: the use of the same series of images for the second evaluation (such that the inter/intraobserver reproducibility results from this second exercise may be that of pure memory rather than true assimilation of learnt criteria); no time limitations imposed on the second evaluation (with possible review option by pathologists, thereby allowing comparison of and realisation that there were repeated images/lesions that were interspersed within the series of images); an unequal number of images from each category; and differing levels of interest in breast pathology and motivation among participating pathologists.
Nevertheless, our study shows that moderate to substantial interobserver reproducibility for breast CCL can be achieved after a tutorial, although it is recommended that follow up educational sessions should be conducted to maintain satisfactory consistency in the evaluation of these lesions. At the DCIS end of the spectrum, a diagnostic reproducibility of 100% can be achieved. The category that requires more attention in terms of refining diagnostic criteria is that of CCL with cytological atypia, where more objective guidelines as to what constitutes cytological atypia should be reached. Closer scrutiny of all CCLs at medium to high magnification is necessary to recognise any accompanying cytological atypia that may have implications for management. This difficulty is similar to that seen in studies of reproducibility in the diagnosis of atypical ductal hyperplasia which, unfortunately, has remained refractory to efforts to improve diagnostic consistency.25 Indeed the most (as yet unpublished) recent version of the UK breast screening pathology guidelines recommends that CCL with atypia should be categorised as equivalent to atypical ductal hyperplasia—that is, B3.
In conclusion, further reproducibility and follow up studies will assist in arriving at a clinically relevant consensus classification for CCL of the breast. Nuclear morphometry can help define thresholds of nuclear atypia, and molecular studies on microdissected lesions may provide important biological clues beyond morphology. From a practical standpoint, multiple step sections and consultation with a breast pathologist may be prudent when there is uncertainty about the presence or otherwise of cytological atypia in CCL, particularly in core biopsies.
We thank the staff pathologists who participated in both exercises.