|Home | About | Journals | Submit | Contact Us | Français|
Most screening mammography in the United States differs from that in countries with formal screening programs by having a shorter screening interval and interpretation by a single reader vs independent double reading. We examined how these differences affect early detection of breast cancer by comparing performance measures and histopathologic outcomes in women undergoing opportunistic screening in Vermont and organized screening in Norway.
We evaluated recall, screen detection, and interval cancer rates and prognostic tumor characteristics for women aged 50–69 years who underwent screening mammography in Vermont (n = 45050) and in Norway (n = 194430) from 1997 through 2003. Rates were directly adjusted for age by weighting the rates within 5-year age intervals to reflect the age distribution in the combined data and were compared using two-sided Z tests.
The age-adjusted recall rate was 9.8% in Vermont and 2.7% in Norway (P < .001). The age-adjusted screen detection rate per 1000 woman-years after 2 years of follow-up was 2.77 in Vermont and 2.57 in Norway (P = .12), whereas the interval cancer rate per 1000 woman-years was 1.24 and 0.86, respectively (P < .001). Larger proportions of invasive interval cancers in Vermont than in Norway were 15 mm or smaller (55.9% vs 38.2%, P < .001) and had no lymph node involvement (67.5% vs 57%, P = .01). The prognostic characteristics of all invasive cancers (screen-detected and interval cancer) were similar in Vermont and Norway.
Screening mammography detected cancer at about the same rate and at the same prognostic stage in Norway and Vermont, with a statistically significantly lower recall rate in Norway. The interval cancer rate was higher in Vermont than in Norway, but tumors that were diagnosed in the Vermont women tended to be at an earlier stage than those diagnosed in the Norwegian women.
Opportunistic screening mammography in Vermont differs from organized screening mammography in Norway in several respects, including the screening interval, which is longer in Norway than in Vermont. These differences make it challenging to compare parameters that are associated with the provision and quality of the screening process (ie, performance measures).
An evaluation of recall, screen detection, and interval cancer rates and prognostic tumor characteristics for women aged 50–69 years who underwent screening mammography in Vermont (n = 45050) and in Norway (n = 194430) from 1997 through 2003.
Screening mammography detected cancer at about the same rate and at the same prognostic stage in Norway and Vermont, but with a lower recall rate in Norway. The interval cancer rate was higher in Vermont than in Norway, but the Vermont women with interval cancers were diagnosed with earlier-stage tumors than the Norwegian women.
Despite its longer screening interval, the organized population-based screening program in Norway achieved similar outcomes as the opportunistic screening in Vermont. Norwegian women were exposed to half as many screening mammograms and fewer recall examinations than the Vermont women, yet the tumor characteristics for all invasive cancers diagnosed were not different between the two screened populations.
Subtle differences in the Vermont and Norwegian data definitions and collection procedures may have influenced the findings. The effects of differences in screening interval could not be distinguished from those of other potential differences in mammography performance between Vermont and Norway.
From the Editors
Systems for the delivery of screening mammography vary among countries, and these differences can influence the effectiveness of screening. For example, women in the United States usually begin undergoing screening in response to a recommendation from their primary care provider, and they are generally able to choose the mammography facility that they would like to use (1,2). This approach is often referred to as “opportunistic screening.” By contrast, most European countries, including Norway, have organized population-based screening programs in which women within a specified age range regularly receive a personal letter that invites them to undergo mammographic screening (3,4). Opportunistic and organized screening may also differ in the screening interval, age of the target population, and health service algorithms that are used to manage mammographic abnormalities (5–11). These variations make it challenging to compare parameters that are associated with the provision and quality of the screening process (ie, performance measures). It is also difficult to compare outcome measures that may be associated with a reduction in mortality.
Comparing performance measures and tumor characteristics among countries that have different approaches to the screening process and relating the similarities and differences to specific aspects of the screening can provide valuable insights about the impact of screening mammography on the early detection of cancer. In particular, such comparisons can identify areas that can be targeted to improve the quality of both opportunistic and organized screening and the health outcomes of women who undergo screening. However, a limited number of studies have examined screening mammography in different health care systems (7–11), and only Smith-Bindman et al. (7,8) examined both performance and outcome measures. They used data from the Breast Cancer Surveillance Consortium and the National Breast and Cervical Cancer Early Detection Program in the United States, and the National Health Service Breast Cancer Screening Program in the United Kingdom to compare performance measures, and they also examined the characteristics of screen-detected tumors in the two countries. The United States had a higher recall rate than the United Kingdom, but the two countries had similar screen detection rates per 1000 women.
The aim of our study was to identify the strengths and weaknesses of opportunistic and organized screening mammography in terms of their impact on the early detection of breast cancer. To do so, we compared recall rates, rates of screen-detected and interval cancers, and prognostic tumor characteristics of women aged 50–69 years who underwent opportunistic screening mammography in Vermont from 1997 through 2003 with those of comparably aged women who participated in an organized population-based screening program in Norway during the same time.
The data sources for this study were the Vermont Breast Cancer Surveillance System (VBCSS) and the Norwegian Breast Cancer Screening Program (NBCSP). Both the VBCSS and the NBCSP collect data on radiological performance (ie, recall rates, mammogram interpretation, and radiological features), cancer detection, and histopathologic characteristics of tumors.
The VBCSS is a passive surveillance system that captures community mammography as it routinely occurs; it is funded by the US National Cancer Institute as part of the Breast Cancer Surveillance Consortium (12). Since 1994, the VBCSS has collected patient risk factors and breast imaging and breast pathology data for all breast imaging and breast biopsies performed in Vermont (13). The data are linked to the breast cancer cases reported to the Vermont Cancer Registry and the New Hampshire Tumor Registry for nearly complete cancer follow-up. All mammography and pathology facilities in Vermont send standardized data on paper forms or electronic files to the VBCSS. Data collection forms and variables collected are available at http://www.uvm.edu/~vbcss/. The Institutional Review Board for the Protection of Human Subjects at the University of Vermont approved our use of the VBCSS data. Approximately 5% of the women in the VBCSS did not provide permission to allow the use of their data for research.
The NBCSP is a governmentally organized screening program run by the Cancer Registry of Norway (3). Data collection and quality assurance are integrated parts of the administration of the program. This study is considered a part of the evaluation and scientific activities of the NBCSP and is thus covered by the general ethical approval of the Cancer Registry of Norway (14).
Most women who live in Vermont undergo regular screening mammography every 1 or 2 years beginning at age 40 years, as recommended by the US Preventive Services Task Force (2), and choose when and where the screening will take place (Table 1). All mammography facilities in Vermont are accredited by the US Food and Drug Administration, and they operate under the rules and regulations of the Mammography Quality Standards Act (15). Bilateral two-view mammography is performed routinely at dedicated screening units. Mammograms are assessed according to the American College of Radiology Breast Imaging Reporting and Data System categories (16). Most mammograms performed in Vermont were read by one radiologist; however, some of the data come from facilities that double read mammograms or used computer-aided detection, but we were unable to discern the details for individual mammograms. Although some facilities do the screening and any necessary additional imaging on the same day, most contact women who need additional imaging and ask them to return on a later date.
The NBCSP started at the end of 1995 as a 4-year pilot project in 4 of the 19 counties in Norway and gradually expanded to become nationwide in 2004 (3). Today, the target population consists of approximately 520000 women aged 50–69 years who are invited every 2 years to undergo bilateral two-view mammography (Table 1). Each woman is assigned a prescheduled time and place for the screening examination. Two independent radiologists read the screening mammograms according to a five-point interpretation scale that reflects the probability of cancer (3,17). The final decision about whether or not a woman should be recalled for further imaging is made in a consensus or arbitration meeting. The recall examination takes place 5–15 working days after the screening examination. A recall examination includes all procedures that are deemed necessary for a definitive assessment, including additional imaging, ultrasound, magnetic resonance imaging (MRI), and, if recommended, a needle biopsy. These procedures are usually performed on the same day in one setting. Cancers are reported to the Cancer Registry of Norway. Table 1 contrasts the main features of screening mammography in Vermont and Norway.
This study included women with no history of breast cancer who were aged 50–69 years at some time during the period from 1997 through 2003. Women's first screening examinations (ie, the prevalent screens) often detect more and larger breast cancers than subsequent screening examinations, which only have the opportunity to detect cancers that were not visible on the previous screen. Therefore, prevalent and subsequent screens vary substantially with respect to their performance measures and goals (3,4,18), and previous studies that have compared screening outcomes between countries have excluded prevalent screens (7–9,11). Because most women receive their prevalent screen before age 50 years in Vermont and at age 50 years or older in Norway, the Vermont women aged 50–69 years had fewer prevalent screens than the Norwegian women aged 50–69 years. For these reasons, only data from subsequent screens in Vermont and Norway were used in this study. Subsequent screens include all screening examinations following the first mammogram a woman received after age 49 years that were registered in the screening database in either of the two regions. The subsequent screens were classified by the time since the previous screen as 1 year (range = 10–19 months since the previous screen), 2 years (range = 20–27 months since the previous screen), or more than 2 years (more than 27 months since the previous screen). There were no 1-year screens for Norwegian women because the NBCSP invites women to attend screening every 2 years. If a woman does not attend a biennial screen, she does not receive another invitation for 2 years, so the screening interval for her next screen will be at least 4 years.
We carefully reviewed all data definitions for this study and transformed variables to maximize comparability between the two countries. In Vermont, a screening mammogram was defined as a bilateral mammogram that was performed on an asymptomatic woman who had no abnormalities previously noted by a clinician. In Norway, all mammograms performed as a consequence of regular invitations from the NBCSP were defined as screening mammograms. Recall for further imaging included additional mammographic views, ultrasound, and MRI, and the final assessment was made after all imaging was completed. The recall rate was defined as the number of screens requiring a recall to complete the final assessment divided by the total number of screens within a given time period and is expressed per 100 examinations. The screen detection rate was defined as the number of breast cancers diagnosed within a specific time after a positive final assessment divided by the total number of screens. The interval cancer rate was estimated as the number of breast cancers diagnosed within a specified time after a negative assessment divided by the total number of screens. The screen detection and interval cancer rates are expressed per 1000 screens. Because Vermont and Norway had different screening intervals, we provide results for both 1 year and 2 years of follow-up. The follow-up time for computing 2-year screen detection and interval cancer rates in Vermont was censored at the time of a woman's next screening mammogram if it occurred less than 2 years after the previous screen.
We also expressed the screen detection and interval cancer rates per 1000 woman-years of follow-up. The number of woman-years of follow-up was estimated by computing the time between the first and last subsequent screen in the study period plus 2 years of follow-up or the time to cancer detection if it occurred within 2 years of the last screen for each woman and summing these values for all women in the study. No information was available on loss to follow-up due to a change of residence or death during the 2 years after a woman's last screen, so loss to follow-up was not taken into account in computing the number of woman-years. Rates based on the number of woman-years are estimates of outcomes for the women being screened, whereas rates based on the number of screens reflect the outcomes of the screening examinations. Rates per 1000 screens and rates per 1000 woman-years were more different for Norway than for Vermont because most women were screened annually in Vermont and biennially in Norway.
All rates were directly adjusted for age by using 5-year age intervals and adjusting to the age distribution for the combined Vermont and Norway data. For rates per 1000 screens, the woman's age at the time of the mammogram was used for the age classification. Woman-years were allocated to the appropriate age categories as women aged during follow-up to obtain the adjusted rates per 1000 woman-years. Differences in age-adjusted rates were assessed by a Z test using the variances of the weighted proportions for each country. Logistic regression was used to assess relationships between screen detection rates and screening intervals after adjustment for age and country and to assess differences in screen-detection rates between Vermont and Norway after adjustment for screening interval. To examine the risk of interval cancer over follow-up time, we computed the hazard functions for Vermont and Norway using the actuarial life-table method (19) and assessed the statistical significance of their difference by the Mantel–Cox log-rank test. Tumor characteristics are presented as the average sizes of invasive lesions in millimeters and as distributions of histological types, size categories, and lymph node involvement. Differences in average tumor size were assessed by a t test, and differences in proportions were assessed using a χ2 test. For polychotomous variables, proportions were compared only if the overall χ2 test indicated a difference in the distribution between countries, and the degrees of freedom from the overall χ2 was used to adjust for multiple testing. All statistical tests were two-sided. A P value less than or equal to .05 was considered statistically significant. The analyses were conducted using SPSS (Version 12.0.1 for Windows, SPSS Inc., Chicago, IL), R Statistical Computing (Version 2.0.1), and SAS (Version 8, SAS Institute, Cary, NC) software.
In Vermont, 45050 women contributed 141284 subsequent screening examinations, of which 130978 (93%) had a screening interval of 1–2 years and 10306 (7%) had an interval of more than 2 years. The average number of screens for these women was 3.1. A total of 116996 screens (83%) were performed at a 1-year interval. On the basis of 2000 US Census data, we estimated that 81% of Vermont women aged 50–69 years had a screening mammogram during the study period.
In Norway, 194430 women contributed 360872 subsequent screening examinations, of which 350202 (97%) had a screening interval of 2 years and 10670 (3%) an interval of more than 2 years. The average number of screens for these women was 1.9. The coverage of screening mammography performed in the NBCSP during the study period was 83%, based on data from the Norwegian Population Registry and the NBCSP database.
Approximately 95% of the women in the Vermont and Norway study populations were white. The Norwegian women were older than the Vermont women (Table 2), reflecting the later age at prevalent screens for women who were older than 50 years at the time the NBCSP was initiated (percentage of women in the study populations aged 50–54 years old in Vermont vs Norway: 52.8% vs 29.6%). The Vermont women reported a higher educational level compared with the Norwegian women (58.2% vs 21.8%, respectively, at college or university level). A larger proportion of Vermont women than Norwegian women reported menarche at age 13 years or younger (76.6% vs 49.3%). Greater proportions of Vermont women than Norwegian women were nulliparous (12.6% vs 9.3%) or younger than 20 years at first birth (20.2% vs 12.1%). The proportions of women in Vermont and Norway who had ever used hormonal therapy were 46.8% and 45.2%, respectively.
The age-adjusted recall rates for all screening mammograms during the study period were 9.8% in Vermont and 2.7% in Norway (P < .001; Table 3). Logistic regression indicated that the odds of being recalled increased with increasing screening interval (1 year, 2, and >2 years for Vermont; 2 and >2 years for Norway) independent of country (P < .001 for the linear effect of screening interval when country is in the multivariable model [data not shown]). The age-adjusted screen detection rates for a 1-year follow-up of all screens were 4.01 per 1000 screens in Vermont and 5.08 per 1000 screens in Norway (P < .001). This difference was due to the lower screen detection rate for invasive cancer in Vermont (3.05 per 1000 screens) than in Norway (4.17 per 1000 screens) (P < .001). Logistic regression indicated that the odds of a screen-detected cancer (invasive, ductal carcinoma in situ [DCIS], and total) increased with increasing screening interval, independent of country (P < .001 for the linear effect of screening interval when country is in the multivariable model [data not shown]), and that the odds of invasive cancer and total cancers did not differ statistically significantly between Vermont and Norway after adjustment for screening interval. However, the odds of DCIS was statistically significantly lower in Norway after adjustment for screening interval (P = .002). There were no statistically significant differences in tumor size or lymph node involvement between Vermont and Norway among the different screening intervals for each country (Table 3).
Table 4 shows the age-adjusted screen detection rates that were obtained when 2 years of follow-up was used to identify cancers diagnosed after a positive screen. The number of screen-detected cancers in Vermont increased from 556 (Table 3) with 1 year of follow-up to 569 (Table 4) with 2 years of follow-up; in Norway, they increased from 1841 to 1848. When expressed per 1000 screens, the screen detection rates for 2 years of follow-up were similar to those based on 1 year of follow-up (4.10 vs 4.01 in Vermont and 5.10 vs 5.08 in Norway), which was expected because most screen-detected cancers were diagnosed shortly after screening. Screen detection rates were lower in Vermont than in Norway for invasive cancers (3.12 vs 4.17 per 1000 screens, P < .001) and for total cancers (4.10 vs 5.10 per 1000 screens, P < .001). However, when the screen detection rate was calculated based on woman-years of follow-up rather than the number of screens, the rates were similar in Vermont and Norway for invasive cancer (2.11 vs 2.10 per 1000 woman-years; P = .97) and total cancers (2.77 vs 2.57 per 1000 woman-years; P = .12), whereas the rate for DCIS was statistically significantly higher in Vermont than in Norway (0.66 vs 0.46 per 1000 woman-years; P < .001).
Table 4 also shows the age-adjusted interval cancer rate for 2 years of follow-up. The rate of interval DCIS per 1000 screens was higher in Vermont than in Norway (0.2 vs 0.11, P = .008), but the rates for interval invasive cancer and total interval cancers did not differ statistically significantly between Vermont and Norway. However, when calculated on a woman-year basis, all interval cancer rates were statistically significantly higher in Vermont than in Norway (P < .001 for all). These findings are influenced not only by the probability of interval cancer at varying time points following a negative screen but also by the number of women at risk over the follow-up time, both of which differ between Vermont and Norway because of the differing screening intervals. The cumulative probability of an interval cancer was statistically significantly higher in Vermont than in Norway over the entire 2-year follow-up (P < .001) (Figure 1). In Norway, few interval cancers were detected during the first 3 months after screening, and from months 9 to 24 after screening the probability of an interval cancer remained constant, as indicated by the straight line for the cumulative probability. In Vermont, the probability of an interval cancer increased during the month after screening and again at about 6–8 months after screening, probably because of recommendations for short-term follow-up mammography, which usually occurs approximately 6 months after screening, and then was fairly constant over the reminder of the 2-year follow-up. Figure 1 also shows the proportion of women who remained at risk for an interval cancer over time. In Vermont, this proportion dropped steeply at 1 year after screening because the majority of women returned for their annual screen. In Norway, few women returned for biennial screens before 22 months. Thus, although the risk of interval cancer was higher in Vermont than in Norway regardless of follow-up time, the majority of interval cancers in Vermont (67%) were detected within 1 year of screening because fewer women remained at risk after this time. By contrast, the majority of interval cancers in Norway (69%) were detected 12–24 months after screening because the probability of cancer was higher during this period than earlier after screening and a large number of women remained at risk. These findings suggest that the difference in interval cancer rates between the two countries might be larger if women were screened more frequently in Norway or less frequently in Vermont.
We also examined the overall detection rates (ie, the screen detection and interval cancer rates combined). The age-adjusted overall detection rates per 1000 screens were statistically significantly higher in Norway than in Vermont for invasive and total cancers (invasive cancers: 5.80 vs 4.77, P < .001; total cancers: 6.82 vs 5.96, P < .001), but the rates for DCIS did not differ statistically significantly between the two countries (Table 4). However, when expressed on a woman-year basis, the overall detection rates for invasive, DCIS, and total cancers were all lower in Norway than in Vermont (invasive cancers: 2.9 vs 3.2 per 1000 woman-years, P = .026; DCIS: 0.5 vs 0.8 per 1000 woman-years, P < .001; total cancers: 3.4 vs 4.0 per 1000 woman-years, P < .001).
Table 5 summarizes the histological type, tumor size, and lymph node involvement for all invasive cancers diagnosed in Vermont and Norway during the 2-year follow-up. The distribution of histological types of screen-detected invasive cancers differed between Vermont and Norway (P < .001), with Vermont having a statistically significantly higher proportion of invasive ductal carcinomas than Norway (92.1% vs 85.0%; P < .001). There were no differences between Vermont and Norway in tumor size or in lymph node involvement for screen-detected cancers. For interval cancers, the distribution of histological type did not differ between Vermont and Norway. However, interval cancers in Vermont were statistically significantly smaller than those in Norway (mean tumor size for Norway vs Vermont: 22.0 vs 19.1 mm, P = .019) and were more likely to be 15 mm or smaller (55.9% vs 38.2%, P < .001). In addition, a statistically significantly higher proportion of the interval cancers in Vermont than in Norway had no lymph node involvement (67.5% vs 57.0%, P = .01).
There were statistically significant differences in the distribution of histological types between screen-detected and interval cancers both in Vermont (P < .001) and in Norway (P = .005); fewer interval cancers than screen-detected cancers were invasive ductal carcinomas (Vermont: 81.0% vs 92.1%, P < .001; Norway: 79.6% vs 85.0%, P = .01), whereas more interval cancers than screen-detected cancers were invasive lobular carcinomas (Vermont: 16.0% vs 6.0%, P < .001; Norway: 14.6% vs 10.2%, P < .001). In both Vermont and Norway, invasive cancers that were screen detected had more favorable prognostic characteristics than invasive interval cancers. For example, screen-detected invasive cancers were smaller than invasive interval cancers in both Vermont (mean tumor size: 14.0 vs 19.1 mm, P < .001) and Norway (mean tumor size: 14.2 vs 22.0 mm, P < .001). In addition, more screen-detected invasive cancers than invasive interval cancers had no lymph node involvement in both Vermont (77.5% vs 67.5%, P = .011) and Norway (74.9% vs 57.0%, P < .001).
For all invasive cancers (both screen detected and interval), there were no statistically significant differences between Vermont and Norway in either tumor size or lymph node involvement. There was, however, a statistically significant difference in the distribution of all histological types (P = .003): a lower proportion of the tumors in Norway than in Vermont were classified as invasive ductal carcinoma (83.5% vs 88.2%, P = .01).
Our study revealed a number of important differences and similarities in the performance and effectiveness of screening mammography in Vermont and Norway. Opportunistic screening in Vermont was associated with a considerably higher recall rate and a lower screen detection rate compared with the organized screening program in Norway. Analyses that were based on woman-years of follow-up revealed statistically significantly higher interval cancer rates and overall detection rates (interval and screen detection) in Vermont than Norway for invasive, DCIS, and total cancer. However, tumor size and lymph node involvement characteristics were more favorable for invasive interval cancers in Vermont than for those in Norway. Despite these differences, Vermont and Norway were similar with respect to the prognostic characteristics of all invasive cancers (screen-detected and interval cancer) that were diagnosed in women who had undergone screening during the study period.
The recall rate in Vermont was nearly four times that of Norway, regardless of the screening interval that was examined. The higher recall rate in Vermont could have been driven by radiologists' concerns about malpractice lawsuits (8). However, Elmore et al. (20) found no association between medical malpractice experience and concerns and recall rates among radiologists in three regions of the United States (Washington, Colorado, and New Hampshire). The lower recall rates in Norway and other European countries may instead reflect a standard for recall that is set by the screening programs and regular monitoring of the recall rate to assure compliance (4). Contrary to expectation, the lower recall rate in Norway was not associated with either a lower screen detection rate or a higher interval cancer rate than that of Vermont.
Screening performance measures are influenced by several factors. The screen detection and interval cancer rates are interrelated; both depend on the frequency of screening, the cancer incidence, and the accuracy of the mammographic assessment. The lower rate of screen-detected cancers per 1000 screens in Vermont than in Norway was expected because the majority of the Vermont women were screened twice as often as the Norwegian women and because longer intervals between examinations generally increase screen detection by improving sensitivity (18). Results of our logistic regression analysis that controlled for screening interval suggest that differences in the screen detection rates between Vermont and Norway were due primarily to the difference in screening interval. This logistic regression analysis assumed that the relationship between screening interval and detection rate was the same in Vermont and Norway. Although this assumption appears to be valid based on the results for the 2-year and greater than 2-year screening intervals, we were unable to test it because there were no 1-year screens in Norway.
Other factors may have contributed to the differences in recall and screen detection rates between Vermont and Norway that we observed. For example, all screening mammograms in Norway are independently double read, and discrepancies are decided by arbitration. By contrast, during the study period, the majority of mammograms in Vermont were single read, although some were augmented by computer-aided detection and double reading either with or without arbitration. Independent double reading has previously been reported to be associated with a higher screen-detected cancer rate and, depending on the recall policy, a lower recall rate (21–24). Despite the evidence in favor of double reading, health insurance companies in Vermont and Medicare do not reimburse for double reading but do reimburse for computer-aided detection, even though the evidence that computer-aided detection is effective in improving mammography accuracy is equivocal (25,26). A second factor that could have affected screening accuracy in our study was the experience of the radiologists interpreting mammograms (27–29). All radiologists in the NBCSP are mammography specialists and are required to read at least 5000 screening mammograms each year (4,17). By contrast, most radiologists in Vermont are generalists who read all types of radiological images, and very few read as many as 5000 screening mammograms per year (15,21). At this time, there is no consensus about how radiologist volume, experience, and training influence the accuracy of screening mammography (26–28). Although some studies (4,27,30) suggest that specialists are more accurate mammography readers than generalists, the term “specialist” was not clearly defined in those studies.
The factors described above that potentially lowered the rate of screen-detected cancer in Vermont compared with Norway also may have contributed to Vermont's higher interval cancer rate. However, the effects of these factors should be offset, to a large extent, by more frequent screening. With a shorter screening interval, cancers that are not detectable at screening have less time to be clinically detected before the next screen. It was therefore surprising that women in Vermont had a higher interval cancer rate per 1000 woman-years of follow-up and a higher probability of interval cancer regardless of the time since screening than women in Norway. These findings suggest that the Vermont women and/or their health care providers may more readily pursue evaluation of symptoms and clinical findings than their Norwegian counterparts. The predetermined 24-month screening interval and the scheduled examinations in Norway may result in women being more likely to wait until their next personal invitation, even if they have symptoms. This possibility is supported by our finding that the interval cancers diagnosed in Vermont were smaller and less likely to have lymph node involvement than those diagnosed in Norway.
Although the more favorable tumor characteristics observed in Vermont are advantageous for the women diagnosed with interval cancer, these characteristics may not have a sizable impact on the overall effectiveness of screening in terms of mortality reduction because interval cancers account for only 20%–30% of the breast cancers detected in screened women (31,32). The majority of interval cancers in Norway were diagnosed during the second year after screening, which suggests that a shorter screening interval might lead to earlier detection and, thus, more prognostically favorable tumor characteristics. However, a previous study from Norway that examined the characteristics of invasive interval cancers by time since last mammogram found that although tumor size increased with time since the last screen, there were no substantial differences in other tumor characteristics, such as tumor grade, lymph node involvement, or estrogen or progesterone receptor status (33).
The tumor characteristics of all invasive cancers (screen-detected plus interval cancers) did not differ between Vermont and Norway, despite the fact that the Norwegian women had a longer screening interval and invasive interval cancers with less favorable tumor characteristics compared with Vermont women. This finding is consistent with a previous study by White et al. (34), who found no additional risk of late-stage breast cancer in US women 50 years or older who were screened biennially vs annually, and is further supported by a study by Wai et al. (35), which showed no difference in 5-year survival among women aged 50–74 years who underwent annual vs biennial screening mammography.
All DCIS rates (screening detection, interval cancer, and overall) computed per 1000 woman-years were lower in Norway than in Vermont. A similar finding of lower detection of DCIS with longer screening intervals has been previously reported (35), as has a higher proportion of DCIS among young women (36). A possible explanation for the higher detection of DCIS in Vermont may be that, since 2001, approximately one-third of screening mammograms in Vermont were performed using digital imaging, which provides improved image contrast that may increase the detection of cancers, particularly DCIS, in dense breasts (37). Only about 8% of the screens from Norway that were included in this study were by digital mammography. The higher rate of DCIS in Vermont may also be due to the use of computer-aided detection, which can also increase the detection of DCIS (25). Computer-aided detection has not been implemented in the screening program in Norway. The somewhat higher proportion of biopsies among women screened in the United States compared with women screened in Norway (3,38) may also have contributed to the higher proportion of DCIS in Vermont. The detection of DCIS, a preinvasive lesion, is controversial: some believe that low-grade DCIS lesions are being overdiagnosed, and consequently, that women are being treated for disease that is not life threatening or clinically relevant (39,40). However, some DCIS progresses to an invasive cancer (40,41). Without knowing which cases of DCIS are likely to progress, it is impossible to determine whether the higher rate of DCIS in Vermont was a beneficial or adverse outcome for the women who were screened.
Our study has several limitations. First, although great attention was paid to creating variables that were comparable, it is possible that some results were influenced by subtle differences in the Vermont and Norwegian data definitions and collection procedures. However, our results for both Vermont and Norway were similar to those of other studies within each respective country or continent (3,8,35,38), validating that the variables were indeed measuring what they were designed to measure. A second limitation is that not all variables that influence mammography accuracy were collected in both countries. For example, mammographic breast density was only collected in Vermont and we therefore were unable to adjust for this possible confounder. However, we have no reason to believe that the distribution of breast density would be different among the two populations. Finally, the absence of 1-year screens in Norway limited our ability to fully distinguish the effects of screening interval from other potential differences in mammography performance between Vermont and Norway.
In conclusion, screening in Vermont and Norway yielded comparable overall results. However, what works in one country may not work in the other. For example, it is unclear how the effectiveness of biennial opportunistic screening in Vermont would compare with that of the biennial organized screening in the Norwegian program. Implementation of biennial screening mammography in Vermont with no reduction in the interval cancer rate could have a negative impact on the prognosis of future interval cancers. Adoption of biennial screening in the United States would reduce the number of mammograms being performed, which might give radiologists more time to perform independent double reading and possibly offset the financial cost associated with double reading. Independent double reading with consensus probably accounts for the fewer interval cancers and lower recall rate in Norway.
Our results demonstrate that despite its longer screening interval, the organized population-based screening program in Norway achieved similar outcomes as the opportunistic screening in Vermont. The Norwegian women were exposed to half as many screening mammograms as the Vermont women, and the recall rate in Norway was statistically significantly lower than that in Vermont, yet the tumor characteristics for all invasive cancers diagnosed in the screened Norwegian women were not statistically significantly different from those diagnosed in the screened Vermont women. Although more frequent screening in Norway might lead to interval cancers that have more prognostically favorable tumor characteristics, it is unclear whether or not a shorter screening interval would decrease breast cancer mortality among screened Norwegian women.
National Cancer Institute (U01 CA070013 to P.M.V., J.S., D.L.W., B.M.G.); Cancer Registry of Norway (S.H.).
The study design, analysis, and interpretation of the data are the sole responsibility of the authors.