|Home | About | Journals | Submit | Contact Us | Français|
Follow-up procedures vary among cancer registries in North America. US registries are funded by the Surveillance, Epidemiology, and End Results (SEER) Program and/or the National Program of Cancer Registries (NPCR). SEER registries ascertain vital status and date of last contact to meet follow-up standards. NPCR and Canadian registries primarily conduct linkages with local and national death records to ascertain deaths. Data on patients diagnosed between 2002 through 2006 and followed through 2007 were obtained from 51 registries. Registries that met follow-up standards or, at a minimum, conducted linkages with local and national death records had comparable age-standardized five-year survival estimates (all sites and races combined): 63.9% SEER, 63.1% NPCR, and 62.6% Canada. Estimates varied by cancer site. Survival data from registries using different follow-up procedures are comparable if death ascertainment is complete and all nondeceased patients are presumed to be alive to the end of the study period.
Population-based cancer survival is a necessary component to fully understand the burden of cancer in the population (1). Survival data can help measure access to and utilization of health-care services (2) and inform the need for cancer control activities aimed at reducing the cancer burden (3,4). Comparative analyses of survival have been used to motivate and evaluate cancer control activities between and within countries (4,5). Similar analyses in North America may inform cancer control activities at the national and local levels.
Population-based cancer registries throughout North America collect information on all residents diagnosed with cancer in their catchment areas including demographic data (birth date, race/ethnicity [United States only], sex), case data (date of diagnosis, primary site, histology, behavior, sequence number), and follow-up data (vital status [alive and dead], date of last contact, source of follow-up information, and cause of death for deceased patients) using standard procedures (6).
With financial support from their health agencies, there is a cancer registry in all 10 Canadian provinces and 3 territories (7). All Canadian registries report demographic and case data to the Canadian Cancer Registry, a patient-based surveillance system maintained by Statistics Canada (8). In the United States, registries are supported through the National Cancer Institute’s Surveillance, Epidemiology, and End Results (SEER) Program (9) and the Centers for Disease Control and Prevention’s National Program of Cancer Registries (NPCR) (10). US registries report data to their respective federal funding programs. Between SEER and NPCR, there is a cancer registry in all 50 states and the District of Columbia (11). All Canadian and US registries are members of the North American Association of Central Cancer Registries (NAACCR).
Registries routinely perform death clearance by linking their incidence data with local death certificate records to identify cancer patients who were missed at the time of their diagnosis and follow back to obtain information relating to the decedent’s diagnosis. In addition, these linkages are also intended to give the registries the opportunity to record date and cause of death for deceased cancer patients (12). If no information related to an earlier diagnosis can be found, the cancer patient is registered as a death certificate-only (DCO) case and the date of death is recorded as the date of diagnosis.
Activities undertaken to obtain follow-up data can range from direct contact with cancer patients, their families, or physicians to indirect verification of vital status through computerized linkages with administrative databases such as those maintained by (in the United States) the Centers for Medicare and Medicaid Services, the Social Security Administration, licensing bureaus and voter registration offices (13).
Deaths are recorded in the province or state where the death occurs, and this information is forwarded if necessary to the vital records office in the jurisdiction where the person resided at the time of their death. However, the province or state where the death occurred may not be the same as where the patient was diagnosed and registered by the cancer registry. In this situation, linkage between the registry and their national repository of death certificate information [Statistics Canada’s Death Database (14) and Centers for Disease Control and Prevention’s National Death Index (NDI) (15)] can ascertain deaths that occur within the country. These linkages do not provide follow-up information on living patients or patients who moved out of the country between the time of their diagnosis and death. Annually, Statistics Canada links the Canadian Cancer Registry to its Death Database (except for Quebec deaths due to legal limitations) and notifies the reporting registry of out-of-province deaths. The SEER Program requires their registries to meet follow-up standards by reporting vital status and a date of last contact that is within 22 months of the date of their annual data submission to SEER for a minimum of 90% of all registered cancer patients, living and deceased (16). NPCR registries record vital status and the date of last contact as available, but the registries are not funded or required to meet follow-up standards. However, in 2008, NPCR and SEER arranged for NDI linkage services to be available at no additional cost to their registries to encourage the registries to comprehensively ascertain deaths within the United States. Most but not all NPCR and SEER registries are performing these linkages.
NAACCR compiles incidence data from all member registries through their annual Call for Data and publishes these data in Cancer in North America (17). NAACCR has developed criteria for evaluating and certifying cancer registries as having high-quality cancer incidence data (18). These data are aggregated for combined estimates of cancer incidence at the national level.
The extent and method of follow-up activities have been shown to affect the completeness of death ascertainment and the period that a cancer patient is at known or assumed risk of dying (19,20). Missing deaths can result in spuriously high survival estimates, whereas over- or under-estimating the time that a cancer patient is at risk of dying can raise or lower survival estimates (20). Therefore, different follow-up procedures may bias the comparison of survival estimates among registries.
The purpose of this study was to assess the effect of different follow-up procedures on five-year survival estimates and develop criteria for identifying high-quality cancer survival data for inclusion in Cancer in North America.
In 2011, NAACCR requested follow-up information for all cases of invasive cancer diagnosed 1995 through 2009. Site and histology information were coded to the third edition of the International Classification of Diseases for Oncology and recoded to SEER site recodes (21,22). Registries were eligible for inclusion in this analysis if their incidence data met Cancer in North America publication criteria for diagnosis years 2002 through 2006 (17), they consented to have their data included in this analysis, and they submitted follow-up information through December 31, 2007. These years were selected to optimize the number of eligible registries. Cases diagnosed between July 1 through December 31, 2005 in the states of Alabama, Louisiana, Mississippi, and Texas were excluded due to problems with data collection following Hurricanes Katrina and Rita (23).
Record-level data were extracted for all patients diagnosed with an invasive primary cancer between the age of 15 and 99 years using the case listing function in SEER*Stat software (Version 7.04, Information Management Services, Inc., Silver Spring, MD) and processed using SAS (Version 9.2, Cary, NC). In addition to the date of last contact reported by the cancer registry, an end-of-study date (December 31, 2007) was assigned to all nondeceased cases. Date of last contact was not altered for deceased cases. One Canadian registry (CAN01) assigned a reported follow-up date of December 31, 2009 for all nondeceased cancer patients.
The data were reloaded into SEER*Stat for analysis using SEER*Prep (Version 2.4.6, Information Management Services, Inc., Silver Spring, MD).
Registry data were grouped by country (Canada and United States) and within the United States (NPCR and SEER). Registries funded by both SEER and NPCR (California, Georgia, Kentucky, Louisiana, and New Jersey) were categorized as SEER because these registries are funded to meet SEER follow-up standards and their survival data have been published (24). The 51 participating registries were categorized as follows: Canada (Alberta, Manitoba, Nova Scotia, and Ontario); NPCR (Alabama, Alaska, Arizona, Arkansas, Colorado, Delaware, Florida, Idaho, Illinois, Indiana, Maine, Massachusetts, Michigan, Minnesota, Missouri, Montana, Nebraska, Nevada, New Hampshire, New York, North Carolina, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, Rhode Island, South Carolina, South Dakota, Texas, Virginia, Washington, West Virginia, Wisconsin, and Wyoming); and SEER (California, Connecticut, Georgia, Hawaii, Iowa, Kentucky, Louisiana, New Jersey, New Mexico, Utah, and metropolitan-area registries in Detroit and Seattle). To maintain confidentiality, registries were identified by randomly assigned numeric values within country and by program. Data from NPCR01 to NPCR16 contained information from NDI for deaths that occurred in 2002 through 2007 (n = 14) and/or met SEER follow-up standards (n = 3).
Survival analyses were restricted to the first primary cancer diagnosed (only cancer diagnosed or the first of multiple primary [MP] cancers diagnosed) for each cancer patient, and excluded DCO and autopsy-only cases because these cases had no calculable survival interval. To estimate the percentage of cases that were excluded from the survival analyses, we evaluated the percentage of MP cancers using SEER MP coding rules (25) and the percentage of DCO (including autopsy-only) cases among first primary cancers. To evaluate the breadth of case finding activities conducted by each cancer registry, we estimated the percentage of microscopically confirmed (MC) cases among first primary non-DCO cases. A patient was considered to have complete follow-up information if they were deceased (any date) or alive with last follow-up date of January 1, 2008 or later. Data from SEER registries were considered the gold standard when evaluating data from Canada and NPCR because SEER registries are funded and required to meet follow-up standards.
Cases with missing age or sex were excluded. The survival interval for decedents was computed from the date of diagnosis to the date of death. Nondeceased cases contributed person-time at risk and were censored at the date of last contact as reported by the cancer registry (reported alive survival) and at the end-of-study date (presumed alive survival). To obtain a more up-to-date survival estimates in months, the multiple-year cohort method was used, which included all patients diagnosed in the most recent years spanning the maximum duration (60 months) to be estimated (26).
All-cause (observed) survival is the percent of patients alive at some specified time after diagnosis following the identification of deaths from all causes. Relative survival (RS) estimates the probability of survival related to a specific cancer in the absence of other causes of death and is calculated as the ratio of all-cause survival among cancer patients with a specific cancer to the expected survival among the population with similar demographic characteristics [eg, age, sex, calendar year of death, race, and geographic area (27)]. For US registries, expected survival was derived from national life tables available in SEER*Stat. Provincial life tables were used for Canadian registries (28). Confidence intervals were calculated using a log (−[log]) transformation.
Survival estimates were age-standardized to the International Cancer Survival Standard using five age groups (29). For some combinations of registry, primary site, and age group, survival estimates could not be calculated because no cases were reported alive at the start of the 60th month. To avoid presenting missing survival estimates in this situation, unstandardized estimates were used for both reported and presumed alive survival. These estimates are identified by a caret (^) in the figures.
Table 1 presents data quality indicators for all cancers combined by registry. Values that fell plus or minus 1% outside the SEER range were noted by a double dagger symbol (‡) in the table. This error allowance is consistent with NAACCR evaluation methods (18). The percentage of MP cases reported (sequence number: ≥02) ranged from 13.6% to 19.1% among SEER registries; eight NPCR registries and two Canadian registries reported values below the SEER range. Among first primary cancers, the percentage of DCO cases ranged from 0.4% to 3.7% for SEER; no NPCR or Canadian registry had a value outside the SEER range. The percentage of total cases excluded ranged from 14.9% to 20.5% among SEER registries; five NPCR and two Canadian registries reported values below the SEER range. The percentage of MC first primary, non-DCO cases ranged from 93.3% to 96.9% for SEER; one NPCR registry (NPCR15) reported a value (100.0%) above the SEER range, and two Canadian registries reported values below the SEER range. The percentage of cases with complete follow-up information was 97.5%–99.6% for SEER, 30.8%–96.1% for NPCR, and 54.8%–100.0% among Canadian registries. All NPCR and two Canadian cancer registries reported lower percentages of complete follow-up than SEER registries. Three NPCR registries (NPCR06, NPCR10, and NPCR13) and two Canadian registries (CAN01 and CAN02) reported follow-up that met SEER follow-up standards, although CAN01 appeared to have assigned a uniform follow-up date to all nondeceased patients based on when the registry submitted their data to NAACCR.
Stacked bar charts (Figures 1A–D) show age-standardized five-year RS for four leading cancers by registry. The lighter shading shows the reported alive survival percent and the final (black) bar width shows the presumed alive survival percent. Values greater than 1% above the SEER range for presumed alive survival were noted by an asterisk (*) in the figures. The range of presumed alive survival was as follows: female breast cancer (Figure 1A) was 82.6%–94.1% for SEER, 83.3%–109.1% for NPCR, and 85.7%–86.3% for Canadian registries; prostate cancer (Figure 1B was) 94.7%–101.5% for SEER, 92.5%–113.5% for NPCR, and 92.8%–97.7% for Canadian registries; cancers of the lung and bronchus (Figure 1C) were 13.4%–20.5% for SEER, 14.0%–46.0% for NPCR, and 13.9%–20.7% for Canadian registries; and cancers of the colon and rectum (Figure 1D) were 61.2%–69.2% for SEER, 58.9%–96.3% for NPCR, and 61.6%–64.1% for Canadian registries. Among registries who reported higher survival than SEER, all were NPCR. Of these registries, only two of seven for colorectal cancers, zero of three for breast cancer, zero of four for prostate cancer, and zero of nine for lung and bronchus cancers met SEER follow-up standards and/or included NDI linkage results in their datasets.
Visual inspection of Figures 1A-D shows that two NPCR registries (NPCR26 and NPCR31) reported the highest five-year presumed alive survival for all cancers examined. Visual inspection also suggests the extent of follow-up activities performed by each registry. For all SEER registries, reported alive survival nearly equaled the presumed alive survival. Two Canadian registry (CAN01 and CAN02) and three NPCR registries (NPCR06, NPCR10, and NPCR13) reported similar follow-up patterns, suggesting that they performed complete follow-up on living patients or assigned follow-up dates assuming that cases not known to be deceased were still alive. Data from two of these registries (NPCR10 and NPCR13) did not include NDI linkage results. For all remaining NPCR registries, variable differences were observed between reported alive and presumed alive survival.
RS (%) estimates in Tables 2 and and33 were based on data from 32 registries (all SEER and Canadian registries and NPCR01–NPCR16). These data covered 53.2% of the US population (statewide coverage) and 55.4% of the Canadian population (province-wide coverage), respectively. Table 2 shows age-standardized five-year, presumed alive RS and data quality indicators (%DCO and %MC) by registry. Survival estimates were noted in the table if they were 1% above the SEER range. The range of survival for female breast cancer was 82.6%–94.1% for SEER, 83.3%–94.1% for NPCR, and 85.7%–86.3% for Canadian registries. Survival for prostate cancer ranged from 94.7%–101.5% for SEER, 92.5%–102.5% for NPCR, and 92.7%–97.9% for Canadian registries. For six registries (three SEER and three NPCR), presumed alive RS exceeded 100.0% for prostate cancer. The range of survival for cancer of the lung and bronchus was 13.4%–20.5% for SEER, 14.0%–20.1% for NPCR, and 13.9%–20.7% for Canadian registries. Survival for colorectal cancer ranged from 61.2% to 69.2% for SEER, 58.9% to 71.0% for NPCR, and 61.6% to 64.1% for Canadian registries. One NPCR registry (NPCR15) reported the highest (colorectal) or among the highest (breast, prostate, and lung) survival estimates of all registries and 100% MC cases for all four cancers. The %DCOs for this registry were within the SEER range for all four cancer sites.
Table 3 shows age-standardized five-year RS for all sites combined and 23 common cancers. SEER survival estimates are shown using both the reported alive and presumed alive survival. Survival for NPCR and Canada was computed using the presumed alive survival estimates. For NPCR and SEER survival combined, reported survival time was used for all SEER registries and for three NPCR registries that met the SEER follow-up standard. Presumed alive survival estimates were used for the remaining 13 NPCR registries.
Among SEER registries, presumed alive survival estimates were marginally and consistently higher than reported alive survival estimates for all cancer sites examined; differences ranged from 0.1 (testis) to 1.5 (cervix) percentage points. Differences were higher for cancer sites with poor survival (stomach, pancreas, and liver cancer). Presumed alive survival estimates for all sites combined were 63.9% for SEER, 63.1% for NPCR, and 62.6% for Canada. Rates varied by cancer type and country and within the United States, by program. The absolute difference between SEER and NPCR ranged from 0.1 (non-Hodgkin lymphoma) to 2.2 (cancers of the brain and other nervous system) percentage points and between the United States (SEER and NPCR combined) and Canada from 0.1 (brain and other nervous system) to 5.7 (liver and intrahepatic bile duct) percentage points.
This study investigated the impact of different follow-up procedures on survival estimates using data from 51 NAACCR member registries with high-quality cancer incidence data. The analysis showed that for Canadian and NPCR registries that either conducted national death linkages or met SEER follow-up standards, survival estimates for all races combined appeared to be in the range of the SEER registries, which were considered the gold standard because these registries are funded and required to meet standards for reporting timely and complete follow-up data, and their survival data are routinely published (24).
However, ensuring the accuracy and comparability of population-based cancer survival can be challenging for reasons related to the completeness and quality of the incidence data; the procedures used to collect follow-up information; and the assumptions underlying the analyses, particularly as relates to known or presumed period of risk of death among cancer survivors. In addition, a close examination of survival data can further reveal data quality issues not apparent in an examination of incidence data alone.
First, to ensure accurate and unbiased survival estimates, registries must ascertain all, or nearly all, patients diagnosed with cancer in their catchment area (30). In 1997, NAACCR instituted a program to evaluate and certify incidence data from member registries. Information on evaluation methods and certification criteria has been published (18). Considerable progress has been made in increasing the number of registries reporting high-quality incidence data (31).
Population-based incidence data reflects a mixture of cancers that are diagnosed through microscopic confirmation, radiologic imaging, clinical examination, or laboratory procedures. As seen in this study, the majority of cases were MC. However, at the very least, death clearance should identify a small percentage of clinically confirmed cases, particularly among elderly patients and patients with highly fatal cancers, who upon follow back, are found to have a date of diagnosis that precedes their date of death and thus have a calculable survival interval. One NPCR registry (NPCR15) reported 100% MC non-DCO cases and somewhat higher survival estimates for all four cancers investigated in this study. This registry’s higher survival estimates do not appear to result from a lack of death ascertainment (they conducted death clearance and NDI linkages and their percentage DCO cases was comparable with other registries); instead, these survival estimates may reflect selection bias due to an over reliance on pathology laboratories as a case finding source and/or the systematic exclusion of clinically confirmed cases with short survival times. This registry should review their death clearance procedures and case finding sources and include all patients diagnosed with cancer within their state, as characteristic of a population-based cancer registry (30). Systematically excluding clinically confirmed cases diminishes the registry’s ability to inform population-based cancer control efforts.
Death clearance is primarily intended to identify incident cases that were missed at diagnosis and is thus performed throughout or immediately following the end of the current diagnosis year. Death clearance also provides an opportunity to update vital status (ie, deaths) among registered cancer patients when the linkages are performed using all death certificates, and not just those coded to cancer as the underlying cause of death. All cancer registries are required to perform death clearance and to update vital status of decedents (12). Linkage with local death files identifies the majority of deaths (32). The unusually and consistently high survival estimates observed in two NPCR registries (NPCR26 and NPCR31), neither of which conducted NDI linkages, suggests that these registries are missing a substantial number of deaths, including deaths that could have been found through routine linkage with state death records. These registries should review their death clearance procedures to ensure that the vital status of registered incident cases that link with state death records (all causes) are updated accordingly.
Linkage with local death records may yet miss some deaths that are reported late to the state vital statistics office through interstate data exchange agreements, and among patients who move out of the state between the time of their diagnosis and their death. Interstate- and interprovincial- migration rates are known to vary by jurisdiction (33,34). For this reason, it is important for all cancer registries to link with their national death databases. Missing only a small number of deaths can result in an overestimation of survival (19,20). This study confirmed this finding and showed the importance of ascertaining all, or nearly all, deaths, particularly for fatal cancers such as cancers of the lung and bronchus (Figure 1 C).C). NPCR registries whose data did not include NDI linkage results tended to report higher survival rates than SEER registries, which performed complete follow-up, or Canadian registries, which linked with their national death database. Even among NPCR and Canadian registries that conducted national death linkages, there was the suggestion that some deaths may yet be missing as evidenced by somewhat higher survival estimates for highly fatal cancers such as liver and pancreas (Table 3).
Further work is needed to ensure the accuracy of death record linkages. Evaluation of “immortals”—patients who are diagnosed with a highly fatal and/or late stage cancer and who are reported alive well beyond their natural life span—may elucidate potential problems with death ascertainment, including those resulting from linkage errors owing to missing, incomplete, or nonspecific linkage variables. The reluctance of some medical facilities to report Social Security numbers and/or birth dates to cancer registries may impede the registries’ ability to identify deaths through linkages with state and national death certificate files (35). It should also be kept in mind that deaths among patients who leave the country will not be ascertained through linkage with local or national death certificate databases.
Having comprehensively ascertained deaths, the question then becomes, what period at risk should be used to calculate survival for nondeceased patients–the period that the patient is reported to be alive (from the date of diagnosis to the date of last contact as reported by the cancer registry) or the period the patient is presumed to be alive (from the date of diagnosis to the end of the study period). Systematically underestimating the time that a cancer patient is at risk of dying can lower survival estimates (20). As seen in this study, reported alive survival estimates were slightly lower than presumed alive estimates for all cancers among the SEER registries (Table 3). This was because the survival interval in the reported alive survival analysis could be shorter than the survival interval in the presumed alive survival analysis.
Cancer registries that conduct complete follow-up to meet SEER follow-up standards are expending a great deal of resources updating follow-up information on all patients surviving a diagnosis of cancer. As the number of these prevalent cases increases, due to improving survival and a growing elderly population in North America (36,37), the burden and cost to the cancer registry to maintain complete and timely follow-up information will increase. Focusing resources on first identifying all deaths among registered patients before following cancer survivors may be a more effective and efficient use of limited resources, especially for NPCR registries that are not required nor funded to conduct follow-up activities.
SEER registries have additional information on a somewhat different type of “immortals”—cancer patients who have a diagnosis date but who immediately became lost to follow-up following their diagnosis. This situation can occur if a cancer patient leaves the United States immediately following their diagnosis (35). Reported alive survival analyses exclude these cases (they were documented with zero follow-up time), whereas presumed alive survival analyses include these cases and consider them to be alive at the end of the study period. These cases contribute no time at risk of death in the reported alive analysis and maximum time at risk of death in the presumed alive analysis. Cancer registries that do not conduct complete follow-up will not have knowledge of these “immortal” cancer patients and must consider all nondeceased cancer patients to be alive as of the end of the study period. For this reason, registries must pay close attention to where the cancer patient resided when they were diagnosed to identify nonresidents who were evaluated or treated in a medical facility in the registry’s catchment area. Cancer patients who leave the country immediately following a diagnosis of cancer (to be with family and/or to obtain treatment not available to them where they reside) will be more difficult to identify.
Incomplete follow-up, including death ascertainment, among certain racial and ethnic groups can bias survival estimates (38). In addition, a bias may result among younger cancer patients due to lower follow-up rates among these patients (20). The extent and direction of theses biases may vary according to primary site, age, and stage.
The proportion of cases that were excluded from these analyses ranged from 6.9% (CAN03) to 20.5% (SEER04), with the majority of exclusions in all registries due to MP cancers, or subsequent cancers. Although it has been routine practice to estimate survival based on first primary, non-DCO cancers, this practice may introduce a bias in comparative studies. To minimize this bias, Brenner and Hakulinen (39) have suggested that all primary cancers, regardless of their sequence number, be included in comparative survival analyses. Work is underway to assess the impact of this recommendation on data collected using SEER MP rules (40).
RS estimates were calculated for the purpose of comparing different methods of follow-up. Survival estimates may over- or under-estimate survival if there is a mismatch between the life table and the cancer patient cohort (41). For example, six US registries (three NPCR and three SEER) reported prostate survival in excess of 100%, suggesting that survival among these cancer patients was higher than the expected survival in the general population. This can happen when information on deaths is missing or when cancer patients are successfully treated for their cancer and other comorbidities and adopt a healthy lifestyle. It can also occur when the life tables do not accurately reflect the background mortality experience of the cancer patients. As life expectancy in the United States has been shown to vary by state, using national life tables may not be appropriate for all states (42). A comparative analysis of national- versus state-specific life tables is presented elsewhere in this monograph (43).
In conclusion, survival data from registries utilizing different follow-up procedures may be aggregated and compared across programs and jurisdictions. The choice between using the reported alive survival interval or the presumed alive survival interval, or a combination of the two, will depend on the groups to be compared and the objective of the analysis.
NAACCR survival data, which currently covers greater than 50% of the populations in the United States and Canada, respectively, can make a significant contribution to understanding the cancer burden and evaluating cancer control efforts aimed at reducing the burden at the national, provincial, state, and local levels.
As NAACCR prepares to include population-based cancer survival estimates in Cancer in North America, the following points should be considered, based upon our study results:
There are no financial disclosures from any of the authors.
The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.