1.  Use of Imputed Population-based Cancer Registry Data as a Method of Accounting for Missing Information: Application to Estrogen Receptor Status for Breast Cancer 
American Journal of Epidemiology  2012;176(4):347-356.
The National Cancer Institute's Surveillance, Epidemiology, and End Results (SEER) Program provides a rich source of data stratified according to tumor biomarkers that play an important role in cancer surveillance research. These data are useful for analyzing trends in cancer incidence and survival. These tumor markers, however, are often prone to missing observations. To address the problem of missing data, the authors employed sequential regression multivariate imputation for breast cancer variables, with a particular focus on estrogen receptor status, using data from 13 SEER registries covering the period 1992–2007. In this paper, they present an approach to accounting for missing information through the creation of imputed data sets that can be analyzed using existing software (e.g., SEER*Stat) developed for analyzing cancer registry data. Bias in age-adjusted trends in female breast cancer incidence is shown graphically before and after imputation of estrogen receptor status, stratified by age and race. The imputed data set will be made available in SEER*Stat ( to facilitate accurate estimation of breast cancer incidence trends. To ensure that the imputed data set is used correctly, the authors provide detailed, step-by-step instructions for conducting analyses. This is the first time that a nationally representative, population-based cancer registry data set has been imputed and made available to researchers for conducting a variety of analyses of breast cancer incidence trends.
PMCID: PMC3491971  PMID: 22842721
breast neoplasms; imputation; incidence; missing data; receptors, estrogen
2.  Use of Colony-Stimulating Factors With Chemotherapy: Opportunities for Cost Savings and Improved Outcomes 
Myeloid colony-stimulating factors (CSFs) decrease the risk of febrile neutropenia (FN) from high-risk chemotherapy regimens administered to patients at 20% or greater risk of FN, but little is known about their use in clinical practice. We evaluated CSF use in a multiregional population-based cohort of lung and colorectal cancer patients (N = 1849). Only 17% (95% confidence interval [CI] = 8% to 26%) patients treated with high-risk chemotherapy regimens received CSFs, compared with 18% (95% CI = 16% to 20%) and 10% (95% CI = 8% to 12%) of patients treated with intermediate- (10%–20% risk of FN) and low-risk (<10% risk of FN) chemotherapy regimens, respectively. Using a generalized estimating equation model, we found that enrollment in a health maintenance organization (HMO) was strongly associated with a lower adjusted odds of discretionary CSF use, compared with non-HMO patients (odds ratio = 0.44, 95% CI = 0.32 to 0.60, P < .001). All statistical tests were two-sided. Overall, 96% (95% CI = 93% to 98%) of CSFs were administered in scenarios where CSF therapy is not recommended by evidence-based guidelines. This finding suggests that policies to decrease CSF use in patients at lower or intermediate risk of FN may yield substantial cost savings without compromising patient outcomes.
PMCID: PMC3119647  PMID: 21670423
3.  Improved Estimates of Cancer-Specific Survival Rates From Population-Based Data 
Accurate estimates of cancer survival are important for assessing optimal patient care and prognosis. Evaluation of these estimates via relative survival (a ratio of observed and expected survival rates) requires a population life table that is matched to the cancer population by age, sex, race and/or ethnicity, socioeconomic status, and ideally risk factors for the cancer under examination. Because life tables for all subgroups in a study may be unavailable, we investigated whether cause-specific survival could be used as an alternative for relative survival.
We used data from the Surveillance, Epidemiology, and End Results Program for 2 330 905 cancer patients from January 1, 1992, through December 31, 2004. We defined cancer-specific deaths according to the following variables: cause of death, only one tumor or the first of multiple tumors, site of the original cancer diagnosis, and comorbidities. Estimates of relative survival and cause-specific survival that were derived by use of an actuarial method were compared.
Among breast cancer patients who were white, black, or of Asian or Pacific Islander descent and who were older than 65 years, estimates of 5-year relative survival (107.5%, 106.6%, and 103.0%, respectively) were higher than estimates of 5-year cause-specific survival (98.6%, 95% confidence interval [CI] = 98.4% to 98.8%; 97.4%, 95% CI = 96.2% to 98.2%; and 99.2%, 95% CI = 98.4%, 99.6%, respectively). Relative survival methods likely underestimated rates for cancers of the oral cavity and pharynx (eg, for white cancer patients aged ≥65 years, relative survival = 54.2%, 95% CI = 53.1% to 55.3%, and cause-specific survival = 60.1%, 95% CI = 59.1% to 60.9%) and the lung and bronchus (eg, for black cancer patients aged ≥65 years, relative survival = 10.5%, 95% CI = 9.9% to 11.2%, and cause-specific survival = 11.9%, 95% CI = 11.2 % to 12.6%), largely because of mismatches between the population with these diseases and the population used to derive the life table. Socioeconomic differences between groups with low and high status in relative survival estimates appeared to be inflated (eg, corpus and uterus socioeconomic status gradient was 13.3% by relative survival methods and 8.8% by cause-specific survival methods).
Although accuracy of the cause of death on a death certificate can be problematic for cause-specific survival estimates, cause-specific survival methods may be an alternative to relative survival methods when suitable life tables are not available.
PMCID: PMC2957430  PMID: 20937991
4.  The Impact of Underreported Veterans Affairs Data on National Cancer Statistics: Analysis Using Population-Based SEER Registries 
Reduced cancer reporting by the US Department of Veterans Affairs (VA) hospitals in 2007 (for patients diagnosed through 2005) impacted the most recent US cancer surveillance data. To quantify the impact of the reduced VA reporting on cancer incidence and trends produced by the Surveillance, Epidemiology, and End Results Program, we estimated numbers of missing VA patients in 2005 by sex, age, race, selected cancer sites, and registry and calculated adjustment factors to correct for the 2005 incidence rates and trends. Based on our adjustment factors, we estimated that as a result of the underreporting, the overall cancer burden was underestimated by 1.6% for males and 0.05% for females. For males, the percentage of patients missing ranged from 2.5% for liver cancer to 0.4% for melanoma of the skin. For age-adjusted male overall cancer incidence rates, the adjustment factors were 1.015, 1.012, and 1.035 for all races, white males, and black males, respectively. Modest changes in long-term incidence trends were observed, particularly in black males.
PMCID: PMC2720708  PMID: 19318639

