|Home | About | Journals | Submit | Contact Us | Français|
Adverse socioeconomic conditions, at both the individual and the neighborhood level, increase the risk of colorectal cancer (CRC) death, but little is known regarding whether CRC survival varies geographically and the extent to which area-level socioeconomic deprivation affects this geographic variation. Using data from the National Institutes of Health (NIH)-AARP Diet and Health Study, the authors examined geographic variation and the role of area-level socioeconomic deprivation in CRC survival. CRC cases (n = 7,024), identified during 1995–2003, were followed for their CRC-specific vital status through 2005 and overall vital status through 2006. Bayesian multilevel survival models showed that there was significant geographic variation in overall (variance = 0.2, 95% confidence interval (CI): 0.1, 0.2) and CRC-specific (variance = 0.3, 95% CI: 0.1, 0.4) risk of death. More socioeconomically deprived neighborhoods had a higher overall risk of death (most deprived quartile vs. least deprived: hazard ratio = 1.2, 95% CI: 1.1, 1.4) and a higher CRC-specific risk of death (most deprived quartile vs. least deprived: hazard ratio = 1.2, 95% CI: 1.1, 1.5). However, neighborhood socioeconomic deprivation did not account for the geographic variation in overall and CRC-specific risks of death. In future studies, investigators should evaluate other neighborhood characteristics to help explain geographic heterogeneity in CRC survival. Such research could facilitate interventions for reducing geographic disparity in CRC survival.
Colorectal cancer (CRC) is the third most common cancer in both men and women, accounting for 10% of cancer incidence and 9% of cancer mortality in the United States (1). Eliminating health disparities in CRC, including geographic disparity, is one of the overarching goals of Healthy People 2010 and is part of the National Cancer Institute's strategic plan (2, 3). Geographic disparity in CRC survival might reflect inequities in small-area socioeconomic conditions and/or prevalence of individual prognostic factors for CRC survival, such as cancer characteristics, patient characteristics, health conditions, and health behaviors (4). Although Aarts et al. (5) reported that socioeconomic conditions at the individual and neighborhood levels were associated with CRC survival, few studies have quantified the geographic variation in CRC survival and examined the extent to which area-level socioeconomic conditions accounted for geographic variation in CRC survival. The previously reported associations between neighborhood-level socioeconomic conditions and CRC survival were based on estimation of the usual hazard ratio (fixed effect), which cannot ideally assess the role of neighborhood-level factors in any geographic heterogeneity because the variability in the random effects among neighborhoods is ignored (6).
Thus, our main purpose in this study was to estimate the extent of the geographic variation in all-cause and CRC-specific survival and to examine whether small-area socioeconomic deprivation accounted for any geographic variation after adjustment for individual characteristics, using data from a large, population-based prospective cohort study, the National Institutes of Health (NIH)-AARP Diet and Health Study. Identification of potential reasons for any geographic disparity may help provide opportunities for intervention to reduce such disparities.
The NIH-AARP Diet and Health Study is a prospective cohort study resulting from collaboration between the NIH and the AARP (formerly known as the American Association of Retired Persons). In 1995–1996, baseline questionnaires were mailed to 3.5 million AARP members living in one of 6 US states (California, Florida, Louisiana, New Jersey, North Carolina, and Pennsylvania) or 2 metropolitan areas (Atlanta, Georgia; Detroit, Michigan). This questionnaire included a 124-item food frequency questionnaire and questions on demographic and lifestyle factors. The study participants were mailed risk-factor questionnaires in 1996–1997 and follow-up questionnaires in 2002–2004. Of the 566,402 eligible subjects, we identified 7,024 primary CRC cases between study enrollment (1995–1996) and December 31, 2003 (International Classification of Diseases for Oncology, Third Edition (ICDO-3), codes C180–C189, C199, and C209). The cancer cases were identified through linkage with state cancer registry databases, which detects approximately 90% of all cancer cases (7). Details on the NIH-AARP Diet and Health Study are provided elsewhere (8). The NIH-AARP Diet and Health Study was approved by the Special Studies Institutional Review Board of the National Cancer Institute, and the current study was approved by the Institutional Review Board of Washington University School of Medicine.
Study participants were followed for CRC-specific vital status until December 31, 2005, and for overall vital status until December 31, 2006. Vital status was ascertained by annual linkage of the cohort to the Social Security Administration Death Master File. Decedents were linked to the National Death Index to ascertain information about the cause of death.
Each patient's residential address from the baseline questionnaire was geocoded using geographic information systems to obtain geographic coordinates (latitude and longitude), which were matched to 2000 US Census TIGER/Line files to locate the residential census tracts of study participants.
We developed a census-tract-level socioeconomic deprivation index score using 21 variables from the 2000 US Census, which were identified from previous studies (9–12), in 6 domains based on the census tracts that contained at least 1 study participant. The domains included 1) education (percentage of the total adult population with less than a high school education and percentage of the total adult population with a college degree); 2) employment and occupation (percentage of unemployed males aged 20 years or more, percentage of unemployed females aged 20 years or more, percentage white-collar, and percentage with low social class); 3) housing conditions (percentage of households with ownership, percentage of vacant households, percentage of households with no less than 1 person per room, percentage of female-headed households with dependent children, median value of all owner-occupied households, percentage of households receiving public assistance, and percentage of households without a car); 4) income and poverty (percentage of households with a low income, percentage of households with an income no less than 400% of the US median household income, median household income in 1999, and percentage of the population below the federal poverty line); 5) racial composition (percentage of the population non-Hispanic black and percentage of the population Hispanic); and 6) residential stability (percentage of residents aged 65 years or more and percentage of persons living in the same residence since 1995) (Table 1). Principal-components factor analysis with varimax rotation was used for variable reduction to evaluate the component structure of the 21 census variables. Cronbach's alpha coefficient was used to evaluate the internal consistency of the selected census variables included in the deprivation index. The index scores were categorized into quartiles based on the distribution among census tracts.
Six blocks of individual-level variables were considered as potential confounders when examining the geographic variation in CRC survival and the role of neighborhood socioeconomic deprivation (Table 2). The baseline questionnaire was used to obtain information on individual demographic and lifestyle factors, including age, sex, race/ethnicity, education, marital status, smoking, heavy alcohol drinking (≥15 g/day), vigorous physical activity (any physical activity that lasted at least 20 minutes and led to increases in breathing or heart rate or working up a sweat in the past 12 months), and self-rated health. Because some of these variables were reported prior to CRC diagnosis, it is possible that they may have changed over time after diagnosis.
Age was categorized as <65 years, 65–69 years, or ≥70 years. Race was dichotomized as non-Hispanic white or other. Educational level was grouped as less than high school, high school or some college, or college graduate. Marital status was dichotomized as married/living as married versus never married, widowed, separated, or divorced. CRC site was categorized as proximal (ICDO-3 codes C180–C184), distal (ICDO-3 codes C185–C187), colon with site unspecified (ICDO-3 codes C188–C189), or rectum (ICDO-3 codes C199 and C209). CRC stage was combined into 4 categories (in situ or local, regional, distant metastases/systematic disease, or unstaged/missing) based on the summary staging system, which was used because of its stability over time (13). CRC grade was categorized as well-differentiated, moderately differentiated, or poorly differentiated/undifferentiated. Body mass index (weight (kg)/height (m)2) was categorized as <25.0, 25.0–29.9, 30.0–34.9, or ≥35.0. Patients were designated as having comorbid conditions if they reported having ever been diagnosed with diabetes (no distinction was made for gestational diabetes), heart disease, or stroke.
Frequency of vigorous physical activity in the past 12 months was categorized as never/rarely, 1–3 times/month, 1–2 times/week, 3–4 times/week, or ≥5 times/week. To reflect exposure history and dosage, current smoking status was categorized as never smoker, former smoker with no more than 20 cigarettes/day, former smoker with more than 20 cigarettes/day, current smoker with no more than 20 cigarettes/day, or current smoker with more than 20 cigarettes/day. Current alcohol consumption, which included beer, wine, and liquor, was computed and grouped into 2 categories: ≥15 g/day versus <15 g/day. Adherence to a Mediterranean dietary pattern during the past 12 months was measured using a composite score (ranging from 0 to 9), which was divided into 3 categories: 0–3, 4–5, or 6–9. Computation of the Mediterranean dietary score has been described in detail elsewhere (14). Higher Mediterranean diet scores indicate healthier dietary behavior. Self-rated health was categorized as excellent or very good, good, or fair or poor.
We applied the Kaplan-Meier product limit estimator to calculate the survival probability for all-cause and CRC-specific survival rates. The log-rank test was used to determine whether survival rates differed statistically between census tracts with different socioeconomic deprivation quartiles. We applied a multilevel Weibull survival model with a census-tract-level random intercept to estimate the geographic variations in and fixed effects of census tract socioeconomic deprivation on overall and CRC-specific survival (15). Time-to-event was defined as the elapsed time from the date of diagnosis to the date of death or the end of study follow-up (December 31, 2005, for CRC-specific death and December 31, 2006, for all-cause death), whichever came first. Six blocks of individual characteristics were used to adjust the multilevel survival models.
Although geographic variation in CRC survival can be quantified directly through census-tract-level variance from the multilevel survival model, this variance is difficult to interpret since it is based on the residual variance for the log hazard ratio and has no meaningful unit. Therefore, we applied 2 measures of heterogeneity, the median hazard ratio (MHR) and the interquartile hazard ratio (IqHR), to describe the magnitude of geographic variation (16, 17). The MHR is a median value which reflects the central tendency of the hazard ratios for CRC survival by comparing 2 study subjects randomly chosen from 2 different census tracts (6). The MHR is always greater than or equal to 1, and a larger value indicates more geographic variation in CRC survival (6). The IqHR is a measure of dispersion of the hazard ratios, which reflects the difference in all-cause or CRC-specific survival between the 25% of all CRC patients who lived in census tracts with the highest risk of death and the 25% of all CRC patients who lived in census tracts with the lowest risk (18). The MHR (6, 19) and IqHR (18) are calculated as
where Z is the z value of the Gaussian distribution at a specified percentage and σ2 is the census-tract-level variance from the multilevel model.
First, we added the quartiles of census-tract deprivation to the multilevel model to calculate their fixed effects on overall and CRC-specific survival. We also used this model to calculate the 80% interval hazard ratio (IHR) (6):
where β is the parameter estimate of the census-tract-level deprivation, with X1 and X2 denoting a higher quartile and the reference quartile of the deprivation, respectively. The IHR denotes the importance of census-tract deprivation relative to the geographic variation remaining among census tracts. If the IHR interval does not include 1.0, it indicates that census-tract-level deprivation does (otherwise does not) substantially account for the amount of geographic variation. A wider IHR range indicates a larger amount of geographic variation (6). Next, we adjusted multilevel survival models for more blocks of individual-level variables to examine the potential alterations of random and fixed effects.
All data were managed in SAS (version 9.1; SAS Institute Inc., Cary, North Carolina). The multilevel survival models were constructed using a Bayesian approach with Markov chain Monte Carlo simulations in WinBUGS (version 1.4.3; Medical Research Council, London, United Kingdom). After the model convergence (20,000 burn-in iterations), parameter estimates were obtained from 20,000 additional iterations. Model-fitting was evaluated using the deviance information criterion, with a smaller value indicating better model fit (20).
Finally, we performed a sensitivity analysis to examine the geographic variation in and effect of neighborhood deprivation through adjustment for type of treatment in 5,683 CRC cases that had treatment information available. Receipt of chemotherapy, radiation therapy, and surgery were dichotomized as yes or no.
Principal-components common factor analysis indicated that the first common factor explained 43.5% of the total variance (Table 1). Eight census variables, including percentage of unemployed males aged 20 years or more, percentage of unemployed females aged 20 years or more, percentage of female-headed households with dependent children, percentage of households on public assistance, percentage of households without a car, percentage of households with low income, percentage of the population below the federal poverty line, and percentage of the population non-Hispanic black, had substantially higher factor loadings on this first factor (Table 1). These 8 census variables indicated a high internal consistency (Cronbach's α = 0.92) and were standardized and comprised the census tract-level socioeconomic deprivation score, which was calculated by weighting factor scoring coefficients.
Table 2 shows that higher percentages of females; nonwhites; less educated persons; persons who were unmarried, widowed, or living alone; persons with distant CRC; obese persons; persons with diabetes, heart disease, or stroke; persons with lower physical activity levels; current smokers; persons with lower Mediterranean dietary scores; and persons with poorer self-rated health resided in more socioeconomically deprived census tracts.
Among 7,024 primary CRC cases, 2,468 deaths (1,440 from CRC and 1,028 from other causes) were observed during the study period. Kaplan-Meier analysis (Figure 1) indicated that 11-year overall survival rates for the deprivation quartiles were statistically different (log-rank test, P = 0.001), with survival for the least deprived quartile (60.9%) being higher than that for the more deprived quartiles (53.4%, 54.0%, and 52.4%). The 10-year CRC-specific survival rates for the deprivation quartiles also were statistically different, although differences were relatively small (76.1% for the least deprived quartile and 73.5%, 73.2%, and 74.1%, respectively, for the more deprived quartiles; log-rank test, P = 0.008).
Model 1 in Table 3 shows that CRC patients who lived in census tracts characterized by deprivation quartiles 2–4 were more likely to die from any cause than patients in the least deprived census tracts after adjustment for age and sex. For example, persons who lived in census tracts with the most deprivation were 1.2 times (95% confidence interval (CI): 1.1, 1.4) more likely to die from any cause than persons in the least deprived census tracts. A similar association was found for CRC-specific risk of death. Model 1 also shows that overall risk of death (variance = 0.2) and CRC-specific survival (variance = 0.3) varied geographically. The MHR indicated that the overall risk of death was 1.5 times (95% CI: 1.3, 1.6) higher and the risk of CRC-specific death was 1.6 times (95% CI: 1.4, 1.9) higher, on average, when comparing a CRC patient who lived in a more deprived census tract with another CRC patient with the same individual characteristics (age and sex, in this model) who lived in a less deprived census tract. The IqHR in model 1 indicates that the overall risk of death is 2.5 times (95% CI: 1.9, 3.1) higher and the risk of CRC-specific death is 3.3 times (95% CI: 2.2, 4.4) higher when comparing the 25% of all CRC patients who lived in census tracts with the highest mortality risk to the 25% of CRC patients who lived in census tracts with the lowest mortality risk.
Next, we added various blocks of individual-level characteristics to model 1 (Table 3). Results showed that the association between census-tract deprivation and the risk of death was attenuated when models were adjusted for the individual-level characteristics (models 2–7), suggesting that the 6 groups of individual-level factors partially explained the effect of census-tract deprivation on overall risk of death. Similar results were found for CRC-specific risk of death. For both overall and CRC-specific risks of death, the significance of geographic variations was not altered, as evidenced by the stability of the MHR and the IqHR across the 7 models.
All of the 80% IHR ranges contained the value of 1 (data not shown). This indicates that census-tract-level socioeconomic deprivation did not account for a significant amount of census-tract heterogeneity in all-cause and CRC-specific survival.
Sensitivity analysis indicated that adding type of CRC treatment to the models did not alter the findings regarding the geographic variation in and the effect of census tract deprivation on all-cause or CRC-specific survival (data not shown). Participants who were aged ≥70 years or had poorer self-rated health were more likely to be without treatment data (χ2 test, P < 0.001). Additionally, missing treatment data partly resulted from unavailability of information from Michigan.
Our main purpose in this study was to 1) examine the effect of small-area socioeconomic deprivation, 2) quantify the geographic variation in CRC survival, and 3) determine whether small-area deprivation accounts for geographic variation in CRC survival. Our results showed that census-tract-level socioeconomic deprivation increased the mortality risk of CRC patients regardless of random effects. Previous studies have consistently indicated that survival rates were lower among CRC patients living under more deprived socioeconomic conditions (21–25). These studies suggest that area-level socioeconomic effects may result from the geographic distribution of individual prognostic factors, including cancer stage, comorbidity, and type of treatment received (5). In another study, Singh et al. (26) suggested that CRC stage at diagnosis could partially explain the disparity in CRC survival. Other studies showed that comorbid illnesses and lifestyle characteristics contributed to most of the excess 30-day mortality risk following CRC surgery among patients with low socioeconomic status (27, 28). Meanwhile, some investigators also reported significant associations between low socioeconomic status and worse CRC survival even after controlling for cancer stage and selected comorbid illnesses and treatments (29–31). In a study using data on the NIH-AARP cohort, Mitrou et al. (14) reported that the Mediterranean dietary pattern was strongly associated with all-cause mortality. Since the association of census-tract socioeconomic deprivation with CRC survival was attenuated after adjustment for individual characteristics, the census-tract socioeconomic deprivation effect could partially result from individual characteristics, including CRC site, stage, grade, comorbid illnesses, physical activity, smoking, Mediterranean dietary pattern, and self-rated health status. Our study also indicated that all-cause and CRC-specific survival varied geographically. Based on the characteristics of computed 80% IHRs, small-area socioeconomic deprivation did not substantially contribute to the geographic variation in risks of all-cause and CRC-specific death. This implies that other unmeasured area-level characteristics that may play a role include spatial accessibility to medical services, neighborhood disorganization (32), perceived neighborhood safety, and neighborhood social capital (33).
There are at least 2 major strengths of this study. First, most previous studies applied a single neighborhood-level deprivation indicator, such as poverty rate and education. Selecting different socioeconomic indicators may produce different findings in the relation between neighborhood socioeconomic conditions and CRC survival (34). We constructed a more comprehensive composite index of neighborhood socioeconomic deprivation. This index captures a broader concept of deprivation than an existing deprivation indicator alone, such as poverty rate (35). Since this deprivation index is based on the specific study region, it is typically applied to assess the neighborhoods in which study participants are residing. For a study with different neighborhoods, the deprivation index should be rebuilt, because different areas may have different characteristics in deprivation structure. Second, in most previous studies, investigators applied a single-level Cox proportional hazards model which cannot model the random effect (e.g., geographic variation) and ignored the intercorrelation of CRC patients within neighborhoods. We overcame these limitations by using a multilevel survival model. A multilevel design can help investigators quantify the geographic variation, examine relevant neighborhood-level characteristics that may be associated with CRC-related mortality, and obtain evidence for targeting disadvantaged neighborhoods to improve the survival of CRC patients. Additionally, we applied 2 measures of heterogeneity (MHR and IqHR) to facilitate quantifying geographic variation and comparing the neighborhood deprivation effect with the magnitude of geographic variation in a meaningful way. In previous multilevel studies, fixed effects of neighborhood-level factors were frequently used to assess their effects on the outcomes; however, it is unsatisfactory to use a usual hazard ratio interpretation, which is typically applied to assess individual factors that vary within neighborhood (with a constant random effect), since neighborhood-level effect includes the difference in random effects between neighborhoods, which cannot be taken into account by the usual hazard ratio interpretations. The complementary application of the 80% IHR overcomes this difficulty through integration of fixed and random effects (6). It makes it more reasonable to directly compare 2 persons from any 2 different neighborhoods.
Our study had some limitations. The data on treatment did not include specific details about type of surgery, chemotherapy, and radiation, which may account for differences in survival. Misclassification may have resulted from the lack of centralized pathology review. This information may be outweighed by the benefit of using a population-based study. Furthermore, CRC patients may have moved after diagnosis. This is unlikely to have affected the findings to any large extent because the NIH-AARP Diet and Health Study only enrolled persons aged 50–71 years at baseline, and older persons are less likely to move than younger persons. The extent to which geographic variation in misclassification of the underlying cause of death may have affected our findings cannot be determined. However, coding of CRC as an underlying cause of death on death certificates is over 90% accurate (36). Additionally, there was a higher proportion of non-Hispanic whites among study participants compared with the general population in the study areas, because this racial group was overrepresented in the AARP population. Our findings can be generalized to the AARP population. Our study area included 8 sites, and an analysis stratified by site could provide insights into site-specific scenarios. Unfortunately, we were unable to perform such a sensitivity analysis because of small sample sizes in some of the study regions.
In conclusion, important small-area geographic variations in all-cause and CRC-specific risks of death were present among CRC patients in the NIH-AARP Diet and Health Study cohort. Census-tract-level deprivation was associated with higher mortality risk, but it did not account for the geographic variation in CRC survival. This suggests that if public health interventions targeted patients living in socioeconomically deprived neighborhoods, this might reduce the risk of death among CRC patients if one were ignoring the influence of random effects (assuming a causal association), but this would not significantly affect the geographic disparity in mortality risk that appears to exist. Future studies should investigate other area-level characteristics that might explain the geographic disparity in CRC survival.
Author affiliations: Department of Medicine, School of Medicine, Washington University, St. Louis, Missouri (Min Lian, Mario Schootman); Department of Family Medicine and Community Health, Medical School, University of Massachusetts, Worcester, Massachusetts (Chyke A. Doubeni); Division of Cancer Epidemiology and Genetics, National Cancer Institute, Bethesda, Maryland (Yikyung Park, Jacqueline M. Major, Barry I. Graubard, Arthur Schatzkin); Department of Psychiatry, Medical School, University of Massachusetts, Worcester, Massachusetts (Rosalie A. Torres Stone); Department of Internal Medicine, College of Medicine, Howard University, Washington, DC (Adeyinka O. Laiyemo); and AARP, Washington, DC (Albert R. Hollenbeck).
This research was supported in part by the Intramural Research Program of the National Cancer Institute, National Institutes of Health. The work of Drs. Min Lian and Mario Schootman was supported in part by a research award (R01CA137750) and a cancer center support award (P30CA91842) from the National Cancer Institute. Dr. Chyke A. Doubeni's work was supported by career development awards (K01CA127118 and R01CA151736) from the National Cancer Institute.
Cancer incidence data from the Atlanta, Georgia, metropolitan area were collected by the Georgia Center for Cancer Statistics, Department of Epidemiology, Rollins School of Public Health, Emory University. Cancer incidence data from California were collected by the California Department of Health Services, Cancer Surveillance Section. Cancer incidence data from the Detroit, Michigan, metropolitan area were collected by the Michigan Cancer Surveillance Program, Community Health Administration, State of Michigan. The Florida cancer incidence data used in this report were collected by the Florida Cancer Data System under contract with the Florida Department of Health. (The views expressed herein are solely those of the authors and do not necessarily reflect those of the Florida Cancer Data System or the Florida Department of Health.) Cancer incidence data from Louisiana were collected by the Louisiana Tumor Registry, Louisiana State University Medical Center in New Orleans. Cancer incidence data from New Jersey were collected by the New Jersey State Cancer Registry, Cancer Epidemiology Services, New Jersey State Department of Health and Senior Services. Cancer incidence data from North Carolina were collected by the North Carolina Central Cancer Registry. Cancer incidence data from Pennsylvania were supplied by the Division of Health Statistics and Research, Pennsylvania Department of Health, Harrisburg, Pennsylvania. (The Pennsylvania Department of Health specifically disclaims responsibility for any analyses, interpretations, or conclusions.) Cancer incidence data from Arizona were collected by the Arizona Cancer Registry, Division of Public Health Services, Arizona Department of Health Services. Cancer incidence data from Texas were collected by the Texas Cancer Registry, Cancer Epidemiology and Surveillance Branch, Texas Department of State Health Services. Cancer incidence data from Nevada were collected by the Nevada Central Cancer Registry, Center for Health Data and Research, Bureau of Health Planning and Statistics, State Health Division, State of Nevada Department of Health and Human Services.
The authors thank Sigurd Hermansen and Kerry Grace Morrissey of Westat, Inc. (Rockville, Maryland) for study outcomes ascertainment and management and Leslie Carroll of Information Management Services (Silver Spring, Maryland) for data support and analysis. They also thank the Alvin J. Siteman Cancer Center at Barnes-Jewish Hospital and Washington University School of Medicine in St. Louis, Missouri, for use of the Health Behavior, Communication and Outreach Core shared resource.
Conflict of interest: none declared.