Search tips
Search criteria 


Logo of nihpaAbout Author manuscriptsSubmit a manuscriptHHS Public Access; Author Manuscript; Accepted for publication in peer reviewed journal;
Med Care. Author manuscript; available in PMC 2010 July 1.
Published in final edited form as:
PMCID: PMC2701975

The Contribution of Longitudinal Comorbidity Measurements to Survival Analysis

C. Y. Wang, Ph.D.,* Laura-Mae Baldwin, M.D., M.P.H., Barry G. Saver, M.D., M.P.H, Sharon A. Dobie, M.C.P., M.D., Pamela K. Green, Ph.D., Yong Cai, Ph.D.,§ and Carrie N. Klabunde, Ph.D.



Many clinical and health services research studies are longitudinal, raising questions about how best to use an individual’s comorbidity measurements over time to predict survival.


To evaluate the performance of different approaches to longitudinal comorbidity measurement in predicting survival, and to examine strategies for addressing the inevitable issue of missing data.

Research Design

Retrospective cohort study using Cox regression analysis to examine the association between various Romano-Charlson comorbidity measures and survival.


50,000 cancer-free individuals age 66 or older enrolled in Medicare between 991 and 1999 for at least 1 year.


The best fitting model combined both time independent baseline comorbidity and the time dependent prior year comorbidity measure. The worst fitting model included baseline comorbidity only. Overall, the models fit best when using the “rolling” comorbidity measures that assumed chronic conditions persisted rather than measures using only prior year’s recorded diagnoses.


Longitudinal comorbidity is an important predictor of survival, and investigators should make use of individuals’ longitudinal comorbidity data in their regression modeling.

Keywords: Claims data, comorbidity, missing data, Medicare


Comorbidity, a measure of an individual’s underlying health status, is a key variable in health services research because of its important role in health care utilization, prognosis, and outcomes. Numerous comorbidity measures have been developed for research with administrative claims data. The Charlson comorbidity index and its derivatives consolidate individual comorbid conditions into a single, predictive variable.14 Other indices consist of individual condition indicators predictive of outcomes such as cost or mortality.57 If a research study’s timeframe is short, a point in time (time independent) comorbidity measure is sufficient. Many studies are longitudinal, raising questions about how best to measure comorbidity over time.

Simple strategies for using point in time comorbidity measures in longitudinal analyses include time independent baseline comorbidity measured prior to the first observation period, and time dependent comorbidity measured just prior to each observation period.8 These comorbidity measures do not reflect an individual’s comorbid disease history, however. The influence of comorbidity on outcome may differ for an individual who enters a study without comorbidity, then has a catastrophic event resulting in multiple comorbid conditions, than for someone with initial substantial comorbidity compounded with subsequent disease over the study period.

To date, only a few studies have included longitudinal comorbidity measures in their data analyses.915 In survival analysis, though investigators routinely use time dependent covariates, comorbidity is most commonly used as a time independent baseline measure.1115 This study supplements the toolbox of researchers who use secondary databases by 1) proposing several approaches to longitudinal comorbidity measurement, such as average comorbidity scores, “rolling” comorbidity scores in which chronic conditions are carried forward from the initial qualifying year, even if a subsequent year’s claims do not include the diagnosis, and inclusion of baseline and subsequent comorbidity variables, and 2) examining the performance of these measures in predicting survival, a commonly-used outcome measures in longitudinal studies. It also examines strategies for addressing the inevitable issue of missing data in a longitudinal data set. We hypothesized that longitudinal comorbidity measures that account for an individual’s progression of disease would be more effective predictors of survival than point in time measures.



We used 1991–1999 Medicare claims data for control subjects from the Surveillance, Epidemiology, and End Results (SEER)-Medicare database.16 Control subjects were individuals in the annual 5% random sample of Medicare beneficiaries who lived in SEER program areas and had no cancers reported to the SEER program during the study years. Individuals in SEER areas are generally comparable to the U.S. population, though somewhat more urban and more likely to be foreign-born.17 Medicare claims were obtained from Medicare Provider Analysis and Review (MedPAR, Part A claims), Carrier Claims (physician/supplier Part B bills) and Outpatient Claims (institutional outpatient Part B bills). When this study began, the SEER program included 5 state registries (Connecticut, Hawaii, Iowa, New Mexico, and Utah) and 7 county-based registries (Atlanta, Detroit, rural Georgia, Los Angeles, San Francisco, San Jose, and Seattle/Puget Sound) in 4 other states. Medicare data include patient sociodemographic characteristics, enrollment dates, health maintenance organization (HMO) membership, date of death, if applicable, and, for fee-for-service beneficiaries, billed claims that include the diagnoses for care provided in hospitals, physician offices, and clinics.

Study Population

We randomly selected 50,000 individuals ages 66 or older and enrolled in Parts A and B fee-for-service Medicare for a full calendar year at any time between 1991 and 1999. All 50,000 individuals had at least 1 year of comorbidity data and were included in the study. Comorbidity was measured in subsequent calendar years if an individual had a full 12 months of enrollment in Parts A and B fee-for-service Medicare in that year. The sample sizes for each year varied depending on the availability of comorbidity data—from 31,058 to 37,227. Nearly 10% of the original sample of 50,000 contributed only 1 year of comorbidity data or had at least 1 year of missing data prior to death or the end of the study period. To ensure that our results represented a population with longitudinal data, we created a subsample in which each individual had at least 2 consecutive years of comorbidity data and no subsequent missing years of comorbidity data. This subsample contained 44,016 individuals.

Measurement of Comorbidity

The primary independent variable of interest in this study is comorbidity. The Romano adaptation of the Charlson weighted comorbidity index was used to calculate annual comorbidity scores for each year between 1991 and 1999 in which a study subject was enrolled for all 12 months in fee for service Medicare parts A and B.4 We chose the Charlson index because it is widely used by health services researchers, is available in the public domain, and permits calculation of rolling comorbidity scores.18 Baldwinet al. demonstrated that the Charlson index, Adjusted Clinical Groups (ACGs), and Diagnostic Cost Groups (DxCGs) were essentially equivalent in their ability to predict mortality and treatment type in colon cancer patients.19 The Charlson index consists of 19 comorbid conditions weighted according to the degree to which they predicted mortality among an inpatient cohort, then summed to produce an index score. These conditions included, in order of frequency during the middle year of the study (1995), diabetes (8.2%), chronic pulmonary disease (6.7%), congestive heart failure (6.0%), cerebrovascular disease (3.9%), peripheral vascular disease (3.7%), dementia (2.5%), diabetes with chronic complications (1.7%), prior myocardial infarction (1.5%), peptic ulcer (1.1%), connective tissue disease (1.1%), renal disease (0.9%), acute myocardial infarction (0.8%), hemiplegia or paraplegia (0.3%), mild liver disease (0.2%), moderate or severe liver disease (0.1%), and AIDS (0.03%). Romano adapted this index for use with administrative claims by identifying the ICD-9-CM codes corresponding to the measure’s comorbid conditions. The index has been further adapted to incorporate outpatient as well as inpatient-only diagnoses.3 Because we were using a non-cancer control population from the SEER-Medicare claims database, we excluded “any tumor,” “metastatic solid tumor,” “lymphoma,” and “leukemia” from the calculation of the Romano-Charlson index score as these likely represented rule out diagnoses or cancers diagnosed prior to living in a SEER registry area. A comorbid condition was identified if its corresponding ICD-9-CM codes appeared more than once in the Carrier (physician/supplier) and Outpatient claims at least 31 days apart, or appeared at least once in the MedPAR (inpatient) claims. Comorbidity rates calculated using this “rule-out” algorithm were more comparable to national estimates for selected conditions, and corresponded more closely with hospital record review than rates calculated without claims restrictions.20

We calculated 2 Romano-Charlson annual comorbidity scores—one using only the claims data from the prior year, the other as a “rolling” comorbidity score in which chronic medical conditions (except peptic ulcer disease, counted as a comorbidity only in the years it was diagnosed) were considered ongoing once identified. In the rolling comorbidity definition, a patient with a diagnosis of diabetes in 1992 was considered to have this diabetes diagnosis in each ensuing year regardless of whether the diagnosis reappeared in the claims data. In this study, 33.7% of subjects had at least one rolling comorbidity score that differed from the standard prior year comorbidity scores.

General Modeling Approach

The study’s outcome of interest is all cause death. The day, month, and year of each individual’s death was available between January 1,1992, and December 31, 2000. We used Cox regression models with time to death through December 31, 2000 as the dependent variable to examine the influence of various comorbidity measures. Age, gender, and race/ethnicity were included as potential confounding variables. Age was treated as a time-dependent continuous variable. Race/ethnicity, available from Medicare files, was categorized as African American, Asian, Caucasian, Hispanic, or Other.

Modeling Longitudinal Comorbidity Measures

We developed 11 approaches to incorporating comorbidity as a covariate in our survival analysis. They include the following measures for a subject in year t:

  1. Baseline comorbidity: the time independent comorbidity score in the year prior to the first observation year.
  2. Prior year’s comorbidity: the time dependent comorbidity score in year (t−1).
  3. Prior year’s rolling comorbidity: the time dependent rolling comorbidity score in year (t−1).
  4. Baseline and prior year’s comorbidity: the time independent baseline comorbidity score and the time dependent comorbidity score in year (t−1).
  5. Baseline and prior year’s rolling comorbidity: the time independent baseline comorbidity score and the time dependent rolling comorbidity score in year (t−1).
    In (A), all observation years for each subject can be included in the models because each subject had an initial year of comorbidity data. In (B) – (E), subjects’ observation years were dropped if their comorbidity measures were missing in the prior year due to lack of eligibility in fee for service parts A and B Medicare. In models (F) – (G), described below, missing data were imputed with the most recent available mean comorbidity score. In models (H) – (I), missing data were imputed using the most recent available comorbidity or rolling comorbidity scores using the last observation carry forward (LOCF) method. In models (J) – (K), missing data were imputed using the regression calibration (RC) imputation method.21 The RC method has been shown to perform well with missing data and measurement error if the relative hazard parameter for the missing covariate is not large22,23, which is the case in our application. The RC estimator assumes that data are missing at random, namely that the missing data probability is a function of the observed data.
  6. Mean comorbidity: a time dependent function of the average of the available comorbidity scores before year t.
  7. Baseline and mean comorbidity: the baseline comorbidity score and the average of the available comorbidity scores before year t.
  8. Baseline and the most recent comorbidity: the baseline comorbidity score and the comorbidity score in most recent year for which a comorbidity score was available.
  9. Baseline and the most recent rolling comorbidity: the baseline comorbidity score and the rolling comorbidity score in the most recent year a comorbidity score was available.
  10. Baseline and prior year comorbidity with RC imputation: the baseline comorbidity score and the comorbidity score from the prior year, or, if prior year comorbidity score is missing, imputation of the missing prior year comorbidity score using its conditional expectation given observed covariates and the last observed comorbidity.
  11. Baseline and prior year rolling comorbidity with RC imputation: the baseline comorbidity score and the rolling comorbidity score from the prior year, or, if prior year rolling comorbidity score is missing, imputation of the missing prior year rolling comorbidity score using its conditional expectation given observed covariates and the last observed rolling comorbidity.

Relative Hazard Function

We developed the following relative hazard functions for modeling and comparing our longitudinal comorbidity measures. For subject i = 1, …, 50,000, let the longitudinal comorbidity be denoted by Si(t). Let the first year of a subject i be denoted by ti0, at which a subject is selected into the study cohort. More than 50% of the study subjects had 1991 as their first year. The baseline comorbidity for subject i is Si(ti0), which is available for the entire cohort of 50,000 subjects. Let Xi be the vector of age, gender and race/ethnicity for subject i. The Cox proportional hazards function consists of a baseline hazard function and a relative hazard function. When we visually examined the Kaplan-Meyer survival curves for different subgroups, there was no clear violation of the proportional hazards model assumption. However, given the very large study population, the assumption violation was statistically significant. To ensure that the proportional hazards model is used appropriately, we included interactions ofall the covariates with time in our models. For the 11 comorbidity modeling approaches above, their main difference is the relative hazard function. Model (A)’s simple model for the relative hazard function is


The second comorbidity model, (B), is based on the following relative hazard function:


The relative hazard function for model (C) is similar to (B) above, but with the comorbidity score at year (t−1) being replaced by the rolling comorbidity score at year (t−1). The relative hazard function for models (D) and (E) can be expressed similarly to (B) and (C) with the addition of baseline comorbidity. Some subjects may have missing comorbidity scores in certain years, and under (B) – (E), the relative hazard parameters are estimated based on available comorbidity scores only, with no imputation of data. The relative hazard functions for models (F) to (K) are variations on the above noted functions, though models (F) – (G) use mean rather than prior year comorbidity scores. For instance, the relative hazard function for model (G) can be written as


where Si(t−1) is the average of available comorbidity scores between years ti0 and year (t−1). An important difference between (B) to (E) versus (F) to (G) is that the former does not include year t in the survival analysis if the comorbidity score at year (t1) is not available, while the latter includes year t in the survival analysis even if some prior comorbidity scores in year (t1) are missing by using the average of the available comorbidity scores before time t.

We consider models (H) and (I) as modified approaches to models (D) and (E) by imputing missing data using the LOCF method. Similarly, models (J) and (K) are modified approaches to models (D) and (E) by imputing missing data using the RC method. They are important and practical models in dealing with incomplete longitudinal data.


We first calculated summary statistics to describe the demographic characteristics, mean comorbidity scores, and mean number of observation years for our study population. We computed annual death rates for our annual cohorts, stratified by presence or absence of baseline comorbidity. We then used Cox regression analysis to assess the contribution of each comorbidity measure to predicting survival after controlling for age, gender, and race/ethnicity. The likelihood ratio test and the Akaike Information Criterion (AIC) were used to evaluate the overall goodness of fit of the Cox regression models with different comorbidity measures. The AIC allows descriptive comparison of non-nested models, and accounts for the number of covariates in the model. Models with larger likelihood ratio scores and smaller AIC scores were considered to have better fit. As the p-values for all the models were ≤ 0.0001, they did not provide useful information for assessing model fit. We conducted the Cox regression analysis on our original study sample, and the subsample with individuals continuously enrolled in fee-for-service parts A and B Medicare and with at least 2 consecutive years of comorbidity. Using the many years of longitudinal comorbidity measures in our large study cohort, we found that baseline comorbidity does not fully predict the most recent available comorbidity. Although each individual in our study may have many comorbidity measures, there were at most 2 comorbidity measures used to predict survival for each observation year in the estimating procedure.


Our sample had a mean age of 72.1: 61.6% were female, and 82.9% Caucasian (Table 1). The survival rates were similar for men (70%) and women (71%). Asians and Hispanics had the highest survival rates, likely because they were younger. African American and “other” study subjects had the lowest, and Caucasians intermediate survival rates. Survivors throughout the study period were younger than non-survivors (mean 69.8 vs. 77.5), and had lower comorbidity scores (mean score 0.23 vs. 0.69). Figure 1 presents comorbidity and rolling comorbidity scores for 8 randomly selected subjects who died during the study period.

Figure 1
Longitudinal Comorbidity and Rolling Comorbidity Measurements for 8 Selected Non-survivors by Year.
Table 1
Characteristics of Survivors Versus Non-Survivors

In Table 2, we present the annual numbers of subjects at risk of death, the number of subjects who died each year, and the unadjusted annual death rates. Annual death rates increased in later years, as subjects aged. We also present annual death rates for the group with no comorbidity at baseline, and for the group with comorbidity at baseline. Overall, the unadjusted death rate of subjects with comorbidity was 2 to 3 times that of subjects without comorbidity.

Table 2
Unadjusted Death Rates by Year and Comorbidity

Table 3 shows the results from the Cox regression models, with death as the outcome, the 11 different comorbidity measures as the variables of interest, and age, gender, and race/ethnicity as the covariates. The coefficient estimates and their standard errors, the relative hazard estimates and their confidence intervals, the likelihood ratio scores, and the AICs are reported for each model. The likelihood ratio scores can be compared between models (A) – (E) and between models (F) – (K). The differences in total person-years [286,265 for (A) – (E) and 313,160 for (F) – (K)] limits comparison across these model sets. The larger sample size for models (F) – (K) reflects imputed missing comorbidity scores. Generally, increasing the sample size increases the likelihood ratio score and the AIC, even if the model fit is equivalent, as is evident in Table 3.

Table 3
Association of Various Measures of Comorbidity with Survival Using Cox Regression Modeling

Table 3 demonstrates that (A), which includes only baseline comorbidity and does not account for comorbidity change over time, is the least well fitting model, with its smallest likelihood ratio score and largest AIC. Models including a comorbidity measure from the year prior to observation [(B) and (C)] fit much better than those with baseline comorbidity, and models combining both baseline and the prior year’s comorbidity [(D) and (E)] improve the fit over models with only the prior year’s comorbidity [(B) and (C)]. Overall, the models fit best when using rolling comorbidity measures. Therefore, among (A) – (E), the best model for comorbidity is (E). This is consistent with findings from models (F) – (K), in which the best fitting model is (I), which uses baseline and the most recent available rolling comorbidity. The LOCF imputation (I) and the RC imputation (K) models produce nearly identical results. Models using mean comorbidity scores do not fit as well as models with comorbidity measures using diagnoses from the year prior to observation only.

Table 4 presents results for the 44,016 study subjects with continuous enrollment and at least 2 years of comorbidity data. Consistent with the findings in Table 3, the best fitting model is (E). Because this analysis includes only subjects with complete data, the original models (J) and (K) in Table 3 would be redundant with models (D) and (E). Therefore, (H) – (K) are not included in Table 4.

Table 4
Association of Various Measures of Comorbidity with Survival Using Cox Regression Modeling Among Subjects with at Least 2 Years of Continuous Enrollment


This research demonstrates that using longitudinal comorbidity data, when available, is associated with the best fit of Cox regression models predicting survival. Our analyses suggest that a combination of baseline comorbidity and the last year’s rolling comorbidity is the best longitudinal comorbidity measure. These findings are consistent with a limited literature examining use of longitudinal comorbidity data for predicting survival.9,10,24 Grunau et al. found that among individuals with acute myocardial infarction, adjusting for comorbidity after the index hospitalization improved the prediction of survival more than solely adjusting for comorbidity identified at the index hospitalization. Stukenborg et al. found that for five medical conditions, adjustment for comorbidity prior to the index hospitalization and comorbidity at the time of the index hospitalization somewhat improved the regression model’s ability to predict probability of death.

In this study, the difference in the likelihood ratio test and AIC between the regression models with and without baseline comorbidity is small. Differences are even smaller for the models restricted to individuals with at least 2 years of continuous enrollment. One could argue that using only the prior year rolling comorbidity variable is sufficient, though the addition of an easily computed baseline comorbidity requires minimal effort.

The “rolling” version of the prior year comorbidity variable performs better in predicting survival than the comorbidity variable based only on the prior year’s diagnoses. This suggests that many chronic conditions are not consistently recorded in administrative data, yet have an influence on outcomes. Among the 44,016 individuals with continuous enrollment, 1,578 had a recorded diabetes diagnosis in 1992 but only 70% had this recorded diagnosis in 1993. Thus, using a “rolling” comorbidity variable for chronic conditions can improve on the inconsistencies of administrative data.

Another important methodological question is whether to impute missing comorbidity data, as in model (I), which used a LOCF imputation method, or (K) which uses a RC imputation method. Imputation is justified if the imputation method is valid and the standard deviation is properly adjusted. An alternative method is to project a linear trajectory for comorbidity and impute using the predicted values.25 We did not use this strategy since the longitudinal comorbidity measures do not appear to fit a linear model well. While improper imputation may cause bias, ignoring missing comorbidity can cause bias as well. This study’s results suggest that imputing missing comorbidity data via either the LOCF or RC imputation method for a sample with non-continuous enrollment is an effective data analysis method. Research using RC as an imputation method2123 suggests that the LOCF and RC perform equally well due largely to the moderate relative hazard parameter for comorbidity. When the relative hazard parameter is large, other more complicated statistical methods26,27 should be considered.

An alternative modeling approach uses baseline and change in comorbidity, or change in rolling comorbidity, as covariates. We did not include this modeling approach because it is a reparameterization of model (D). The coefficient estimate for the change in comorbidity is equivalent to the coefficient estimate for the last year’s comorbidity, while the coefficient estimate of the baseline comorbidity for the change model is the summation of the coefficient estimates of the baseline comorbidity and last year’s comorbidity from model (D). Likewise, modeling longitudinal comorbidity using baseline and change in rolling comorbidity will be a reparameterization of model (E).

Our ability to compare across all of our modeling approaches was limited. Because models (B) – (E) exclude observation years for which prior year comorbidity is unavailable, they include fewer observation years than models (F) – (K), making it difficult to compare the likelihood ratios and AICs between these model sets. Second, we chose to examine only a few approaches to modeling the comorbidity-survival relationship. There are many more complicated approaches to modeling longitudinal comorbidity, but we focused on practical models that can be readily applied in health services research.

We chose to use the Charlson index in this study because it is commonly used by health services researchers and widely available. However, use of other comorbidity measures, such as ACGs and DxCGs,5,7,28 would allow modeling with continuous comorbidity variables in addition to discrete comorbidity groups to determine the robustness of our findings. Other comorbidity measures may not permit calculation of rolling comorbidity scores, though. This study is also limited in its use of the less intuitive likelihood ratio and AIC measures for discriminating between different Cox regression models. Further research developing more intuitive measures of model discrimination for survival analysis would be of benefit to health services researchers.

This research provides health services researchers with guidance on the most appropriate methods for including longitudinal comorbidity data in models predicting survival outcomes. We conclude that investigators should make use of the longitudinal data available to them.8 This study’s findings suggest that a combination of baseline comorbidity and time-dependent comorbidity variables should be included in survival models. We identify how researchers can improve on less consistent administrative data by acknowledging the nature of chronic conditions through development of a “rolling” comorbidity measure, and through imputation of missing data by carrying forward the most recent available information. Further research using different study populations, modeling additional comorbidity measures, and extending to outcomes other than survival is needed to confirm these findings and expand the literature in the largely understudied area of longitudinal comorbidity measurement.


The authors thank Denise M. Lishner, MSW, who helped edit the final version of the manuscript.

Support: This project was funded by grants 1 R01 CA104935 and CA53996 from the National Cancer Institute.


1. Charlson ME, Pompei P, Ales KL, et al. A new method of classifying prognostic comorbidity in longitudinal studies: development and validation. J Chronic Dis. 1987;40:373–383. [PubMed]
2. Deyo RA, Cherkin DC, Ciol MA. Adapting a clinical comorbidity index for use with ICD-9-CM administrative databases. J Clin Epidemiol. 1992;45:613–619. [PubMed]
3. Klabunde CN, Potosky AL, Legler JM, et al. Development of a comorbidity index using physician claims data. J Clin Epidemiol. 2000;53:1258–1267. [PubMed]
4. Romano PS, Roos LL, Jollis JG. Adapting a clinical comorbidity index for use with ICD-9-CM administrative data: differing perspectives. J Clin Epidemiol. 1993;46:1075–1079. discussion 1081–1090. [PubMed]
5. DxCG, Inc. Analytic guide release 6.1. Boston, MA: Author; 2002. DxCG risk adjustment software.
6. Elixhauser A, Steiner C, Harris DR, et al. Comorbidity measures for use with administrative data. Med Care. 1998;36:8–27. [PubMed]
7. Weiner JP, Abrams C, editors. Documentation & application manual. Baltimore, MD: Johns Hopkins University; 2001. The Johns Hopkins ACG case-mix system.
8. Cox DR. Regression models and life tables (with discussion) J R Stat Soc Ser B. 1972;34:187–220.
9. Grunau GL, Sheps S, Goldner EM, et al. Specific comorbidity risk adjustment was a better predictor of 5-year acute myocardial infarction mortality than general methods. J Clin Epidemiol. 2006;59:274–280. [PubMed]
10. Stukenborg GJ, Wagner DP, Connors AF., Jr Comparison of the performance of two comorbidity measures, with and without information from prior hospitalizations. Med Care. 2001;39:727–739. [PubMed]
11. Rius C, Perez G, Martinez JM, et al. An adaptation of Charlson comorbidity index predicted subsequent mortality in a health survey. J Clin Epidemiol. 2004;57:403–408. [PubMed]
12. Colinet B, Jacot W, Bertrand D, et al. A new simplified comorbidity score as a prognostic factor in non-small-cell lung cancer patients: description and comparison with the Charlson’s index. Br J Cancer. 2005;93:1098–1105. [PMC free article] [PubMed]
13. Byers TE, Wolf HJ, Bauer KR, et al. The impact of socioeconomic status on survival after cancer in the United States: findings from the National Program of Cancer Registries Patterns of Care Study. Cancer. 2008;113:582–591. [PubMed]
14. Gomez SL, O’Malley CD, Stroup A, et al. Longitudinal, population-based study of racial/ethnic differences in colorectal cancer survival: impact of neighborhood socioeconomic status, treatment and comorbidity. BMC Cancer. 2007;7:193. [PMC free article] [PubMed]
15. Crew KD, Neugut AI, Wang X, et al. Racial disparities in treatment and survival of male breast cancer. J Clin Oncol. 2007;25:1089–1098. [PubMed]
16. Potosky AL, Riley GF, Lubitz JD, et al. Potential for cancer related health services research using a linked Medicare-tumor registry database. Med Care. 1993;31:732–748. [PubMed]
17. National Cancer Institute. SEER registries. [Accessed April 6, 2008]. Available at:
18. Klabunde CN, Warren JL, Legler JM. Assessing comorbidity using claims data: an overview. Med Care. 2002;40:IV-26–35. [PubMed]
19. Baldwin LM, Klabunde CN, Green P, et al. In search of the perfect comorbidity measure for use with administrative claims data: does it exist? Med Care. 2006;44:745–753. [PMC free article] [PubMed]
20. Klabunde CN, Harlan LC, Warren JL. Data sources for measuring comorbidity: a comparison of hospital records and Medicare claims for cancer patients. Med Care. 2006;44:921–928. [PubMed]
21. Wang CY, Hsu L, Feng ZD, et al. Regression calibration in failure time regression with surrogate variables. Biometrics. 1997;53:131–145. [PubMed]
22. Wang CY, Xie SX, Prentice RL. Recalibration based on an approximate relative risk estimator in Cox regression with missing covariates. Stat Sin. 2001;11:1081–1104.
23. Xie SX, Wang CY, Prentice RL. A risk set calibration method for failure time regression using a covariate reliability sample. J R Stat Soc Ser B. 2001;63:855–870.
24. Buch P, Rasmussen S, Gislason GH, et al. Temporal decline in the prognostic impact of a recurrent acute myocardial infarction 1985 to 2002. Heart. 2007;93:210–215. [PMC free article] [PubMed]
25. Wang CY, Wang N, Wang S. Regression analysis when covariates are regression parameters of a random effects model for observed longitudinal measurements. Biometrics. 2000;56:487–495. [PubMed]
26. Wang CY, Chen HY. Augmented inverse probability weighted estimator for Cox missing covariate regression. Biometrics. 2001;57:414–419. [PubMed]
27. Qi L, Wang CY, Prentice RL. Weighted estimators for proportional hazards regression with missing covariates. J Am Stat Assoc. 2005;100:1250–1263.
28. DxCG, Inc. User’s guide release 6.1. Boston, MA: Author; 2002. DxCG risk adjustment software.