|Home | About | Journals | Submit | Contact Us | Français|
Comorbidity measures are designed to exclude complications when they map International Classification of Diseases (ICD-9-CM) codes to diagnostic categories. The use of data fields that indicates whether each secondary diagnosis was present at the time of hospital admission may lead to the more accurate identification of preexisting conditions.
To examine the rate of misclassification of ICD-9-CM codes into diagnostic categories by the Dartmouth–Manitoba adaptation of the Charlson index and by the Elixhauser comorbidity algorithm.
Analysis of 178,838 patients in the California State Inpatient Database (CA SID) admitted in 2000 for one of seven major medical and surgical conditions. The CA SID includes a condition present at admission (CPAA) modifier for each ICD-9-CM code.
The Dartmouth/Charlson index and the Elixhauser comorbidity measure were used to map the ICD-9-CM codes into diagnostic categories for patients in each study population. We calculated the misclassification rate for each mapping algorithm, using information from the CPAA as the “gold standard.”
The Dartmouth/Charlson index underestimated the prevalence of hemiplegia/paraplegia by 70 percent, cerebrovascular disease by 70 percent, myocardial infarction by 65 percent, congestive heart failure (CHF) by 45 percent, and peptic ulcer disease by 34 percent. The Elixhauser algorithm misclassified complications as preexisting conditions for 43 percent of the coagulopathies, 25 percent of the fluid and electrolyte disorders, 18 percent of the cardiac arrhythmias, 18 percent of the cardiac arrhythmias, and 9 percent of the cases of CHF.
Adding the CPAA modifier to administrative data would significantly enhance the ability of the Dartmouth/Charlson index and of the Elixhauser algorithm to map ICD-9-CM codes to diagnostic categories accurately.
Hospital report cards have become an integral part of the evolving health care landscape. One of the principal barriers to the widespread implementation of health outcomes measurement is the cost of collecting the clinical data required for risk adjustment (Chassin and Galvin 1998). To circumvent this problem, many performance-profiling systems rely on administrative rather than clinical data. Third-party payers are making hospital report cards based on such administrative data available to patients. “Hospital Comparison Tools,” which uses administrative data to benchmark hospital performance, is an example of a web-based service offered to members of many major health care plans such as Blue Cross/Blue Shield and Aetna. The Institute of Medicine has identified the importance of these data for assessing health care quality in Envisioning the National HealthCare Quality Report, by stating that “Administrative data, such as Medicare claims, represent one of the most practical and cost-effective data sources on selected components of healthcare quality available today” (Hurtado, Swift, and Corrigan 2001).
Hospital report cards are based on risk-adjusted mortality rates that are calculated from administrative data. The clinical information used in risk adjustment is captured in the primary and secondary diagnoses coded using the International Classification of Diseases (ICD-9-CM) system. These administrative data sets, however, fail to distinguish between conditions present at admission (preexisting conditions) and conditions that developed subsequent to admission (complications). This distinction is critically important because misclassifying complications as preexisting conditions can lead to the overestimation of the risk of mortality, effectively giving lower quality hospitals “credit for the complications that occurred under their care” (Jollis and Romano 1998). For example, a hospital with a high postoperative myocardial infarction rate after coronary artery bypass grafting will have an inappropriately high-predicted mortality rate and, therefore, a low risk-adjusted mortality rate if patients with postoperative myocardial infarctions are wrongly assumed to have had their myocardial infarction prior to hospital admission. Inaccurate risk adjustment may yield incorrect conclusions regarding hospital quality. The inability to distinguish accurately between preexisting conditions and complications in administrative data may greatly reduce the face validity and value of hospital quality report cards.
Unfortunately, ICD-9-CM codes in most administrative data sets are not “date stamped” to indicate whether they represent secondary diagnoses that were present prior to hospital admission or complications that developed subsequent to hospital admission. Only two states, California and New York, use a “condition present at admission” (CPAA) field to indicate for each recorded diagnosis as to whether or not it was present at the time of admission. Although in theory, the information from the CPAA field should lead to fewer errors, the extent to which complications are misclassified as preexisting conditions is largely unknown.
In practice, the large number of ICD-9-CM codes—over 14,000—makes it impossible to perform risk adjustment using administrative data without first mapping ICD-9-CM codes to a smaller number of diagnostic categories (i.e., congestive heart failure [CHF], myocardial infarction, renal disease). The best-known mapping algorithms used for this purpose are the Deyo (1992) and Dartmouth–Manitoba (Romano, Roos, and Jollis 1993) adaptations of the Charlson index (Charlson et al. 1987), and the Elixhauser comorbidity measure (Elixhauser et al. 1998). These mapping algorithms are designed to exclude ICD-9-CM codes in a patient record that are likely to represent complications. The Deyo and Dartmouth–Manitoba adaptations of the Charlson index rely on the linkage of hospital data “across multiple episodes of care” to differentiate between preexisting conditions and complications. The Elixhauser algorithm, on the other hand, uses information only from the current admission: by design, many ICD-9-CM codes that could represent either complications or preexisting conditions were excluded from the Elixhauser algorithm in an effort to avoid inadvertently identifying a complication as a preexisting condition.
Whether the addition of CPAA modifiers for each secondary diagnosis leads to more accurate identification of preexisting conditions versus complications is not well understood. Previous studies have been limited by small sample sizes and the use of narrowly defined population groups. Roos et al. (1997) investigated the extent to which the Dartmouth–Manitoba adaptation of the Charlson index misclassified complications in patients undergoing coronary artery bypass grafting surgery (CABG), pacemaker, and hip fracture surgery (n=7,187) using an administrative data set with date stamp information. These investigators found that the proportion of diagnoses—myocardial infarction, CHF, and cerebrovascular disease, etc.—that were correctly mapped, varied between 9.5 and 100 percent. The same study showed that in a much larger study population based on patients undergoing 17 surgical procedures, complications represented 11.1 percent of all of the diagnoses. However, in this larger patient group, findings were not reported for specific diagnostic categories. Southern, Quan, and Ghali (2004) showed, using an administrative data set with date stamp information, that the prevalence of specific diagnoses using the Deyo adaptation of the Charlson index was very similar regardless of whether or not date stamp information was used in patients with acute myocardial infarctions (n=4,833). However, prevalence rates across diagnoses could be similar if complications misclassified as preexisting diagnoses were offset by “missed diagnoses.”1 Neither of these studies explored the potential for diagnoses to be “missed” by the Charlson index—which can occur when some ICD-9-CM codes for a preexisting condition are present only on the current admission record, and not on a record from a previous hospitalization.
The goal of our study was to quantify the misclassification rate of the Dartmouth–Manitoba adaptation of the Charlson index and of the Elixhauser algorithm using date stamp information as the “gold standard.” Our study was based on a cohort of 178,838 patients admitted for one of seven major surgical procedures or medical conditions: coronary artery bypass grafting, coronary angioplasty, carotid endarterectomy, abdominal aortic aneurysm (AAA) repair, total hip replacement, acute myocardial infarction, and stroke. The Dartmouth–Manitoba adaptation of the Charlson index and the Elixhauser algorithm were used to map ICD-9-CM codes to diagnostic categories. We compared the results of using these mapping algorithms with and without the use of date stamp information. We estimated both the proportion of complications that were misclassified as preexisting conditions and the proportion of preexisting conditions that were “missed” using these mapping algorithms. This study was conducted using the California State Inpatient Database (SID) because for each recorded diagnosis, California data indicates whether or not it was present on admission through the use of a “CPAA” field.
Evaluating the importance of the datestamp is important if other states and the Medicare program are to consider adding date stamps to their administrative data. Currently, date-stamped diagnoses are not present in any of the state discharge databases other than California and New York, nor are they present in the Medicare or Medicaid databases. Adding date stamp information to hospital discharge data sets will be expensive because every secondary diagnosis will have to be evaluated by hospital coders to determine whether it was present on admission. However, misclassifying complications as preexisting conditions may seriously bias quality hospital measurement and may compromise our ability to improve health care quality in this country. If our findings show that the addition of “date stamp” information to ICD-9-CM codes leads to more accurate identification of preexisting conditions, health care policy makers will need to consider mandating date stamping of ICD-9-CM codes by the states and by the federal government.
This study is based on the 1998–2000 California SID, which contains 100 percent of the state's inpatient discharge records. The data are made available through a Federal-State-Industry partnership sponsored by the Agency for Healthcare Research and Quality (AHRQ) as part of the Healthcare Cost and Utilization Project (HCUP). The California SID was chosen for this study because California is one of only two states that date stamps secondary diagnoses.
The data for the California SID were provided by the California Office of Statewide Health Planning and Development (OSHPD) to HCUP and are based on data abstracted from medical charts by hospital data coders. ICD-9-CM coding slots for up to 30 diagnoses and 21 procedures are available for each patient record in the California SID. The CPAA field—that is, date stamp—indicates whether a diagnosis was present at admission. It is coded for all primary and secondary diagnoses with the exception of E-codes. Coding data quality has been found to be higher when the controlling entity that submits the data to HCUP is a government agency, as opposed to a private hospital association (Berthelsen 2000). The discharge data reports submitted by individual hospitals are checked for errors using software edit tools. In addition, trend analysis is performed by analysts to detect any large unexplained variation in the data. Discharge data reports that do not meet error tolerance levels established by the state (see Table 1) are sent back to the reporting institution for correction (California Patient Discharge Data Reporting Manual 2000). For discharge data reports not exceeding allowable error tolerance levels, a missing value is assigned to data elements that are incorrectly coded. The accuracy of the California discharge-based data has been previously validated by comparing coded data and information obtained by reabstracting medical records (Stukenborg, Wagner, and Connors 2001). However, the CPAA modifier has not been validated using reabstraction studies.
We conducted exploratory analyses on the CPAA modifier using the entire CA SID between 1999 and 2000. The first goal of these analyses was to determine the proportion of missing CPAA modifiers. We then looked at CPAA modifiers for ICD-9-CM codes for secondary diagnoses that most likely represent preexisting conditions to determine what proportion were coded as preexisting conditions. Finally, we looked at CPAA modifiers for some ICD-9-CM codes that were likely to represent complications.
We evaluated the importance of the date stamp in seven study populations: coronary artery bypass grafting, coronary angioplasty, carotid endarterectomy, AAA surgery, hip replacement, myocardial infarction, and stroke. These primary diagnoses were chosen because they represent common surgical and medical diagnoses, and are associated with significant mortality. For each study population, we constructed an index data set that included all the inpatient admissions during the 2000 calendar year that met the ICD-9-CM coding criteria in Table 2. If a patient was admitted more than once in 2000 with the same primary diagnosis, only the initial admission was included in the analysis. We then created longitudinal data sets by linking patient records in the index data sets to records from previous inpatient admissions between 1998 and 2000 using encrypted social security numbers and gender. In order to have a uniform “look back” period, we only included information from prior admissions if they occurred within 2 years of the index admission. When the index admission occurred in the same month and year as a “linked” admission in the longitudinal data set, we used transfer status, discharge quarter, and live/die status to determine which admission came first. Of note, since the CA SID is a limited data set, only the month and year of admission are provided—exact admission dates and discharge dates are not available. We excluded patients if the time sequence of the index admission and the “linked” admission could not be determined; 10.7 percent of the patients were excluded.
STATA 8/SE (2003) programming language was used to map ICD-9-CM codes to the diagnostic categories in the Elixhauser algorithm (Elixhauser et al. 1998) and to the diagnostic categories in the Dartmouth–Manitoba adaptation of the Charlson index (Dartmouth/Charlson index) (Romano, Roos, and Jollis 1993).2–4 We did not implement the DRG screen that is used by the Elixhauser algorithm to exclude secondary diagnoses that are related to the primary diagnosis. Elixhauser et al. assumed that other measures, such as disease staging (Gonnella, Louis, and Gozum 1994), could be used to characterize the severity of disease of the principal diagnosis. However, in practice, most investigators use comorbidity measures to identify all secondary diagnoses, regardless of whether the secondary diagnoses represent comorbidities or conditions that characterize severity of disease.
The original versions of the Elixhauser algorithm and of the Charlson index were developed for use with administrative data that do not contain date stamp information. We constructed two versions of each of these algorithms. The first version ignored the presence of the date stamp associated with each ICD-9-CM code (no date stamp version). The second version used the date stamp associated with each ICD-9-CM code to determine whether an ICD-9-CM code represented a preexisting condition (date stamp version). We used these algorithms to map the ICD-9-CM codes to diagnostic categories for patients in each study population. The two versions of the Elixhauser algorithm were used to map ICD-9-CM codes in the index data set to diagnostic categories, whereas the two versions of the Dartmouth/Charlson index were applied to the longitudinal data set.
We defined the false-positive error rate (FPER) as the number of complications identified as preexisting conditions (false positives) divided by the total number of cases mapped to a diagnostic category (true positives plus false positives). A patient who is mapped to the “coagulopathy” diagnostic category by the Elixhauser algorithm and for whom the CPAA modifier indicates that this secondary diagnosis was not present at admission (i.e. a complication) is counted as a false positive.
We defined the false negative error rate (FNER) as the number of “missed” diagnoses (false negatives) divided by the total number of diagnoses (false negatives plus true positives).
In order to avoid coding complications as preexisting conditions, the Dartmouth/Charlson index will only map some ICD-9-CM codes to a diagnostic category if they are present on a prior hospital record. Thus, a false negative may occur when an ICD-9-CM code present on the index admission is not present on a prior hospital record. For example, a patient with a history of myocardial infarction 3 months prior to admission would not be mapped to the “myocardial infarction” diagnostic category by the Dartmouth/Charlson index if his most recent previous hospital admission was 6 months prior to the index admission. Since, by construction, the Elixhauser algorithm does not use information from prior hospitalizations, the FNER for the Elixhauser algorithm must be zero. It is probable that the Elixhauser algorithm also underestimates the prevalence of some of the diagnostic categories by excluding ICD-9-CM codes that are likely to represent complications (a problem that is exacerbated when using the DRG screen). Expanding the Elixhauser algorithm to include such codes, and thus quantitate the “true” FNER for the Elixhauser algorithm, was beyond the scope of this present study.
The formulas for the FPER and for the FNER are shown below:
where false positive is when a patient is classified as having a condition (i.e., CHF) by the no date stamp version and not by the date stamp version and true positive is when a patient is classified as having a condition by both the date stamp and the no date stamp versions
where false negative is when a patient is classified as having a condition (i.e., CHF) by the date stamp version and not by the no date stamp version and true positive is when a patient is classified as having a condition by both the date stamp and the no date stamp version.
We calculated the FPER and the FNER of the “no date stamp” version of the Dartmouth/Charlson index using the “date stamp” version as the “gold standard.” We calculated the FPER and the FNER for each of the 16 diagnostic categories in the Dartmouth/Charlson index (we excluded AIDS since HIV status is not coded in the CA SID). These calculations were first performed for each of the seven patient groups separately, and then for all of the patients together. We repeated this analysis for the Elixhauser algorithm.
After excluding E-codes, which are not coded with the CPAA modifier, out of over 55 million ICD-9-CM codes, 92.99 percent of CPAA modifiers indicated that the corresponding ICD-9-CM codes designated diagnoses present at the time of admission and 6.48 percent conditions that developed following hospital admission. The percent of missing CPAA modifiers was 0.54 percent.
To examine the face validity of the date stamp, we first examined its distribution with respect to diagnoses types—that is, diagnoses that are likely to be preexisting conditions and those that are likely to be complications. For ICD-9-CM codes that are very likely to represent preexisting conditions (ectopic pregnancy, old myocardial infarction, chronic ischemic heart disease, etc.), the CPAA fields indicated that these secondary diagnoses were present at admission 98 to 99 percent of the time (Table 3). On the other hand, for ICD-9-CM codes that designate secondary diagnoses very likely to represent complications, the CPAA fields indicated that these secondary diagnoses were not present at admission between 69 and 80 percent of the time (Table 3). The likely explanation why this latter percentage was not higher is that some of these patients were re-admitted with complications that had occurred during a previous admission. These data suggest that the date stamp in the CA data set is meaningful and could potentially be used to improve the accuracy of the Elixhauser algorithm and of the Charlson index.
The FNER—preexisting conditions that were “missed” by the Dartmouth/Charlson index—ranged between 0 and 70 percent in the combined data set (all patient populations). The prevalence of myocardial infarctions was underestimated by 65 percent, CHF by 45 percent, cerebrovascular disease by 70 percent, peptic ulcer disease by 34 percent, and hemiplegia/paraplegia by 70 percent. The FPER—complications that were misclassified as preexisting conditions—ranged between 0 and 11 percent. Three percent of the myocardial infarctions, 11 percent of the cases of renal disease, and 7 percent of the cases of moderate-to-severe liver disease were misclassified as preexisting conditions. These results are shown in Table 4. For some of the diagnostic conditions, there is considerable variability across different patient populations (Table 5). For example, although the FPER for renal disease was only 11 percent overall, it was 70 percent for patients undergoing AAA repair and 32 percent in patients undergoing CABG. Similarly, although the FPER for myocardial infarction was only 3 percent overall, it was 23 percent in AAA patients.
For the Elixhauser algorithm, the proportion of complications misclassified as preexisting conditions ranged between 0 and 43 percent. The results are shown in Table 6. Nine percent of the cases of CHF, 18 percent of the cardiac arrhythmias, 8 percent of the “other” neurologic diseases, 43 percent of the coagulopathies, 25 percent of the fluid and electrolyte disorders, 20 percent of the cases of weight loss, 13 percent of blood loss anemia, and 16 percent of the cases of deficiency anemia represented complications and not preexisting conditions. As discussed in the Methods section, the FNER for the Elixhauser algorithm was zero by construction. As with the Charlson index, there was also considerable variability across different patient populations (Table 7). Although the FPER for CHF was 9 percent overall, it was increased to 40 percent in AAA patients and 18 percent in CABG patients. Similarly, the FPER for paralysis was only 4 percent overall, but was found to be 65 percent in the AAA group and 67 percent in the CABG group.
In a recent study, the Institute of Medicine emphasized the critical role that “comparative quality data” can play in promoting health care quality (Corrigan, Eden, and Smith 2002). The goal of quality reporting is “[drawing] attention to best practices in the hope of driving patient volume to the higher-quality performers, and spurring action on the part of poor and average performers to enhance their knowledge and skills or limit their scope of practice” (Corrigan, Eden, and Smith 2002). Aside from some notable exceptions like the VA National Surgical Quality Improvement Program (NSQIP) (Khuri, Daley, and Henderson 2002) and the New York State Cardiac Surgery Reporting System (Hannan et al. 1994), current efforts to measure health care quality are still dependent on “20th century measurement technology …[for the] culling of information from administrative data sets” (Corrigan, Eden, and Smith 2002). Although clinical data are clearly preferable to administrative data for performing the risk adjustment required for quality measurement, they are still largely unavailable for the vast majority of patients. Until the necessary infrastructure for collecting computerized clinical data becomes widespread, we must continue to rely on administrative data for risk adjustment. Two states, California and New York, have created enhanced administrative data sets that include a date stamp to signify whether a secondary diagnosis was present at the time of hospital admission. This enhancement may help to mitigate one of the major drawbacks to using administrative data sets for performance profiling: the difficulty of distinguishing between preexisting conditions and complications. The goal of this study was to determine whether the addition of date stamp information to administrative data improves the ability of two well-known mapping algorithms to accurately identify preexisting conditions.
This study shows that applying the Dartmouth–Manitoba adaptation of the Charlson index to administrative data severely underestimates the prevalence of myocardial infarction, CHF, cerebrovascular disease, peptic ulcer disease, and hemiplegia/paraplegia: between 34 and 70 percent of the patients with these conditions were “missed.” The rate at which complications are misclassified as preexisting conditions was much lower: 3 percent for myocardial infarctions, 0 percent for CHF, 11 percent for renal disease, and 7 percent for moderate-to-severe liver disease. The high incidence of “false negatives”—missed diagnoses—is a function of how this algorithm is constructed. The Dartmouth–Manitoba adaptation of the Charlson index uses information from prior hospital records to avoid misclassifying complications as preexisting conditions: ICD-9-CM codes that could represent either preexisting conditions or complications must be present on a prior admission (and not just on the current admission) before they can be mapped to a diagnostic category. Thus, a patient who experiences CHF 6 weeks before admission will not be considered to have CHF as a preexisting condition if his prior hospitalization(s) did not include an ICD-9-CM code for CHF. This approach to excluding complications has a major impact on the ability of the Dartmouth–Manitoba adaptation of the Charlson index to accurately identify preexisting conditions. Adding the CPAA modifier to administrative data would significantly enhance the ability of the Dartmouth–Manitoba adaptation of the Charlson index to accurately map ICD-9-CM codes to diagnostic categories.
The Elixhauser algorithm, on the other hand, does not use information from prior hospitalizations to distinguish between complications and preexisting conditions. Not surprisingly, we found a much higher misclassification rate for complications using the Elixhauser algorithm as compared with the Dartmouth–Manitoba adaptation of the Charlson index. For example, 18 percent of the cardiac arrhythmias and 43 percent of the coagulopathies identified as preexisting conditions were in fact complications that developed subsequent to hospital admission. In some study populations, the misclassification rate was much higher. Forty percent of the cases of CHF in AAA patients and 18 percent of the cases of CHF in CABG patients identified as preexisting conditions were in fact complications. Similarly, over 65 percent of the cases of “paralysis” identified as a preexisting condition using the Elixhauser algorithm in patients undergoing AAA or CABG turned out to be complications.
Since the Elixhauser algorithm, was designed to be used only with data from the current hospital record, ICD-9-CM codes that were likely to represent complications were excluded from the mapping methodology in order to avoid classifying complications as preexisting conditions. Our study did not assess the extent to which the Elixhauser algorithm underestimates the prevalence of the diagnostic categories. However, our study does demonstrate that the addition of CPAA modifiers to administrative data would significantly reduce the misclassification of complications as preexisting conditions for some of the diagnostic categories in the Elixhauser algorithm.
Our analysis has several strengths. First, it is the single largest study—178,838 patients with one of seven primary diagnoses—designed to analyze the potential for misclassifying secondary diagnoses using two widely recognized ICD-9-CM mapping algorithms. The large sample size made it possible to assess the extent of misclassification of secondary diagnoses across specific primary diagnoses. Second, this study was population based and, in contrast to many studies based on the Medicare database, was not limited to patients over the age of 65. Third, the California SID includes 30 slots for diagnosis codes, compared with the Medicare database, which only has coding slots for up to nine discharge diagnoses. Limiting the number of coding slots can potentially cause some chronic diagnoses to be truncated from the administrative record (Iezzoni 2003).
Several potential limitations and caveats are noteworthy. First, this study assumes that the date stamp provides accurate information on whether an ICD-9-CM code represents a condition present at the time of admission to the hospital. The accuracy of the date stamp has not been previously validated in any study using chart reabstraction or clinical data. However, the fact that the date stamp identified chronic conditions (ectopic pregnancy, old myocardial infarction, chronic ischemic heart disease, etc.) as present on admission >98 percent of the time suggests that the CPAA modifier has face validity.
Second, because of the limitations of the data set, we were forced to exclude some patients with a previous admission in the same year and month as the index admission. It is likely that the Dartmouth/Charlson index would have underestimated the prevalence of preexisting conditions to a lesser extent if these patients had not been excluded.5 However, excluding these patients would not be expected to decrease the error rate of the Dartmouth/Charlson index for classifying complications.6 Since the Elixhauser algorithm is applied only to the index admission, the excluded patients would not be expected to have an effect on the study results.
Third, our study could be criticized for using comorbidity measures to identify all preexisting conditions, as opposed to identifying only comorbidities and ignoring conditions related to disease severity. We believe that for the purpose of creating prediction models, the distinction between secondary diagnoses that represent comorbidities versus those that describe severity of disease is arbitrary. That is, whether a secondary diagnosis is a comorbidity or a measure of disease severity makes no difference in risk adjustment. This lack of distinction between comorbidities and disease severity is underscored by the fact that comorbidity measures are widely used to identify all diagnostic conditions regardless of whether or not they relate to the principal diagnosis.
Fourth, in order to include all preexisting conditions, we did not implement an integral component of the Elixhauser algorithm—the DRG screen. Our study design increased the rate at which the Elixhauser algorithm identified complications as preexisting conditions. The purpose of this study, however, was not to identify the “limitations” of the Dartmouth/Charlson index and of the Elixhauser algorithm, but rather to better understand the extent to which the clinical content derived by applying these mapping algorithms to administrative data could be enhanced by the addition of date stamp information. Finally, this study does not examine the effect of misclassifying secondary diagnoses on hospital performance ranking. The impact of date stamping on the identification of hospital quality outliers will be the subject of future investigations.
Adding date stamp information to administrative data will substantially improve the ability of the Dartmouth–Manitoba adaptation of the Charlson index and of the Elixhauser algorithm to accurately identify preexisting conditions in administrative data. This has important health policy implications. First, health outcomes report cards are only as good as the data on which they are based. This study demonstrates that ICD-9-CM codes are often mapped inaccurately into diagnostic categories when date stamp information is not available. A priori, it is reasonable to assume that poor data quality will lead to inaccurate report cards. Second, most health care outcomes studies based on administrative data use the Charlson index as a measure of the intensity of patient disease. Some of these observational studies have become important “drivers” in the effort to reform health care. For example, the Leapfrog Initiative (Birkmeyer, Finlayson, and Birkmeyer 2001), which seeks to use market forces to regionalize health care for high-risk surgery to high-volume centers, illustrates the potential of health policy research to shape health care policy. This initiative is grounded in the large body of research showing that higher volumes lead to lower mortality. Most of the studies examining the volume–outcome association are based on administrative data (Halm, Lee, and Chassin 2002). The largest, and possibly most influential study to examine the impact of hospital and surgeon volume on mortality was based on the Medicare data set and used the Charlson index to adjust for patient severity of disease (Birkmeyer et al. 2002, 2003). Thus, the drive to institute “selective referral” of patients undergoing high-risk surgery to high-volume centers (Dudley et al. 2000) may be based largely on studies with imperfect risk adjustment due, in part, to the absence of date stamp information.
Our findings suggest that adding date stamp information to administrative data will lead to more accurate mapping of ICD-9-CM codes to diagnostic categories. This study provides significant insights into the potential risk of relying on conventional administrative data that are not date stamped for the construction of hospital report cards. Future studies investigating the impact of using date stamp information on the evaluation of hospital quality are necessary. It is possible that a relatively simple addition to administrative data sets—the CPAA modifier—may greatly improve risk adjustment. The findings of this study may prove useful to health care policy makers exploring the value and feasibility of instituting date stamping of ICD-9-CM codes in the other 48 states and in the Medicare/Medicaid programs.
The following supplementary material is available for this article online:
Prevalence of diagnostic categories based on the “no date stamp” version of the Dartmouth-Manitoba adaptation of the Charlson index by patient population.
Prevalence of diagnostic categories based on the “no date stamp” version of the Elixhauser algorithm by patient population.
This project was supported by a grant from the Agency for Healthcare and Quality Research (RO1 HS 13617).
1Example: 100 patients in the “date stamp” group could be identified as having the diagnosis of myocardial infarction (MI). In the “no date stamp” group, of the 100 patients identified with the diagnosis of MI, 25 of these could be complications, and 25 of the patients with a “true” history of MI may have been missed in the “no date stamp” group.
2We incorporated the additional codes for hypertensive heart and renal disease with congestive heart failure listed in the footnote in the Romano et al. (1993) description of the Dartmouth–Manitoba adaptation of the Charlson index.
3We also considered uncomplicated and uncomplicated hypertension as two separate diagnostic categories.
4We included the ICD-9-CM code for tobacco abuse in the Elixhauser diagnostic category for drug abuse.
5Preexisting conditions are “missed” by the Dartmouth/Charlson index when the ICD-9-CM codes that are mapped to a diagnostic category must be present on a previous admission. Prior admissions that are closer in time to the index admission are more likely to code for an “acute” condition.
6Complications are incorrectly classified as preexisting conditions by the Dartmouth/Charlson index when the ICD-9-CM codes can be mapped to a diagnostic category if they are present either on the index admission or on a previous admission.