|Home | About | Journals | Submit | Contact Us | Français|
To examine the validity of International Classification of Diseases, Ninth Revision ((ICD)-9) and Current Procedural Terminology (CPT) codes for knee replacement and hip replacement in Veterans Affairs (VA) Databases.
From a cohort of veterans who received health care at Minneapolis VA Medical Center and/or affiliated medical facilities, we obtained 4 random samples of 50 patients each with: neither hip nor knee replacement code, knee replacement code only, hip replacement code only and both knee and hip replacement codes. The gold standard was documentation of knee or hip replacement surgery in patient medical records. Accuracy of ICD-9 or CPT code for knee and hip replacement was assessed by calculating sensitivity, specificity, positive and negative predictive values (PPV and NPV).
Of the 200 patients, medical records were available for 166: 140 (70%) had complete medical records and 26 (13%) had incomplete medical records. Knee replacement codes were accurate with excellent PPV of 95%, sensitivity of 95%, specificity of 96% and NPV of 96%. Hip replacement codes were accurate with excellent PPV of 98%, sensitivity of 96%, specificity of 99% and NPV of 96%. Sensitivity analyses that included incomplete charts had little impact on these estimates. The procedure dates found in VA databases matched exactly with medical records in 96%.
The ICD-9 and CPT codes for knee replacement and hip replacement in VA databases are valid. These codes may be used to identify cohorts of veterans with knee replacement and hip replacement for research studies.
Knee replacements and Hip replacements are one of the commonest surgeries performed in the U.S. and in the VA hospitals (1) (2). The Department of Veteran Affairs (VA) is the largest health care delivery system in the U.S. VA databases have been used in a variety of high-quality health services research and outcomes studies (3–6). Although diagnostic inaccuracy (under/over-coding) (7) and incomplete documentation (8) are limitations of these databases, administrative/clinical database codes have been found to be valid for acute myocardial infarction (9), hepatitis C (10) and spondyloarthritis (11). Our objective was to assess the validity of International Classification of Diseases-9th version (ICD-9) or Current Procedure Terminology (CPT) codes for knee and hip replacement and dates of surgery in Minneapolis VA databases. We obtained 4 random samples of 50 patients each with: neither hip nor knee replacement code, knee replacement code only, hip replacement code only and both knee and hip replacement codes. The gold standard was documentation of knee or hip replacement surgery in patient medical records. Accuracy of ICD-9 or CPT code for knee and hip replacement was assessed by calculating sensitivity, specificity, positive and negative predictive values (PPV and NPV). Of the 200 patients, medical records were available for 166: 140 (70%) had complete medical records and 26 (13%) had incomplete medical records. Knee replacement codes were accurate with excellent PPV of 95%, sensitivity of 95%, specificity of 96% and NPV of 96%. Hip replacement codes were accurate with excellent PPV of 98%, sensitivity of 96%, specificity of 99% and NPV of 96%. Sensitivity analyses that included incomplete charts had little impact on these estimates. The procedure dates found in VA databases matched exactly with medical records in 96%.
For this validation study, we used data from our survey of all veterans in the Upper Midwest Network with a VA health care encounter (n=70,334), details described elsewhere (12). Of these, 1,241 had undergone prior knee or hip replacement, as documented by presence of an International Classification of Diseases-9th version (ICD-9) or Current Procedure Terminology (CPT) code for knee or hip replacement (00.70– 00.76, 00.8–00.84, 81.51–81.55; 27437, 27438, 27440–27443, 27445–27447, 27486, 27487, 27125, 27130, 27132, 27134, 27137, 27138 and 27236). We obtained 4 random samples of 50 patients each with: neither hip nor knee replacement code, knee replacement code only, hip replacement code only and both knee and hip replacement codes. This combined list of 200 patients with names in alphabetic order were provided to a physician (S.A.) trained in chart abstraction, who was blinded to the ICD- and CPT-codes as well as how the sample was obtained. The Institutional Review Board at the Minneapolis VA Medical Center approved the study.
We used a standardized data extraction form to collect abstract demographics (age and gender), the date of knee or hip replacement, laterality (right, left), type of replacement (total or partial, primary or revision), underlying diagnosis and the procedure details from the operative and other clinical notes. A chart documentation of knee or hip replacement surgery in patients’ medical records was the gold standard for patient having undergone a knee or a hip replacement. All medical records (in- and out-patient visits) were reviewed starting from the first available encounter. We retrieved complete VA medical records including paper and computerized records for 140 patients (70%) and incomplete medical records for 26 patients (13%). No medical records were available for 34 patients (17%). Thus, the study sample consisted of 166 patients (83%). Main analyses were performed for the 140 with complete charts with sensitivity analyses including 166 patients. This chart retrieval rate is similar to other studies of validity of diagnoses in the VA health care system (7).
We compared the administrative data definition of presence of ICD-9 or CPT code to the gold standard of chart documentation of knee or hip replacement for each patient. We calculated sensitivity, specificity, positive and negative predictive values and Kappa statistic for administrative data. Sensitivity was the fraction of those with knee replacement or hip replacement according to the gold standard that were correctly identified as positive by the data definition, respectively. Specificity was the fraction of those without joint replacement (knee or hip) according to the gold standard that were correctly identified as negative by the data definition. Positive predictive value was the proportion of those with positive test definition that meet the gold standard definition of medical chart documentation. according to the gold standard. Negative predictive value was the proportion of those with negative test definition that did not meet the gold standard definition. The kappa coefficient was used to describe agreement (beyond chance) between the rheumatologist’s diagnosis in the chart (gold standard) and the 4 database definitions (13).
A receiver operating characteristic (ROC) curve analysis plotted the true positive rate against the false positive rate for the different possible cut-points of a case definition (14). A ROC curve shows how any increase in sensitivity is accompanied by a decrease in specificity. The area under the ROC measured discrimination of the database test definition, i.e., its ability to correctly classify those with and without respective joint replacement status. The 45-degree diagonal line represents the null hypothesis and a test definition that is no better than random will overlap the diagonal.
Because one may arrive at different conclusions depending on the relative importance given to sensitivity or specificity using the classic methods described above, we performed Bayesian analysis. Specificity and sensitivity can be regarded as utility measures of a test procedure under 2 unknown states of nature, i.e., having or not having the disease. A weighted average of these 2 quantities is the Bayes utility of a test. Bayes values for each diagnosis definition were calculated by giving a range of importance (P; value ranging from 0 to 1) to sensitivity and (1−P) to specificity, where 0 indicates the least importance and 1 indicates the maximum importance. For example, if sensitivity is most critical, we choose the method with the highest sensitivity, i.e., P of 1. However, in various situations sensitivity and specificity have different weights of importance. Linear combinations of sensitivity and specificity for different values of P were graphed. The analyses were performed using SPSS software, version 11.5 (SPSS, Chicago, IL) and S-plus 2000 (Mathsoft, Seattle, WA).
To examine whether the dates of knee or hip replacement surgery in VA administrative databases are accurate, we identified the cohort of patients that underwent knee or hip replacement during the fiscal years 1992–1998, since the surgery dates were available for this time-period in the administrative databases. 94 patients of the original cohort of 140 patients with complete charts qualified (41 with no knee/hip replacement procedures and 5 with procedures outside the study period 1992–1998). Date difference was calculated in days for difference between the dates from administrative database and gold standard chart documentation (most often from the operative or anesthesia note).
There were 200 patients with an ICD-9 and CPT codes for Knee and/or Hip replacement. As described above, main analyses were done for patients with complete charts (n=140). The mean age in this cohort of 140 patients was 69.3 (standard deviation, 12.8), and 98.6% were men (138/140).
Among the 140 patients with complete medical records, 63 had knee replacement and 64 had hip replacement according to the VA electronic databases (Table 1: sum of columns for true and false positives). Using the medical chart as the gold standard, of these 140 patients, 63 had knee replacement and 66, hip replacement (numbers derived from cross-tabulations; not depicted in Table 1).
ICD-9 and CPT codes for knee replacement were accurate with excellent PPV of 95%, sensitivity of 95%, specificity of 96% and NPV of 96%, Kappa of 0.91 (95% CI, 0.84–0.98). ICD-9 and CPT codes for hip replacement were accurate with excellent PPV of 98%, sensitivity of 96%, specificity of 99% and NPV of 96%, Kappa of 0.94 (95% CI, 0.89–1.00) (Table 1). Sensitivity analyses that included incomplete charts had little impact on these estimates. The area under the ROC curve (95% CI) for database definition of knee replacement was 0.875 (0.824, 0.926) and for hip replacement was 0.882 (0.832, 0.931), compared to the medical record documentation (Figure 1A and B).
The Bayesian approach showing the weighted averages of sensitivity and specificity against various weights are presented in Figure 1C. For example, in Figure 1C, the line for knee replacement represents the values of 0.952 × P + 0.961 × (1 −P) and the line for hip replacement represents the values of 0.955 × P + 0.986 × (1 −P) [sensitivity × P + specificity × (1−P)] for different values of P. If we gave the most importance to sensitivity, then P=1 and if we give most importance to specificity, then P=0. Interpretation for all other values of P can be made from the graph. In general the Bayesian value is high for almost the entire spectrum of P from 0 to 1.
Of the 94 patients who had knee and/or hip replacement surgery 1992–1998, there was a perfect match for date/s of surgery for 89 patients (95%). Of these 89 patients with perfect match, 57 (64%) had a single replacement surgery and 32 (36%) had multiple replacement surgeries- 24 patients with 2 procedures, 7 patients with 3 procedures and one patient with five procedures.
There are 5 patients whose replacement procedure dates did not match completely with VA databases. For patient#1, the dates of first and fourth procedure matched perfectly, but the second (left total knee replacement) and third (revision of left infected knee) procedure dates in databases differed by +303 and −210 days compared to medical records, respectively. For patient#2, the discrepancy for partial left hip replacement procedure was −154 days. For patient#3, first and third replacement surgery dates matched perfectly; for the second surgery (right total hip replacement), the date discrepancy was +2days. For patient #4, the second replacement procedure date matched perfectly, but the first replacement data (left total knee replacement) data was discrepant by −14 days. For patient#5, the second procedure date matched; for the first procedure (left total knee replacement), the date was missing in VA databases. Thus, from a total the 143 knee or hip procedures, 137 (96%) procedure dates matched perfectly.
In this study, we found excellent sensitivity, specificity, PPV and NPV of ICD-9 codes and CPT codes for knee and hip replacements in VA databases. The dates of surgery for knee and hip replacements in VA databases were also found to be accurate 96% of the time. To our knowledge there are no published studies of validation of ICD-9 and CPT codes for Knee and Hip replacements in VA databases or other U.S. databases for comparison. Thus, our study adds significantly to the current knowledge.
Several findings deserve further discussion. The high accuracy of codes for knee and hip replacement surgery in VA databases tested against the medical record gold standard is reassuring (ROC curve areas of 0.875 and 0.882 respectively). This is similar to high accuracy for spondyloarthritis (10) and in contrast to the low accuracy rates for rheumatoid arthritis (7) in VA databases. In conjunction with our other finding that dates in VA databases were 96% accurate, this means that one can identify cohort of knee and hip replacement using a simple valid database approach using ICD-9 and CPT codes in VA databases. These cohorts can be used for performing research studies of comparative effectiveness and/or of post-procedure complications.
Strengths of our study include selection of a random sample, good inter-observer agreement, and standardized data abstraction by a blinded physician. Our study findings may only be applicable to VA databases and accuracy of codes in other databases such as Medicare may differ. Similar validity studies need to be performed in other databases. Charts were available only for 83% of the sample and accuracy may have been different had all charts been available; residual bias is possible.
In conclusion, ICD-9 and CPT codes for Knee and Hip replacements in VA administrative databases are accurate with excellent sensitivity, specificity and positive and negative predictive values when compared with gold standard of medical record documentation. Dates of surgery also had high accuracy rates. The findings of this study imply that cohorts of VA patients with these procedures can be identified which can allow epidemiological and outcome studies in these populations.
Grant support: NIH CTSA Award 1 KL2 RR024151-01 (Mayo Clinic Center for Clinical and Translational Research).
J.A.S.’s time was protected for research through a NIH CTSA Award 1 KL2 RR024151-01 (Mayo Clinic Center for Clinical and Translational Research) grant. The funding source had no contributions to design or conduct of study, preparation of manuscript or decision to submit it for publication. This material is the result of work supported by the resources and the use of facilities at the Birmingham VA Medical Center, Alabama, USA.
Author Contributions: S.A.- Data abstraction, assistance in data analyses, preparation and revision of manuscriptJ.A.S.- Study concept and design, Ethics committee approval, Design of data abstraction form, Data analyses, Preparation and revision of manuscript.
Financial Conflict: There are no financial conflicts related to this work. J.A.S. has received speaker honoraria from Abbott; research and travel grants from Allergan, Takeda, Savient, Wyeth and Amgen; and consultant fees from Savient and URL pharmaceuticals. S.A. has no financial conflicts.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.