|Home | About | Journals | Submit | Contact Us | Français|
Procedure length is a fundamental variable associated with quality of care, though seldom studied on a large scale. We sought to estimate procedure length through information obtained in the anesthesia claim submitted to Medicare to validate this method for future studies.
The Obesity and Surgical Outcomes Study enlisted 47 hospitals located across New York, Texas and Illinois to study patients undergoing hip, knee, colon and thoracotomy procedures. 15,914 charts were abstracted to determine body mass index and initial patient physiology. Included in this abstraction were induction, cut, close and recovery room times. This chart information was merged to Medicare claims which included anesthesia Part B billing information. Correlations between chart times and claim times were analyzed, models developed, and median absolute differences in minutes calculated.
Of the 15,914 eligible patients, there were 14,369 where both chart and claim times were available for analysis. In these 14,369, the Spearman correlation between chart and claim time was 0.94 (95% CI 0.94, 0.95) and the median absolute difference between chart and claim time was only 5 minutes (95% CI: 5.0, 5.5). The anesthesia claim can also be used to estimate surgical procedure length, with only a modest increase in error.
The anesthesia bill found in Medicare claims provides an excellent source of information for studying operative time on a vast scale throughout the United States. However, errors in both chart abstraction and anesthesia claims can occur. Care must be taken in the handling of outliers in this data.
Procedure length is a fundamental variable utilized to describe surgical performance and even quality of care, as it has been shown to be associated with postoperative complications,1-17 and is an integral part of any measurement of efficiency.18-28 In previous research we reported on a method to estimate both anesthesia and surgical procedure length using the anesthesia Medicare claim based on 1,931 high-risk general surgery and orthopedics cases performed during 1995 and 1996 in Pennsylvania. We found that we could achieve an excellent prediction of anesthesia chart time using anesthesia claims data (R2 = 0.89).29 We subsequently utilized the method to estimate procedure times in the 20 most frequent orthopedic and 20 most frequent general surgical procedures in Pennsylvania during that period.30 Other investigators have now used this technique to answer questions regarding procedure length, yet no new validations with large-scale chart abstraction have been attempted.10,31-38
Obtaining the anesthesia chart time from the Medicare claim is not straightforward, as the Medicare variable was not developed with this purpose in mind. As will be described, there is considerable opportunity for the anesthesia claim time to diverge from the chart time for a number of reasons: (1) the anesthesia claims do not always specify the exact surgical procedure associated with that claim, so matching anesthesia procedure to surgical procedure is not simple; (2) there may be mistakes in the claim; (3) there may be mistakes in the chart abstraction; (4) there may be confusion concerning times when more than one anesthesia provider was involved with the same operative case and billed for overlapping time periods (such as a physician and nurse anesthetist billing for the same case as part of the anesthesia team). For all these reasons, the claim derived time may not necessarily be correct. The intent of this paper is to demonstrate that using our proposed algorithm, the Medicare anesthesia claim can be utilized to accurately obtain procedure time information.
In this report we present chart abstraction data on a far larger data set (over 7-fold larger than our previous study), over three different states, across four types of surgery. Our original report, which was based on a case control study of mortality in Pennsylvania, analyzed patients that were uniformly very ill (all cases died within 60 days of admission and controls were matched to these cases based on similar comorbidities and age). The present report, using data that is a decade more recent than the original study, provides an update using a population far more representative of patients undergoing the procedures studied, and utilizes a new and better methodology to estimate procedure length from Medicare claims while providing a more detailed account of the errors in measurement.
Establishing that anesthesia procedure time can be accurately estimated from Medicare claims may facilitate study of both surgical and anesthesia quality throughout the entire Medicare system. It also may aid in studying important clinical questions concerning cumulative anesthesia exposure time and its relationship to outcomes.
The aim of this study is to inform researchers and policy analysts about the validity of using anesthesia claims data to determine procedure time when chart data is not available. We take the perspective that we wish to evaluate the proposed claims algorithm using chart data as a “gold” standard. However, we have seen that both claims and charts have errors in measurement and recording. In an ideal world, one could have both claims and charts to inform the presence of errors from either source. For example, chart data may have transcription errors. If a chart time was 5 min for a colectomy, it is likely an error, especially if the claim time was 205 min. Because we wish to evaluate these two measures, in reporting these data we generally do not use chart data to inform or correct claims data, and we do not use claims data to inform or correct chart data, unless specifically stated.
The Obesity and Surgical Outcomes Study (OBSOS) is a study of surgery at 47 hospitals located throughout Illinois, New York, and Texas (appendix 1). Using Medicare claims, patients that underwent one of five types of surgery between 2002 and 2006 were identified in each study hospital: (1) hip replacement or revision excluding fracture (ICD9CM Principal Procedure codes 81.51-81.53); (2) knee replacement or revision (ICDCM Principal Procedure 81.54, 81.55); (3) colectomy for cancer (ICD9CM Principal Procedure codes 45.7-45.79, 45.8) (ICD9CM Principal Diagnosis codes 153-153.9, 154-154.8, 230.3-6); (4) colectomy not for cancer (ICD9CM Principal Procedure 45.7-45.79, 45.8) and (ICD9CM Principal Diagnosis codes 562.1-562.13); and (5) thoracotomy (ICD9CM Principal Procedure codes 32-32.9).
Hospitals were approached by the Oklahoma Foundation for Medical Quality, and requested to abstract between 300 and 400 charts in order to collect baseline information including body mass index, admission vital signs and laboratory tests, and information on the surgical procedure including time of induction, initial surgery, closure, and time to recovery room. All data collected was deidentified and merged with encrypted Medicare files and sent to the study investigators for analysis. Approval was obtained from The Children’s Hospital of Philadelphia Institutional Review Board (IRB) (Philadelphia, Pennsylvania), the IRB associated with the PI of the study, as well as hospital specific IRB approval when requested.
The fundamental question we seek to answer is whether our algorithm for anesthesia claims data can provide valid anesthesia times. In this study we have the luxury of collecting chart derived anesthesia time to aid in validation. However, even chart derived anesthesia times are not perfect. In a study of this size, there are occasional mistakes in the abstraction of chart information when collecting anesthesia time and these mistakes may contribute to the appearance of mistakes in the claims algorithm. To make sense of these potential errors, we developed definitions for chart and claim derived variables based on an algorithm used by each definition. We define three times:
(a) “Isolated” Chart Time: A chart time cleaned in isolation from the claim time information. Changes in obviously incorrect dates (say off by a year or a month) were corrected using only chart information, not claims information. If times were obviously incorrect (≤ 30 min or ≥ 24 h), these were either fixed with internal information from the chart or, if no time could be determined using only the chart, the chart time was coded as missing.
(b) “Isolated” Claim Time: a time derived from using the claim only, and not using any chart information. To clean the isolated claim information, we used our claim algorithm (described below) or where the claim was obviously too long (≥ 24 h) or where the claim was obviously too short (≤ 30 min) for the procedures we coded the time as missing.
(c) “Best” Chart Time: A time derived using the best information available. Isolated chart time information was augmented with claim information. Correlating claim time with the “best” chart time is obviously tautological, but there will be instances where we provide this information in order to place an informal upper bound on the quality of the claims information. The terms “chart time” and “claim time” always refer to “isolated chart” and “isolated claim” times unless otherwise noted.
Chart data on induction, incision, closure and recovery room times were defined for the principal procedure in a standard manner as reported previously.29 We collected chart time and date for start of induction, start of incision, end of closure and entrance to the recovery room. Claim time can only directly provide anesthesia time, as that is how anesthesiologists bill Medicare. Anesthesia time refers to time from induction to recovery room. Surgical time is only available from the chart, and is defined as cut to close time, as surgeons do not bill Medicare by the minute.39
The correlations between chart and claim times were assessed with Pearson, Spearman and Kendal correlation coefficients.40 When performing multiple regression models, we used Huber’s robust m-estimation as implemented in SAS Version 9 (SAS Institute, Inc., Cary, NC) using the bisquare weight function.41-43 In our robust regressions, we report as R2 (or rank R2) the square of the Spearman rank correlation between the observed and expected y’s, which is analogous to the square of the Pearson correlation between the observed and predicted ranks of y = chart time. This prevents one or two peculiar claims from greatly increasing or decreasing the R2.
In order to ascertain the anesthesia time from the Medicare claims, we linked to the index admission in the Inpatient file all the claims in Part B that pertain to that patient. Then we selected only bills that identified an anesthesia service, these are bills with HCPCS codes in the range of 00100 to 01999. We applied the following algorithm (see fig. 1) that ensured that we match the principal surgical procedure in the hospital’s inpatient claim with the appropriate anesthesia bill from the provider in Part B. The first step was to align the dates in the Inpatient and in Part B files. We tried to match the anesthesia date in Part B to the surgical procedure date in the Inpatient file by choosing the anesthesia bill in Part B with the “first expense date” and the “last expense date” that included the “procedure date” of the principal procedure. If there was no overlap between any of the expense dates and the procedure date of the principal procedure, (step 2), we used the interval between the “from date” and the “through date” in Part B that included the “procedure date” of the principal procedure. If there was no overlap between “from” and “through” dates of anesthesia bills and the procedure date of the principal procedure, (step 3), we broadened the time frame in the hospital file so that the index admission and discharge dates in the hospital bill would overlap the “from”-“through” date interval in the provider bill. If multiple anesthesia bills were found that matched in terms of the time frame, we calculated each length and chose the bill with the longest time. If more than one provider reported the same longest time, we did not want to double count time. However it was possible that both providers traded-off times and did not perform services concurrently as assumed by our algorithm. As will be seen, the algorithm performs well despite the potential undercounting of time when anesthesia providers traded–off times, leading us to conclude that for the vast majority of cases the longest anesthesia time billed reflects the total time needed for the entire anesthesia procedure.
The length of the anesthesia is calculated by multiplying the anesthesia time unit variable in the Physician Part B file by 15 min per unit. The time units are identified by the variable “mile/time/units/services indicator code” when this variable equals 2, it identifies anesthesia. For example a time unit value of “25” implies 2.5 time units (CMS reports units always starting at the tenth’s place, and do not provide the decimal point). We therefore multiply 2.5 units by 15 min/unit to get 37.5 min billed by the anesthesia provider.
Table 1 displays the distribution of patient and hospital characteristics in the OBSOS study population. The OBSOS study was not a random sample of hospitals in the three states, but did provide a representative cross-section of hospitals and patients.
Table 2 describes comorbidities in each of the five procedure categories by study and non-study hospital groups. Again, the sample of 47 study hospitals is comprised of patients that look fairly similar to nonstudy hospital patients.
Table 3 provides a comparison of missing data as defined by the claim time and chart time variables from the 15,914 patients evaluable in the OBSOS study. Here we see that there were missing data in both the claims and chart derived times. There were 1,187 patients or 7.5% of the claims with missing times, and 385 patients or 2.4% of abstracted charts with missing times, but there was almost no overlap between the patients with missing claim time and those with missing chart time, with only 27 patients missing time data from both claim and chart. The distribution of anesthesia chart times in those patients who were missing anesthesia claim times was almost identical to the distribution of anesthesia chart times in those not missing anesthesia claim times. Similarly, the distribution of anesthesia claim times in those patients who were missing anesthesia chart times was almost identical to the distribution of anesthesia claim times in those patients who were not missing anesthesia chart times (see fig. 2). Figure 2 suggests no interesting relationship between missing times on one variable and recorded times on another.
Using the 14,369 patients who had both chart and claim times, we next studied the correlations between chart and claim times. Table 4 provides these correlations using theSpearman, Pearson and Kendal τ statistics with their confidence intervals and p-values. The correlations between chart and claim times were very high, ranging from 0.85 for the Kendal’s tau to 0.94 for the Spearman. We also provide the probability of concordance associated with the Kendal coefficient τ, which is equal to (τ+1)/2; for two patients, it is the probability that the chart and claim will agree about which patient had the longer anesthesia time. The probability of concordance was 0.93. The median absolute difference between chart and claim was only 5 min (95% CI 5.0, 5.5), and the median difference was 4.5 min.
Table 5 provides similar information as table 3 but compares claim time to “best” chart time. As expected, because of the definition of “best” chart time, there is better correlation between the claim time and the “best” chart time—although this is presented just to help bound the correlation, as for this calculation information from the claim was used to correct the chart time as was described in methods.
Table 6 presents the distribution of claim times, chart times and best chart times for each of the five procedure groups in the study (hip replacement or revision, knee replacement or revision, colectomy for cancer, colectomy not for cancer, and thoracotomy).
A Bland-Altman plot is presented in figure 3, which displays the difference between anesthesia chart and anesthesia claim times versus the average value of each pair. The vast majority of points show little difference between chart and claim, but there are a few outliers with respect to both measures. As there are 14,369 points on this graph, it should be remembered that outliers represent only a very small fraction of patients. The “wings” of the Bland-Altman plot do show rare large outliers. Eighty percent of the pairs showed differences between −16 and 0.5 min, and 95% of the pairs show differences between −49 and 16 min.
We next asked whether we could detect any appreciable difference in the discrepancy between claim and chart depending on type of surgical procedure or on the specific study hospital. For each regression using m-estimation our dependent variable is chart time, and the independent variables are claim time as well as hospital identifiers and/or procedure types, depending on the model. We use m-estimation because we observe that there are some extreme outliers in both the claim and the chart, and as our work has suggested in the past29, m-estimation is less sensitive to such errors.41,42 Table 7 displays four models. Model 1 simply predicts chart time using claim time. Model 2 adds into Model 1 individual hospitals. Model 3 adds procedure type to Model 1. Finally Model 4 adds both hospital and procedure variables to Model 1.
Model 1 suggests that we can estimate chart time very well with claim time. The coefficient on the claim time was nearly 1, and the intercept was −1.21 min. The model R2 was 0.89. Model 2 asks whether the relationship between claim and chart changes with the hospital. We do observe that the hospital does have a significant influence on the model, but the effects were extremely small. The hospital with the largest effect only increased the difference between the claim and the chart by 15 min (results not shown). We next asked if the individual procedure influenced the relationship between claim and chart. Again, we observed statistically significant but clinically insignificant effects from procedure, with effects on the order of only 1 to 2 min. Finally, Model 4 includes both hospital and procedure variables. Again, we see no appreciable difference in the estimates. Hence, a single formula that is not adjusted for procedure or hospital appears reasonable for this data set.
In a related set of analyses we ran a series of models to explore whether patient procedure, patient characteristics and hospital characteristics could predict the difference between anesthesia chart time and anesthesia claim time. While some results were statistically significant, all effects were very small and not of clinical interest. The median absolute error for the model using procedure as an independent variable was 4.39 min, with only 2.39 min separating the most extreme procedures. Adding patient characteristics did not improve the median absolute error, and further adding hospital characteristics only reduced the median absolute error to 3.88 min. Hence, patient, procedure and hospital characteristics did not influence the errors between anesthesia chart time and claim time in this data set.
One very likely application of the algorithm we use to derive information from the anesthesia claim is the estimate of surgical time. Since only anesthesiologists, and not surgeons, bill by the minute, we do not have a direct bill for surgical time. We can, however, observe how well the Medicare anesthesia claim information can describe surgical time. We might imagine that the difference between the anesthesia claim time and the surgical chart time may be more susceptible to the influence of procedure type and hospital than when using anesthesia claim time to predict anesthesia chart time. This is because the style of practice in a hospital may dictate different styles of coordination between the surgeon and the anesthetist. Table 8 displays the exact models as seen in table 7, but here we have substituted anesthesia chart time with surgical chart time. There are some immediate differences between table 7 and table 8. First, we see that there is generally a 22-min gap between the total surgical time and the total anesthesia time. Luckily for the patient, the intercept term is negative, suggesting anesthesia time is longer than surgical time We also observe that the influence of hospital style does play a slightly larger role in the regression, as does the influence of the procedure. However, as before, both effects were quite small, typically amounting to only a few minutes difference by institution or procedure. When we reran all models, substituting anesthesia chart time for anesthesia claim time, we obtained almost identical coefficients, with slightly smaller median errors. Figure 4 describes the relationship between surgical chart time and predicted surgical time derived from Model 1 in table 8. Using anesthesia claims to predict surgical time was not as accurate as using anesthesia claims to predict anesthesia claim time, yet 80% of paired differences were between −24 and 19 min.
Finally, we wish to describe the relationship between the anesthesia claim time and the anesthesia chart time that is corrected by claims when the chart time is missing. While the relationship is tautological, in that we used some claim time information to “correct” obvious chart errors, we only corrected these errors when we had no consistent chart information to make a judgment. In other words, for 146 patients we corrected the charts by using the claims, and the odds are great that these were fairly close (as there is only a 5-min median absolute time difference). We present this in order to better describe how well an individual using the anesthesia claims could mimic the actual anesthesia time as determined as best as possible. As can be seen in table 9, the results were quite similar to the previous findings. Hence, using the claim time does an excellent job at predicting chart time. Figure 5 displays a Bland-Altman plot for the Best Anesthesia Chart Time and the Anesthesia Claim Time. These plots look almost identical to the Anesthesia Chart Time versus Anesthesia Claim Time displayed in figure 3.
The OBSOS provided us with a unique opportunity to study how Medicare claims can be used to estimate procedure length, because the study was designed to measure operative time and entailed the merging of chart information with Medicare claims. Procedure length is a fundamental variable associated with quality and outcomes. Many have published on procedure length,1-17 often using chart reviews at single institutions.1-7 If anesthesia claims could be utilized to reliably provide valid information on procedure length, then many questions now relying on single institution studies with relatively small data sets could be answered with much larger and more representative samples. For example, large-scale nationwide studies of anesthesia claim time can be utilized to study a vast assortment of questions involving both clinical and health services research in anesthesiology and surgery. On the clinical side, better measures of anesthesia cumulative exposure may provide methods to study potential toxicities associated with anesthetic agents, and may provide us with a better way to study and develop models that assess postoperative risk due, in part, to deviations from the expected anesthesia time for the actual procedure performed. On the health services side, questions of quality can be studied, with benchmarking across all hospitals that care for Medicare patients. Examples include the study of racial disparities in procedure length inside and between hospitals throughout the United States, again, based on the actual procedures performed.
The results provided in the present report give the potential investigator a higher degree of confidence that anesthesia claims can be utilized to derive anesthesia time. The data presented in this study represents far more observations than those we reported on three years ago. Previously, using data from 1995-1996, we had analyzed 1,931 Medicare patients in 187 hospitals in the state of Pennsylvania. When we compared the chart to the claim, we observed a median absolute error of 5.49 min.29 In the present study, we report on the abstraction of 14,369 Medicare charts in 3 states over 47 hospitals. We find a median absolute difference that was very small, only 5.0 min. In other words, we can be quite certain that for the vast majority of cases, anesthesia claims work well at estimating anesthesia time.
In the present study, like our original study, we did observe occasional errors that were substantial. Therefore, as in the past report, we suggest the use of regression techniques that down-weight outliers when fitting models. Such techniques are ideally suited for problems such as ours, where claims information is usually correct but may occasionally fail to reflect the true procedure length due to mistakes in the algorithm that links claim to procedure, mistakes in the algorithm identifying whether anesthesiologists worked sequentially or concurrently, or mistakes in coding. As it stands, in situations where there is no single member of the anesthesia team that bills for the entire procedure, the claim may underestimate the chart. Furthermore, we may observe situations where the claim overestimates the chart information. These instances may reflect mistaken linkages between the specific procedure for which the claim was made. As anesthesia bills often use a “from-through” date that encompasses multiple procedures, one may mistakenly assign excess time to a single procedure that mistakenly reflects other procedures’ time.
Though this paper has focused on the potential use of anesthesia claim time as a dependent variable (an outcome variable) for many analyses, anesthesia claim time can also be utilized as an independent variable in models designed to predict outcomes. Just as when a claim time is used as the dependent (y) variable in regression it is important to fit these models using a robust method such as m-estimation,44 (because claim times closely reproduce chart times with rare but large errors), when a claim time is used as an independent (x) variable in a model, it is similarly important to fit these models using bounded-influence methods.45,46
While we want investigators to be aware of the potential pitfalls in using claims to determine anesthesia and surgical time, we do not want to overstate these problems. The correlations we report, now in two separate studies spanning over 8 yr of data, and close to 16,000 observations, are high and will be useful for applying the claims estimates to many important questions being studied concerning procedure time.
It is also interesting to note that billing styles were fairly similar across hospitals. We generally found only small differences between hospitals, with the exception of a few that were associated with 10 to 15 min claim-chart time differences. Furthermore, the median difference between the claim time and the chart time was 5 min. This number would not appear to be a coincidence. As one anesthesia time unit equals 15 min, a policy of always rounding up to the higher unit would lead to about a 5 min difference on average (assuming a uniform distribution for the fraction of units remaining before round-up).
In summary, we have demonstrated that the Medicare anesthesia claim can be utilized to construct an excellent measure of procedure time. Future investigators can feel confident that they may utilize our algorithm to better study procedure length through using the Medicare claim, without the need to collect procedure length information directly from the chart.
Procedure length from charts was compared to minutes billed in 15,914 Medicare anesthesia claims over 47 hospitals. There was very good concordance between chart and claim, with a median absolute difference of only 5 min.
We thank Traci Frank, A.A., Administrative Coordinator1, Rebecca Jones, M.S.N., R.N., Measures Project Coordinator7, and Min Wang, M.H.S., Project Manager1 for their assistance with this manuscript. Individuals who assisted in the acquisition of data at the study hospitals are acknowledged in appendix 2.
Funding Source: National Institute of Diabetes and Digestive and Kidney Disease, Bethesda, Maryland (Grant #R01-DK07-3671)
This work is received from the Center for Outcomes Research, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania.
This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.