To analyze the relationship between evidence-based imaging guidelines and Medicare payment determination, we mapped the ACR Appropriateness Criteria (2005) onto the ICD-9-CM and CPT code sets, respectively (
International Classification of Diseases 2005;
American Medical Association 2005). The imaging appropriateness criteria come from summary documents, each for a general clinical condition such as back pain, head trauma, or epileptic seizures. Each guideline document includes a summary of the relevant literature, along with a series of tables organized by clinical condition and variant. is a reproduction of one of the tables in the head trauma guideline (
ACR 2005). Row entries in the tables list specific imaging procedures and the associated appropriateness rating from 1 to 9 with higher numbers being more appropriate.
To keep the analysis manageable, we chose to focus on neurologic imaging, which includes 16 guidelines containing a total of 139 individual clinical condition-variant tables. Using the row entries of these tables, we constructed a data set with 2,804 observations, each of which is a medical condition/imaging procedure combination. For each observation, independent variables include: medical condition (ICD-9-CM), imaging procedure (CPT code), appropriateness rating (1–9), modality (e.g., computed tomography), part of the body being imaged (e.g., head), and relative value units (RVU) for performing and interpreting the test.
The dependent variable, Medicare Part B payment determination for a given medical condition/imaging procedure combination, was obtained from a commercial web-based service used by many imaging providers (
http://www.codecorrect.com). The coding service uses the publicly available official Medicare Part B database of payment policies to determine the coverage category, based on the carrier, procedure, and medical condition. Coverage categories include: payable, no local medical review policy (payable), not payable, and not covered (not payable). In this study, the Medicare payment variable indicates the final determination of payable or not payable. The Supplementary Material
Appendix A contains additional detail regarding construction of the data set.
Before analyzing the data, we deleted 294 observations. The majority of the deleted observations (N=231) were cases in which the appropriateness rating was 0, indicating that the expert panel could come to no consensus. Other observations deleted were for procedures rarely used for neurologic imaging: thermography, to detect changes in skin surface temperature (N=12), and magnetoencephalography, which measures magnetic fields generated by neuronal activity of the brain (N=6). Finally, 45 observations were deleted because the listed imaging technique was so new that it had not yet been assigned a CPT code (e.g., functional brain magnetic resonance imaging [MRI]).
The analytic data set has 2,510 observations for 139 different medical conditions and 65 imaging procedures. Modalities include ANG (catheter angiography), CT (computed tomography), MRI, NUC (nuclear imaging), RAD (standard X-rays), SPEC (special invasive procedures, such as myelography), and US (diagnostic ultrasound, including vascular diagnostic procedures). Body part categories are Head, Neck, and Spine.
In interpreting the results of this study, it is important to keep in mind that the unit of analysis is a medical condition/imaging procedure combination from the ACR guidelines for neurologic imaging. Consequently, this data set contains no information about the extent to which any given medical condition/imaging procedure combination actually is submitted to Medicare for reimbursement or utilized in practice.
Descriptive Statistics
presents descriptive statistics for variables used in the analysis. For Medicare payment determination, 66.2 percent of the medical condition/imaging procedure combinations were identified as payable, with the remaining 33.8 percent being not payable.
Although the appropriateness ratings in the ACR tables range from 1 (least appropriate) to 9 (most appropriate), preliminary analysis using the full range of scores showed a distribution of predominantly even numbered scores, suggesting that the actual number of meaningful categories is <9. Consequently, we chose to collapse the appropriateness ratings into three categories: 1–3 (low), 4–6 (middle), and 7–9 (high). As shown in , 47.0 percent of the medical condition/procedure combinations had a low appropriateness rating, with 35.7 and 17.3 percent having middle and high appropriateness ratings, respectively.
also shows that, for modality, CT and MRI account for the majority of medical condition/imaging procedure combinations: 29.9 and 31.3 percent, respectively. Head procedures account for approximately three-quarters of the medical condition/imaging procedure combinations, reflecting the fact that the data set is restricted to neurologic imaging.
Although the RVU values were entered in numerical form, a frequency distribution indicated that the RVU values were strongly clustered toward certain values. For example, 67 observations had RVU=0.47, but no observations had RVU values equal to 0.43, 0.44, 0.45, 0.46, 0.48, or 0.49; this pattern probably reflects the fact that RVU values are administratively assigned. Consequently, we created a categorical variable that assigned observations to one of five RVU ranges. The frequency distribution for RVU category, also included in , shows that RVU category 3, which includes the intermediate values (1.5–2.5), is most common—approximately 67 percent of the imaging procedures fall into this RVU category.
Statistical Analysis
The primary research question of this study is whether Medicare is more likely to pay for imaging procedures as the level of appropriateness increases. To investigate this question, we used both bivariate and multivariate analyses.
First, we cross-tabulated the data according to appropriateness category (low, middle, and high) and Medicare payment determination (payable, not payable). A Cochrane–Armitage test for trend was used to determine whether the percentage of payable observations exhibited a significant trend across the three appropriateness categories.
Then we used logistic regression analysis to examine the effect of appropriateness category on the likelihood of Medicare payment, holding constant other factors expected to affect Medicare payment determination. The dependent variable is the Medicare payment determination (P): 1=payable, 0=not payable. The independent variable of primary interest is the appropriateness category (APPR).
Control variables included in the regression analysis to take account of other factors that may affect payment coverage are modality (MOD), body part (BOD), and RVU category (RVUCAT). For a given medical condition, payment may be different for standard X-rays or angiography, which have been in use longer, than for CT or MRI. Imaging modality may also control for differences in lobbying intensity among physician subspecialty groups and equipment vendors. The degree of consensus regarding diagnosis and treatment also may affect the probability of being payable. For example, the choice of imaging procedure—and if one should be done at all—for low back pain is less well-defined than for severe head trauma where CT scanning is universally considered to be necessary. Body part is included as a control variable to approximate for this effect. The final control variable included in the regression analysis is RVU category, because RVU is directly related to the dollar amount of payment.
where
P is Medicare payment determination, APPR is appropriateness category, MOD is modality, BOD is body part, and RVUCAT is RVU category.
We specify the equation using a logistic linking function, because the dependent variable is dichotomous. For the dependent variable of Medicare payment determination (
P=1: payable, or
P=0: not payable), the linear logistic model takes the form
where
p is the probability of being payable,
α is the intercept parameter,
β is the vector of slope parameters, APPR is appropriateness category (APPR : Low being the reference category), MOD is modality (MOD : CT being the reference category), BOD is body part (BOD : Head being the reference category), and RVUCAT is RVU category (RVU<0.5 being the reference category), and

is the error term. We used standard maximum likelihood methods (SAS PROC LOGISTIC) to estimate the parameters. For ease of interpretation, the results are reported in terms of odds ratio estimates, along with confidence intervals.