This secondary data analysis found that persons with more severe pre-treatment back dysfunction demonstrated the greatest benefits from acupuncture or simulated acupuncture treatment, as measured by changes on the Roland score. Few other significant interactions emerged and none were consistent for both short and long term follow-ups. Regression to the mean is probably responsible for some of the greater improvement after 8 weeks in members of the usual care group with the worst back pain, given our finding that individuals in the usual care group improved more if their baseline dysfunction scores were worse. However, this phenomenon is unlikely to explain why the difference between usual care and acupuncture at 8 weeks increased as the baseline dysfunction scores increased. Thus, we have demonstrated interaction on an additive scale with the measurement of the Roland scores as absolute changes from baseline. This may merely reflect the greater opportunity for absolute change in those with higher baseline Roland scores. However, looked at from the perspective of relative percentage change from baseline, which is consistent with working on a multiplicative scale, the acupuncture (and simulated acupuncture) groups reduced their dysfunction approximately 30% more than did the usual care group - no matter what the baseline dysfunction score actually was. Thus, there was no interaction on the multiplicative scale. In fact, the measurement of interaction here, as is commonly found, depends on the scale that is being used.
Our findings are generally consistent with those of the few prior studies attempting to identify subgroups of individuals who respond best to specific treatments for back pain. Typically, these studies report few strong and consistent characteristics that identify subgroups of treatment responders for a specific intervention [15
]. However, most of these studies are not large enough to identify all but the strongest interactions. In one of the largest pragmatic trials of acupuncture for chronic back pain including over 2000 patients, Witt et al [28
] found that 3 of 9 evaluated characteristics of patients (younger age, worse baseline back dysfunction, more than 10 years of education) were effect modifiers indicating better response to acupuncture. One of the challenges in comparing results across studies is that studies typically assess a somewhat different list of possible characteristics as potential moderators of response to treatment.
Our finding that pre-treatment expectations did not predict response to specific types of acupuncture differs from the findings of previous researchers. Kalauokalani [22
] and Linde [23
] both found that more optimistic expectations of treatment led to better outcomes from acupuncture. Thomas's [29
] results were more complex. She found no benefit of acupuncture over usual care for persons with positive beliefs about acupuncture, but found acupuncture more effective for those who were agnostic about its benefits.
The results of these trials cannot be directly compared to our study because of differences in the way the data were collected and analyzed. However, our finding that individuals who could not provide a rating of their expectation of acupuncture's effectiveness did not have worse outcomes clearly demonstrates that our findings are different than those of Linde [23
]. In that study, participants who could not rate their expectation of acupuncture's effectiveness did worse than the others, who nearly always believed that acupuncture would be "effective" or "very effective". Given the variability in the findings across these studies, further research is needed to understand the different effects of pre-treatment expectations on outcomes of acupuncture care.
Our study has a number of limitations. For one thing, our study only explored characteristics of individuals that were predictive of superior outcomes for acupuncture (or a type of acupuncture) versus usual care. Conceivably, our findings may have differed had we used a different comparison group. We did not collect data on fear avoidance, which is associated with poor prognosis in some data sets [27
]. If patients with higher levels of fear avoidance were less likely to improve from acupuncture, our lack of information on this variable would be a limitation of our study.
Our study was large and high follow-up rates. However, the samples sizes required to detect interactions must be four times larger than that required for detecting a main effect of similar magnitude [30
]. Thus, we would be able to detect only large interactions.
Finally, as with all post-hoc analyses, the results must be interpreted with caution and need to be replicated in other data sets. We suspect that replication would best be undertaken in the context of a meta-analysis using individual patient level data from all included studies, as that would increase the sample size substantially [31
]. The Acupuncture Trialist Collaboration, a new international collaboration among researchers to share data and conduct meta-analyses from large trials of acupuncture for pain, may be well-suited to conduct such analyses.
Researchers have employed two different approaches in their attempts to identify sub-groups of persons with low back pain that would benefit from specific treatments. Some studies, including ours, have searched for sub-groups using regression analyses to see what characteristics are associated with superior outcomes for specific treatments. Others have developed "clinical prediction rules" wherein patients are initially categorized into more homogenous groups based on clinical findings and pain history. Such rules can then be tested in studies where patients are given treatments that are matched to the type of treatment that is believed better able to address their underlying problem [32
]. For example, Childs [11
] used this approach to validate a clinical prediction rule for spinal manipulation.
Clinical prediction rules have yet to be identified for acupuncture. In principle, various Chinese medicine findings, including Chinese medicine diagnosis, might be useful for developing such a rule. In practice, however, progress in this area has been limited because there is typically poor diagnostic concordance among TCM practitioners [33
] and because individual patients with chronic low back pain are often given multiple TCM diagnostic labels [34
Thoughtful collaboration among practitioners and researchers may ultimately lead to the development of prediction rules that match patients to the most appropriate health care provider. Such collaborations are most likely to be fruitful if they initially focus on developing comprehensive models that incorporate the physiological underpinnings of the biopsychosocial model [36