The comprehensive network meta-analysis reported here, in which physical treatments for osteoarthritis of the knee were compared with each other within a coherent framework, provides the first estimate of the relative effect of these treatments, which is essential for decision makers. A network meta-analysis provides a basis of synthesising all the available evidence in a consistent framework, obviating the need to make decisions by subjective inferences from disparate data. Numerous systematic reviews, some summarised in a review of reviews,29
have evaluated the interventions (or classes of interventions) included in this review. However, our analysis represents the use of the most practical methods currently available to compare a large number of different types of treatment, i.e., enabling us to compare the physical treatments (including acupuncture) with each other.
Of the 22 interventions evaluated, eight – interferential therapy, acupuncture, TENS, pulsed electrical stimulation, balneotherapy, aerobic exercise, sham acupuncture, and muscle-strengthening exercise – produced a statistically significant reduction in pain, compared with standard care. Of these, only acupuncture and muscle-strengthening exercise were represented by more than three trials in the sensitivity analysis of better-quality studies, with acupuncture (11 trials) being statistically significantly better than muscle-strengthening exercise (9 trials). Acupuncture, and balneotherapy (1 trial) were the interventions with the highest rank, although there is some uncertainty around these. For the better-quality placebo-controlled studies, interferential therapy (1 trial) showed a strong effect when compared to placebo.
Like a standard meta-analysis, a network meta-analysis requires an assumption of exchangeability between the trials. We sought to minimise concerns which might arise from within- or between-intervention heterogeneity by using an age restriction as part of our inclusion criteria, and by excluding interventions consisting of more than one physical treatment. The patient characteristics appeared broadly comparable across interventions. Some clinical heterogeneity is inevitable in a wide-ranging study such as this, but baseline pain did not appear to vary systematically between interventions, as far as it was possible to tell – given the wide variation of scales used. We conducted sensitivity analyses excluding trials causing heterogeneity. Our analyses used a random effects model to incorporate heterogeneity, and we undertook an evaluation of levels of inconsistency and model fit. Despite this, it is possible there are unknown confounding factors affecting the results of indirect comparisons, although in our results heterogeneity is accounted for in the credible intervals. The majority of trials which used placebo interventions studied electrical or electromagnetic interventions; it is not unreasonable to assume the placebo effects were similar (since the interventions were similar). A further strength of our review is that trials covering a diverse range of interventions were all assessed using the same quality assessment tools; this enabled fair comparisons to be made by evaluating the reliability of the evidence base for each intervention.
However, although we conducted a sensitivity analysis of the better-quality studies, this resulted in fewer trials per comparison, and fewer network loops, meaning there is greater uncertainty about the true heterogeneity and about the differences between the direct and indirect evidence. Fewer loops in relation to the size of the network means there is less data to quantify inconsistency and so it is possible that uncertainty associated with inconsistency is not captured in the results. Further limitations are that we could not include all studies in our analyses due to the variable reporting of pain results, and the end of treatment data available was mostly short-term: of the trials which did investigate medium- or long-term effectiveness only a few could provide the data required by our analyses. However, given that the treatments under consideration are not intended as being cures, and that any treatment effect is expected to attenuate over time, a comparison of their maximum effect is not without merit.
It is important that our results are evaluated in context. Methodological limitations exist which are often inherent and unavoidable in clinical trials of physical treatments. Additionally, we found flaws which trialists could have avoided by using better methodology and reporting practices. Most of the studies in our review were rated as being of poor quality, and even many of the better-quality studies were pragmatic trials, where blinding of patients was not possible, i.e., most studies are likely to have been subject to some form of bias. For the trials where patients were not blinded, and treatments were compared with standard care, the overall treatment effect is likely to incorporate non-specific (placebo) effects. We assumed that such non-specific effects were similar across all interventions, but variation may in fact be present.
In light of our results, consideration of what might be the true (or specific) effect of acupuncture is warranted. A Cochrane review reported a statistically significant, clinically relevant, short-term improvement in pain, similar to our findings (acupuncture vs waiting list control, SMD −0.96, 95% CI: −1.19 to −0.72)30
. The comparison of acupuncture with sham acupuncture also showed a similar effect to ours, and was described as being clinically irrelevant (SMD −0.35, 95% CI: −0.55 to −0.15). However, the largest study in this Cochrane analysis, indicating no significant difference, had, for many participants, the primary pain assessment 7 weeks after the end of treatment, and was one of two trials which used an intensive sham needling technique, which may have had physiologic effects. Also, our analysis included a recent large trial (discussed below) which used what appeared to be a very active sham. It is therefore possible that the pooled results from both reviews underestimate the short-term effect of acupuncture. It is also worth noting that the effect size of acupuncture vs sham is of the same order as that seen for NSAIDs vs placebo (SMD 0.32, 95% CI 0.24–0.39), which has also been described as being too small to be clinically significant31
. An analysis of individual patient data on patients with knee osteoarthritis was recently reported for acupuncture studies (in which the allocation concealment methods had to be unambiguously adequate)32
. These results also indicated acupuncture to be more effective than sham acupuncture, and also found a smaller effect size than when acupuncture was compared with no acupuncture (usual care) controls. Non-specific effects therefore seem to play an important role in the pain-alleviating effects of acupuncture. However, for our comparisons, the lack of blinding in trials of the other interventions (where blinding was not possible) in our network of better-quality studies would also be likely to result in non-specific effects contributing to results; it is reasonable to assume that fair comparisons between treatments have therefore been made.
Studies have presented evidence suggesting that sham acupuncture is associated with larger treatment effects than pharmacological and other physical placebos12,13
. However, one of two opposing factors – inadequacy of patient blinding by using unsuitable shams, or the use of physiologically active shams – may impact on the effect of sham acupuncture in a given trial; the former may result in an overestimation of the true effect of acupuncture, while the latter may result in an underestimation. In our review important details about sham acupuncture (e.g., depth of insertion) were sometimes poorly reported, or were not reported, so the possibility of further clinical heterogeneity remains. One study in particular had a very active sham, the depth of needle insertion was similar to depths used for (active) acupuncture in some of the other trials; different needle placement formed a large component of the sham. This large study, which found no difference in pain between acupuncture and sham, partly explains the relatively large effect estimate seen for sham acupuncture in our analyses (when compared with standard care)33
Several quantifications of the clinical relevance of improvements in knee pain scores exist [see (d) footnote]. In this context, our results (derived from the better-quality trials) indicate that acupuncture produces both a ‘minimal perceptible clinical improvement’ (MPCI)34
and quite possibly a ‘minimal clinically important change’34,35
, but may only yield a ‘minimal clinically important improvement’36
for patients with low levels of pain. For muscle-strengthening exercise (with evidence from nine trials) a MPCI remains a possibility. Overall, our results suggest that few physical treatments are likely to have a clinically-relevant pain-relieving effect. Other factors to consider when interpreting effectiveness results are safety, the rapidity of onset – and durability – of treatment benefit, and the convenience, cost, and likelihood of patient adherence to treatment37
; these factors would clearly differ across the diverse range of interventions we studied, or when comparing them with pharmacological treatments.
Our analyses of the better-quality studies suggest that acupuncture should be considered as one of the physical treatment options for relieving pain due to osteoarthritis of the knee in the short-term. They indicate that balneotherapy, interferential therapy, and heat treatment may also be effective, but the results for all three interventions were informed by single small studies, so a cautious interpretation is warranted. It is worth noting that some of our results on effectiveness do not concur with existing guidance on physical treatments, specifically: EULAR (for insoles, braces, and weight loss), NICE (for TENS, insoles, braces, weight loss, manual therapy, and heat or cooling treatment), ACR (for weight loss, insoles, thermal agents, and Tai Chi), AAOS (for weight loss), and OARSI (for insoles, braces, heat or cooling treatment, TENS, and weight loss). Our analyses found little evidence (of significant differences from standard care, let alone clinically-relevant differences) to support such guidance with respect to treating pain, other than for TENS, where the evidence was of poor quality and likely to be unreliable. It should be remembered though that our review was focused on pain outcomes, rather than on function, disability, or cost-effectiveness.
Larger RCTs, with risk of bias reduced to a minimum and with longer treatment periods, which also examine the effectiveness of re-treatment following treatment cessation (to evaluate durability and attenuation effects) are needed in order to comprehensively assess the value of many of these interventions. The optimum timing and parameters of treatment for both acupuncture and muscle-strengthening exercise also need to be more clearly defined by future studies.
The evidence available for our network meta-analyses, in which physical interventions for osteoarthritis of the knee were compared with each other within a coherent framework, suggests that overall effectiveness is limited but that acupuncture can be considered as one of the more effective physical treatments for alleviating pain in the short-term. However, despite the large evidence-base found, the methodological limitations associated with many of the trials, indicate that high quality trials of many of the physical treatments are still required.