|Home | About | Journals | Submit | Contact Us | Français|
To compare the effectiveness of acupuncture with other relevant physical treatments for alleviating pain due to knee osteoarthritis.
Systematic review with network meta-analysis, to allow comparison of treatments within a coherent framework. Comprehensive searches were undertaken up to January 2013 to identify randomised controlled trials in patients with osteoarthritis of the knee, which reported pain.
Of 156 eligible studies, 114 trials (covering 22 treatments and 9,709 patients) provided data suitable for analysis. Most trials studied short-term effects and many were classed as being of poor quality with high risk of bias, commonly associated with lack of blinding (which was sometimes impossible to achieve). End of treatment results showed that eight interventions: interferential therapy, acupuncture, TENS, pulsed electrical stimulation, balneotherapy, aerobic exercise, sham acupuncture, and muscle-strengthening exercise produced a statistically significant reduction in pain when compared with standard care. In a sensitivity analysis of satisfactory and good quality studies, most studies were of acupuncture (11 trials) or muscle-strengthening exercise (9 trials); both interventions were statistically significantly better than standard care, with acupuncture being statistically significantly better than muscle-strengthening exercise (standardised mean difference: 0.49, 95% credible interval 0.00–0.98).
As a summary of the current available research, the network meta-analysis results indicate that acupuncture can be considered as one of the more effective physical treatments for alleviating osteoarthritis knee pain in the short-term. However, much of the evidence in this area of research is of poor quality, meaning there is uncertainty about the efficacy of many physical treatments.
The objective of treating osteoarthritis of the knee is usually the alleviation of pain and improving quality of life. Failure to control pain may result in reduced mobility and reduced participation in daily activities, which may further exacerbate symptoms. The regular use of pharmacological agents for pain may be associated with significant side effects (such as gastrointestinal bleeding)1, and many patients want non-pharmacological treatments for pain relief2,3. Effective alternatives to pharmacological pain relief are therefore desirable.
Five guidelines (ACR4, AAOS5, OARSI6, EULAR7, and NICE8) have evaluated treatment effects on key outcomes of knee osteoarthritis (including pain, function, and disability). All recommend treatment with muscle-strengthening and aerobic exercise, education, weight loss (if required), and, where necessary, paracetamol and/or topical NSAIDs; when these are ineffective, a choice of one or more options from a range of pharmacological and non-pharmacological treatments is sometimes recommended, including transcutaneous electrical nerve stimulation (TENS), thermal (heat/cooling) treatments, insoles, and braces. The OARSI guideline recommended using acupunture, AAOS found the acupunture evidence to be inconclusive, and the ACR conditionally recommended acupunture only for patients with moderate-to-severe pain who are unable or unwilling to undergo total knee arthroplasty. EULAR and NICE did not recommend use of acupunture; one of the reasons for the commissioning of this review – as part of a programme of projects on acupunture and chronic pain, funded by the National Institute for Health Research (NIHR) under its Programme Grant for Applied Research Programme – was the uncertainty within the NICE decision-making process with regard to the level of evidence on acupunture for osteoarthritis relative to other physical treatments. The rationale for this systematic review was to compare acupunture with available alternative physical treatments that might be prescribed by a GP, or used by a physiotherapist, since uncertainty exists regarding which treatments are best.
Although numerous reviews have evaluated individual types of physical treatment, few randomised trials have directly compared these treatments. One way to overcome this limitation is to use network meta-analysis, which allows assessment of relative efficacy when direct treatment comparisons are insufficient or unavailable. In the context of the present review it should enable all relevant physical treatments to be compared with each other. The purpose of this systematic review, therefore, was to conduct a comprehensive synthesis using network meta-analysis methods in order to compare the effectiveness of acupunture with other relevant physical treatments for alleviating pain due to osteoarthritis of the knee.
A systematic review was conducted following the general principles outlined in the Centre for Reviews and Dissemination (CRD) Guidance9 and the PRISMA statement10. This paper reports an update of a systematic review and network meta-analysis conducted in 2011, which is available on the CRD website11.
A range of resources was searched for published and unpublished studies, grey literature, and on-going research (see eMethods 1). We searched 17 electronic databases from inception to January 2013, without language restrictions. A combination of relevant free text terms, synonyms and subject headings relating to osteoarthritis of the knee and named physical treatments were included in the strategy. Bibliographies of relevant reviews and guidelines were also checked, and Internet searches were made of websites relating to osteoarthritis.
Two reviewers independently screened all abstracts and full papers, with disagreements resolved by discussion, or a third reviewer. We included randomised controlled trials (RCTs) assessing pain (as a primary or secondary outcome) in adults with knee osteoarthritis (with a population mean age of ≥55 years). Eligible treatments were any of the following: acupuncture, balneotherapy, braces, aerobic exercise, muscle-strengthening exercise, heat treatment, ice/cooling treatment, insoles, interferential therapy, laser/light therapy, manual therapy, neuromuscular electrical stimulation (NMES), pulsed electrical stimulation (PES), pulsed electromagnetic fields (PEMF), static magnets, Tai Chi, TENS, and weight loss. The following were excluded: predominantly home-based and unsupervised exercise interventions, surgical interventions, pharmaceutical interventions, interventions which combined two or more physical treatments, and studies comparing only different regimens/durations/modalities of the same intervention. Populations with varus/valgus malalignment were excluded as were studies which did not report data in a format suitable for network meta-analysis (see Outcomes section).
We classified adjunctive components of the experimental interventions into five categories, based on what was reported in the trials: ‘treatment as usual’, ‘treatment as usual’ plus specified home exercise or education, ‘treatment as usual’ plus specified (trial-specific) analgesics, no medication, and no medication plus specified home exercise or education. Eligible comparators included any form of standard/usual care or waiting list control (which could incorporate analgesics, education, and exercise advice) all of which we called ‘standard care’. Placebo interventions, no intervention, and sham acupuncture were also eligible. Sham acupuncture was treated as a separate comparator because of evidence suggesting it is more active than an inert ‘placebo’12,13. All pain scales were eligible.
Trial quality was assessed using an adaptation of a checklist (14 questions) from a previous review by CRD14. Using an algorithm, studies were then graded as excellent, good, satisfactory or poor, and also given an assessment based on the Cochrane risk of bias tool15 [see eTables I(a and b)]. Data extraction and quality assessments were performed by one reviewer and independently checked by a second. Disagreements were resolved by discussion or a third reviewer.
WOMAC pain (using a VAS or Likert scale) was the preferred pain measure. When studies did not measure WOMAC pain, another pain scale was included in the analysis with prioritisation of scales made on a clinical, or prevalence, basis (further details in the 2011 report)11. Hedges-g standardised mean differences (SMDs) were calculated for the meta-analyses (studies reporting medians could not be analysed). Results for different doses/regimens of the same type of treatment within a study were pooled. In an initial analysis only final values were used. However, we included more studies by calculating final values for trials reporting change from baseline data, provided trial baseline data together with variance estimates (e.g., standard deviations) were also reported. In order to present more clinically meaningful results, we present both SMDs, and SMDs converted to the WOMAC pain VAS 0-100 scale.
A network meta-analysis draws on both direct evidence (treatments compared in the same trial) and indirect evidence (different treatments studied in separate trials, but compared when they use a common comparator), with the benefit of randomisation in each study retained. For indirect and direct evidence to be consistent, population and intervention characteristics must be similar across comparisons16–21. Inconsistency between direct and indirect evidence was assessed using the node-splitting method17,22. The SMD was assumed to be normally distributed and a random effects network meta-analysis model was selected since clinical and methodological heterogeneity within treatments appeared likely23. Analyses were conducted using WinBUGS software (version 1.4). Further method detail can be found in eMethods 2.
We conducted analyses with interventions categorised both with, and without, any adjunct treatments. Furthermore, in order to attempt to assess both the immediacy and durability of effects, we planned analyses for three time points: end of treatment (our primary time point) as defined in the studies; 3 months from the start of treatment (the time point closest to 3 months from the start of treatment, excluding outcomes recorded at less than 4 weeks from the start of treatment); and three months after the end of treatment (the time point closest to 3 months, but between 8 and 16 weeks, from end of treatment). However, due to a lack of medium- and long-term data, we report here results for the end of treatment time point only.
To evaluate the impact of study quality on the results, two sets of analyses were performed: one including all studies regardless of quality (‘any-quality’), and a primary sensitivity analysis including studies of satisfactory, or better, quality (‘better-quality’). Studies with atypical populations, interventions, or results were excluded in a second sensitivity analysis. When possible, examination of funnel plots was used to assess for publication bias.
156 original trials (of 22 distinct interventions and comparators) met the inclusion criteria. Four of 10 foreign language papers which appeared eligible based on their English abstracts could not be translated, so had to be excluded from our analyses24–27. One retracted study was removed from all analyses28. Twenty-two new studies were identified from the 2013 update searches. A study selection flow diagram is presented in eFig. 1.
An overview of all eligible studies – regardless of whether they reported data suitable for network meta-analysis – is presented in Table I. The range of mean treatment durations (and timing of end of treatment assessment) varied widely from just a single session (TENS) to 69.3 weeks (weight loss interventions), although a majority of interventions were administered over a 2–6 week period. Most studies were classified as having recruited a general knee osteoarthritis population, although weight loss trials (as expected) recruited only overweight or obese participants. The mean BMIs of some studies recruiting a general population fell into the overweight or obese classification, although most studies did not report BMI.
Around three-quarters of the studies were classed as being of poor quality (110 of 152). The remainder were ‘satisfactory’ (33 studies) or ‘good’ (9 studies), together classed as ‘better-quality’. In the network meta-analyses only 12 trials were considered to be at low risk of bias. Most trials were hampered by a lack of adequate blinding, and small sample sizes (which limited the effectiveness of randomisation, resulting in baseline imbalances). Full study quality assessment results are presented in eTable I(a and b). Study quality did vary by intervention, making the evidence base more robust in some areas than in others [see Table II(a)]. No evidence was found for publication bias (only assessable for muscle-strengthening exercise). Individual study characteristics and a reference list of all studies included in the systematic review can be found in eTable II.
Overall, 114 trials (9,709 patients) reported data suitable for the end of treatment analyses. In addition to the 22 new studies identified from the update searches, nine studies – excluded from the original review analyses – were included in this updated analysis by calculating final values using change from baseline data. Our original analyses (based on searches up to 2010) provided no indication of a treatment effect difference related to the majority of adjunctive components of the experimental interventions (see eFig. 2). The exception was that standard care incorporating active analgesia was more effective than standard care with ‘treatment as usual’ (with or without home exercise/education). However, analgesic adjuncts were used in only eight trials. Furthermore, most studies were classified as using the ‘treatment as usual’ adjunct, where little adjunct detail was defined. We therefore focussed on comparing the interventions categorised without adjuncts.
Tables II(a and b) and Fig. 1(a and b) (caterpillar plots) present the primary results, with interventions ordered by treatment effect. The network is illustrated in eFig. 3. When compared with standard care, eight physical treatments had a mean effect suggesting benefit, namely interferential therapy, acupuncture, TENS, pulsed electrical stimulation, balneotherapy, aerobic exercise, sham acupuncture, and muscle-strengthening exercise [Fig. 1(a), Table II(a)]. When acupuncture (rather than standard care) was the comparator, acupuncture was significantly better at reducing pain than sham acupuncture, muscle-strengthening exercise, weight loss, PEMF, placebo, insoles, NMES, and no intervention [Fig. 1(b), Table II(b)]. Across all comparisons, inconsistency at a P-value less than 0.05 was only identified for the two comparisons involving PES.
The primary sensitivity analysis of only better-quality studies involved 35 trials, nine types of intervention and 3,499 patients. A small study of muscle-strengthening exercise vs PES was excluded as it was identified as causing inconsistency in the main analysis. The network is illustrated in Fig. 2. The reduction in the number of studies per comparison, as well as loops in the network, increased uncertainty around the true between-study variance. Some interventions were represented by few studies, although there were 11 acupuncture studies and nine muscle-strengthening exercise studies. There was a statistically significant reduction in pain compared with standard care for acupuncture, balneotherapy, sham acupuncture, and muscle-strengthening exercise [Fig. 1(c), Table II(c)]. Acupuncture was statistically significantly better at a 95% level of credibility than sham acupuncture, muscle-strengthening exercise, weight loss, aerobic exercise, and no intervention when the analysis of better-quality studies was presented as a comparison with acupuncture [Fig. 1(d), Table II(d)]. We found that acupuncture and balneotherapy were the two interventions with the highest rank, a probability statistic calculated from the treatment effect distributions (Table III), although there is uncertainty around these rankings as reflected in the overlapping credible intervals with sham acupuncture, muscle-strengthening exercise and Tai Chi.
Several trials were excluded in a secondary sensitivity analysis based on population or intervention differences, or on extreme data: the results were not sensitive to these changes, although the model fit improved. (see eResults 1).
For the analysis of better-quality studies, no network link could be made with the placebo-controlled studies. We therefore conducted a separate network meta-analysis for these studies. The results, and network, are presented in eResults 1. Both interferential therapy and heat treatment were statistically significantly more effective than placebo, but laser therapy, PES, and insoles were not.
The comprehensive network meta-analysis reported here, in which physical treatments for osteoarthritis of the knee were compared with each other within a coherent framework, provides the first estimate of the relative effect of these treatments, which is essential for decision makers. A network meta-analysis provides a basis of synthesising all the available evidence in a consistent framework, obviating the need to make decisions by subjective inferences from disparate data. Numerous systematic reviews, some summarised in a review of reviews,29 have evaluated the interventions (or classes of interventions) included in this review. However, our analysis represents the use of the most practical methods currently available to compare a large number of different types of treatment, i.e., enabling us to compare the physical treatments (including acupuncture) with each other.
Of the 22 interventions evaluated, eight – interferential therapy, acupuncture, TENS, pulsed electrical stimulation, balneotherapy, aerobic exercise, sham acupuncture, and muscle-strengthening exercise – produced a statistically significant reduction in pain, compared with standard care. Of these, only acupuncture and muscle-strengthening exercise were represented by more than three trials in the sensitivity analysis of better-quality studies, with acupuncture (11 trials) being statistically significantly better than muscle-strengthening exercise (9 trials). Acupuncture, and balneotherapy (1 trial) were the interventions with the highest rank, although there is some uncertainty around these. For the better-quality placebo-controlled studies, interferential therapy (1 trial) showed a strong effect when compared to placebo.
Like a standard meta-analysis, a network meta-analysis requires an assumption of exchangeability between the trials. We sought to minimise concerns which might arise from within- or between-intervention heterogeneity by using an age restriction as part of our inclusion criteria, and by excluding interventions consisting of more than one physical treatment. The patient characteristics appeared broadly comparable across interventions. Some clinical heterogeneity is inevitable in a wide-ranging study such as this, but baseline pain did not appear to vary systematically between interventions, as far as it was possible to tell – given the wide variation of scales used. We conducted sensitivity analyses excluding trials causing heterogeneity. Our analyses used a random effects model to incorporate heterogeneity, and we undertook an evaluation of levels of inconsistency and model fit. Despite this, it is possible there are unknown confounding factors affecting the results of indirect comparisons, although in our results heterogeneity is accounted for in the credible intervals. The majority of trials which used placebo interventions studied electrical or electromagnetic interventions; it is not unreasonable to assume the placebo effects were similar (since the interventions were similar). A further strength of our review is that trials covering a diverse range of interventions were all assessed using the same quality assessment tools; this enabled fair comparisons to be made by evaluating the reliability of the evidence base for each intervention.
However, although we conducted a sensitivity analysis of the better-quality studies, this resulted in fewer trials per comparison, and fewer network loops, meaning there is greater uncertainty about the true heterogeneity and about the differences between the direct and indirect evidence. Fewer loops in relation to the size of the network means there is less data to quantify inconsistency and so it is possible that uncertainty associated with inconsistency is not captured in the results. Further limitations are that we could not include all studies in our analyses due to the variable reporting of pain results, and the end of treatment data available was mostly short-term: of the trials which did investigate medium- or long-term effectiveness only a few could provide the data required by our analyses. However, given that the treatments under consideration are not intended as being cures, and that any treatment effect is expected to attenuate over time, a comparison of their maximum effect is not without merit.
It is important that our results are evaluated in context. Methodological limitations exist which are often inherent and unavoidable in clinical trials of physical treatments. Additionally, we found flaws which trialists could have avoided by using better methodology and reporting practices. Most of the studies in our review were rated as being of poor quality, and even many of the better-quality studies were pragmatic trials, where blinding of patients was not possible, i.e., most studies are likely to have been subject to some form of bias. For the trials where patients were not blinded, and treatments were compared with standard care, the overall treatment effect is likely to incorporate non-specific (placebo) effects. We assumed that such non-specific effects were similar across all interventions, but variation may in fact be present.
In light of our results, consideration of what might be the true (or specific) effect of acupuncture is warranted. A Cochrane review reported a statistically significant, clinically relevant, short-term improvement in pain, similar to our findings (acupuncture vs waiting list control, SMD −0.96, 95% CI: −1.19 to −0.72)30. The comparison of acupuncture with sham acupuncture also showed a similar effect to ours, and was described as being clinically irrelevant (SMD −0.35, 95% CI: −0.55 to −0.15). However, the largest study in this Cochrane analysis, indicating no significant difference, had, for many participants, the primary pain assessment 7 weeks after the end of treatment, and was one of two trials which used an intensive sham needling technique, which may have had physiologic effects. Also, our analysis included a recent large trial (discussed below) which used what appeared to be a very active sham. It is therefore possible that the pooled results from both reviews underestimate the short-term effect of acupuncture. It is also worth noting that the effect size of acupuncture vs sham is of the same order as that seen for NSAIDs vs placebo (SMD 0.32, 95% CI 0.24–0.39), which has also been described as being too small to be clinically significant31. An analysis of individual patient data on patients with knee osteoarthritis was recently reported for acupuncture studies (in which the allocation concealment methods had to be unambiguously adequate)32. These results also indicated acupuncture to be more effective than sham acupuncture, and also found a smaller effect size than when acupuncture was compared with no acupuncture (usual care) controls. Non-specific effects therefore seem to play an important role in the pain-alleviating effects of acupuncture. However, for our comparisons, the lack of blinding in trials of the other interventions (where blinding was not possible) in our network of better-quality studies would also be likely to result in non-specific effects contributing to results; it is reasonable to assume that fair comparisons between treatments have therefore been made.
Studies have presented evidence suggesting that sham acupuncture is associated with larger treatment effects than pharmacological and other physical placebos12,13. However, one of two opposing factors – inadequacy of patient blinding by using unsuitable shams, or the use of physiologically active shams – may impact on the effect of sham acupuncture in a given trial; the former may result in an overestimation of the true effect of acupuncture, while the latter may result in an underestimation. In our review important details about sham acupuncture (e.g., depth of insertion) were sometimes poorly reported, or were not reported, so the possibility of further clinical heterogeneity remains. One study in particular had a very active sham, the depth of needle insertion was similar to depths used for (active) acupuncture in some of the other trials; different needle placement formed a large component of the sham. This large study, which found no difference in pain between acupuncture and sham, partly explains the relatively large effect estimate seen for sham acupuncture in our analyses (when compared with standard care)33.
Several quantifications of the clinical relevance of improvements in knee pain scores exist [see Table II(d) footnote]. In this context, our results (derived from the better-quality trials) indicate that acupuncture produces both a ‘minimal perceptible clinical improvement’ (MPCI)34 and quite possibly a ‘minimal clinically important change’34,35, but may only yield a ‘minimal clinically important improvement’36 for patients with low levels of pain. For muscle-strengthening exercise (with evidence from nine trials) a MPCI remains a possibility. Overall, our results suggest that few physical treatments are likely to have a clinically-relevant pain-relieving effect. Other factors to consider when interpreting effectiveness results are safety, the rapidity of onset – and durability – of treatment benefit, and the convenience, cost, and likelihood of patient adherence to treatment37; these factors would clearly differ across the diverse range of interventions we studied, or when comparing them with pharmacological treatments.
Our analyses of the better-quality studies suggest that acupuncture should be considered as one of the physical treatment options for relieving pain due to osteoarthritis of the knee in the short-term. They indicate that balneotherapy, interferential therapy, and heat treatment may also be effective, but the results for all three interventions were informed by single small studies, so a cautious interpretation is warranted. It is worth noting that some of our results on effectiveness do not concur with existing guidance on physical treatments, specifically: EULAR (for insoles, braces, and weight loss), NICE (for TENS, insoles, braces, weight loss, manual therapy, and heat or cooling treatment), ACR (for weight loss, insoles, thermal agents, and Tai Chi), AAOS (for weight loss), and OARSI (for insoles, braces, heat or cooling treatment, TENS, and weight loss). Our analyses found little evidence (of significant differences from standard care, let alone clinically-relevant differences) to support such guidance with respect to treating pain, other than for TENS, where the evidence was of poor quality and likely to be unreliable. It should be remembered though that our review was focused on pain outcomes, rather than on function, disability, or cost-effectiveness.
Larger RCTs, with risk of bias reduced to a minimum and with longer treatment periods, which also examine the effectiveness of re-treatment following treatment cessation (to evaluate durability and attenuation effects) are needed in order to comprehensively assess the value of many of these interventions. The optimum timing and parameters of treatment for both acupuncture and muscle-strengthening exercise also need to be more clearly defined by future studies.
The evidence available for our network meta-analyses, in which physical interventions for osteoarthritis of the knee were compared with each other within a coherent framework, suggests that overall effectiveness is limited but that acupuncture can be considered as one of the more effective physical treatments for alleviating pain in the short-term. However, despite the large evidence-base found, the methodological limitations associated with many of the trials, indicate that high quality trials of many of the physical treatments are still required.
HM and NW conceived the study along with Lesley Stewart and Mark Sculpher. NW, MC, VM, AS and HM developed the protocol. MH performed the searches. SR performed the analyses in collaboration with VM and AS. MC wrote the first draft of the manuscript. MC, NW, VM and RS were responsible for the acquisition of data. All authors critically revised the manuscript for important intellectual content and approved the final version of the manuscript. HM obtained public funding. NW is guarantor.
The authors declare that they have no competing interests.
This article presents independent research funded by the NIHR under its Programme Grants for Applied Research Programme (Grant Reference Number RP-PG-0707-10186) titled, “Acupuncture for chronic pain and depression in primary care”. The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. The funders had no role in study design, data collection, data synthesis, data interpretation, or writing the report.
Thanks to Philip Conaghan, Mark Roman, Peter Hall, Mark Sculpher, Lesley Stewart, Andrea Manca, Cynthia Iglesias, Tony Danso-Appiah and Ann Hopton for their help at various stages of the review, particularly during protocol development.
This is an open-access article distributed under the terms of the Creative Commons Attribution-NonCommercial-ShareAlike License, which permits non-commercial use, distribution, and reproduction in any medium, provided the original author and source are credited.
Appendix ASupplementary data related to this article can be found at http://dx.doi.org/10.1016/j.joca.2013.05.007.