|Home | About | Journals | Submit | Contact Us | Français|
Although acupuncture is widely used for chronic pain, there remains considerable controversy as to its value. We aimed to determine the effect size of acupuncture for four chronic pain conditions: back and neck pain, osteoarthritis, chronic headache, and shoulder pain.
We conducted a systematic review to identify randomized trials of acupuncture for chronic pain where allocation concealment was determined unambiguously to be adequate. Individual patient data meta-analyses were conducted using data from 29 of 31 eligible trials, with a total of 17,922 patients analyzed.
In the primary analysis including all eligible trials, acupuncture was superior to both sham and no acupuncture control for each pain condition (all p<0.001). After exclusion of an outlying set of trials that strongly favored acupuncture, the effect sizes were similar across pain conditions. Patients receiving acupuncture had less pain, with scores 0.23 (95% C.I. 0.13, 0.33), 0.16 (95% C.I. 0.07, 0.25) and 0.15 (95% C.I. 0.07, 0.24) standard deviations lower than sham controls for back and neck pain, osteoarthritis, and chronic headache respectively; the effect sizes in comparison to no acupuncture controls were 0.55 (95% C.I. 0.51, 0.58), 0.57 (95% C.I. 0.50, 0.64) and 0.42 (95% C.I. 0.37, 0.46). These results were robust to a variety of sensitivity analyses, including those related to publication bias.
Acupuncture is effective for the treatment of chronic pain and is therefore a reasonable referral option. Significant differences between true and sham acupuncture indicate that acupuncture is more than a placebo. However, these differences are relatively modest, suggesting that factors in addition to the specific effects of needling are important contributors to the therapeutic effects of acupuncture.
Acupuncture is the insertion and stimulation of needles at specific points on the body to facilitate recovery of health. Although initially developed as part of traditional Chinese medicine, some contemporary acupuncturists, particularly those with medical qualifications, understand acupuncture in physiologic terms, without reference to pre-modern concepts1.
An estimated 3 million American adults receive acupuncture treatment each year2, and chronic pain is the most common presentation3. Acupuncture is known to have physiologic effects relevant to analgesia4, 5, but there is no accepted mechanism by which it could have persisting effects on chronic pain. This lack of biological plausibility, and its provenance in theories lying outside of biomedicine, makes acupuncture a highly controversial therapy.
A large number of randomized trials of acupuncture for chronic pain have been conducted. Most have been of low methodologic quality and, accordingly, meta-analyses based on these trials are of questionable interpretability and value6. Here we present an individual patient data meta-analysis of randomized trials of acupuncture for chronic pain, where only high quality trials were eligible for inclusion. Individual patient data meta-analysis is superior to the use of summary data in meta-analysis as it enhances data quality, enables different forms of outcome to be combined, and allows use of statistical techniques of increased precision.
The full protocol of the meta-analysis has been published.6 In brief, the study was conducted in three phases: identification of eligible trials; collection, checking and harmonization of raw data; individual patient data meta-analysis.
To identify papers, we searched MEDLINE, the Cochrane Collaboration Central Register of Controlled Trials and the citation lists of systematic reviews (full search strategy in Appendix). There were no language restrictions. The initial search, current to November 2008, was used to identify studies for the individual patient data meta-analysis; a second search was conducted in December 2010 for summary data to use in a sensitivity analysis.
Two reviewers applied inclusion criteria for potentially eligible papers separately, with disagreements about study inclusion resolved by consensus. Randomized trials were eligible for analysis if they included at least one group receiving acupuncture needling and one group receiving either sham (placebo) acupuncture or no acupuncture control. Trials must have accrued patients with one of four indications - non-specific back or neck pain, shoulder pain, chronic headache or osteoarthritis - with the additional criterion that the current episode of pain must be of at least four weeks duration for musculoskeletal disorders. There was no restriction on the type of outcome measure, although we specified that the primary endpoint must be measured more than four weeks after the initial acupuncture treatment.
It has been demonstrated that unconcealed allocation is the most important source of bias in randomized trials7 and, as such, we included only those trials where allocation concealment was determined unambiguously to be adequate (further detail in the review protocol6). Where necessary, we contacted authors for further information concerning the exact logistics of the randomization process. Trials were excluded if there was any ambiguity about allocation concealment.
The principal investigator of eligible studies was contacted and asked to provide raw data from the trial. To ensure data accuracy, all results reported in the trial publication, including baseline characteristics and outcome data, were then replicated.
Reviewers assessed the quality of blinding for eligible trials with sham acupuncture control. Trials were graded as having a low likelihood of bias if either the adequacy of blinding was checked by direct questioning of patients (e.g. by use of a credibility questionnaire) and no important differences were found between groups, or the blinding method (e.g. the Streitberger sham device8) had previously been validated as able to maintain blinding. Trials with a high likelihood of bias from unblinding were excluded from the meta-analysis of acupuncture versus sham; a sensitivity analysis included only trials with a low risk of bias.
Each trial was reanalyzed by analysis of covariance with the standardized principal endpoint (scores divided by pooled standard deviation) as the dependent variable, with the baseline measure of the principal endpoint and variables used to stratify randomization as covariates. This approach has been shown to have the greatest statistical power for trials with baseline and follow-up measures.9, 10 The effect size for acupuncture from each trial was then entered into a meta-analysis using the metan command in Stata 11 (Stata Corp., College Station, TX): the meta-analytic statistics were created by weighting each coefficient by the reciprocal of the variance, summing and dividing by the sum of the weights. Meta-analyses were conducted separately for comparisons of acupuncture with sham and no acupuncture control, and within each pain type. We pre-specified that the hypothesis test would be based on the fixed effects analysis as this constitutes a valid test of the null hypothesis of no treatment effect.
We identified 82 trials (see figure 1 for flowchart) of which 31 were eligible (Table 1 and Appendix online). Four of the studies were organized as part of the German Acupuncture Trials (GERAC) initiative11–14, 4 were part of the Acupuncture Randomized Trials (ART) group15–18; 4 were Acupuncture in Routine Care (ARC) studies19–22; 3 were UK National Health Service acupuncture trials23–25. Eleven studies were sham controlled, 10 had no acupuncture control and 10 were three-armed studies including both sham and no acupuncture control. The second search for subsequently published studies identified an additional four eligible studies26–29, with a total of 1,619 patients.
An important source of clinical heterogeneity between studies concerns the control groups. In the sham controlled trials, the type of sham included acupuncture needles inserted superficially13, sham acupuncture devices with needles that retract into the handle rather than penetrate the skin30 and non-needle approaches such as deactivated electrical stimulation31 or detuned laser32. Moreover, co-interventions varied, with no additional treatment other than analgesics in some trials15, whereas in other trials, both acupuncture and sham groups received a course of additional treatment, such as exercise led by physical therapists25. Similarly, the no acupuncture control groups varied between usual care, such as a trial in which control group patients were merely advised to “avoid acupuncture”23; attention control, such as group education sessions33; and guidelined care, where patients were given advice as to specific drugs and doses13.
Usable raw data were obtained from 29 of the 31 eligible trials, including a total of 17,922 patients from the US, UK, Germany, Spain and Sweden. For one trial, the study database had become corrupted34; in another case, the statisticians involved in the trial failed to respond to repeated enquiries despite approval for data sharing being obtained from the principal investigator35.
The 29 trials comprised 18 comparisons with 14,597 patients of acupuncture with no acupuncture group and 20 comparisons with 5,230 patients of acupuncture and sham acupuncture. Patients in all trials had access to analgesics and other standard treatments for pain. Four sham-controlled trials were determined to have an intermediate likelihood of bias from unblinding13, 32, 36, 37; the 16 remaining sham-controlled trials were graded as having a low risk of bias from unblinding. On average, drop-out rates were low (weighted mean 10%). Drop-out rates were only above 25% for four trials: Molsberger 200235 and 201027 (33% and 27%, but raw data not received and neither trial included in main analysis); Carlsson 200137 (46%, trial excluded in a sensitivity analysis for blinding) and Berman 200433 (31%). This had a high drop-out rate amongst no acupuncture controls (43%); drop-out rates were close to 25% in the acupuncture and sham groups. The Kerr trial had a large difference in drop-out rates between groups (acupuncture 13%, control 33%) but was excluded in the sensitivity analysis for blinding36.
Forest plots for acupuncture against sham acupuncture and against no acupuncture control are shown separately for each of the four pain conditions in figures 2 and and3.3. Meta-analytic statistics are shown in table 2. Acupuncture was statistically superior to control for all analyses (p<0.001). Effect sizes are larger for the comparison between acupuncture and no acupuncture control than for the comparison between acupuncture and sham: 0.37, 0.26 and 0.15 in comparison with sham versus 0.55, 0.57 and 0.42 in comparison with no acupuncture control for musculoskeletal pain, osteoarthritis and chronic headache respectively.
For five of the seven analyses, the test for heterogeneity was statistically significant. In the case of comparisons with sham acupuncture, the trials by Vas et al are clear outliers. For example, the effect size of the Vas trial for neck pain is about 5 times greater than meta-analytic estimate. One effect of excluding these trials in a sensitivity analysis (table 3) is that there is no significant heterogeneity in the comparisons between acupuncture and sham. Moreover, the effect size for acupuncture becomes relatively similar for the different pain conditions: 0.23, 0.16 and 0.15 against sham, and 0.55, 0.57 and 0.42 against no acupuncture control for back and neck pain, osteoarthritis, and chronic headache respectively (fixed effects; results similar for the random effects analysis).
To give an example of what these effect sizes mean in real terms, baseline pain score on a 0 – 100 scale for a typical trial might be 60. Given a standard deviation of 25, follow-up scores might be 43 in a no acupuncture group, 35 in sham acupuncture and 30 in patients receiving true acupuncture. If response were defined in terms of a pain reduction of 50% or more, response rates would be approximately 30%, 42.5% and 50%, respectively.
The comparisons with no acupuncture control show evidence of heterogeneity. This appears largely explicable in terms of differences between the control groups used. In the case of osteoarthritis, the largest effect is for Witt 200517, where patients in the waiting list control received only rescue pain medication, and the smallest for Foster 200725, which involved a program of exercise and advice led by physical therapists. For the musculoskeletal analyses, heterogeneity is driven by two very large trials19, 20 (n=2565 and n=3118) for back and neck pain. If only back pain is considered (table 3), heterogeneity is dramatically reduced and is again driven by one trial, Brinkhaus 200615, with waiting list control. In the headache meta-analysis, Diener 200613 had much smaller differences between groups. This trial involved providing drug therapy according to national guidelines in the no acupuncture group, including initiation of beta-blockers as migraine prophylaxis. There was disagreement within the collaboration about whether this constituted active control. Excluding this trial reduced evidence of heterogeneity (p=0.04) but had little effect on the effect size (0.42 to 0.45).
Table 3 shows several pre-specified sensitivity analyses. Neither restricting the sham control trials to those with low likelihood of unblinding nor adjustment for missing data had any substantive effect on our main estimates. Inclusion of summary data from trials for which raw data were not obtained (2 trials) or which were published recently (4 trials) also had little impact on either the primary analysis (table 3) or the analysis with the outlying Vas trials excluded (data not shown).
To estimate the potential impact of publication bias, we entered all trials in to a single analysis and compared the effect sizes from small and large studies38. We saw some evidence that small studies had larger effect sizes for the comparison with sham (p=0.023) but not no acupuncture control (p=0.7). However, these analyses are influenced by the outlying Vas trials, which were smaller than average, and by indication, as the shoulder pain trials were small and had large effect sizes. Tests for asymmetry were non-significant when we excluded Vas and shoulder pain studies (n=15; p=0.065) and when small studies were also excluded(n<100, n=12; p=0.3). Nonetheless, we repeated our meta-analyses excluding trials with a sample size less than 100. This had essentially no effect on our results. As a further test of publication bias, we considered the possible effect on our analysis if we had failed to include high-quality, unpublished studies. Only if there were 47 unpublished trials with n=100 showing an advantage to sham of 0.25 standard deviations would the difference between acupuncture and sham lose significance.
A final sensitivity analysis examined the effect of pooling different endpoints measured at different periods of follow-up. We repeated our analyses including only pain endpoints measured at 2 – 3 months after randomization. There was no material effect on results: effect sizes increased by 0.05 to 0.09 SD for musculoskeletal and osteoarthritis trials and were stable otherwise.
As an exploratory analysis, we compared sham to no acupuncture control. In a meta-analysis of 9 trials11–13, 15–18, 25, 33, the effect size for sham was 0.33 (95% C.I. 0.27, 0.40) and 0.38 (95% C.I. 0.20, 0.56) for fixed and random effects models respectively (p<0.001 for tests of both effect and heterogeneity).
In an analysis of patient-level data from 29 high quality randomized trials, including 17,922 patients, we found statistically significant differences between both acupuncture versus sham and acupuncture versus no acupuncture control for all pain types studied. After excluding an outlying set of studies, meta-analytic effect sizes were similar across pain conditions.
The effect size for individual trials comparing acupuncture to no acupuncture control did vary, an effect that appears at least partly explicable in terms of the type of control used. As might be expected, acupuncture had a smaller benefit in patients who received a program of ancillary care – such as physical therapist led exercise25 – than in patients who continued on usual care. Nonetheless, the average effect, as expressed in the meta-analytic estimate of approximately 0.5 standard deviations, is of clear clinical relevance whether considered either as a standardized difference39 or when converted back to a pain scale. The difference between acupuncture and sham is of lesser magnitude, 0.15 to 0.23 standard deviations.
Neither study quality nor sample size appear to be a problem for this meta-analysis, on the grounds that only high quality studies were eligible and the total sample size is large. Moreover, we saw no evidence that publication bias, or failure to identify published eligible studies, could affect our conclusions.
As the comparisons between acupuncture and no acupuncture cannot be blinded, both performance and response bias are possible. Similarly, while we considered the risk of bias of unblinding low in most studies comparing acupuncture and sham acupuncture, providers obviously were aware of the treatment provided and, as such, a certain degree of bias of our effect estimate for specific effects cannot be entirely ruled out. However, it should be kept in mind that this problem applies to almost all studies on non-drug interventions. We would argue that the risk of bias in the comparison between acupuncture and sham acupuncture is low compared to other non-drug treatments for chronic pain, such as cognitive therapies, exercise or manipulation, which are rarely subject to placebo control.
Another possible critique is that the meta-analyses combined different endpoints, such as pain and function, measured at different times. However, results did not change when we restricted the analysis to pain endpoints measured at a specific follow-up time, 2 – 3 months after randomization.
Many prior systematic reviews of acupuncture for chronic pain have had liberal eligibility criteria, accordingly included trials of low methodologic quality, and then came to the circular conclusion that weaknesses in the data did not allow conclusions to be drawn40, 41. Other reviews have not included meta-analyses, apparently due to variation in study endpoints42, 43. We have avoided both problems by including only high quality trials and obtaining raw data for individual patient data meta-analysis. Some more recent systematic reviews have published meta-analyses44–46 47 and reported findings that are broadly comparable to ours with clear differences between acupuncture and no treatment control and smaller differences between true and sham acupuncture. Our findings have greater precision: all prior reviews have analyzed summary data, an approach of reduced statistical precision when compared to individual patient data meta-analysis 6, 48. In particular, we have demonstrated a robust difference between acupuncture and sham control that can be distinguished from bias. This is a novel finding that moves beyond the prior literature.
We believe that our findings are both clinically and scientifically important. They suggest that the total effects of acupuncture, as experienced by the patient in routine clinical practice, are clinically relevant, but that an important part of these total effects is not due to issues considered to be crucial by most acupuncturists, such as the correct location of points and depth of needling. Several lines of argument suggest that acupuncture (whether real or sham) is associated with more potent placebo or context effects than other interventions49–52. Yet many clinicians would feel uncomfortable in providing or referring patients to acupuncture if it were merely a potent placebo. Similarly, it is questionable whether national or private health insurance should reimburse therapies that do not have specific effects. Our finding that acupuncture has effects over and above sham acupuncture is therefore of major importance for clinical practice. Even though on average these effects are small, the clinical decision made by doctors and patients is not between true and sham acupuncture, but between a referral to an acupuncturist or avoiding such a referral. The total effects of acupuncture, as experienced by the patient in routine practice, include both the specific effects associated with correct needle insertion according to acupuncture theory, non-specific physiologic effects of needling, and non-specific psychological (placebo) effects related to the patient’s belief that treatment will be effective.
We found acupuncture to be superior to both no acupuncture control and sham acupuncture for the treatment of chronic pain. Although the data indicate that acupuncture is more than a placebo, the differences between true and sham acupuncture are relatively modest, suggesting that factors in addition to the specific effects of needling are important contributors to therapeutic effects. Our results from individual patient data meta-analyses of nearly 18,000 randomized patients on high quality trials provide the most robust evidence to date that acupuncture is a reasonable referral option for patients with chronic pain.
The Acupuncture Trialists’ Collaboration is funded by an R21 (AT004189I from the National Center for Complementary and Alternative Medicine (NCCAM) at the National Institutes of Health (NIH) to Dr Vickers) and by a grant from the Samueli Institute. Dr MacPherson’s work has been supported in part by the UK National Institute for Health Research (NIHR) under its Programme Grants for Applied Research scheme (RP-PG-0707-10186). Eric Manheimer’s work on the Acupuncture Trialists’ Collaboration was supported by grant number R24 AT001293 from NCCAM The views expressed in this publication are those of the author(s) and not necessarily those of the NCCAM NHS, the NIHR or the Department of Health in England. No sponsor had any role in the design and conduct of the study; collection, management, analysis, and interpretation of the data; and preparation, review, or approval of the manuscript.
The Acupuncture Trialists' Collaboration includes physicians, clinical trialists, biostatisticians, practicing acupuncturists and others. The list of collaborators is as follows.
Claire Allen is the consumer representative ('patient advocate'). Mrs Allen is the Deputy Administrator at the Cochrane Collaboration Secretariat.
Mac Beckner, MIS, is Vice President of the Information Technology and Data Management Center at the Samueli Institute.
Brian Berman, MD, is Professor of Family & Community Medicine at the University of Maryland School of Medicine and Director of the Center for Integrative Medicine.
Benno Brinkhaus, MD, is professor at the Institute for Social Medicine, Epidemiology and Health Economics, Charité - University Medical Center, Berlin, Germany.
Remy Coeytaux, MD, PhD, is Associate Professor, Community and Family Medicine, Duke University.
Angel M. Cronin, MS, is a biostatistician at the Dana-Farber Cancer Institute.
Hans-Christoph Diener, MD, PhD, is Professor of Neurology and Chairman of the Department of Neurology at the University of Duisburg-Essen, Germany.
Heinz G. Endres, MD, is a senior research assistant and lecturer at the Ruhr-University Bochum, Germany.
Nadine Foster, DPhil, BSc(Hons), is Professor of Musculoskeletal Health in Primary Care, Arthritis Research UK Primary Care Centre, Keele University, UK.
Juan Antonio Guerra de Hoyos, MD, is director of the Andalusian Integral Plan for Pain Management, coordinator of Andalusian Health Service Project for Improving Primary Care Research.
Michael Haake, MD, PhD is an orthopedic surgeon who directs the Department of Orthopedics and Traumatology of the SLK-Hospitals in Heilbronn, Germany.
Richard Hammerschlag, PhD, is the Emeritus Dean of Research at the Oregon College of Oriental Medicine in Portland, Oregon.
Dominik Irnich, MD, is head of the Interdisciplinary Pain Centre at the University of Munich, Germany.
Wayne B. Jonas, MD, is the president and chief executive officer of the Samueli institute.
Kai Kronfeld, PhD, is a clinical trialist at the Interdisciplinary Centre for Clinical Trials (IZKS Mainz), University Medical Centre Mainz, Germany.
Lixing Lao, PhD, is professor at the University of Maryland, and director of Traditional Chinese Medicine Research at the Center for Integrative Medicine at that institution.
George Lewith, MD, FRCP, is a Professor of Health Research directing the Complementary and Integrated Medicine Research Unit at Southampton Medical School, UK.
Klaus Linde, MD, is research coordinator at the Institute of General Practice, Technische Universität München.
Hugh MacPherson, PhD, is a Senior Research Fellow who heads the Complementary Medicine Research Group at the University of York, UK.
Eric Manheimer, MS, is a research associate at the University of Maryland School of Medicine Center for Integrative Medicine.
Alexandra Maschino, BS, is a data analyst at Memorial Sloan-Kettering Cancer Center.
Dieter Melchart, MD, PhD, is a Professor directing the Centre for Complementary Medicine Research (Znf) at the Technische Universität München.
Albrecht Molsberger, MD, PhD, Prof. is an orthopedic surgeon, a practicing acupuncturist and the president of the German acupuncture research group.
Karen J. Sherman, PhD, MPH, is Senior Scientific Investigator at the Group Health Research Institute, Seattle WA.
Hans Trampisch, PhD, Chair of the department of Medical Statistics and Epidemiology at Ruhr-University Bochum, Germany.
Jorge Vas, MD, PhD, is the Chief Medical Officer of the Pain Treatment Unit, Dos Hermanas Primary Care Health Center (Andalusia Public Health System), Spain.
Andrew J. Vickers (collaboration chair), DPhil, is Assistant Attending Research Methodologist at Memorial Sloan-Kettering Cancer Center.
Norbert Victor, PhD, is Professor Emeritus at the University of Heidelberg, where he was previously Chair of Medical Biometry, and Director of the Institute for Medical Biometry and Informatics.
Peter White, PhD, is a lecturer in research methodology at the School of Health Sciences, University of Southampton, UK.
Lyn Williamson, MD, MA (Oxon), MRCGP, FRCP, is Consultant Rheumatologist at the Great Western Hospital, Swindon, UK and Honorary Senior Lecturer in Clinical Medicine at Oxford University.
Stefan N. Willich, MD, MPH, MBA, is Professor and Director of the Institute for Social Medicine, Epidemiology and Health Economics, Charité University Medical Center, Berlin, Germany.
Claudia M. Witt, MD, MBA, is Professor for Complementary Medicine at the University Medical Center Charité and Vice Director of the Institute for Social Medicine, Epidemiology and Health Economics, Berlin, Germany.
An ethics statement was not required for this work.
Conflicts of Interest
The authors declare that they have no competing interests.
Authors’ contributionsThe study was conceived by AV, GL, CW, and KL. AV was responsible for the overall study design with input from AC for the statistical analysis; AM for the systematic review; GL and HM with respect to acupuncture analyses; NV, CW, NF, KS and KL with respect to clinical trial methodology and meta-analysis. Statistical analyses were conducted by AV, AC and AM. The first draft of the manuscript was written by AV and AM. All authors gave comments on early drafts and approved the final version of the manuscript. AV had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Andrew J. Vickers, Memorial Sloan-Kettering Cancer Center, New York, NY.
Angel M. Cronin, Memorial Sloan-Kettering Cancer Center, New York, NY now at Dana-Farber Cancer Center, Boston, MA.
Alexandra C. Maschino, Memorial Sloan-Kettering Cancer Center, New York, NY.
George Lewith, University of Southampton, UK.
Hugh MacPherson, University of York, UK.
Norbert Victor, University of Heidelberg, Germany.
Nadine E. Foster, Keele University, UK.
Karen J. Sherman, Group Health Research Institute, Seattle, WA.
Claudia M. Witt, Charité Universitätsmedizin, Berlin Germany.
Klaus Linde, Technische Universität München, Germany.