|Home | About | Journals | Submit | Contact Us | Français|
Acupuncture is often used for migraine prophylaxis but its effectiveness is still controversial. This review (along with a companion review on ’Acupuncture for tension-type headache’) represents an updated version of a Cochrane review originally published in Issue 1, 2001, of The Cochrane Library.
To investigate whether acupuncture is a) more effective than no prophylactic treatment/routine care only; b) more effective than ’sham’ (placebo) acupuncture; and c) as effective as other interventions in reducing headache frequency in patients with migraine.
The Cochrane Pain, Palliative & Supportive Care Trials Register, CENTRAL, MEDLINE, EMBASE and the Cochrane Complementary Medicine Field Trials Register were searched to January 2008.
We included randomized trials with a post-randomization observation period of at least 8 weeks that compared the clinical effects of an acupuncture intervention with a control (no prophylactic treatment or routine care only), a sham acupuncture intervention or another intervention in patients with migraine.
Two reviewers checked eligibility; extracted information on patients, interventions, methods and results; and assessed risk of bias and quality of the acupuncture intervention. Outcomes extracted included response (outcome of primary interest), migraine attacks, migraine days, headache days and analgesic use. Pooled effect size estimates were calculated using a random-effects model.
Twenty-two trials with 4419 participants (mean 201, median 42, range 27 to 1715) met the inclusion criteria. Six trials (including two large trials with 401 and 1715 patients) compared acupuncture to no prophylactic treatment or routine care only. After 3 to 4 months patients receiving acupuncture had higher response rates and fewer headaches. The only study with long-term follow up saw no evidence that effects dissipated up to 9 months after cessation of treatment. Fourteen trials compared a ’true’ acupuncture intervention with a variety of sham interventions. Pooled analyses did not show a statistically significant superiority for true acupuncture for any outcome in any of the time windows, but the results of single trials varied considerably. Four trials compared acupuncture to proven prophylactic drug treatment. Overall in these trials acupuncture was associated with slightly better outcomes and fewer adverse effects than prophylactic drug treatment. Two small low-quality trials comparing acupuncture with relaxation (alone or in combination with massage) could not be interpreted reliably.
In the previous version of this review, evidence in support of acupuncture for migraine prophylaxis was considered promising but insufficient. Now, with 12 additional trials, there is consistent evidence that acupuncture provides additional benefit to treatment of acute migraine attacks only or to routine care. There is no evidence for an effect of ’true’ acupuncture over sham interventions, though this is difficult to interpret, as exact point location could be of limited importance. Available studies suggest that acupuncture is at least as effective as, or possibly more effective than, prophylactic drug treatment, and has fewer adverse effects. Acupuncture should be considered a treatment option for patients willing to undergo this treatment.
Migraine patients suffer from recurrent attacks of mostly one-sided, severe headache. Acupuncture is a therapy in which thin needles are inserted into the skin at defined points; it originates from China. Acupuncture is used in many countries for migraine prophylaxis - that is, to reduce the frequency and intensity of migraine attacks.
We reviewed 22 trials which investigated whether acupuncture is effective in the prophylaxis of migraine. Six trials investigating whether adding acupuncture to basic care (which usually involves only treating acute headaches) found that those patients who received acupuncture had fewer headaches. Fourteen trials compared true acupuncture with inadequate or fake acupuncture interventions in which needles were either inserted at incorrect points or did not penetrate the skin. In these trials both groups had fewer headaches than before treatment, but there was no difference between the effects of the two treatments. In the four trials in which acupuncture was compared to a proven prophylactic drug treatment, patients receiving acupuncture tended to report more improvement and fewer side effects. Collectively, the studies suggest that migraine patients benefit from acupuncture, although the correct placement of needles seems to be less relevant than is usually thought by acupuncturists.
Migraine is a disorder with recurrent headaches manifesting in attacks lasting 4 to 72 hours. Typical characteristics of the headache are unilateral location, pulsating quality, moderate or severe intensity, aggravation by routine physical activity and association with nausea and/or photophobia and phonophobia (IHS 2004). Epidemiological studies have consistently shown that migraine is a common disorder with a 1-year prevalence of around 10% to 12% and a lifetime prevalence of between 15% and 20% (Oleson 2007). In Europe, the economic cost of migraine is estimated at 27 billion Euro per year (Andlin-Sobocki 2005). Most migraine patients can be adequately treated with treatment of acute headaches alone, but a relevant minority need prophylactic interventions, as their attacks are either too frequent or are insufficiently controlled by acute therapy. Several drugs, such as propranolol, metoprolol, flunarizine, valproic acid and topiramate, have been shown to effectively reduce attack frequency in some patients (Dodick 2007). However, all these drugs are associated with adverse effects. Dropout rates in most clinical trials are high, suggesting that the drugs are not well accepted by patients. There is some evidence that behavioral interventions such as relaxation or biofeedback are beneficial (Holroyd 1990; Nestoriuc 2007), but additional effective, low-risk treatments are clearly desirable.
Acupuncture in the context of this review is defined as the needling of specific points of the body. It is one of the most widely used complementary therapies in many countries (Bodeker 2005). For example, according to a population-based survey in the year 2002 in the United States, 4.1% of respondents reported lifetime use of acupuncture, and 1.1% recent use (Burke 2006). A similar survey in Germany performed in the same year found that 8.7% of adults between 18 and 69 years of age had received acupuncture treatment in the previous 12 months (Härtel 2004). Acupuncture was originally developed as part of Chinese medicine wherein the purpose of treatment is to bring the patient back to the state of equilibrium postulated to exist prior to illness (Endres 2007). Some acupuncture practitioners have dispensed with these concepts and understand acupuncture in terms of conventional neurophysiology. Acupuncture is often used to treat headache, especially migraine. For example, 9.9% of the acupuncture users in the U.S. survey mentioned above stated that they had been treated for migraine or other headaches (Burke 2006). Practitioners typically claim that a short course of treatment, such as 12 sessions over a 3-month period, can have a long-term impact on the frequency and intensity of headache episodes.
Multiple studies have shown that acupuncture has short-term effects on a variety of physiological variables relevant to analgesia (Bäcker 2004; Endres 2007). However, it is unclear to what extent these observations from experimental settings are relevant to the long-term effects reported by practitioners. It is assumed that a variable combination of peripheral effects; spinal and supraspinal mechanisms; and cortical, psychological or ’placebo’ mechanisms contribute to the clinical effects in routine care (Carlsson 2002). While there is little doubt that acupuncture interventions cause neurophysiological changes in the organism, the traditional concepts of acupuncture involving specifically located points on a system of ’channels’ called meridians are controversial (Kaptchuk 2002).
As in many other clinical areas, the findings of controlled trials of acupuncture for migraine and other headaches have not been conclusive in the past. In 1999 we published a first version of our review on acupuncture for idiopathic headache (Melchart 1999), and in 2001 we published an updated version in The Cochrane Library (Melchart 2001). In our 2001 update, we concluded that “overall, the existing evidence supports the value of acupuncture for the treatment of idiopathic headaches. However, the quality and the amount of evidence are not fully convincing.” In recent years several rigorous, large trials have been undertaken. Due to the increasing number of studies, and for clinical reasons, we decided to split our previous review on idiopathic headache into two separate reviews on migraine and tension-type headache (Linde 2009) for the present update.
We aimed to investigate whether acupuncture is a) more effective than no prophylactic treatment/routine care only; b) more effective than ’sham’ (placebo) acupuncture; and c) as effective as other interventions in reducing the frequency of headaches in patients with migraine.
We included controlled trials in which allocation to treatment was explicitly randomized, and in which patients were followed up for at least 8 weeks after randomization. Trials in which a clearly inappropriate method of randomization (for example, open alternation) was used were excluded.
Study participants had to be diagnosed with migraine. Studies focusing on migraine but including patients with additional tension-type headache were included. Studies including patients with headaches of various types (for example, some patients with migraine, some with tension-type headache) were included only if findings for migraine patients were presented separately or if more than 90% of patients suffered from migraine.
The treatments considered had to involve needle insertion at acupuncture points, pain points or trigger points, and had to be described as acupuncture. Studies investigating other methods of stimulating acupuncture points without needle insertion (for example, laser stimulation or transcutaneous electrical stimulation) were excluded.
Control interventions considered were:
Trials that only compared different forms of acupuncture were excluded.
Studies were included if they reported at least one clinical outcome related to headache (for example, response, frequency, pain intensity, headache scores, analgesic use). Trials reporting only physiological or laboratory parameters were excluded, as were trials with outcome measurement periods of less than 8 weeks (from randomization to final observation).
(See also: Pain, Palliative & Supportive Care Group methods used in reviews.)
For our previous versions of the review on idiopathic headache (Melchart 1999; Melchart 2001), we used a very broad search strategy to identify as many references on acupuncture for headaches as possible, as we also aimed to identify non-randomized studies for an additional methodological investigation (Linde 2002). The sources searched for the 2001 version of the review were:
The search terms used for the electronic databases were ’(acupuncture or acupressure)’ and ’(headache ormigraine)’. In the years following publication of the 2001 review, the first authors regularly checked PubMed and CENTRAL using the same search terms. For the present update, detailed search strategies were developed for each database searched (see Appendix 1). These were based on the search strategy developed for MEDLINE, revised appropriately for each database. The MEDLINE search strategy combined a subject search strategy with phases 1 and 2 of the Cochrane Sensitive Search Strategy for RCTs (as published in Appendix 5b2 of the Cochrane Handbook for Systematic Reviews of Interventions, version 4.2 6 (updated Sept 2006)). Detailed strategies for each database searched are provided in Appendix 1.
The following databases were searched for this update:
In addition to the formal searches, one of the reviewers (KL) regularly checked (last search 15 April 2008) all new entries in PubMed identified by a simple search combining acupuncture AND (migraine OR headache), checked available conference abstracts and asked researchers in the field about new studies. Ongoing or unpublished studies were identified by searching three clinical trial registries (http://clinicaltrials.gov/, http:// www.anzctr.org.au/, and http://www.controlled-trials.com/mrct/; last update 15 April 2008).
All abstracts identified by the updated search were screened by one reviewer (KL), who excluded those that were clearly irrelevant (for example, studies focusing on other conditions, reviews, etc.). Full texts of all remaining references were obtained and were again screened to exclude clearly irrelevant papers. All other articles and all trials included in our previous review of acupuncture for idiopathic headache were then formally checked by at least two reviewers for eligibility according to the above-mentioned selection criteria. Disagreements were resolved by discussion.
Information on patients, methods, interventions, outcomes and resultswas extracted independently by at least two reviewers using a specially designed form. In particular, we extracted exact diagnoses; headache classifications used; number and type of centers; age; sex; duration of disease; number of patients randomized, treated and analyzed; number of, and reasons for dropouts; duration of baseline, treatment and follow-up periods; details of acupuncture treatments (such as selection of points; number, frequency and duration of sessions; achievement of de-chi (an irradiating feeling considered to indicate effective needling); number, training and experience of acupuncturists); and details of control interventions (sham technique, type and dosage of drugs). For details regarding methodological issues and study results, see below.
Where necessary, we sought additional information from the first or corresponding authors of the included studies.
For the assessment of study quality, the new risk of bias approach for Cochrane reviews was used (Higgins 2008). We used the following six separate criteria:
We did not include the item ’other potential threats to validity’ in a formal manner, but noted if relevant flaws were detected.
In a first step, information relevant for making a judgment on a criterion was copied from the original publication into an assessment table. If additional information from study authors was available, this was also entered in the table, along with an indication that this was unpublished information. At least two reviewers independently made a judgment whether the risk of bias for each criterion was considered low, high or unclear. Disagreements were resolved by discussion.
For the operationalization of the first five criteria, we followed the recommendations of the Cochrane Handbook for Systematic Reviews of Interventions (Higgins 2008). For the ’selective reporting’ item, we decided to use a more liberal definition following discussion with two persons (Julian Higgins and Peter Jüni) involved in the development of the Handbook guidelines. Headache trials typically measure a multiplicity of headache outcomes at several time points using diaries, and there is a plethora of slightly different outcome measurement methods. While a single primary end-point is sometimes predefined, the overall pattern of a variety of outcomes is necessary to get a clinically interpretable picture. If the strict Handbook guidelines had been applied, almost all trials would have been rated ’unclear’ for the ’selective reporting’ item. We considered trials as having a low risk of bias for this item if they reported the results of the most relevant headache outcomes assessed (typically a frequency measure, intensity, analgesic use and response) for the most relevant time points (end of treatment and, if done, follow-up), and if the outcomes and time points reported made it unlikely that study investigators had picked them out because they were particularly favorable or unfavorable.
Trials that met all criteria, or all but one criterion, were considered to be of higher quality. Some trials had both blinded sham control groups and unblinded comparison groups receiving no prophylactic treatment or drug treatment. In the risk of bias tables, the ’Judgement’ column always relates to the comparison with sham interventions. In the ’Description’ column, we also include the assessment for the other comparison group(s). As the risk of bias table does not include a ’not applicable’ option, the item ’ incomplete follow-up outcome data addressed (4 to 12 months after randomization)?’ was rated as ’unclear’ for trials that did not follow patients longer than 3 months.
We also attempted to provide a crude estimate of the quality of acupuncture. Two reviewers (mostly GA and BB, or, for trials in which one of these reviewers was involved, AW) who are trained in acupuncture and have several years of practical experience answered two questions. First, they were asked how they would treat the patients included in the study. Answer options were ’exactly or almost exactly the same way’, ’similarly’, ’differently’, ’completely differently’ or ’could not assess’ due to insufficient information (on acupuncture or on the patients). Second, they were asked to rate their degree of confidence that acupuncture was applied in an appropriate manner on a 100-mm visual scale (with 0% = complete absence of evidence that the acupuncture was appropriate, and 100% = total certainty that the acupuncture was appropriate). The latter method was proposed by a member of the review team (AW) and has been used in a systematic review of clinical trials of acupuncture for back pain (Ernst 1998). In the Characteristics of included studies table, the acupuncturists’ assessments are summarized under ’Methods’ (for example, ’similarly/70%’ indicates a trial where the acupuncturist-reviewer would treat ’similarly’ and is ’70%’ confident that acupuncture was applied appropriately).
For the purposes of summarizing results, the included trials were categorized according to control groups: 1) comparisons with no acupuncture (acute treatment only or routine care); 2) comparisons with sham acupuncture interventions; 3) comparisons with prophylactic drug treatment; and 4) comparisons with other treatments.
We defined four time windows for which we tried to extract and analyze study findings:
In all included studies acupuncture treatment started immediately or very soon after randomization.
If more than one data point were available for a given time window, we used: for the first time window, preferably data closest to 8 weeks; for the second window, data closest to the 4 weeks after completion of treatment (for example, if treatment lasted 8 weeks, data for weeks 9 to 12); for the third window, data closest to 6 months; and for the fourth window, data closest to 12 months.
We extracted data for the following outcomes:
For continuous measures we used, if available, the data from intention-to-treat analyses with missing values replaced; otherwise, we used the presented data on available cases.
All these outcomes rely on patient reports, mainly collected in headache diaries.
Post hoc we decided also to extract the number of patients reporting adverse effects and dropping out due to adverse effects for the trials comparing acupuncture and prophylactic drug treatment.
Although we consider measures such as number of migraine days to be preferable - because they are more informative and less subject to random variation - we decided to use the proportion of responders as the main outcome measure simply because this was most often reported in the studies in a manner that allowed effect size calculation. We chose the 3- to 4-month time window as the primary measure because this a) is typically close to the end of the treatment cycle, and b) is a time point for which outcome data are often available.
Pooled random-effects estimates, their 95% confidence intervals, the Chi2-test for heterogeneity and the I2-statistic were calculated for each time window for each of the outcomes listed above. Given the strong clinical heterogeneity, pooled effect size estimates can be considered to be only very crude indicators of the overall evidence. For this reason we also refrained from calculating numbers needed to treat to benefit (NNTBs).
In our previous review on idiopathic headache (Melchart 2001), we evaluated 26 trials that included 1151 participants with various types of headaches. The search update identified a total of 251 new references. Full reports for three migraine trials (Alecrim 2005; Alecrim 2008; Jena 2008) that were reported only as abstracts at the time of completion of the literature search (January 2008) were later identified through personal contacts with study authors.
Most of the references identified by the search update were excluded at the first screening step by one reviewer, as they were clearly irrelevant. The most frequent reasons for exclusion at this level were: article was a review or a commentary; studies of non-headache conditions; clearly non-randomized design; and investigation of an intervention which was not true acupuncture involving skin penetration.
A total of 70 full-text papers were then formally assessed by at least two reviewers for eligibility. Thirty-two studies reported in 33 publications did not meet the selection criteria (see Characteristics of excluded studies). Common reasons for exclusion included: study group had non-migraine headache or included mixed pain populations without reporting data separately for the migraine subgroup (8 trials); interventions did not meet our definition of acupuncture (for example, laser acupuncture or transcutaneous electrical stimulation at acupuncture points; 6 trials); comparison of acupuncture with laser acupuncture or other acupuncture-like interventions (5 trials); and questionable random allocation (5 trials).
Twenty-two trials described in 37 publications (including published protocols, abstracts of trials otherwise not available at all or not available in English language, papers reporting additional aspects such as treatment details or cost-effectiveness analyses) met all selection criteria and were included in the review. The total number of study participants was 4419. One large study (n = 401) in which 6% of patients suffered from tension-type headache only was included, as 94% patients had migraine as a primary diagnosis (Vickers 2004). Two studies with a larger proportion of patients with tension-type headache were also included because separate subgroup data for migraine patients were available (Jena 2008; Wylie 1997). Patients included in these two studies who had only tension-type headache are not included in the number of patients and other figures below. Ten of the 22 included trials (Baust 1978; Ceccherelli 1992; Doerr-Proske 1985; Dowson 1985; Henry 1985; Hesse 1994; Vincent 1989; Weinschütz 1993; Weinschütz 1994; Wylie 1997) had been included in our previous review; the remaining 12 trials (Alecrim 2005; Alecrim 2006; Alecrim 2008; Allais 2002; Diener 2006; Facco 2008; Jena 2008; Linde K 2005; Linde M 2000; Linde M 2005; Streng 2006; Vickers 2004) are new.
Searches in the clinical trial registers identified four ongoing trials (Liang; Vas; Wang; Zheng; see Characteristics of ongoing studies).
A total of 4419 migraine patients participated in the included studies. The mean number of patients in each trial was 201, with a median of 42. The smallest trial included 27 patients and the largest 1715. Five trials had between 114 and 401 participants (Allais 2002; Facco 2008; Linde K 2005; Streng 2006; Vickers 2004); the two largest trials had 960 (Diener 2006) and 1715 participants (Jena 2008). Five of the larger trials were multicenter studies; all others were performed in a single center. The 10 older trials included in the previous version of our review had included a total of 407 migraine patients.
Eight trials originated from Germany, four from the UK, three each from Italy and Brazil, two from Sweden and one each from Denmark and France. We were able to obtain additional information from the authors of 16 trials; however, for most older trials the amount of additional information was very limited. Detailed additional data relevant for the calculation of effect size measures were received for eight trials (Alecrim 2005; Alecrim 2006; Alecrim 2008; Diener 2006; Jena 2008; Linde K 2005; Streng 2006; Vincent 1989).
All trials used parallel-group designs; no trial had a cross-over design. Eighteen trials had two groups (one acupuncture group and a control group), three trials were three-armed (Diener 2006; Doerr-Proske 1985; Linde K 2005) and one trial had four groups (Facco 2008). Six trials included a group which either received treatment of acute attacks only (Doerr-Proske 1985; Facco 2008; Linde K 2005; Linde M 2000) or ’routine care’ that was not specified by protocol (Jena 2008; Vickers 2004), while the experimental group received acupuncture in addition. Fourteen trials had a sham control group. Sham techniques varied considerably. In three trials existing acupuncture points considered inadequate for the treatment of migraine were needled superficially (Alecrim 2005; Alecrim 2006; Alecrim 2008); in five trials superficial needling of non-acupuncture points at variable distance from true points was used (Diener 2006; Linde K 2005; Vincent 1989; Weinschütz 1993; Weinschütz 1994); and in a further two trials close non-acupuncture points were needled without indication of needling depth (Baust 1978; Henry 1985). In two trials (Linde M 2005; Facco 2008) ’placebo’ needles (telescope needles with blunt tips not penetrating the skin) were used. In Linde M 2005 these were placed at the same predefined points as in the true treatment group. Facco 2008 had two sham groups: in one group the placebo needles were placed at correct, individualized points after the same full process of Chinese diagnosis as in the true treatment group. In the second group placebo needles were placed at standardized points without the ’Chinese ritual’ (to investigate whether the different interaction and process affected outcomes). In the remaining two trials (Ceccherelli 1992; Dowson 1985) other sham interventions without skin penetration were applied. Four trials compared acupuncture to prophylactic drug treatment with metoprolol (Hesse 1994; Streng 2006), flunarizine (Allais 2002) or individualized treatment according to guidelines (Diener 2006). In three of these trials participants were unblinded, while one blinded trial used a double-dummy approach (true acupuncture + metoprolol placebo vs. metoprolol + sham acupuncture; Hesse 1994). One trial compared acupuncture to a specific relaxation program (and a waiting list; Doerr-Proske 1985), and one to a combination of massage and relaxation (Wylie 1997).
Most trials included patients diagnosed as having migraine with or without aura, or reported only that they included patients with migraine. One trial was restricted to women with migraine without aura (Allais 2002), one recruited only women with menstrually related migraine (Linde M 2005) and a third recruited only patients with migraine without aura (Linde M 2000). Two older, small trials explicitly stated that included patients had been non-responders to previous treatments (Baust 1978; Doerr-Proske 1985).
It is likely that there is some diagnostic inaccuracy in several trials. In two older trials (Dowson 1985; Vincent 1989) the high number of headache days during the baseline phase makes it seem likely that a relevant proportion of participants had additional tension-type headache. In two large, recent pragmatic, multicenter trials investigating the addition of acupuncture to routine care in primary care (Jena 2008; Vickers 2004), baseline headache frequency and the reported diagnoses make it likely that, in spite of the use of the criteria of the International Headache Society, there was some diagnostic misclassification. This applies to a minor extent also to three other recent multicenter trials (Diener 2006; Linde K 2005; Streng 2006). In the two large, pragmatic, routine care studies (Jena 2008; Vickers 2004), which left non-acupuncture treatment completely to the individual practitioner, it also seems likely that treatment of acute attacks was suboptimal in a relevant proportion of patients.
The acupuncture interventions tested in the included trials also varied to a great extent. In four trials (Allais 2002; Ceccherelli 1992; Doerr-Proske 1985; Henry 1985) acupuncture treatments were standardized (all patients were treated at the same points); in six (Alecrim 2006; Baust 1978; Diener 2006; Linde K 2005; Linde M 2000; Linde M 2005) treatments were semi-standardized (either all patients were treated at some basic points and additional individualized points, or there were different predefined needling schemes depending on symptom patterns); and in 12 trials the selection of acupuncture points was individualized (Alecrim 2005; Alecrim 2008; Dowson 1985; Facco 2008; Hesse 1994; Jena 2008; Streng 2006; Vickers 2004; Vincent 1989; Weinschütz 1993; Weinschütz 1994; Wylie 1997). In four trials treatment consisted of six acupuncture sessions (Baust 1978; Dowson 1985; Vincent 1989; Wylie 1997), which must be considered a low number for a chronic condition. In four trials 16 to 20 sessions were provided (Alecrim 2005; Alecrim 2006; Alecrim 2008; Facco 2008), while the remaining trials included between 7 and 15 sessions. In most trials reporting the duration of sessions, needles were left in place between 20 and 30 minutes; in one trial (Dowson 1985) needles were inserted for 10 minutes only, and one trial (Hesse 1994) investigated brief needling for a few seconds. In the case of one trial (Doerr-Proske 1985), both assessing acupuncturists had very little confidence that acupuncture was performed in an adequate manner and would have treated the patients in a completely different manner.
All but three trials (Facco 2008; Henry 1985; Jena 2008) used a headache diary for measuring primary outcomes. Two trials (Baust 1978; Ceccherelli 1992) did not include a pre-treatment baseline period. Twelve trials followed patients for 6 months or more after randomization. The complex headache data on frequency, intensity, medication use and response were presented in a highly variable manner, making systematic extraction difficult. Particularly, most small, older trials (Baust 1978; Ceccherelli 1992; Doerr-Proske 1985; Henry 1985; Hesse 1994; Weinschütz 1993; Weinschütz 1994; Wylie 1997) presented the findings in a way precluding effect size estimation for migraine days, migraine attacks, headache days, intensity and analgesic use.
We discuss the methodological quality of trials (risk of bias) for the four comparisons separately, as problems differ according to control groups.
The four largest trials (Facco 2008; Jena 2008; Linde K 2005; Vickers 2004) all used adequate methods for allocation sequence generation and concealment of allocation. For one trial (Linde M 2000) sequence generation was adequate but concealment was in-adequate. One trial (Doerr-Proske 1985) did not report any details on randomization, and we were not able to obtain additional information. Given the comparison between acupuncture and no acupuncture, the patients (who were also assessing all relevant outcomes) were unblinded in all six trials. In consequence, bias cannot be ruled out. The use of headache diaries to monitor symptoms closely over a long period of time (in Doerr-Proske 1985; Linde K 2005; Linde M 2000; Vickers 2004) might be less prone to bias than the use of questionnaires with retrospective assessment of symptoms for the preceding weeks. Attrition in the first 3 months was high in Linde M 2000 and minor to moderate in the remaining trials. The analyses of Jena 2008, Linde K 2005 and Vickers 2004 took account of attrition (primary or sensitivity analysis with missing values replaced that confirmed available data analyses), suggesting a low risk of bias. This applies also to the long-term follow-up in Vickers 2004, while Facco 2008 presented only a per protocol analysis. Although presentation of results was not always optimal, we considered the risk of selective reporting to be low as the most important outcome measures were always presented and consistent.
While comparisons with no acupuncture cannot be blinded and, therefore, bias cannot be ruled out in the patient assessment of the (subjective) headache outcomes in any trial, we consider the trials of Jena 2008, Linde K 2005 and Vickers 2004 to have a lower risk of bias compared to the other three trials.
We did not formally assess the quality of Alecrim 2005, for which only an abstract and additional unpublished information provided by the authors were available. Unpublished information provided by the authors and published information from the two other trials (Alecrim 2006; Alecrim 2008) conducted by the same group suggest that the risk of bias in this trial is low. Among the 13 trials formally assessed, the risk of bias regarding sequence generation was low for eight (Alecrim 2006; Alecrim 2008; Ceccherelli 1992; Diener 2006; Dowson 1985; Facco 2008; Linde K 2005; Linde M 2005) and unclear in five. Publications for four trials reported adequate methods of allocation concealment (Alecrim 2006; Alecrim 2008; Diener 2006; Linde K 2005); for a further two trials, such information was provided by the authors (Ceccherelli 1992; Facco 2008). In all trials there were attempts to blind patients. Several trials that used sham interventions which were not strictly indistinguishable from ’true’ acupuncture’ (Ceccherelli 1992; Diener 2006; Facco 2008; Linde K 2005) did not mention the use of a sham or placebo control in the informed consent procedure. This is ethically problematic, but enhances the credibility of the sham interventions. Taking into account also the results of the trials, we considered the risk of bias to be low in all trials except in one that used an distinguishable sham procedure and for which we could not obtain information on the method of informed consent (Dowson 1985). Reporting of dropouts was insufficient in several older trials. We considered the risk of bias to be low regarding short-term outcomes (up to 3 months) in seven trials, (Alecrim 2006; Alecrim 2008; Diener 2006; Dowson 1985; Linde K 2005; Linde M 2005; Vincent 1989), and low regarding long-term outcomes in four (Alecrim 2008; Diener 2006; Linde K 2005; Linde M 2005). For four trials (Baust 1978; Dowson 1985; Weinschütz 1993; Weinschütz 1994) outcomes were reported so insufficiently that selective reporting cannot be ruled out.
One trial (Hesse 1994) did not describe the methods for sequence generation and concealment, while these were adequate in the other three trials (Allais 2002; Diener 2006; Streng 2006). These three trials compared acupuncture and drug treatment in an open manner, which implies that bias on this level cannot be ruled out. The use of a double-dummy technique allowed patient blinding in Hesse 1994, but this approach might be associated with other problems (see Discussion). While there is little risk of bias due to low attrition rates in Allais 2002 and Hesse 1994, a relevant problem occurred in the two German trials (Diener 2006; Streng 2006). The recruitment situation for these trials made it likely that participants had a preference for acupuncture. This resulted in a high proportion of patients allocated to drug treatment withdrawing informed consent immediately after randomization (34% in Diener 2006 and 13% in Streng 2006), as well as high treatment discontinuation (18% in Diener 2006) or dropout rates due to adverse effects (16% in Streng 2006). These trials did not include patients refusing informed consent immediately after randomization in analyses, and one (Streng 2006) also excluded early dropouts. Such analyses should normally tend to favor drug treatment. Both trials presented additional analyses restricted to patients complying with the protocol. All four trials presented the most important outcomes measured, so we considered the risk of bias of selective reporting to be low.
The two small trials comparing acupuncture with relaxation (Doerr-Proske 1985) or a combination of relaxation and massage (Wylie 1997) did not report on the methods used for generation of the allocation sequence, on concealment or on dropouts. Therefore, the risk of bias is unclear for these aspects. Patients were not blinded. Although the reporting of outcomes was suboptimal (no standard deviations, etc.), the most relevant outcomes measured were presented, and we considered the risk of bias of selective reporting to be low.
The six trials comparing acupuncture with a control group receiving either treatment of acute migraine attacks only or routine care are clinically very heterogeneous. Doerr-Proske 1985 is a very small older trial investigating a probably inadequate acupuncture treatment (see assessments by acupuncturists in Characteristics of included studies) compared to both a relaxation control and a waiting list control. Facco 2008 performed a four-armed trial in which patients in the control group all received acute treatment with rizatriptan. Linde M 2000 was a small pilot trial (n = 39) performed in a specialized migraine clinic in Sweden in which control patients continued with their individualized treatment of acute attacks but did not receive additional acupuncture. A similar approach was used for the waiting-list control group in the three-armed (also sham control group) Linde K 2005 (n = 302) trial. Jena 2008 is a very large, highly pragmatic study which included a total of 15,056 headache patients recruited by more than 4000 physicians in Germany. A total of 11,874 patients not giving consent to randomization received up to 15 acupuncture treatments within 3 months and were followed for an additional 3 months. This was also the case for 1613 patients randomized to immediate acupuncture, while the remaining 1569 patients remained on routine care (not further defined) for 3 months and then received acupuncture. The published analysis of this trial is on all randomized patients, but the authors provided us with unpublished results of subgroup analyses on the 1715 patients with migraine. Finally, in the Vickers 2004 trial (n = 401), acupuncture in addition to routine care in the British National Health Service was compared to a strategy, ’avoid acupuncture.’ In addition to the strong clinical heterogeneity, the methods and timing of outcome measurement in these trials also differed considerably. Therefore, any pooled effect size measures in the forest plots should be interpreted only as very crude indicators of the overall direction of the findings. Nevertheless, the findings clearly show that response, headache frequency, headache days and headache scores 3 to 4 months after randomization are more favorable in patients receiving acupuncture (see Figure 1; Figure 2; Analysis 1.5; Analysis 1.8). Responder rate ratios 3 to 4 months after randomization in the four trials reporting this outcome varied between 1.43 and 3.53. For analgesic use, the findings differed strongly across studies (Analysis 1.7). Migraine attacks and migraine days were adequately measured in only two trials (Linde K 2005; Linde M 2000). Only Vickers 2004 included a long-term follow-up. In this study, patients who had received acupuncture still did significantly better than those receiving routine care 9 months after completion of treatment.
The clinical heterogeneity of the 14 sham-controlled trials is less extreme than in the case of comparisons with no acupuncture, but is still considerable. Due to the variability of treatment and sham interventions, here too any pooled effect size estimates must be interpreted with caution. Furthermore, despite the very limited power (low number of trials), the Chi2-test for statistical heterogeneity was statistically significant (P < 0.05) in 9 of the 25 analyses and was close to significance (0.05 < p < 0.1) in a further three. I2-values were above 50% (indicating strong statistical heterogeneity) in 13 comparisons, and between 25% and 50% in a further four. Response measures were reported by 7 trials for the period up to 2 months after randomization, by 11 for 3 to 4 months, by 6 at 5 to 6 months, and by 3 after 6 months. Pooled responder rate ratios were not statistically significant at any period (see Figure 3). The same applies to mixed headache frequency measures (six, eight, five and four trials at the four different periods; see Figure 4), migraine attacks (four, five, four and four trials; Analysis 2.3), migraine days (five, six, five and four trials; Analysis 2.4), headache days (two, two, two and zero trials; Analysis 2.5), headache intensity (zero, three, three and 1 trials; Analysis 2.6), analgesic use (four, six, five and four trials; Analysis 2.7) and headache scores (one, three, two and zero trials; Analysis 2.8). There was some evidence of group differences (0.05 < p < 0.1) in four analyses (responder rate ratio, headache frequency, migraine days and migraine attacks up to 2 months after randomization).
When restricted to the five studies of higher quality (Alecrim 2006; Alecrim 2008; Diener 2006; Linde K 2005; Linde M 2005), analyses of response and headache frequency also failed to yield significant differences between acupuncture and sham acupuncture.
The results of Hesse 1994 regarding treatment effectiveness were not reported in amanner that allowed effect size estimation. Overall, the findings of this trial, which used a double-dummy design (true acupuncture + metoprolol placebo vs. metoprolol + sham acupuncture), show similar improvements in both groups, slightly favouring the metoprolol + sham acupuncture group. The acupuncture technique used in this trial (very brief needling of individual trigger points) is rather unusual. The remaining three trials all reported at least some frequency data (migraine attacks and/or migraine days). Findings were consistent among trials, and the pooled standardized mean differences were statistically significant in favour of acupuncture in the first three time periods (none of the trials had a follow-up beyond 6 months; see Figure 5). For response (see Figure 6), migraine attacks (Analysis 3.3), migraine days (Analysis 3.4), headache intensity (Analysis 3.6) and analgesic use (Analysis 3.7), effect size estimates could be calculated for at least two trials. The reduction of analgesic use was similar in patients receiving acupuncture and prophylactic drug treatment, but for several time windows, results for response, migraine attacks, migraine days and intensity were statistically significant in favour of the acupuncture groups.
All four trials described the number of patients reporting adverse effects. In all four, more patients receiving drug treatment reported adverse effects than patients receiving acupuncture, but the difference was less pronounced in the largest trial (Diener 2006) compared to the other three trials (test for heterogeneity P = 0.01, I2 = 73.1%). The pooled odds ratio was 0.47 (95% confidence interval 0.34 to 0.65; Analysis 3.9). In the two trials reporting the number of dropouts due to adverse effects, this was lower in patients receiving acupuncture (Analysis 3.10).
The two small trials comparing acupuncture with relaxation (Doerr-Proske 1985) and a combination of relaxation and massage (Wylie 1997) did not report any outcome measures in a manner usable for calculation of effect size estimates. In Doerr-Proske 1985 overall results suggest short- and long-term superiority of the relaxation program compared to the probably inadequate acupuncture intervention. Wylie 1997 reported a significantly larger short-term (no follow-up beyond 2 months) reduction of pain total and headache scores in the group receiving massage and relaxation, but baseline values were much lower in the acupuncture group (189 vs. 326 for the pain total score, and 23 vs. 38 for the headache index). The mean number of migraine days decreased from 7.1 to 1.7 in the acupuncture group, and from 7.5 to 2.7 in the massage and relaxation group.
In recent years, the evidence base for acupuncture as a prophylactic treatment for headache has grown considerably due to the publication of several large trials of high quality. Still, the results are challenging and not easy to interpret. Several trials using quite variable methods and interventions consistently show that the addition of acupuncture to treatment of acute migraine attacks or to routine care is beneficial for at least 3 months. Compared to routine care, which includes treatment of acute migraine attacks and possibly other interventions, the size of the effect seems to be small to moderate (according to usual standards for classifying effect size measures such as standardized mean differences); it seems to be larger compared to acute treatment only. The only trial which investigated long-term effects showed a sustained moderate response to acupuncture in addition to routine care provided by a GP. There is currently no evidence that the acupuncture interventions tested had relevant effects over their sham comparators, although a number of single trials report significant findings. At the same time, the pooled analyses of the available trials comparing acupuncture interventions with evidence-based prophylactic drug treatment found a superiority of acupuncture. The findings from two small older trials comparing acupuncture and relaxation interventions are not reliably interpretable.
The findings of our review seem contradictory: on the one hand, the available evidence suggests that acupuncture is an effective adjunct to routine care and at least as effective as prophylactic treatment with drugs that have been shown to be superior to placebo (Schürks 2008). On the other hand, ’true’ acupuncture interventions do not seem to be superior to sham interventions. Three factors could explain these findings (possibly in combination): 1) Acupuncture might be a particularly potent placebo; 2) sham acupuncture might have direct physiological effects affecting mechanisms relevant for migraine symptoms; 3) due to the lack of blinding, comparisons with routine care and prophylactic drug treatment might be biased.
We consider each of these possible explanations in turn:
A fourth possible explanation for the lack of effects of true acupuncture over sham comes from the perspective of acupuncture practitioners. The quality of acupuncture interventions in clinical trials is often disputed. Study protocols often limit the flexibility of treatment procedures, particularly in sham-controlled trials, and it is argued that better acupuncturists would have achieved better results. However, response rates in sham-controlled trials were on average similar to those in pragmatic trials with flexible treatments. Furthermore, while there is always the possibility that some expert acupuncturists are particularly successful, in several of the larger trials included in this review the training of treatment providers was at least comparable to that of the average acupuncturists in their country. Still, it cannot be ruled out that inadequate study interventions contribute to the lack of differences compared to sham interventions.
It should be noted that a statistically significant difference between ’true’ and sham acupuncture interventions was found in our systematic review on trials in patients with tension-type headache (Linde 2009). This review, however, included a smaller number of studies, and pooled effect estimates were stronly influenced by one large trial.
The quality of clinical trials of acupuncture for headache has clearly improved since the last version of our review. Methods for sequence generation, allocation concealment, handling of dropouts and withdrawals and reporting of findings were adequate in most of the recent trials. Still, designing and performing clinical trials of acupuncture is a challenge, particularily with respect to blinding and selection of control interventions. We have mentioned that bias cannot be ruled out in the unblinded studies, and that comparisons with prophylactic drug treatment have to be interpreted with caution due to high dropout rates in two of the trials. Blinding in comparisons with drug treatment could be achieved by double-dummy designs (drug + sham acupuncture vs. acupuncture + drug placebo) as in the trials by Hesse 1994. However, if it is the case that sham acupuncture interventions might be strong placebos and not physiologically inert, this approach would also be problematic.
Acupuncture is a therapy which is applied in a variable manner in different countries and settings. For example, in Germany, where the majority of the large trials included in this review were performed, acupuncture is mainly provided by general practitioners and other physicians. Their approach to acupuncture is based on the theories of traditional Chinese medicine, although the amount of training they receive in traditional Chinese medicine is limited (Weidenhammer 2007). In the UK, the providers are likely to be non-medical acupuncturists with a comparatively intense traditional training, physiotherapists or medical doctors with a more ’Western’ approach (Dale 1997). The trials included in our review come from a variety of countries, and study designs range from very pragmatic (Jena 2008; Vickers 2004) to more experimental (Linde M 2005). Despite this strong heterogeneity, within comparisons the findings are quite consistent. Large-scale observational studies (Jena 2008; Melchart 2006), a review of smaller observational studies (Linde 2002) and a systematic comparison of findings from a randomized and an observational study (Linde 2007a) suggest that the response rates observed in clinical trials are also seen in conditions similar to routine practice. However, as the overall evidence also suggests that factors other than the correct selection of acupuncture points and needling procedures play an important role in outcomes, treatment setting and patient selection could have a strong impact and might vary considerably. For example, a pooled analysis of four trials on chronic pain (including Linde K 2005) found that even 4 months after completion of treatment, patients who had started acupuncture with a positive attitude and expectation had significantly better outcomes than patients with lower expectations (Linde 2007b).
We are confident that we have identified the existing large clinical trials relevant to our question, but we cannot rule out the possibility that there are additional small trials which are unpublished or published in sources not accessible to our search. We have not systematically searched Chinese databases for this version of the review, but we assume that Chinese trials meeting our selection criteria exist. The few Chinese trials identified through our literature search did not meet the inclusion criteria. There is considerable skepticism toward clinical trials from China, as in the past results were almost exclusively positive (Vickers 1998). However, the quality and number of randomized trials published in Chinese have improved over the last years (Wang 2007), and it seems in-adequate to neglect this evidence without examining it critically. For the next update of this review we plan to include researchers and evidence from China to overcome this shortcoming.
A relevant problem for systematic reviews on prophylactic treatments of migraine is the highly variable outcome measurement and the often insufficient reporting of results. Various measures of frequency, intensity, analgesic use and other outcomes are used, and as these measures have to be observed over longer time periods, the amount of data needed to obtain a good overview of the course of symptoms is considerable. Most trials in our review reported several outcome measures at different time points without evidence that these were selected in a biased way. Nevertheless, we were confronted with a complex mosaic of data. Several authors kindly provided unpublished data. Some sort of response and frequency measure was available for almost all trials, although the timing of the measurement and details of the measure often differed. As overall results are rather consistent, it seems unlikely that our results would have changed in a relevant manner if missing data had been available.
Four members of the review team were involved in at least one of the included trials. These trials were assessed by other members of the review team. All reviewers currently have affiliations to a CAM (complementary and alternative medicine) research center, or have had such an affiliation in the past.
Our findings are in good accordance with a recent systematic review published in an acupuncture journal (Scott 2006). Using slightly wider inclusion criteria regarding methodology and condition, the Scott review summarized a total of 25 trials. Another systematic review published in 2006 (Griggs 2006) did not include trials published after 2004, excluded trials published in languages other than English, and included trials on other headaches, although the title suggests a focus on migraine. The conclusion that large trials are needed is not based on the most current evidence. Only five of the trials included in our review were included in the Griggs 2006 review. The remaining trials included in Griggs 2006 were either on tension-type headache (n = 6) or mixed populations (n = 1) or, in one case, a migraine trial that was excluded by us (Liguori 2000) because we had severe doubts that allocation was truly randomized. A large narrative review focusing on recent trials (Endres 2007) also draws conclusions similar to ours.
The assessment of safety was not a predefined objective of this review. Post-hoc analyses for comparisons with prophylactic drug treatments found fewer patients reporting adverse effects and fewer dropouts due to adverse effects in the acupuncture groups. We will include a more formal assessment of safety in future versions of this review. Several large-scale observational studies have provided good evidence that acupuncture is a comparatively safe intervention (White 2001; MacPherson 2001; Weidenhammer 2007; Witt 2006). Severe adverse effects such as pneumothorax are very rare. However, between 8% and 11% of patients report minor adverse effects such as fatigue or temporary aggravations (Witt 2006; Melchart 2006).
For the two large pragmatic trials included in our review (Vickers 2004; Jena 2008), detailed cost-effectiveness analyses are available (Wonderling 2004; Witt 2008). Both analyses show that costs within the study periods (12 months in Vickers 2004 and 3 months in Jena 2008) were higher in the groups receiving acupuncture than in those receiving routine care because of acupuncture practitioners’ costs. Cost-effectiveness was assessed by calculating incremental costs per quality-adjusted life year. The resulting estimates were 13.600 Euro in the analysis by Wonderling 2004 and 11.700 Euro in the analysis by Witt 2008. Both groups concluded that according to international threshold values, acupuncture seems to be a cost-effective treatment.
Although the available results suggest that the selection of specific points is not as important as had been thought by providers, acupuncture should be considered as a treatment option for migraine patients needing prophylactic treatment due to frequent or insufficiently controlled migraine attacks, particularly in patients refusing prophylactic drug treatment or experiencing adverse effects from such treatment.
There is a clear need for further studies. A priority, in our opinion, should be to investigate whether the high response rates observed in conditions similar to routine care in Germany and the UK are reproducible elsewhere. As migraine is a chronic condition, it would be important for clinicians to know how long improvements associated with acupuncture treatment last and whether a further treatment cycle again leads to improvement. These latter questions might be best investigated in cohort studies. Available studies have been rather unsuccessful at identifying reliable predictors for treatment response (Jena 2008; Weidenhammer 2006); these issues could also be investigated in observational studies. For decision makers it would be important to know who is sufficiently qualified to deliver acupuncture. Studies from Germany did not find an association between factors such as amount of training or professional experience and treatment response (Jena 2008; Weidenhammer 2006), but these studies were limited to physicians. Randomized trials comparing outcomes after treatment by different types of practitioners are desirable, although large sample sizes would be needed. Such studies would also be interesting from a more scientific perspective because it is unclear to what extent the effects of acupuncture are mainly mediated by context variables and generalised (i.e., not specific to traditional points) needling effects, and what contribution correct point location makes. Although future sham-controlled trials might find ’specific’ effects over sham interventions, we think that such studies should not have the highest priority unless they also address other important questions. Other aspects that deserve further research include questions such as which types of acupuncture work best, what is the optimal frequency and duration of sessions, and so on. Future comparisons with other non-drug interventions (such as relaxation) should have sufficient sample size. To facilitate future meta-analyses, it would be helpful if some standards for reporting outcome data were established.
We would like to thank the study authors who provided additional information on their trials.
Dieter Melchart, Patricia Fischer and Brian Berman were involved in previous versions of the review. Eva Israel helped assess eligibility. Lucia Angermayer helped with data checks and quality assessments. Sylvia Bickley performed search updates, and Becky Gray helped in various ways.
SOURCES OF SUPPORT
PaPaS trials register search strategy
((acupunctur* OR electroacupunct* or electro-acupunct*) AND (headache* OR migrain* OR hemicrania OR cephalgi* or cephalalgi*))
CENTRAL search strategy
MEDLINE via OVID subject search strategy
The above subject search was linked to the following MEDLINE via OVID Cochrane sensitive search strategy for RCTs
(Revised SRB Jan 07)
EMBASE via OVID subject search strategy
The above subject search was linked to the following Study design filter for EMBASE via OVID
COCHRANE Complementary Medicine Field trials register
This register was searched via CENTRAL using the search strategy described above.
CONTRIBUTIONS OF AUTHORSAll reviewers participated in the development of the protocol, the extraction and assessment of the primary studies and the review of the final manuscript. KL coordinated the review process and wrote the draft of the review.
DECLARATIONS OF INTEREST
This review includes trials in which some of the reviewers were involved, as follows: Allais 2002 - Gianni Allais; Jena 2008 - Benno Brinkhaus; Linde K 2005 - Benno Brinkhaus and Klaus Linde; Streng 2006 - Klaus Linde; and Vickers 2004 - Andrew Vickers. These trials were reviewed by at least two other members of the review team. Gianni Allais, Benno Brinkhaus and Adrian White use acupuncture in their clinical work. Gianni Allais receives fees for teaching acupuncture in private schools. Klaus Linde has received travel reimbursement and, in two cases, fees from acupuncture societies (British, German and Spanish Medical Acupuncture Societies; Society of Acupuncture Research) for speaking about research at conferences. Eric Manheimer and Andrew Vickers both received an honorarium for preparing and delivering presentations on acupuncture research at the 2007 meeting of the Society for Acupuncture Research. Adrian White is employed by the British Medical Acupuncture Society as journal editor and has received fees and travel reimbursements for lecturing on acupuncture on several occasions. Benno Brinkhaus has received travel reimbursement and fees for presenting research findings at meetings of acupuncture societies (British, German and Spanish Medical Acupuncture Societies).