|Home | About | Journals | Submit | Contact Us | Français|
Exogenous melatonin has been increasingly used in the management of sleep disorders.
To conduct a systematic review of the efficacy and safety of exogenous melatonin in the management of primary sleep disorders.
A number of electronic databases were searched. We reviewed the bibliographies of included studies and relevant reviews and conducted hand-searching.
Randomized controlled trials (RCTs) were eligible for the efficacy review, and controlled trials were eligible for the safety review.
One reviewer extracted data, while the other verified data extracted. The Random Effects Model was used to analyze data.
Melatonin decreased sleep onset latency (weighted mean difference [WMD]: −11.7 minutes; 95% confidence interval [CI]: −18.2, −5.2)); it was decreased to a greater extent in people with delayed sleep phase syndrome (WMD: −38.8 minutes; 95% CI: −50.3, −27.3; n=2) compared with people with insomnia (WMD: −7.2 minutes; 95% CI: −12.0, −2.4; n=12). The former result appears to be clinically important. There was no evidence of adverse effects of melatonin.
There is evidence to suggest that melatonin is not effective in treating most primary sleep disorders with short-term use (4 weeks or less); however, additional large-scale RCTs are needed before firm conclusions can be drawn. There is some evidence to suggest that melatonin is effective in treating delayed sleep phase syndrome with short-term use. There is evidence to suggest that melatonin is safe with short-term use (3 months or less).
Sleep disorders affect approximately 20% of the American population.1 A sleep disorder exists whenever a lower quality of sleep leads to impaired functioning or excessive sleepiness.2 Although sleep disorders may be accompanied by other medical and/or psychiatric conditions, in many cases, sleep disorders exist in the absence of these other conditions, and are considered to be primary sleep disorders.3 The most common sleep disorder is insomnia,4 which is classified as a dyssomnia by the International Classification of Sleep Disorders (ICSD).5 Delayed sleep phase syndrome (DSPS) is another sleep disorder that is classified as a dyssomnia by the ICSD. Individuals with DSPS complain of difficulty falling asleep and difficulty waking-up at desired bed times and wake times, respectively. This disorder is thought to result from an out-of-phase endogenous circadian pacemaker that is displaced to a later than normal phase.6,7 Although DSPS may present as insomnia, it is a distinct disorder.
Current management of sleep disorders depends on the type and etiology of the disorder.8 The first line of treatment for sleep disorders is the improvement of sleep hygiene, which may consist of such strategies as strict adherence to a consistent routine 7 days per week, a quiet and comfortable sleep environment, wind-down time before bed, stimulus control, avoidance of alcohol and caffeine, and properly timed exercise.8 Similarly, the treatment of sleep disorders may include behavioral therapy, such as biofeedback and sleep restriction,8 chronotherapy,9 and light therapy,10 which are used in the treatment of circadian rhythm disorders; and pharmacotherapy with sedatives and/or hypnotics.11
Endogenous melatonin exists as a hormone; it is secreted by the pineal gland and is linked to the circadian rhythm.11 Studies of melatonin in the 1970s and 1980s revealed sedative/hypnotic effects of the compound,12–15 which have led to its use as a treatment for sleep disorders.
We conducted a systematic review of the efficacy and safety of melatonin in the management of primary sleep disorders. Our findings can help to guide clinicians and patients in treatment decisions regarding the use of exogenous melatonin in the management of this condition.
A health sciences librarian conducted a comprehensive search to identify relevant English-language studies. We searched 13 electronic databases (Table 1) using the terms listed in the Appendix (available online). We searched for both published and unpublished literature. The reference lists of relevant reviews and included studies were reviewed to identify other potentially relevant studies. We hand-searched Associated Professional Sleep Society Abstracts covering 1999 to 2003. Finally, we researched MEDLINE and EMBASE in early 2004 in order to identify recently published studies.
All titles and abstracts identified through the above search strategies were screened independently by 2 reviewers for potential relevance. The full text of all articles deemed potentially relevant was retrieved. Two reviewers independently assessed the manuscripts for inclusion using predetermined criteria. To assess the efficacy of exogenous melatonin, we included randomized controlled trials (RCTs) that (1) involved human participants who suffered from a primary sleep disorder, (2) compared melatonin with placebo, and (3) reported on either sleep onset latency (amount of time between laying down to sleep and the onset of sleep); sleep efficiency (amount of time spent asleep as a percentage of the total time spent in bed); sleep quality (the perceived quality of sleep); wakefulness after sleep onset (amount of time spent awake in bed following the first attainment of sleep); total sleep time (total time spent asleep while in bed); or percent time in REM sleep (percent time spent dreaming). A study population was considered to have a primary sleep disorder unless the participants, as a group, were defined by a specific chronic medical and/or psychiatric disorder, and this disorder was likely to be the cause of the sleep disorder (e.g., depression). To assess the safety of exogenous melatonin, we included both RCTs and non-RCTs meeting criteria 1 and 2 above and reporting on adverse events and/or adverse effects. Disagreements regarding inclusion were resolved by discussion.
For the efficacy review, RCTs were assessed for methodological quality using the validated Jadad scale.16 This scale assesses randomization, blinding, and reporting of dropouts and withdrawals, and provides an overall score of 0 to 5, with 5 indicating highest quality. In addition, we assessed concealment of allocation as “adequate,”“inadequate,” and “unclear.”17 For the safety review, which relied on evidence from both RCTs and non-RCTs, the Downs and Black 18 Checklist was used. This checklist is partially validated and measures quality in terms of reporting, external validity, internal validity (bias and confounding), and power. The checklist was slightly modified such that a maximum of 2 points could be allotted to item 27 rather than 5 points. Thus, the maximum quality index was 29, rather than 32. Two reviewers assessed study quality independently, and resolved disagreements by discussion.
Data were extracted using a standardized data extraction form that captured details of study design; inclusion/exclusion criteria; population characteristics (e.g., gender, age, ethnicity and type of sleep disorder); the number of individuals eligible and enrolled; the comparison groups and participants allocated to each group; number of withdrawals; details of the intervention including the formulation, dosage, timing, frequency and duration of melatonin administration; type and frequency of concurrent medication use; and results. Additional information that was extracted included: authors and year of publication; country where the study took place; source of funding; authors' objectives and conclusions; and whether an intention-to-treat analysis was planned and/or performed. A trained reviewer extracted relevant data and a second reviewer checked data extraction for accuracy and completeness. Disagreements were resolved by discussion.
A priori, we listed our outcomes in order of importance with sleep onset latency as most important (primary) followed by sleep efficiency, sleep quality, wakefulness after sleep onset, total sleep time, and percent time in REM sleep. Continuous outcomes (e.g., sleep onset latency) were combined using a weighted mean difference (WMD) with the exception of sleep quality, where studies were combined using a standardized mean difference (SMD). The inverse variance method 19 was used to weight the studies. All meta-analyses were performed using a random effects model. A point estimate with corresponding 95% confidence interval (CI) was computed for each outcome using the generic inverse variance function in RevMan 4.2.5 (Update Software 2004).
Dichotomous outcomes (i.e., safety outcomes) were combined using a risk difference (RD) with corresponding 95% CIs. Many studies stated that there were no reported adverse events. These studies were included in the analysis, but a sensitivity analysis excluding them was also performed, as the lack of reporting on adverse events does not necessarily indicate that they did not occur in the study.
In most cases, we were able to calculate the efficacy estimates for each study exactly (i.e., WMD, SMD, RD), but occasionally, estimates had to be made by extracting from graphs or using medians. Standard errors of the differences were calculated exactly from available data (i.e., individual patient data or exact P-values) whenever possible. For studies with a parallel design, this calculation was usually accomplished with the standard formula for variance of difference of independent variables: var(A−B) = var(A) + var(B), where A was the melatonin effect and B was the placebo effect. For studies with a crossover design, the standard error was estimated using the formula for variance of difference of dependant variables: var(A−B)=var(A)+var(B), where A was the melatonin effect, B was the placebo effect and ρ was the correlation estimate of 0.5. In cases where this calculation could not be performed, standard errors were estimated using conservative P-values (i.e., P=.05 if P <.05), interquartile ranges (IQRs; i.e., SD=IQR/1.35),20 and extracting from graphs. As a last resort, an average of standard deviations of all other studies providing this information was used to impute standard deviations of a study.
For studies with a parallel design, change from baseline data was used if available; otherwise final data were used. For studies with a crossover design, final data were always used.
When continuous data were presented for multiple conditions (e.g., different doses of melatonin), which we wished to combine, a new mean and standard deviation were computed. If the study had a parallel design, the new mean and standard deviation could be computed exactly, while for studies with a crossover design, the standard deviation was calculated using a correlation estimate of 0.5.
All estimates of efficacy (WMDs, SMDs, and RDs) were assessed for heterogeneity using the I2statistic.21 Based on this statistic, heterogeneity for each outcome was classified as negligible (I2=0%), minimal (I2<20%), moderate (20% <I2<50%), or substantial (I2>50%). For our primary outcome, we planned to explore heterogeneity in subgroup analyses using the following variables: age, dosage, timing of administration, study duration, primary diagnosis, study design, quality score, and allocation concealment. Deeks' χ2statistic 22 was used to test for significant heterogeneity reduction in partitioned subgroups.
Figure 1 describes the flow of studies through the study selection process. Fourteen RCTs were relevant to the efficacy review, encompassing 279 participants (Table 2, available online). The median quality score, based on the Jadad scale, was 4 (IQR, 3 to 4). Concealment of allocation was unclear in all studies except 3, which had adequate concealment.23–25 Seven studies described a funding source 25–31; in all but 1 study,27 funding was received from public sources. In 4 studies, there was a discrepancy in the number of participants enrolled and the number analyzed 25,30–32; it was either not specified or unclear whether an intention-to-treat analysis was planned or conducted in these studies. For all studies, the duration of melatonin administration was 4 weeks or less.
For 11 of the 14 studies, the exclusion criteria were designed to minimize and/or eliminate comorbid medical and psychiatric conditions in the study population. The reports of many of these studies did not provide a description of comorbid conditions in the study population. For 1 study, the inclusion criteria required the absence of psychological, psychiatric, or medical factors sufficient to explain the sleep disorder symptoms; however, comorbid conditions were present in the population, although their severity was unclear.33 In 3 studies using exclusion criteria designed to minimize and/or eliminate comorbid conditions, 1 author reported the presence of a depressive episode in 2 participants in the preceding year,34 and another author of studies involving children reported the presence of asthma and attention deficit hyperactivity disorder in a minority of the children.25,31 Three studies did not report on exclusion criteria designed to minimize and/or eliminate the presence of comorbid conditions in the study population, and described the distribution of a variety of medical and psychiatric disorders affecting the population.23,24,27 The latter studies were included in the analysis, as the populations were not defined by a specific chronic medical and/or psychiatric disorder that was likely to be the cause of the sleep disorders suffered by these populations.
The combined WMD of the 14 trials showed that those in the melatonin group had shorter sleep onset latency than those in the placebo group (WMD: −11.7 minutes; 95% CI: −18.2, −5.2) (Table 3) Although the result was statistically significant, the effect appears to be clinically unimportant. There was substantial heterogeneity among the studies (I2=81.6%): 12 of the 14 studies showed a difference that favored melatonin (Figure 2).
We conducted subgroup analyses for age, dosage, duration, and primary diagnosis (Table 4) Three of these subgroupings significantly reduced heterogeneity despite retaining a substantial heterogeneity statistic in at least 1 subgroup. It should be noted that these categorizations included a subgroup with only 2 or less studies; therefore, the results are not surprising. The 1 subgrouping that is noteworthy is that of primary diagnosis, which substantially reduced heterogeneity and is the only subgrouping that gave results with nonoverlapping confidence intervals: the effect of melatonin was much more pronounced for those with delayed sleep-phase syndrome (WMD −38.8 minutes; 95% CI −50.3, −27.3; n=2) versus insomnia (WMD −7.2 minutes; 95% CI −12.0, −2.4; n=12). This effect appears to be clinically important. This variable appears to explain much of the heterogeneity in the primary analysis. A subgroup analysis could not be conducted for timing (all participants received melatonin just before bedtime).
We conducted sensitivity analyses based on study design, Jadad score and allocation concealment (Table 4). The subgroupings for study design and allocation concealment significantly reduced heterogeneity; heterogeneity was negligible in the subgroupings of trials of a parallel design (3 studies) and adequate allocation concealment (3 studies).
The pooled estimate from 10 studies that reported sleep efficiency favored melatonin. Although the result was not statistically significant, it was close to significance (WMD 2.5%; 95% CI: −0.2, 5.2) (Table 3). Heterogeneity among the studies was substantial (I2=80.7%): 7 of the 10 studies showed a point estimated that favored melatonin (Figure 3).
Subgroup analyses were conducted for age, dosage, duration, and primary diagnosis (data not shown). The only subgroup that significantly reduced heterogeneity was age; sleep efficiency was greater in the elderly population (66 years and older; WMD 5.3; 95% CI: 0.7, 9.8), compared to the adult population (19 to 65 years; WMD −0.0; 95% CI: −1.6, 1.5). A subgroup analysis could not be conducted for timing (all participants received melatonin just before bedtime).
We conducted sensitivity analyses based on Jadad score and allocation concealment. As all studies were crossover trials, we were not able to do a subgroup analysis for study design. The only analysis that was noteworthy was that of allocation concealment. The subgroup with the 2 studies 23,24 with adequate allocation concealment had negligible heterogeneity and gave results that were statistically significant in favor of melatonin. This finding may be because of chance, as allocation concealment has been associated with smaller, rather than larger, effect sizes.17 The remaining 8 studies showed no effect of melatonin on sleep efficiency.
We conducted meta-analyses for sleep quality (n=2), wakefulness after sleep onset (n=6), total sleep time (n=13), and percentage time spent in REM sleep (n=3). All results favored melatonin but were not statistically significant (Table 3).
Ten studies were relevant to the safety review, encompassing approximately 222 participants (Table 2, available online). Nine studies were RCTs and 1 study was a non-RCT. The median quality score, based on the Downs and Black Checklist, was 21.5 out of 29 (IQR, 12 to 25). The duration of melatonin administration was 3 months or less. There were few reports of adverse events accompanying melatonin administration. The most common adverse events reported were headaches (13 events), dizziness (10 events), nausea (3 events), and drowsiness (3 events). In all cases, there was no significant difference between melatonin and placebo (Table 3).
This review is focused on populations with primary sleep disorders and does not include a review of populations of people with sleep disorders secondary to other disorders (e.g., depression), jet-lag or shift-work disorder. The review is derived from a more comprehensive Evidence Report.35 To our knowledge, there are no other meta-analyses of RCTs of melatonin for primary sleep disorders. Despite its popularity, evidence suggests that melatonin may be of limited clinical use. Our primary analysis showed an average reduction in sleep onset latency of 11.7 minutes. This finding does not support the use of melatonin for the management of primary sleep disorders. However, in our secondary analysis of a subpopulation with delayed sleep phase syndrome, the average reduction in sleep onset latency increased to 38.8 minutes, which is both clinically and statistically significant. This result is based on only 2 studies involving less than 30 participants, necessitating further research to confirm the results.
Despite concerns about the quality of studies of complementary and alternative medicine (CAM) therapies,36,37 the 16 studies included in this review were of moderate to high quality. Moreover, in light of the potential relationship between study results and funding sources, it is somewhat reassuring that the majority of studies that reported funding source used public, rather than industry, funding.38,39 However, it should be noted that half of the studies reviewed did not report funding source.
The basic mechanism by which melatonin induces sleepiness in humans is unclear, although 3 main hypotheses have been proposed. The mechanism may involve a phase-shift of the endogenous circadian pacemaker, a reduction in core body temperature, and/or a direct action on somnogenic structures of the brain. Timing of melatonin administration has been shown to predict its effect on circadian rhythm, such that melatonin delays circadian rhythm following morning administration, and advances circadian rhythm following afternoon or evening administration.40
Our literature review indicated that melatonin reduced sleep onset latency to a greater extent in people with delayed sleep phase syndrome than in people with insomnia. This finding suggests that the effects of melatonin are mediated by a direct resetting of the endogenous circadian pacemaker rather than via a direct action on somnogenic structures of the brain, given that individuals with delayed sleep phase syndrome are distinguished from individuals with insomnia by the presence of a circadian abnormality.
It is noteworthy that the trials that examined the efficacy of melatonin in people with primary sleep disorders are of relatively short trial duration (4 weeks or less), as were the trials examining the safety of melatonin in this population (3 months or less). Therefore, the efficacy and safety of melatonin reported here may reflect only the short-term effects of melatonin on this population. It is necessary that trials of longer duration be conducted in order to determine the long-term effects of melatonin on this population.
There was considerable heterogeneity in study results, and many factors may have contributed to this finding. Like other natural health products, melatonin is subject to variation in product quality. Details of content and quality of melatonin formulations and verification of doses were not adequately described in the published reports. The formulations of melatonin varied from slow to fast release. A wide range of doses was used and the doses were administered for days to up to 4 weeks. The presence, distribution and severity of comorbid conditions in the study population may also contribute to heterogeneity. Although many of the studies reported inclusion/exclusion criteria designed to minimize comorbid conditions in the study population, details of comorbid conditions present in the population were often not clearly described. Heterogeneity may also be explained by the age of the population, duration of melatonin administration or primary diagnosis. Indeed, our analysis suggests that the diagnosis of insomnia versus delayed sleep phase syndrome explains much of the heterogeneity in study results.
Melatonin appears to be safe in the population, doses and timeframe studied. The most commonly reported adverse effects of melatonin were nausea, headache, dizziness, and drowsiness. One must be cautious when interpreting adverse event data—not only are adverse events under reported in clinical trials generally,41–43 there is reason to suspect that adverse reactions related to natural health products may be systematically under reported, when compared with conventional medications.44 Consumers may equate “natural” with “safe” and therefore not recognize or attribute an adverse reaction to a natural health product. The safety of exogenous melatonin when used in the long-term, over months and years, remains unclear.
A limitation of our review is the exclusion of nonEnglish language reports. There is evidence of “reverse” publication bias, in that negative CAM studies are overrepresented in English-language journals and positive CAM studies are overrepresented in nonEnglish language journals.45 We searched a number of electronic databases for nonEnglish language literature on melatonin and sleep disorders. Based on a screen of titles and abstracts resulting from the search, we did not identify any studies that were RCTs of the use of melatonin in people with primary sleep disorders. Thus, it is unlikely that we omitted potentially relevant nonEnglish language reports from the review.
Future studies on the effect of melatonin in the management of primary sleep disorders should clearly define the formulation and pharmacology of the melatonin product under investigation. It remains unclear how the efficacy of melatonin varies by age, dosage, timing, treatment duration, and primary diagnosis. Moreover, the long-term effects of melatonin on people with primary sleep disorders remains to be determined.
Our literature review indicates that there is no evidence that melatonin is effective in the management of most primary sleep disorders with short-term use; however, additional large-scale RCTs are needed before firm conclusions can be drawn. Although some evidence suggests that melatonin may reduce sleep onset latency for persons with delayed sleep phase syndrome, additional research is required to confirm these results. The evidence suggests that short-term use of melatonin is safe; more research is required to determine its long-term safety.
We thank The National Centre for Complementary and Alternative Medicine, National Institutes of Health for sponsoring this research, through the Agency for Healthcare Research and Quality. We are grateful to members of our technical expert panel for providing input on the direction and scope of the review. We are especially grateful to Dr. Manisha Witmans for her valuable feedback on the manuscript. We thank Ms. Michelle Tubman, Dr. Mia Lang, Ms. Maria Ospina, Mr. Victor Juorio, and Ms. Ellen Crumley for their contribution to the systematic review process.
Investigating authors acknowledge the following financial support: Dr. Sunita Vohra is supported by: Agency for Healthcare Research and Quality (USA); Canadian Institutes of Health Research; Change Foundation; Department of Pediatrics; National Health Products Directorate, Health Canada; Ontario Mental Health Foundation; Stollery Children's Hospital and Foundation; The Hospital for Sick Children Foundation; and the University of Alberta. Dr. Glen Baker is supported by: Agency for Healthcare Research and Quality (USA); Canadian Institutes of Health Research; Canada Research Chairs Program; Stanley Foundation; University of Alberta Hospital Foundation; Bebensee Schizophrenia Research Fund; Davey Endowment; and Zyprexa Research Foundation.
This article is based on research conducted by the University of Alberta Evidence-based Practice Center under contract to the Agency for Healthcare Research and Quality (Contract No. 290-02-0023), Rockville, Md and support from the National Center for Complementary and Alternative Medicine, National Institute for Health, Bethesda, Md.
Financial support: This study was conducted under contract to the Agency for Healthcare Research and Quality (Contract No. 290-02-0023), Rockville, Md and support from the National Center for Complementary and Alternative Medicine, National Institute for Health, Bethesda, Md.
Disclaimer: The authors of this article are responsible for its contents, including any clinical or treatment recommendations. No statement in this article should be construed as an official position of the Agency for Healthcare Research and Quality, the National Center for Complementary and Alternative Medicine or the U.S. Department of Health and Human Services.
Authors' contributions: N.B. planned, oversaw and participated in all steps of the systematic review process and in writing and editing the manuscript.
B.V. performed all statistical analyses and participated in writing and editing the manuscript.
N.H. participated in most steps of the systematic review process and in writing and editing the manuscript.
R.P. participated in all steps of the systematic review process and reviewed the manuscript.
L.T. conducted the literature search, provided technological expertise for the inclusion process, and participated in editing the manuscript.
L.H. participated in writing the proposal, provided methodological expertise, and participated in writing and editing the manuscript.
G.B. participated in writing the proposal, provided content expertise and participated in editing the manuscript.
T.K. participated in writing the proposal, provided methodological expertise and provided feedback on the manuscript.