|Home | About | Journals | Submit | Contact Us | Français|
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Data comparing duloxetine with existing antidepressant treatments is limited. A comparison of duloxetine with fluoxetine has been performed but no comparison with venlafaxine, the other antidepressant in the same therapeutic class with a significant market share, has been undertaken. In the absence of relevant data to assess the place that duloxetine should occupy in the therapeutic arsenal, indirect comparisons are the most rigorous way to go.
We conducted a systematic review of the efficacy of duloxetine, fluoxetine and venlafaxine versus placebo in the treatment of Major Depressive Disorder (MDD), and performed indirect comparisons through meta-regressions.
The bibliography of the Agency for Health Care Policy and Research and the CENTRAL, Medline, and Embase databases were interrogated using advanced search strategies based on a combination of text and index terms. The search focused on randomized placebo-controlled clinical trials involving adult patients treated for acute phase Major Depressive Disorder. All outcomes were derived to take account for varying placebo responses throughout studies. Primary outcome was treatment efficacy as measured by Hedge's g effect size. Secondary outcomes were response and dropout rates as measured by log odds ratios. Meta-regressions were run to indirectly compare the drugs. Sensitivity analysis, assessing the influence of individual studies over the results, and the influence of patients' characteristics were run.
22 studies involving fluoxetine, 9 involving duloxetine and 8 involving venlafaxine were selected. Using indirect comparison methodology, estimated effect sizes for efficacy compared with duloxetine were 0.11 [-0.14;0.36] for fluoxetine and 0.22 [0.06;0.38] for venlafaxine. Response log odds ratios were -0.21 [-0.44;0.03], 0.70 [0.26;1.14]. Dropout log odds ratios were -0.02 [-0.33;0.29], 0.21 [-0.13;0.55]. Sensitivity analyses showed that results were consistent.
Fluoxetine was not statistically different in either tolerability or efficacy when compared with duloxetine. Venlafaxine was significantly superior to duloxetine in all analyses except dropout rate. In the absence of relevant data from head-to-head comparison trials, results suggest that venlafaxine is superior compared with duloxetine and that duloxetine does not differentiate from fluoxetine.
Duloxetine is a selective serotonin and norepinephrine reuptake inhibitor (SNRI) that claims greater affinity for the serotonin and norepinephrine transporters compared with venlafaxine [1,2]. The efficacy and safety of duloxetine in the treatment of major depressive disorder (MDD) in adults (18–65 years) has been evaluated in 9 phase II and III clinical trials [3-5]. All were randomized, double blind, placebo-controlled studies with doses ranging from 40 to 120 mg/day in the acute treatment of MDD. Results have shown that duloxetine provided relief from psychological symptoms of depression compared with placebo. Six of the above studies used an active comparator: either fluoxetine or paroxetine. None, however, was designed and powered for direct head-to-head comparison between duloxetine and the active comparator. Inclusion of a selective serotonin reuptake inhibitor (SSRI) was intended only to show non-inferiority of duloxetine. No trial has used venlafaxine, the other marketed SNRI, as an active comparator.
The amount of data comparing duloxetine with existing antidepressant treatments is quite limited. The lack of direct comparisons between the recommended daily dose (60 mg) and an active comparator was criticised in a recent evaluation of duloxetine by the Committee for Medicinal Products for Human Use (CHMP) . Assessments of the benefit/risk ratio of a new drug compared with a standard drug at an adequate dose are generally required and it is recommended that clinical trials be conducted not only against placebo, but also against active comparators . The aim of such studies may be to show superiority over the active comparator or to demonstrate that at least a similar balance between benefit and risk exists when the drug of interest is compared with another acknowledged standard antidepressant.
In the absence of head-to-head randomized studies, indirect comparisons can be made between molecules. Clinical trials frequently compare efficacy of a drug versus placebo in the treatment of MDD. Less frequent, however, are head-to-head comparisons. Indirect comparisons taking into account all available placebo-controlled studies are capable of obtaining an effect size and a confidence interval of the difference between two compounds. The algorithm used gives results adjusted for discrepancies in sociodemographics, settings and designs.
After conducting a systematic review of the efficacy of duloxetine, fluoxetine and venlafaxine versus placebo in the treatment of MDD we performed an indirect comparison of the benefits of duloxetine versus fluoxetine and venlafaxine. We used meta-regression analysis to test whether or not differences in effectiveness (which cannot be explained by the differences in settings only) exist between fluoxetine and duloxetine on one hand and venlafaxine and duloxetine on the other.
We used advanced search strategies based on a combination of text and index terms to interrogate the CENTRAL, Medline and Embase databases as well as the bibliography of the US Agency for Health Care Policy and Research (AHCPR). The bibliography from the AHCPR is an exhaustive literature search (both published and non-published) of trials in depression up to 1999.
Selection criteria were: study reporting HAMD results in randomised trials with a placebo arm, involving adult patients suffering from MDD (as assessed by DSM (III, III-R, IV)) treated in acute phase with either fluoxetine, venlafaxine, duloxetine. Excusion criteria were presence of comorbidities; absence of the HAMD scale; involving adolescents, children or elderly; absence of randomisation and absence of a placebo arm.
These criteria were considered sufficient to retrieve all studies of interest to be included in the analysis set.
Two research assistants independently selected papers by reading the abstract and, if necessary, the entire article to assess eligibility and data extraction. Careful re-reading of the papers resolved differences between each author analysis set and letters were sent to corresponding authors in the attempt to reduce missing data.
Publication bias was assessed drawing funnel plots, and Egger Test was used to test funnel plot asymmetry.
Because different trials do not necessarily use the same scale and/or version for assessing efficacy, an effect size was derived from the primary outcome of each study (either HAM-D 17 21 or 24). This enabled deriving a common effect measure across studies that used different scales. The effect size was Hedge's g (a Standardised Response Mean estimator), which was corrected for small sample size bias. To compute an effect size, both the mean and an estimate of dispersion (variance, standard deviation) have to be present. When the dispersion was missing, data was imputed using the sample size weighted method . If both mean and dispersion were missing, the study was removed from the analysis set.
The computed effect sizes were adjusted for severity at baseline to account for differences in patients' groups (selection bias).
The effect size was defined as the difference between the mean change in depression scale score from baseline to end-of-study in the active arm and the mean change in depression scale score from baseline to end of study in the placebo arm; divided by the standard deviation of the difference.
Other endpoints were response and dropout rates. Response was defined as a reduction of at least 50% in the HAM-D score from baseline. Dropouts were considered regardless of cause, which gave a rough indicator of the tolerability and safety and efficacy of the treatment. In other words, dropouts were an indicator of failures of the present therapy.
The response and dropouts rates were analysed using log-odds ratios. A log-odds ratio equal to zero indicated that there was no statistical difference between the two compared groups. Considering the response rate, a value greater than zero indicated that more patients in the treatment group were classified as responders, and therefore that the treatment was better compared with the reference (placebo or duloxetine). A value lower than zero indicated that the reference (placebo or duloxetine) was better. Regarding dropouts, a value greater than zero indicated that more patients in the reference group (placebo or duloxetine) withdrew, and therefore that the treatment was better (in terms of efficacy and/or safety) compared with the reference (placebo or duloxetine). A value lower than zero indicated that treatment was less effective or less tolerated than the reference (placebo or duloxetine).
Random-effect meta-analyses were computed for each outcome and each treatment compared with placebo. Mean age, mean percentage of male, mean study duration and range of dosage were computed for each treatment.
Following recommendations by Glenny et al.  and van Houwelingen et al. , a mixed procedure was run. This enabled handling studies with more than two arms (typically when different dosages are included in the same study), as well as studies presenting two drugs in the same trial (two trials assessed the effectiveness of duloxetine versus placebo and were fluoxetine controlled). The method used is a weighted least squares algorithm which iteratively computes a between-study variance while keeping each within-study variance constant. Therefore, what are modelled by default (when no adjustment is made) are drug effect (an antidepressant effect of the drugs) and drug-specific effect. The drug specific effect is the effect tested between the two treatments compared.
The models were computed under SAS PROC MIXED . This procedure gives also good coverage for confidence intervals according to van Houwelingen et al. . As in van Houwelingen et al.,  Wald confidence intervals were used.
Sensitivity analyses were planned a priori and included: Performing several adjustments. The variables chosen a priori as having a potential influence over the outcome of a study were age, male percentage, duration of study and dosage. Robustness was then assessed observing the variation in the estimation of the outcome, its corresponding confidence interval, as well as the size of the estimated residual between-study variance .An adjustment over the fact that the effect size was imputed was also run (in case the dispersion had to be imputed to compute an effect size). To assess its influence over the results, studies were removed from the analysis set one at a time. A post hoc. sensitivity analysis was run on a subgroup of fluoxetine studies excluding the studies where the number of patients was below 20.
The following rules were applicable for all computed models:
• In case an adjustment factor was missing, it was imputed by the corresponding weighted mean computed with available data.
• Influence of missing data was computed through sensitivity analyses by removing the studies where the data was missing.
• In the event that an outcome was missing and no reply was received from the letters sent, the study was removed from the analysis set for the particular analysis for which the outcome was missing.
No precise answers were received from the letters sent to corresponding authors; therefore, the number of missing data remained unchanged.
For duloxetine, 8 publications showing results for 9 trials (each with varying characteristics) were selected, [Figure [Figure1].1]. [Table [Table1]1] matches the publications with the information available from each trial. Mean age varied from 41 to 45 and the percentage of males varied from 25 to 40%. Duration of treatment varied from 8 to 9 weeks and dosages (fixed or variable) were from 40 to 120 mg per day. The effect size comparing duloxetine to placebo was -0.29(0.15). The response and dropouts log odds ratio were 0.58(0.18) and -0.02(0.32) respectively. The funnel plot shape cannot rule out the possibility of a publication bias; see [Figure [Figure4].4]. The Funnel plot was not statistically significantly asymmetrical according to the Egger test (p = 0.9).
For fluoxetine, 22 papers were selected [Figure [Figure2],2], presenting a rather heterogeneous picture [Table [Table2].2]. Mean age varied from 33 to 47 and the percentage of males varied from 26 to 57%. Duration of treatment varied from 5 to 12 weeks and dosages (fixed or variable) were from 20 to 80 mg per day. It is worth noting that some studies include few patients (from 5 to 169). The effect size comparing fluoxetine to placebo was -0.46(0.52). The response and dropouts log odds ratio were 0.37(0.32) and -0.02(0.23), respectively. A positive point worth noting is that publication bias is shown to be minimised (see Figure Figure4).4). This figure shows the typical conic shape centred over the value estimated which indicates little or no bias. The Funnel plot was not statistically significantly asymmetrical according to the Egger test (p = 0.4).
For venlafaxine, 8 papers were selected, see [Figure [Figure3],3], with the following characteristics [Table [Table3].3]. Mean age varied from 40 to 46 and the percentage of males varied from 31 to 60%. Duration of treatment varied from 6 to 12 weeks and the dosages (fixed or variable) were from 75 to 225 mg per day. The effect size comparing venlafaxine to placebo was -0.51(0.20). The response and dropouts log odds ratio were 1.28(0.64) and -0.25(0.32), respectively. The funnel plot shape cannot rule out the possibility of all publication bias [Figure [Figure4].4]. The Funnel plot was not statistically significantly asymmetrical according to the Egger test (p = 0.1).
For duloxetine compared with fluoxetine, the estimated effect size was 0.11 [-0.14;0.36] for the treatment effect (Figure (Figure5a).5a). The estimated response log odds ratio was -0.21 [-0.44;0.03] (Figure (Figure5b)5b) (only 13 fluoxetine studies and 6 duloxetine studies were included because of missing data) with a corresponding odds ratio of 0.81. The estimated dropouts log odds ratio was -0.02 [-0.33;0.29] (Figure (Figure5c)5c) (only 8 fluoxetine studies and 5 duloxetine studies were included because of missing data). None of these results vs. fluoxetine were significant, although a trend can be seen in favor of duloxetine in term of number of responders.
For duloxetine compared with venlafaxine, the estimated effect size was 0.22 [0.06;0.38] for the treatment effect (Figure (Figure5a),5a), demonstrating a significant better efficacy of venlafaxine compared with duloxetine. The estimated response log odds ratio was 0.70 [0.26;1.14] also significantly different in favour of venlafaxine (Figure (Figure5b)5b) (only 6 venlafaxine studies and 6 duloxetine studies were included because of missing data). The estimated dropout log odds ratio was 0.21 [-0.13;0.55] (Figure (Figure5c)5c) (only 7 venlafaxine and 5 duloxetine studies were included because of missing data). Venlafaxine seem more efficacious both in reduction of symptoms and in term of number of responders (the corresponding odds ratio is 2.01) for a similar safety profile.
For duloxetine compared with fluoxetine, cf. [Table [Table4]4] either investigating the primary outcome (efficacy as measured by derived HAMD scale) or the response factor, the results were stable through adjustments, no amelioration in the adjustment was reached (the residual between-study variance estimate remained approximately constant), and confidence intervals remained large and stable. The effect size of the best prediction (smallest residual between-study variance) was 0.12 [-0.14;0.38]. The odds ratio of the response factor varied from 0.81 to 0.95, favouring numerically duloxetine in every analysis and reaching borderline significance when the estimate was close to 0.81. The residual between-study variance was constant. Concerning the dropout factor, the odds ratio varied from 1.21 to 1.40, numerically favouring fluoxetine in every analysis. Adjusting for duration of the study revealed a significant advantage in favour of fluoxetine (corresponding odds ratio 1.40). This advantage is borderline significant when adjusting for duration (corresponding odds ratio 1.36). The residual between-study variance was constant.
Whatever the parameter of interest or the adjustment factor considered, the fact that variances were imputed did not change the conclusions.
When removing studies one at a time in the analysis set, the conclusions didn't change except when removing  or  where statistical significance is reached -0.27 [-0.50; -0.01] (Odds ratio 0.76) in favour of duloxetine.
Analyses made on the subgroup of fluoxetine studies (where the number of analysed patients was greater than 20), gave for the efficacy 0.09 [-0.09;0.26] (13 fluoxetine studies) still favouring fluoxetine, for the response factor -0.22 [-0.46;0.02] (10 fluoxetine studies) still favouring duloxetine and for the dropouts factor -0.02 [-0.33;0.28] (7 fluoxetine studies) similar results were found.
For duloxetine compared with venlafaxine, cf. [Table [Table4]4] investigating the efficacy score (in the effect size scale) the effect size varied from 0.16 to 0.25 favouring venlafaxine significantly in all analyses except when adjusting for sex repartition where the result is borderline significant 0.16 [-0.01;0.33] though still numerically favouring venlafaxine. The residual between-study variance is small in all analyses, the model which has the best fit (smallest residual between-study variance) gave an estimated effect size of 0.25 [0.11;0.40] significantly in favour of venlafaxine. Investigating the response factor, the odds ratio varied from 1.75 to 2.46 favouring venlafaxine significantly in all analyses. The residual between-study variance remained stable, the best fit (smallest residual between-study variance) corresponds to an odds ratio of 1.75. Concerning the dropouts the odds ratio varied from 1.14 to 1.30 throughout adjustments favouring numerically venlafaxine in all analyses. The residual between-study variance remained stable and small.
When removing studies one at a time from the analysis set, the conclusions didn't change thus favouring robustness in results.
The use of the meta-regression method to indirectly compare duloxetine with each active comparator revealed that there was no significant difference with fluoxetine either in efficacy or in safety. Findings only suggest that more patients might respond to duloxetine. Results suggest that duloxetine might be significantly less effective compared with venlafaxine, (in terms of treatment effects and number of response) with similar dropouts rates.
Results given by sensitivity analyses showed relatively good consistency, as no analysis changed the conclusions. The results became nonsignificant in one analysis comparing venlafaxine with duloxetine, but the estimated value seldom moved. When removing  or  from the analysis set, duloxetine treated patients had statistically more chance to respond than when treated with fluoxetine. These findings were obtained by removing the less favourable studies for duloxetine, and we found no differences in the design or patients' characteristics that may explain why. These tests showing significance (when comparing fluoxetine to duloxetine) or non-significance (when comparing venlafaxine to duloxetine), as in every study where multiple testing is performed, may be due to a drop in statistical power, which can bias the conclusions. As some robust trends have been found between the different drugs, the findings are considered robust to the confounding factors that have been investigated.
Our findings should, however, be interpreted with caution. Findings of superior efficacy by indirect comparisons are observational and therefore vulnerable to bias. Yet, several articles have recently shown that indirect comparisons adjusted at the aggregate level usually agree with direct comparisons. An indirect meta-analysis of studies comparing olanzapine with haloperidol and risperidone with haloperidol yielded conclusions similar to those found in a direct comparative randomized clinical trial of olanzapine and risperidone . Song et al.  demonstrated that the results of adjusted indirect comparisons were usually similar to those of direct comparisons. In their study, there were a few significant discrepancies between the direct and the indirect estimates, although the direction of discrepancy was unpredictable. The authors concluded that empirical evidence presented in their study clearly indicates that in most cases, results of adjusted indirect comparisons are not significantly different from those of direct comparisons.
While we recognize that none of the trials involving duloxetine used venlafaxine as an active comparator, our results are in accordance with a recent meta-analysis comparing duloxetine and venlafaxine in the treatment of MDD  and a review comparing second-generation antidepressants .
Vis et al. used results of 6 trials with duloxetine and 4 with venlafaxine to report the efficacy and safety of either venlafaxine or duloxetine compared with placebo. They found that venlafaxine rates for remission and response were respectively 17.8% (CI95% 9.0–26.5) and 24.4% (CI95% 15.0–37.7) greater than placebo, compared with 14.2% (CI95% 8.9–26.5) and 18.6% (CI95% 13.0–24.2) for duloxetine. Reported adverse events were comparable between active drugs. The authors concluded that venlafaxine showed a favorable trend in remission and response rates compared with duloxetine, but that no significant between-drug differences were observed for dropout rates and adverse events. Due to the nature of the methodology used, no objective evidence concerning how venlafaxine performs when compared with duloxetine can be drawn. Nonetheless, the numerical trend seen in this paper is in accordance with the ones found here.
A review of second-generation antidepressants' efficacy in the treatment of MDD by Hansen et al.  found that significantly more patients responded to venlafaxine than to fluoxetine. The relative benefit: 1.12 (CI95% 1.02–1.23) favoured venlafaxine. This result suggest the same pattern found here; response rates of venlafaxine are superior to duloxetine which are equal to fluoxetine
Concerning available comparisons with fluoxetine, of the 9 randomized clinical trials that evaluated the efficacy and safety of duloxetine, only two used fluoxetine as an active comparator [4,9]. Neither of these studies was specifically designed and powered to facilitate head-to-head comparisons between duloxetine and fluoxetine. The primary goal was comparison of duloxetine vs. placebo. These two studies (powered 65%) were identical parallel group, double-blind, forced-titration active- and placebo-controlled studies comparing duloxetine titrated from 20 mg to 60 mg BID with placebo over 8 weeks of acute treatment. A fluoxetine 20 mg QD arm was used as an internal active comparator standard. In these studies, duloxetine was statistically significantly superior to placebo on the primary analysis (mean change analysis from baseline of the HAMD-17 total score) and for some of the secondary endpoints. There was no statistically significant difference between fluoxetine and placebo for mean change in HAMD-17 total score in any of the studies. The fluoxetine treatments groups were underpowered qualitative control arms:  half patients included compared with duloxetine and placebo reaching low numbers (33  and 37 ),  comparison of a fixed dose at the minimum recommended range for fluoxetine (20 mg/day) with the highest tested dose for duloxetine (120 mg/day). Higher doses of fluoxetine may have proven more effective and a more robust comparison of duloxetine, and fluoxetine should include a broader and more optimal dose range for comparison. Furthermore, as fluoxetine has proven to have an effect when compared with placebo [47,48], these direct comparisons are not sufficient to draw conclusions about duloxetine's superiority over fluoxetine.
Superiority of one antidepressant medication relative to another needs to be established by means of prospectively designed, adequately powered, head-to-head clinical trials. As the results of placebo-controlled trials are often sufficient to acquire the regulatory approval of new drugs, pharmaceutical companies may not be motivated to support trials that compare new drugs with existing active treatments. Lack of evidence from direct comparison between active interventions makes it difficult for clinicians to choose the most effective treatment for patients . Because of the lack of direct evidence, indirect comparisons have been recommended . Adjusted indirect comparison is a way to compare two compounds through their relative effect vs. a common comparator (placebo in our study). The indirect approach to meta-analysis requires certain conditions to yield optimal results. Differences in study designs, inclusion/exclusion criteria, patients characteristics at baseline as well as difference in drug dosage  and publication bias are limitations that may lead to unbalanced conclusions  and merit discussion.
Our study had some limitations. First, the time frame differs between active drugs. Because fluoxetine is the oldest antidepressant compared with venlafaxine and duloxetine, inclusion criteria for MDD was based on DSM III or IIIr criteria (not DSM IV) in the majority of the fluoxetine studies compared with those of venlafaxine and duloxetine. Secondly, sample sizes seem to be smaller for the fluoxetine studies and include patients with lower HAM-D score (14 to 19). Thirdly the patients characteristics, even if they vary only slightly can act as confounding factors and bias the results. Fourthly, dosages varied between studies and between drugs. Lastly, the missing data might not be balanced between treatments. All these sources of heterogeneity could lead to bias. Considering that the computation of an effect size included adjustment for baseline severity differences and that influence of patient characteristics and study designs were assessed through sensitivity analyses, some confidence can be put on the results if they show stability over the different analyses. Also, the random effect nature of the model used here should be able to deal with the remaining amount of bias that couldn't be measured or properly modelled. Finally, the other major issue in any meta-analysis is the potential publication bias. Publication bias is a major source of systematic bias in overviews, where trials with positive results are more likely to be published than those with neutral or negative results, especially if the trials are small. We therefore tested for publication bias using the Egger test for funnel plot asymmetry . Ruling out completely publication bias is nearly impossible. Even so, any bias would most likely be in favour of the newer drug and its existence would not undermine the results presented here .
In the absence of a well-powered randomised placebo controlled direct comparison trial, meta-regression analysis offers the most rigorous evidence science can buy. Even if it's true that the level of evidence provided by indirect comparisons is lower than the level provided by direct comparisons; in some cases  indirect comparisons have actually been able to predict the results of head-to head-clinical trials. The capacity of prediction is nonetheless directly linked to the quality of the methodology used and the information available. Both have been discussed in the core of this paper, and in this context the results seem stable enough to be confident that the bias are controlled and that the results provide valuable additional information to health care professionals, health economists and the pharmaceutical industry. These results suggest evidence of venlafaxine superiority compared with duloxetine and absence of a difference between fluoxetine and duloxetine. In any case, investigating the relative efficacy of duloxetine compared directly with other existing antidepressants – particularly venlafaxine – in a well-designed trial would be welcomed to challenge or reinforce our findings.
Each author has made substantial contributions at every phase in the planning and writing of the manuscript. Each have each equally contributed to the drafting and critical revision of this work.
The pre-publication history for this paper can be accessed here:
H. Lundbeck A/S provided funding for this research, which is part of the doctoral thesis of Laurent Eckert.
Christophe Lançon declares no conflict of interest or receipt of funding from any source.