Ioannidis labels the effect size he reported as small, and we will not argue about the label, "small". We present data in Tables and based on a range of studies of the most widely used antidepressants. This includes hundreds of placebo-controlled randomize double-blinded trials (the best controlled of studies in the evidence hierarchy), conducted throughout the world, by scientists in industry as well as in academic settings, but we cannot cover all of Ioannidis's 1000 randomized controlled trials. The efficacy is consistent with pragmatic, epidemiological, service research and clinical research. Note the agreement between the studies Ioannidis quotes and ours for the newer studies, as summarized in Tables and . His argument is that the effect size would be reduced to essentially zero (i.e. antidepressants are not materially superior to placebo since the data is based on biases from the suppression of negative studies by industry and smaller still if you counted in negative unpublished studies); and the 6 biases, which would surely decrease it further. We question the logic of Ioannidis' assertion that industry bias could reduce efficacy to zero. To do so, there would have to be an equal number of studies showing drug worse than placebo, and Ioannidis fails to show any evidence that this is the case. He starts with an effect size of 0.31 and assumes suppression of negative studies would reduce it still more, but his effect size of 0.31 is based on all studies in this FDA report-not just published study. As industry is required by law to report all studies to FDA for registration, this is another reason to doubt his premise.
| Table 2summarizes the percent of patients relapsing on placebo or drug in several meta-analyses and from one individual NIMH supported 5-year study. |
Ioannidis states, and we agree, that current evidence suggests that more severely depressed patients show a larger absolute degree of improvement relative to placebo controls than do more mildly depressed patients and symptomatic volunteers [
35-
38], but does not report the effect size in the moderately and severely depressed patients. He criticizes the use of exclusion factors, which does reduce the ability to generalize to a broader range of patients, but the exclusion of serious hospitalized depressed or suicidal patients would reduce the generalization to the patients who need and benefit from antidepressants the most. As a result, the effect size of antidepressant might be greater than reported.
Ioannidis also makes a few technical methodological criticisms of clinical trials, which would apply to trials of medical drugs as well, and we agree that these are problem areas for all drugs. We agree with Ioannidis when he notes that drug-placebo difference has decreased in recent years [
26,
27]. Some of the reasons for this are:
(a) More recent studies exclude suicidal, and the hospitalized more severe depressions, which have a larger drug-placebo difference, but the exact comparison of effect size should not be made due to methodological differences between earlier and more recent trials;
(b) Many of those with low baseline rating scores do not have the type of depression helped by drugs;
(c) Patients who were helped by drugs in the past no longer volunteer for placebo controlled trials;
(d) The more recent trials are not depressed patients of a physician seeking consultation, but rather symptomatic volunteers who answer an ad for a clinical trial, done by the clinical trial companies (working on a contract with pharmaceutical companies) and are paid per case;
(e) The clinical trial companies have difficulty finding patients, and may inflate the baseline rating to ensure that the patient is enrolled in the study, introducing a false improvement in both drug and placebo (baseline inflation has been well documented in recent studies [
39-
41]);
(f) Volunteers may collect their payment, but not actually take their pills, further reducing drug-placebo differences;
We question his generalization and interpretation of the data to virtually all depressed patients based on data from a limited number of studies of a few antidepressants from mostly mild cases and volunteers, who answer an advertisement.
Psychiatry has undergone a paradigmatic shift in how it conceptualizes depression over the years. An earlier version of DSM (DSM II) viewed mild depression as a psychoneurosis (neurosis with depressed mood), for which psychotherapy was indicated and considered only severe depressions as manic-depressive disease. At that time, the research was just beginning to distinguish bipolar disease from unipolar depression. Also, psychotic depression was not distinguished as a distinct clinical entity from severe non-psychotic depression. Ioannidis states that antidepressants may be useful in a few severe depression cases, but antidepressant monotherapy is not very effective in treatment-resistant depression, severe psychotic depression or bipolar depression, and must combine antidepressants with other types of treatments. The newer DSM III and IV placed milder depression in the category of depression, not neurosis.
We next examined Ioannidis's 6 criticisms suggesting that the biases reduce the effect size to zero, and believe that they generally operate in the opposite direction, and would be expected to improve effect size, if taken into consideration.
i) Studies have outcomes that are "non-relevant outcomes", that is the average rating scales change is too small of an improvement to be clinically relevant:
We use the clinically important definition that to be a responder, a patient must show a 50% percent improvement or greater. His assumption that a numerically small average difference on a rating scale is flawed because all patients did not have exactly the same mean improvement. Some were remitted, even though most were not.
ii) Studies are too short:
The long-term studies, including the meta-analysis he quoted generally showed larger effect sizes than the shorter studies he noted.
iii) The statistics used falsely inflated drug-placebo differences:
Ioannidis says that including patients in the analysis when they dropped out of the study (as with last-observation-carried-forward analyses) "may lead to overestimates of treatment efficacy in some circumstances." The primary reason depressed patients drop out in the placebo arm is that their [
42] depression worsened. In many cases, there is concern on the part of the clinician such as the risk of suicide, worsening depression, suffering, and suicidal ideation, leading the clinician for ethical reasons to withdraw the patient from the trial and to initiate non-blinded treatment. Overall, the dropout rate from clinical trials for poor efficacy is 5 times more frequent in the placebo arm [
43]. If such patients are eliminated from the analysis as suggested by Ioannidis, an underestimate of drug efficacy will result. Studies using the newer, favored statistical models, recommended instead of last-observation-carried-forward techniques, show greater drug difference, which is the opposite of what Ioannidis asserts [
42-
47].
iv) Too many exclusion criteria might inflate drug-placebo differences:
It is true that exclusion of patients often reduces generalization, but the exclusion of suicidal and seriously depressed patients reduces drug-placebo differences, which is the opposite of what Ioannidis's concludes.
v) Placebo lead-in periods falsely inflate the drug-placebo differences:
A lead-in period of usually a few days or a week during which time placebo may be given might have the opposite effect of what Ioannidis asserts [
35,
45-
48]. If previous drug treatments are not washed out completely with sufficient lead-in periods, this would make the placebo effect greater, the opposite of Ioannidis suggestion. In any case, eliminating the washout period impacts both the drug and placebo group equally in a double-blind trial, holding the lead-in effect constant for the trial itself. In so far as there is an effect, it seems to be in the opposite directions from that postulated by Ioannidis [
49,
50].
vi) Use of multiple groups (3, 4, or 5 groups) versus one placebo group is unethical and might reduce drug-placebo difference:
Ioannidis criticizes studies with multiple drug comparisons, but multiple experimental drug arms are generally dose-finding studies. He does not recognize that studies of this type are important to establishing therapeutic dose range of a drug. The use of too low a dose for full efficacy clinically would result in patients being exposed to side effects but without the benefit of efficacy, and the use of too high a dose would expose patients to unnecessary side effects with no greater efficacy. Both issues demand dose ranging clinical trials and satisfy equipoise concerns. These trials are necessary to find the best dose for efficacy while exposing the patient to the least risk of side effects. This is an essential component of balancing the risk-benefit ratio for any medical therapy. Furthermore, most meta-analyses Ioannidis quotes and other recent meta-analysis trials he cites used relatively low treatment doses, with the use of too low a dose resulting in an underestimate of true effect size [
1,
35-
37]. Furthermore, dose-ranging studies empirically have a lower effect size than two-arm studies.
Beyond these issues, one can speculate that antidepressants are even more efficacious than can be documented due to current clinical trial design limitations. For example, many patients may respond to a second drug when the first does not work. We cannot study the full degree to which they shorten an episode for ethical reasons, as this would require keeping patients on placebo for several years.
Ioannidis recognizes that unknown side effects could produce harm but does not recognize that unknown benefit could also occur. An example of this is the benefits from treatment shortening an episode. Let us explain: since it is generally considered unethical not to treat acute depressions with drugs after a placebo period of 4-6 weeks, we have little information about this period of time, but there is good data from a large NIMH-funded treatment-resistant depression trial, the STAR*D trial, that continual treatment produces an improvement rate of 67% whereas the initial antidepressant treatment of these patients produced an improvement rate of about 33%. There is further benefit from prevention of relapse, and from other outcomes not measured in most trials or in any trial. One can speculate that real efficacy is less or more than that which is measured. Our argument is not that the antidepressants really have greater benefit than reported, but rather, that it is not valid to conclude that they have no efficacy based on speculations without specific evidence and where the existing evidence shows the opposite.