In 2003, Caspi and colleagues (17
) reported an increasingly positive relationship between number of self-reported stressful life events and depression risk among individuals having more short alleles at the serotonin transporter (5-HTTLPR) polymorphism. Their study has been extremely influential, having tallied over 3,000 citations and a large number of replication attempts. We review this specific cG×E hypothesis and the attempts to replicate it because it highlights the important issue of direct compared with indirect replications and because it potentially illustrates the issues surrounding publication bias and false discovery rates discussed above.
Both direct cG×E replications, which use the same statistical model on the same outcome variable, genetic polymorphism, and environmental moderator tested in the original report, and indirect cG×E replications, which replicate some but not all aspects of an original report, exist in the cG×E literature. Indirect replications might sometimes be conducted to help understand the generalizability of an original report (1
) and might in other cases be conducted out of necessity because available variables do not match those in the original report. However, it is also possible that in an unknown number of cases, a positive indirect replication was discovered by testing additional hypotheses after a direct replication test was negative. Sullivan (36
) showed that when replications in candidate gene association (main effect) studies are defined loosely, the type I error rate can be very high (up to 96% in his simulations). The possibilities for loosely defined, indirect replications are even more extensive in cG×E research than in candidate gene main effect research because of the additional environmental and moderator predictors. Thus, we believe it is important that only direct replications are considered when gauging the validity of the original cG×E finding (see also Chanock et al. [37
]). Once an interaction is supported by direct replications, indirect replications can gauge the generalizibilty of the original finding, but until then they should be considered novel reports, not replications.
The decision of how indirect a replication attempt can be in order to be included in a review or meta-analysis is critical for gauging whether a finding has been supported in the literature. With respect to the interaction of 5-HTTLPR and stressful life events on depression, a meta-analysis by Munafo et al. (38
) and subsequent meta-analysis by Risch et al. (5
) examined results and/or data from 14 overlapping but not identical replication attempts and failed to find evidence supporting the original interaction reported by Caspi et al. (17
). However, a much more inclusive meta-analysis by Karg et al. (39
) looking at 56 replication attempts found evidence that strongly supports the general hypothesis that 5-HTTLPR moderates the relationship between stress and depression. Karg et al. argue that these contradictory conclusions were mainly caused by the different sets of studies included in the three analyses. Karg et al. included studies that Munafo et al. (38
), Risch et al. (5
), and we, in this report, consider to be indirect replications. For example, Karg et al. included studies investigating a wide range of alternative environmental stressors (e.g., hip fractures), alternative outcome measures (e.g., physical and mental distress), and alternative statistical models (e.g., dominant genetic models). Furthermore, 11 studies included in the Karg et al. analysis used “exposure only” designs that investigate only those individuals who have been exposed to the stressor. We excluded such designs in this review because they do not actually test interactions; rather, interactions must be inferred by assuming an opposite or no relationship between the risk allele and the outcome in nonexposed individuals. Additionally, the result from at least one of the studies deemed supportive of the interaction in Karg and colleagues's meta-analysis (40
) is actually in the opposite direction of the original finding when the same statistical model employed in the original report is used (5
). Taken together, the pattern of results emerging from these three meta- and mega-analyses is surprisingly consistent: direct replication attempts of the original finding have generally not been supportive, whereas indirect replication attempts generally have.
There also appears to be evidence of publication bias among the studies included in the Karg et al. (39
) article. As we have shown to be the case in the broader cG×E literature, larger studies included in the Karg et al. meta-analysis were less likely to yield significant results. A logistic model regressing replication status (significant replication compared with not) on sample size among studies included in their meta-analysis found that the odds of a significant replication of Caspi's original finding decreased by 10% for every additional 100 participants (β=–0.001, p=0.02).
Karg et al. (39
) touch on the possibility for publication bias to affect their results by calculating the fail-safe ratio. They note that 14 studies would have to have gone unpublished for every published study in order for their meta-analytic results to be nonsignificant. While this ratio is intended to seem unreachably high, a couple of points should be kept in mind. First, the fail-safe ratio speaks not to unpublished studies but rather to unpublished analyses
. As discussed above, possibilities for alternative analyses (i.e., indirect cG×E replications) abound: alternative outcome, genotypic, and environmental variables can be investigated; covariates or additional moderators can be added to the model; additive, recessive, and dominant genetic models can be tested; phenotypic and environmental variables can be transformed; and the original finding can be tested in subsamples of the data. We observed each of these situations at least once among studies consistent with or replicating the original 5-HTTLPR-by-stressful life event interaction, and such indirect replications can have a high false positive rate. Second, and most importantly, Karg et al. used extremely liberal inclusion criteria, analyzing many indirect replications that we either classified as novel studies or excluded completely. Thus, the findings of Karg et al. and the findings we present here recapitulate one another; almost all novel studies (our review) and indirect replications (the Karg et al. meta-analysis) are positive, whereas most direct replications are not. This suggests that positive meta-analytic findings become more likely as study heterogeneity increases. Notably, this is exactly the opposite of what would be expected if the original results were true. Stricter replication attempts should be more likely, not less likely, to be significant. Rather than interpreting the fail-safe ratio as evidence that most cG×E findings are true, this ratio might be better interpreted as providing a rough estimate of how large the “file drawer problem” is in the cG×E field.