The recent epidemiologic data are most consistent with genetic effects on preterm birth that act through either the mother or the fetus, with both maternal and fetal effects being due to imprinted genes where only the maternal copy is expressed. Svensson et al. (20
) concluded that maternal genes are important, with negligible effects of fetal or paternal genes; however, as they pointed out, their analysis did not distinguish between effects of maternal genes and maternal-copy–expressed genes in the fetus. A third mechanism we had mentioned as also consistent with the evidence involves a mitochondrial gene. Any of our 3 designs could also be used to address that hypothesis: One can compare mothers of cases with mothers of controls or compare the fathers with the mothers.
We have focused on which is the most informative design for detecting the effects of nuclear maternal or fetal genetic variants. If one assumes that such effects act directly through either the mother or the fetus via maternally inherited genes (or both), then all of those parameters can in principle be estimated by using any of the 3 designs. However, very specialized software would be required to accomplish this with a case-mother/control-mother design. By contrast, both the case-parent and the case-base alternatives enable straightforward estimation of all relative risk parameters and also permit inclusion of incompletely genotyped pairs. Investigators typically instead have used the case-mother/control-mother design, with logistic regression, sometimes applied in turn to the mothers or to the babies, in 2 separate models. Both resulting sets of estimates are then confounded, fetal effects by maternal effects, and vice versa. A better approach fits a multivariate logistic case-control model, including both maternal and fetal genetic terms. However, imprinting effects cannot be assessed.
Suppose that the genetic mechanism is entirely fetal, involving expression of the maternally derived allelic variant, and one looks only at mothers. The marginal maternal relative risks (unadjusted for fetal effects) will be (R1M
+ 1)/2 and R1M
for 1 and 2 maternal copies, respectively. These mirroring patterns can potentially lead investigators astray. For example, a study (8
) that compared only case and control mothers and reported S1
= 1.7 and S2
= 2.7 for a variant of a gene affecting vitamin C transport might actually have been detecting the effects of a fetal gene with relative risk near 3.
A fourth design type could also be considered in certain settings. In place of the “base” sample of baby-mother pairs, one could sample parents. In the scenarios we simulated, the power for such an approach was better than that shown for the case-base design (data not shown). However, baby-mother pairs are often more recruitable than are mother-father pairs, and the latter approach could raise additional concerns related to paternity.
Which design should one use? Features of the 3 approaches we considered are summarized in . The case-mother/control-mother design is vulnerable to population stratification and bias from self-selection and provides no ready way to distinguish maternal from fetal effects and to evaluate possible parent-of-origin effects. The proposed case-base design has some of the same vulnerabilities but, if the required assumptions hold, it offers excellent power with an analysis that permits estimation of effects of exposures, an opportunity to characterize and differentiate maternal effects from fetal effects, and a way to include incompletely genotyped pairs. The case-parent design offers many of the same advantages. Somewhat counterintuitively, our power calculations suggest that, even if one assumes that the paternal genome is not relevant to risk and that there is only a fetal effect of an imprinted gene, under application of model 1, the case-parents design offered markedly better efficiency than did a case-base approach. Thus, even if fathers are not biologically important to preterm delivery, they can contribute much to its study. The case-parents design also enables an analysis that is more robust but just as flexible as that for the case-base design for discriminating among genetic mechanisms. Finally, as mentioned above, the clinical dichotomy for preterm delivery as less than 37 completed weeks’ gestation is arbitrary, and gestational length can be thought of as a quantitative trait. Methods exist (31
) to use cases and parents to take advantage of the added information implicit in the actual length of gestation among babies born preterm.
Comparison of Features of the Designs Considered, Other Than Statistical Power for Hypothesis Testing
Suppose that an investigator has already used a case-mother/control-mother approach and wishes to estimate S1, S2, and R1M. A slight revision of the data can make this possible. Randomly sample M of the cases, where M/(N + M) is the rate of preterm delivery in the population under study, and N is the number of participating controls. Now mix those M cases into the control group to form a pseudorandom baby-mother sample, which can then serve as the base sample for a case-base analysis, which uses LEM to maximize the likelihoods and to estimate relative risks with their associated confidence intervals.
In summary, well-chosen approaches to design and analysis can detect and characterize the likely roles of genetic variants in the etiology of preterm birth. The commonly applied case-mother/control-mother approach carries major limitations when studying an outcome that is not rare, but fortunately alternative designs exist that are at least as powerful, more robust, and more informative.