|Home | About | Journals | Submit | Contact Us | Français|
Small babies from a population with higher infant mortality often have better survival than small babies from a lower-risk population. This phenomenon can in principle be explained entirely by the presence of unmeasured confounding factors that increase mortality and decrease birth weight. Using a previously developed model for birth weight-specific mortality, the authors demonstrate specifically how strong unmeasured confounders can cause mortality curves stratified by known risk factors to intersect. In this model, the addition of a simple exposure (one that reduces birth weight and independently increases mortality) will produce the familiar reversal of risk among small babies. Furthermore, the model explicitly shows how the mix of high- and low-risk babies within a given stratum of birth weight produces lower mortality for high-risk babies at low birth weights. If unmeasured confounders are, in fact, responsible for the intersection of weight-specific mortality curves, then they must also (by virtue of being confounders) contribute to the strength of the observed gradient of mortality by birth weight. It follows that the true gradient of mortality with birth weight would be weaker than what is observed, if indeed there is any true gradient at all.
Birth weight is a powerful predictor of infant mortality, although whether birth weight itself is causally related to survival is subject to debate (1–8). The strong gradient of mortality at lower weights, seen even among babies born at term (1), could suggest a causal role of birth weight. On the other hand, the observation that small babies from a population with lower weights usually have better survival than small babies from a heavier population is not what one would predict if birth weight were a simple cause of mortality. Sometimes called the “pediatric” or “low birth weight paradox,” this phenomenon has been widely discussed (2, 3, 5, 9–16). A well-known example is the comparison of babies of smokers and nonsmokers. Infants of smokers weigh less on average than babies of nonsmokers and have higher infant mortality. However, at low birth weights, babies of smokers have lower mortality than infants of nonsmokers. Smoking has many effects on pregnancy, beyond that of decreasing birth weight and increasing perinatal mortality. It is not known how much (if any) of the smoking effect on mortality operates through birth weight (11, 12). If a hypothetical intervention could specifically erase only the effect of smoking on birth weight, any resulting improvement in mortality would be evidence of a causal effect of birth weight on mortality.
The phenomenon of intersecting mortality curves is perplexing only if we assume that, after removing known confounders, the whole gradient of mortality with decreasing birth weight is causal. Unmeasured confounding can, as previously argued (3, 15, 16), explain why mortality curves intersect. Strong unmeasured confounders can, at least in theory, also fully explain the gradient of mortality with birth weight (1). In this paper, we show that the strong unmeasured confounders that we have previously proposed can produce intersecting mortality curves. Our calculations explicitly demonstrate how confounding produces a mix of high- and low-risk babies that changes with birth weight and how this mix also produces changes in relative mortality across the spectrum of birth weight.
We use a previously described model (1) based on the following assumptions:
We then add a simple dichotomous exposure (F) that decreases birth weight and increases mortality. We show that stratifying by F results in intersecting mortality curves, with babies exposed to F having lower mortality at low birth weights (thus demonstrating a simple underlying mechanism for intersecting mortality curves). We replicate a few empirical observations using this model. We fit all the models using a standardized normal distribution (mean = 0, standard deviation (SD)=1), but we present all results using absolute birth weights.
Our empirical examples are based on US linked birth and death certificates between 1995 and 2002, compiled by the National Center for Health Statistics (http://www.nber.org/data/lbid.html). We excluded nonresidents (0.1%); twins, triplets, and so on (3.0%); and missing birth weight (<0.1%). We further excluded births from California (13.5%), which lack data on smoking and clinical gestation.
For births at term (37 weeks and higher), we used gestational age based on the date of the last menstrual period, and we estimated the parameters of the empirical “predominant” birth weight distributions within specified intervals of gestational age using the method proposed by Wilcox (http://eb.niehs.nih.gov/bwt/index.htm). Preterm births defined by the date of the last menstrual period yield heavily right-tailed birth weight distributions, presumably reflecting errors in gestational age. Defining preterm with the clinical estimates of gestation attenuates this problem. For the examples involving 33 and 35 weeks of gestation, we used the birth weight means and standard deviations proposed by Kramer et al. (17), obtained from a large population-based sample in which gestational age was generally assessed by early ultrasound. Their results are stratified by sex, and so we pooled the sex-specific estimates. We additionally excluded babies with a birth weight of >3,250 g at 33 weeks (4.9%) and of >4,250 g at 35 weeks (0.5%).
All birth weights were grouped into 250-g categories, and we fit only those points with at least 4 neonatal deaths (death from birth to 28 days) at the midpoint of the interval (thus introducing some imprecision). We do not have a formal algorithm to optimize the choice of model parameters in fitting the model to the empirical data, nor to assess the goodness of fit. As an alternative, we report how many model-predicted points fell outside the 95% confidence interval of the empirical rate. We calculated confidence intervals based on the approximation that, under a binomial distribution, a sample proportion follows a normal distribution when N is large.
We assume a population of babies born at 40 weeks. The “target” birth weight has no impact on neonatal mortality, constant at 4 per 10,000 births (Figure 1A). We assume 2 rare factors, X1 and X2, which affect birth weight and mortality. We use the parameters reported in the paper by Basso et al. (1), after translating the relative risk into the odds ratio. We assume the mean target birth weight at 40 weeks to be 3,500 g, with a standard deviation of 460 g. X1 decreases birth weight by 782 g (SD=−1.7) and increases mortality to about 640 per 10,000 (i.e., with an odds ratio of 171). X2 increases birth weight by 782 g (SD=1.7) and increases mortality to about 64 per 10,000 births (odds ratio = 16). These factors are extremely rare at term (0.5% for X1 and 0.3% for X2). The 2 populations are shown, with their respective mortality—still entirely independent of birth weight—in Figure 1B. (The figure does not show the unusual situation in which both X1 and X2 are present, although all calculations account for these babies.) When the weight-specific mortality curve is plotted across these categories, it produces the familiar pattern of a strong gradient between birth weight and mortality (Figure 1C), which is, however, purely the result of confounding. Mortality increases at progressively lower birth weights because an increasing proportion of babies have X1.
We now add an exposure (F), which decreases birth weight by 230 g (SD=0.5) and increases mortality with an odds ratio of 1.5 (Figure 2A). Otherwise, there are no differences in the population exposed to F. (The frequency of F is unspecified, as all calculations are conditional on F, and the conclusions are independent of its frequency). X1 and X2 have the same parameters as before (Figure 2B). The weight-specific mortality curves for babies with and without F now intersect (Figure 2C).
Under our simple assumptions, babies can be small because 1) their target birth weight is small, 2) F made them smaller, 3) X1 made them smaller, or 4) F and X1 together made them smaller. A small target weight (scenario 1) does not confer any additional risk. In contrast, scenarios 2, 3, and 4 are all associated with higher mortality but at quite different levels. The odds ratio with F alone (scenario 2) is 1.5, while the odds ratio with X1 is 171 or 256.5 (scenario 3 or 4). Conditional on F, 4 possible combinations of factors are defined by the presence or absence of X1 and X2 within each birth weight stratum.
Consider the mix of these combinations in 1 small stratum of birth weight (Figure 3). Among babies with an observed birth weight of between 2,200 and 2,299 g, those unexposed to F have a 10.6% chance of having X1, while babies at this same weight who are exposed to F have only a 4.8% chance of having X1. In the simple universe represented by our model, small babies with F have a lower probability of having X1 than do the small babies without F. Even though babies with F have higher mortality at all weights, the different proportions of X1 in those with and without F at this birth weight create the appearance of an advantage among small babies with F.
More generally, the intersection of mortality curves under our model results from the varying mix of factors that affect birth weight and mortality—a mix that is weighted differently at each birth weight. This phenomenon is illustrated in detail in Table 1, which shows the frequency of X1 and X2, given the birth weight and F in 3 birth weight strata (details about these calculations are provided in the Appendix). Within each example, all babies have the same observed birth weight. The first 2 columns of data show the target birth weight (in 100-g intervals) prior to the action of F, X1, and X2. The target birth weight is achieved by babies with none of the factors and by the tiny fraction of babies with both X1 and X2—for whom the 2 birth weight shifts cancel out but who nonetheless have extremely high mortality. The last column in Table 1 shows the death rate (per 10,000 births) for each category and the overall mortality, given F, in the specified birth weight stratum.
In this example, babies with F have lower mortality than do babies without F between about 1,500 and 2,950 g. At higher birth weights, the pattern reverses, and babies without F have lower mortality. The relative difference in mortality increases further at higher birth weights, because babies with F are much more likely to have X2 than those without F.
In the above scenario, we have assumed that X1 and X2 have the same effects on babies regardless of whether they are exposed to F. If this is the case, the use of relative birth weight (or z score-adjusted birth weight), as suggested by Wilcox and Russell (18), will remove the intersection of mortality curves by, in effect, removing the connection between F and birth weight. In terms of directed acyclic graphs (3), this means that birth weight is no longer a collider, and the weight-specific contrast in mortality can be displayed without the distortion that comes with adjustment for a collider.
Reality may be more complex if, for example, babies with X1 do not survive to term if they also have F, or in the presence of an interaction between F and X1 on birth weight or on mortality.
We now show how this model behaves in reproducing empirical weight-specific mortality curves stratified by a given factor. The parameters used to fit the empirical examples are reported in Table 2. We present examples with smoking and selected categories of gestational age.
Given that babies of smokers are lighter than babies of nonsmokers, it is possible that, at any given birth weight, they will have a higher gestational age. To minimize this potential problem, we derived the empirical parameters for smoking from US babies born between 39 and 41 weeks in 1995–2002. (Estimates for smokers would have been too unstable with a narrower interval). The calculated standard deviations of the predominant distributions were virtually identical (453 g and 452 g) for smokers and nonsmokers, while the mean birth weight was 193 g smaller for smokers. We used 452 g as the common standard deviation and assigned babies of smokers to have higher baseline mortality (with an odds ratio of 1.4 compared with nonsmokers). To obtain a reasonable fit, we allowed the effect of X1 on birth weight to be smaller in smokers than in nonsmokers (Table 2), since the shape of the mortality curve is strongly influenced by this parameter. Overall, we obtain a reasonable approximation of the empirical curves by assuming equal frequencies and equal mortality effects (in relative terms) for X1 and X2 in smokers and nonsmokers (Figure 4). (Fitted and empirical rates are reported in Appendix Table 2).
Intersecting weight-specific mortality curves also occur when different gestational ages are compared. In Figure 5A, the empirical mortality curves are shown for births at 33, 35, and 37 weeks. At some low birth weights, babies born at 33 weeks have lower mortality than do babies born at 35 or 37 weeks. This may occur through the mechanism illustrated above. To fit the curves, we allowed X1 and X2 to have a different prevalence at each gestational week, as well as different effects on birth weight and mortality (Table 2). We reproduced the general pattern of the mortality curves reasonably well (Figure 5B) by assigning higher proportions of X1 and X2 at earlier gestations. The fit is worst for the heaviest categories of weight at 33 weeks, most likely because of the contamination of the observed data by babies born at later gestations. Table 3 shows how, on the basis of the modeled curves, the different proportions of X1 and X2 at 1,500–1,599 g can lead babies born at 33 weeks to have lower mortality than their more mature counterparts do. (Calculations in the table are exact, unlike those in the figures, where mortality was estimated at the midpoint for each 250-g category).
Our model of birth weight-specific mortality yields curves very close to the empirically observed ones by making only a few simple—if radical—assumptions. The model assumes that birth weight is itself not causal and that the observed gradient of mortality with birth weight is due to unmeasured but exceedingly strong confounding factors with strong effects on birth weight and mortality. Although these assumptions may seem extreme, effects of this magnitude are, in fact, required to produce the empirical curves in the absence of a causal effect of birth weight (1). When assuming that birth weight has no causal effect, we can explain the optimum birth weight (the weight associated with the lowest mortality) as that at which the mix of confounders allows for the lowest mortality (1, 4).
There may be circumstances in which a baby's size directly causes mortality (e.g., macrosomia as a cause of birth injury or extreme thinness as a factor in neonatal starvation). Nonetheless, it is also apparent that reductions in birth weight can be irrelevant to mortality (as seen in babies born at high altitude) (5). In the case of smoking, the crossing of the mortality curves can be parsimoniously explained by assuming that birth weight has no direct or indirect effects on mortality. Because babies of smokers are smaller, they will be more mature than babies of nonsmokers at any given birth weight. We restricted our example to the interval of 39–41 weeks of gestation, which should ensure a relatively uniform level of developmental maturity in the 2 groups. Although this amounts to stratifying on a collider, we expect X1 and smoking to remain approximately independent among babies born at 39–41 weeks, because the association between smoking and preterm birth was modest in this data set (odds ratio = 1.25).
The empirical mortality curves at preterm weeks were reproduced fairly accurately, despite the limitations imposed by errors in gestational age, and the calculations showed how small babies at 33 weeks may have lower mortality than more mature babies with the same birth weight.
It is also interesting that, at preterm weeks, X2 appears to act independently of actual birth weight, with the upturn of the mortality curve happening at lower birth weights than among babies at term. The comparisons further suggest that both X1 and X2 are more prevalent at early gestational ages and may thus cause preterm birth, as well as growth restriction and mortality.
As previously discussed (1), the gradient of mortality with birth weight might be explained entirely by rare, potent, and as yet unidentified causes of death that independently affect birth weight. The entities “X1” and “X2” may not be single factors but combinations of very rare factors and, perhaps, very different combinations at different gestations. Some components of X1 (and X2) may already be clinically recognized but—because of their rarity—not acknowledged as an important source of the birth weight–mortality relation.
The model parameters for reproducing the empirical examples were chosen somewhat arbitrarily within the constraints of the model assumptions. We attempted to fit the curves with the same frequency of X1 and X2 for smoking to show that, in theory, these factors could be equally prevalent despite the differences in the weight-specific mortality. (We could have fit the model with a slightly weaker interaction by allowing babies of smokers to have a lower prevalence of X1 at 39–41 weeks). Even though we allowed for an interaction between X1 and smoking on birth weight, we did not need an interaction involving mortality, and the effect of birth weight was the same (i.e., none) in smokers and nonsmokers. These demonstrations are intended to provide a proof of principle rather than to argue that this model fully explains reality. Still, it is notable how well the model reproduces the empirical curves, especially given such simple assumptions and crude fitting procedure.
The phenomenon of intersecting curves is consistent with the presence of at least 1 unmeasured confounder that reduces birth weight and increases mortality. The presence of such confounding does not preclude the possibility that birth weight per se could be causally linked to mortality. If such confounders exist, however, the true biologic effect of birth weight on mortality would be weaker than what we observe.
The presence of unknown confounding factors as an explanation for intersecting mortality curves can also be applied to gestational age. Although fetal immaturity (birth at an early gestational age) is a plausible direct cause of death, some of the causes of preterm birth probably also contribute independently to mortality. In the presence of such confounding factors, the lower mortality of twins compared with singletons at early gestations can be explained with a mechanism analogous to that illustrated here, as previously suggested (19).
Our explanation for intersecting curves is parsimonious and biologically plausible. Any alternative explanation would seem to require complex interactions among the causal factors at play, birth weight, and mortality. If X1-like factors do not exist, then one would have to propose, for example, that the underlying effect of birth weight on mortality differs between smokers and nonsmokers. Additionally, there would have to be similar interactions between birth weight and mortality operating separately for each of the many factors that affect birth weight and produce intersecting curves.
When discussing intersecting gestational age-specific mortality curves, Klebanoff and Schoendorf (14) suggested that the underlying cause of preterm birth may influence the baby's chances of survival at a given gestational age. Here, we demonstrate this mechanism for birth weight. Under our model, the intersection of mortality curves is explained by the presence of confounding variables and the resulting unequal mix of those variables across the birth weight distribution. For those who would use birth weight as a general-purpose marker of risk, the inconvenient truth is that birth weight is likely affected by many unmeasured factors that carry very different risks of death. As a consequence, birth weight is—at best—an unreliable mediating variable or surrogate endpoint in the study of neonatal and infant mortality.
Author affiliation: Epidemiology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Department of Health and Human Services, Research Triangle Park, North Carolina (Olga Basso, Allen J. Wilcox).
This research was supported by the Intramural Program of the National Institute of Environmental Health Sciences, National Institutes of Health (Z01 ES044003).
The authors thank Dr. Clarice R. Weinberg and Dr. David M. Umbach for their comments on an earlier version of this manuscript.
Conflict of interest: none declared.
All babies in this example have an observed birth weight between 2,200 and 2,300 g. We calculated the target birth weight (the weight they would have had without X1, X2, and F) for each combination of X1, X2, and F. A baby with only X1 would have weighed 782 g more, that is, 2,982–3,082 g. A baby with F and X1 would have weighed 1,012 (782 + 230) g more, and so on. (Refer to “tBW [target birth weight] Interval” in Appendix Table 1).
The overall relative frequencies of each combination of X1 and X2 are determined by the parameters of the model. Because we condition on F, the sum of the 4 probabilities will be 1 within each category of F (“p(X1, X2|F)” in Appendix Table 1). For example, the probability of having neither X1 nor X2 is (1 − 0.005) × (1 − 0.003); that of having only X1 will be (0.005) × (1 − 0.003).
The probabilities of each of the 8 categories of target birth weight depend upon the normal distribution (“p(tBW|X1, X2, F)” in Appendix Table 1). Each of the target birth weight categories has different probabilities and contributes proportionally to the category of observed birth weight. We obtained these probabilities by taking the difference between the cumulative density functions at the ends of the interval, using the NORMDIST function in Microsoft Excel (Microsoft Corporation, Redmond, Washington) software.
We multiplied p(X1, X2|F) by p(tBW|X1, X2, F). The resulting values represent the absolute probabilities of each combination of X1, X2, and target birth weight interval among babies with and without F, respectively. Dividing each row value by the sum of the 4 probabilities within the corresponding stratum of F, we obtain the X1, X2 composition in the 2,200–2,300-g weight stratum, as shown in the last column of the table.
|Factors||tBW Interval||p(X1, X2|F)||p(tBW|X1, X2, F)||p(tBW, X1, X2|F)||Interval Composition|
|No X1, no X2||2,200||2,300||0.99202||0.00219||0.0022||0.89388|
|X1 and X2||2,200||2,300||0.00002||0.00219||0.0000||0.00001|
|No X1, no X2||2,430||2,530||0.99202||0.00748||0.0074||0.95153|
|X1 and X2||2,430||2,530||0.00002||0.00748||0.0000||0.00001|
Abbreviation: tBW, target birth weight.
|Weight, g||z Score||Calculated||Empirical||95% Confidence Interval||Deaths, no.||No.|