|Home | About | Journals | Submit | Contact Us | Français|
Little is known about how the miscarriage rate has changed over the past few decades in the United States. Data from Cycles IV to VI of the National Survey of Family Growth (NSFG) were used to examine trends from 1970 to 2000. After accounting for abortion availability and the characteristics of pregnant women, the rate of reported miscarriages increased by about 1.0% per year. This upward trend is strongest in the first seven weeks and absent after 12 weeks of pregnancy. African American and Hispanic women report lower rates of early miscarriage than do whites. The probability of reporting a miscarriage rises by about 5% per year of completed schooling. The upward trend, especially in early miscarriages, suggests awareness of pregnancy rather than prenatal care to be a key factor in explaining the evolution of self-reported miscarriages. Any beneficial effects of prenatal care on early miscarriage are obscured by this factor. Differences in adoption of early-awareness technology, such as home pregnancy tests, should be taken into account when analyzing results from self-reports or clinical trials relying on awareness of pregnancy in its early weeks.
Since the 1970s, there has been a significant improvement in technology and utilization of maternal and prenatal care (Kiely and Kogan 1994), including early and continuous risk assessment, promotion of healthy behaviors, medical and psychological interventions, and follow-ups (Klerman 1990). Improved prenatal care and increased information about the risks associated with certain behaviors might be expected to have reduced the frequency of fetal loss.1
At the same time, the introduction of home pregnancy tests allowed women, for the first time, to obtain highly accurate confirmation of their pregnancy at a very early stage without the need to see a doctor. However, the availability of home pregnancy tests might have a perverse effect on the measured frequency of spontaneous abortion (hereafter referred to as “miscarriage”). The risk of miscarriage is highest in the very early stages of pregnancy (Wilcox et al. 1999) and may be as high as 25% during the first six weeks following the last menstrual period. At this stage, miscarriages are often asymptomatic (Pandya et al. 1996), and awareness of pregnancy is a key factor in determining whether a woman recognizes a miscarriage. Although it is implausible that home pregnancy tests directly affect the risk of miscarriage, their use could affect the frequency with which women recognize miscarriage and therefore report it in survey data.
Trends driven by increased awareness of pregnancy should manifest themselves primarily in early pregnancy, while medical care will (if anything) have a stronger effect on late miscarriages.2 We use self-reported data on pregnancy outcomes from the National Survey of Family Growth (NSFG) to examine the evolution of rates of reported fetal loss in the first 22 weeks of pregnancy (which we loosely term as “miscarriage”) from 1970 to 2000. This period includes the expansion of prenatal care and the introduction of home pregnancy tests. We disaggregate miscarriages into “early” (7 weeks or less), “middle” (8 to 12 weeks), and “late” (more than 12 weeks) miscarriages. There is a clear upward trend in self-reported early miscarriages, with a weaker trend in self-reported miscarriages occurring in the middle period and no trend in self-reported late miscarriage.
Our results provide detailed information about the evolution of miscarriage rates over time. National Vital Statistical Reports (Ventura et al. 2000, 2008) also provided estimates of all forms of fetal loss (including stillbirths and ectopic pregnancies) from 1976 to 2004. However, their approach assumes that fetal loss rates remain constant within five-year-age/race/Hispanic–origin group except when a new survey is used to compute these within-group rates. Between surveys, all trends in fetal loss reflect only changes in the expected rate of fetal loss attributable to changes in the composition of pregnant women or to changes in the frequency of induced abortion. In contrast, our approach captures changes between surveys and accounts for variation in the incidence and timing of induced abortion. Because some induced abortions preempt a miscarriage that would otherwise have occurred, addressing the statistical complications caused by induced abortions is essential.
The data used in the analysis are extracted from Cycles IV to VI of the NSFG, administered by the National Center for Health Statistics in 1988, 1995, and 2002. The NSFG collects data on family life, infertility, use of contraception, and women’s health. The fourth cycle was the first to record length of gestation in weeks for all pregnancies regardless of outcome and is therefore the first that we can use for our purposes.
The NSFG interviewed a nationally representative sample of noninstitutionalized women aged 15–44 at the time of the interview. Sample size varies by survey,3 totaling 26,940 women across the three cycles, of whom 10,959 had been pregnant at least once. In 1988, the surveys were administered in person, using a paper-and-pencil questionnaire. In 1995 and 2002, the surveys were administered using a computer-aided interview, which was able to detect inconsistent answers. All three cycles included self-administered sections that allowed the respondent to provide information privately on sensitive matters such as abortion.
All three cycles included information on each woman’s pregnancies and their self-reported outcomes (birth, abortion, miscarriage, ectopic pregnancy, or stillbirth). The different cycles of the NSFG are generally comparable, although the exact information gathered and its coding varies. The NSFG also consistently collected information on other characteristics of the respondent, such as educational attainment, marital status, race, religion, family income, and parental education, all measured at the time of the interview. Unfortunately, data on some important risk factors, such as smoking during pregnancy (Anokute 1986; Dominguez-Rojas et al. 1994), were not collected in all cycles and therefore cannot be included in our analysis.
We organize the data by the year in which the pregnancy occurred. Because the NSFG is conducted at infrequent and irregular intervals, it is important to select the sample carefully to ensure consistency across pregnancy years. Without restrictions, the sample of women potentially experiencing pregnancy would be, for example, 12–41 years old in 1999 (15–44 years old in 2002), 8–37 years old in 1995, and some odd weighting of women 5–34 years old (15–44 in 2002) and 12–41 years old (15–44 in 1995) in 1992. Unless we control perfectly for age and there is no variation in trends by age, true trends may be confounded with changes in the age composition of the sample by year.
To avoid this potential problem, we choose age groups and pregnancy years in a manner that ensures that the age group studied remains constant over time. We focus on a sample of pregnancies occurring between 1970 and 2000 among women aged 13 to 25 because this is the largest sample for which we can construct an age-consistent sample back to 1970. We experimented with a second age-consistent sample of pregnancies between 1980 and 2000 among women aged 26 to 35, but the time period was too short to produce meaningful results. The trends that we uncovered were similar to those reported in this article but were never statistically significant at conventional levels.
As a compromise and further check on our results, we analyze an unbalanced sample that includes all pregnancies between 1970 and 2000 among women aged 13 to 35 years of age. Although the estimates using this sample are subject to the concern that the age distribution of the sample varies across years, we reduce this concern by restricting the upper end of the age range. To further minimize this concern, we include dummy variables for each one-year age-at-pregnancy group. As we shall see, the results are similar to those obtained using the age-consistent sample.
To ensure that our data do not include any incomplete pregnancies or pregnancies that would have been incomplete at the time of the survey had they gone to term, we also restrict our analysis in both samples to pregnancies occurring at least two calendar years prior to the survey.4
We include mother’s education (high school dropout, high school graduate, some college, college graduate or more) in our set of control variables. Mother’s education is imputed for a significant minority of observations. In Cycle IV, mother’s education is imputed whenever it is missing. In Cycles V and VI, there are a small number of respondents for whom mother’s education is missing and not imputed, and we drop these observations. The miscarriage rate for those who were dropped for this reason is virtually identical to that for the sample as a whole. We also drop two pregnancies reported as having lasted zero weeks.
The three surveys recorded information about each pregnancy respondents had experienced. The 1970–2000 sample (13–25 years old) has data on 24,544 pregnancies, including 2,897 miscarriages. The length of gestation was recorded in weeks. However, some women reported duration in months, causing small spikes in the duration distribution at weeks 4, 9, 13, and so on. The (age-inconsistent) sample of 13- to 35-year-olds contains 38,122 pregnancies, of which 4,855 ended in miscarriage.
When asked the outcome of their pregnancy, the NSFG respondents could choose among birth, abortion, miscarriage, or stillbirth.5 The option of ectopic pregnancy was added in Cycles V and VI, although in Cycle IV, women could volunteer that the pregnancy was ectopic. There are some indications that ectopic pregnancies were simply less likely to be reported in the earlier cycle. For a given year of pregnancy, the miscarriage rate excluding ectopic pregnancies is independent of the survey year. However, for a given year of pregnancy, ectopic pregnancies were more likely to be reported in the later surveys because in those surveys, ectopic pregnancy was explicitly presented as a possible pregnancy outcome. The differences in reporting are most easily explained by nonreporting of ectopic pregnancies rather than their misreporting as miscarriages.
We do not want to report an increase in fetal loss rates that may simply reflect an improvement in the questionnaire. Moreover, ectopic pregnancies are conceptually distinct from miscarriages. We therefore drop ectopic pregnancies from the sample, grouping reported miscarriages and stillbirths occurring in the first 22 weeks (5 months) of pregnancy and referring to them loosely as miscarriages. We choose 22 weeks, rather than the more standard 20 weeks used to define miscarriages, to capture those respondents rounding their response to 5 months. All pregnancies lasting more than 22 weeks, regardless of reported outcome, are treated as nonmiscarriages.
Fortunately, because the number of ectopic pregnancies is small (187 in the smaller sample and 406 in the larger), the results are similar regardless of whether we include or exclude ectopic/tubal pregnancies. Based on hospital discharges, the Centers for Disease Control and Prevention (CDC) reported an upward trend in ectopic pregnancies during the 1970s and 1980s (Goldner et al. 1993), but no such trend is present in our data. As a result, the choice between the broad measure of fetal loss or the narrower “reported miscarriage or stillbirth in first 22 weeks” has no effect on any of our principal results and little effect on the others. The narrower definition results in somewhat smaller point estimates of the time trend, but significance levels are rarely affected. Despite this generally positive assessment, there is one important caveat: when we include ectopic pregnancies in a general measure of fetal loss, for some specifications using the unbalanced sample, we cannot conclude that we should exclude survey dummy variables from the specification. When both survey dummy variables and pregnancy year are included in the specification, our results are too imprecise to be useful. Ectopic pregnancies are sufficiently infrequent in our younger sample that, in practice, the differential treatment of ectopic pregnancy in different cycles turns out not to be a problem for this group.
Duration of pregnancy is self-reported. A small number of women reported a miscarriage in the first two weeks of pregnancy. About 10% of miscarriages are recorded as having occurred in the fourth week, many of which were undoubtedly reported as occurring in the “first month.” Presumably, these are mostly miscarriages that occurred in the first month following the first missed period, not in the first month after the last period (as physicians prefer to measure pregnancy duration). We have not tried to adjust the reported durations.6
Finally, as discussed earlier, we divide miscarriages into early (7 weeks or less), middle (8 to 12 weeks), and late (more than 12 weeks) miscarriages. These categories each account for roughly one-third of the miscarriages in the unweighted data.
Although we provide information on individual-level risk factors for miscarriage, our focus is on estimating the time trend in miscarriage while controlling for some of these factors. We use a hazard model (Lancaster 1990) in which each pregnancy is considered a spell and the length of the pregnancy is the duration variable:
The model specifies the risk of miscarriage, λ, as proportional to some unknown baseline hazard, λ0, common to all women, and depending exponentially on a vector of time-invariant covariates,7 X (Cox 1972). It facilitates nonparametric estimation of the baseline hazard and provides a straightforward interpretation of the parameters: a one-unit increase in one of the covariates will cause a percentage increase or decrease in the risk of miscarriage approximately equal to the value of the corresponding parameter. The probability of a pregnancy lasting until time t given that it has lasted until t + 1 is given by
where . Therefore, the log-likelihood function of a sample of N individuals will be
where di = 1 if the spell is censored, and Ti is the time at which the pregnancy ends or is censored.
The hazard model framework can be applied to the competing risks scenario, where a spell (pregnancy) can end with the realization of different risks (induced abortion, miscarriage, birth). The availability of induced abortion necessitates the use of a competing risks model. An increase in the rate of induced abortion will mechanically reduce the miscarriage rate by reducing the amount of time that pregnant women are at risk for miscarriage. At the same time, simply eliminating pregnancies ending in induced abortions would have the opposite problem; pregnancies that would have ended in birth are disproportionately eliminated from the data.8 Moreover, women having abortions are not a random sample of pregnant women. Therefore, failing to account for the presence of abortion will not only reduce our estimate of the risk of miscarriage but will do so particularly for those women who are more likely to have an abortion, thereby biasing our estimates of the effects of other factors on miscarriage risk. Given independence of risks, the competing risks model allows us to treat observations ending in an outcome other than miscarriage as censored, simplifying the estimation procedure.
The controls (X) are factors that are (a) consistently available across all three cycles for all pregnancies, and (b) likely to increase either miscarriage or abortion or (c) serve as proxies for such variables.
Thus, for example, we expect that Catholics will be less likely to make use of induced abortion. If so, in the absence of controls for religion, we will overestimate the risk of abortion for Catholics, which could, in turn, incorrectly lead us to observe a relation between miscarriage and variables correlated with being Catholic.
We control for race, not because we have any reason to believe that miscarriage is physically related to race, but because use of induced abortion differs by race and factors, such as health and smoking, that are candidates for sources of miscarriage risk are correlated with race. Education and mother’s education are also related to health and smoking, as well as alcohol and drug use, and are therefore included as controls. We include three age variables (age, age squared, and younger than 15) to capture the J-shaped relation between age and miscarriage. We also include whether the woman had a prior miscarriage. This will capture any physical or health conditions that make it difficult for a woman to give birth, as well as any persistent behavioral factors not captured by our other controls. Finally, we control for whether the woman reports that the pregnancy happened at the “right time.” Women who do not wish to give birth may take actions, short of induced abortion, that increase the likelihood of a spontaneous miscarriage.
These last two variables are both subject to concerns about potential endogeneity. For example, if the rate of miscarriage increases over time, then women who became pregnant in, say 2000, will be more likely to have had a prior miscarriage than those who became pregnant in 1970. The prior miscarriage variable would capture part of the trend; thus, including it is a conservative strategy. It is also possible that women’s assessments of whether the timing of pregnancy was “right” is influenced by the outcome of the pregnancy. Women who give birth may be more reluctant to report that the pregnancy was badly timed. Fortunately, an earlier draft of this article excluded these two variables with no notable effect on the results.
All analyses were conducted using Stata 11. The Stata do-files and data can be found in Online Resource 1. The proportional hazard models were estimated using the stcox command.
We do not have sufficient data to estimate an accurate (relative) miscarriage rate for each year in our sample. Our focus is therefore on whether there is a trend (linear, quadratic, spline). If some factor, other than random sampling, causes the miscarriage rate in a year to deviate from the trend, then the standard errors reported using standard software packages may be severely biased downward. Following Donald and Lang (2007), we implement a two-step procedure in which we first estimate the hazard model with a set of year of pregnancy dummy variables, Di, in addition to controls for race, religion, and so on.
In a second stage, the estimated coefficients on these dummy variables, , are then regressed on the time trend. In practice, the coefficient and standard errors are similar regardless of whether we use a one-step or two-step method, but the two-step method facilitates examining more complex trends. The second-stage is conducted using the reg, qreg, and rreg commands.
Table 1 presents descriptive statistics for all pregnancies and for miscarriages for our two samples. In the younger (balanced) sample, based on the self-reports, about 12% of pregnancies ended in a miscarriage. This is somewhat smaller than the generally accepted rate for all pregnancies but is in the standard range for this age group (Andersen et al. 2000). The self-reported miscarriage rate rises to 13% for the sample including older women. Miscarriages occurred, on average, in the 10th week of pregnancy, although there is a lot of variation in this timing.
Not controlling for characteristics and changes in the incidence of abortion, pregnancies occurring early in the sample period are less likely to end in miscarriage. Although 13% of pregnancies to women 13–25 years old that did not end in abortion ended in miscarriage in the 1970–1979 period, this figure was 15% during the last 10 years of our sample (1990–2000). A similar pattern is observed when older women are included in the sample, for whom the miscarriage rate rose from 13% to 16%.
Given pregnancy, whites are, on average, noticeably more likely to have a miscarriage than are African Americans or Hispanics. Women having miscarriages are somewhat more educated than the average, and they are less likely to have mothers who were high school dropouts. Our sample means for females 13–25 years old do not show a consistent relation between age and miscarriage, which is not surprising given the age group (Wood 1994:250–252; see also Abdullah et al. 1993; Smith and Buyalos 1996). As noted, there is a higher rate of miscarriage in the unbalanced sample, although this is partially explained by the lower frequency of abortion. There is no consistent relation between religion and miscarriage in the summary statistics.
Average age at pregnancy in our sample is around 21 for the younger sample and 25 in the unbalanced sample. Finally, having had a miscarriage before the current pregnancy increases the probability of this pregnancy ending in miscarriage, confirming the well-documented fact that some women have a greater tendency to miscarry (Kutteh 2005).
The data used in the analysis were collected up to 19 years after the pregnancy. This raises concerns that trends might be driven by recall bias. Although up to 30 years later, women have good recall of pregnancy and related events, such as personal characteristics at the time of pregnancy or medication taken (Tomeo et al. 1999), women might fail to report pregnancies, especially miscarriages, that occurred long ago. Gestational age is the major determinant of recall (Wilcox and Horney 1984), but time since pregnancy loss might also affect recall rates.
To assess the importance of recall bias, we estimate a basic model in which, in addition to our standard control variables, we include a time trend for year of pregnancy and dummy variables for each survey except the earliest included in the sample. If recall bias is important, we would expect, for example, that a 1990 miscarriage would be more likely to be reported in the 1995 survey than in the 2002 survey. Thus, the coefficients on the survey dummy variables should be negative and monotonically decreasing with recentness of the survey.
It is evident that this pattern does not arise in the first column of Table 2. Both coefficients are positive, and the coefficient on 2002 is more positive than the coefficient on 1995. The coefficients are neither individually nor jointly significant. In short, there is no evidence of recall bias in these data. We confirm that this is not due to imposing a linear time trend by repeating the exercise with dummy variables for each year of pregnancy. In this case (not shown), the coefficients on the survey dummy variables are tiny, with one positive and one negative.
A related concern is that small differences in the survey cycles might generate spurious trends. The lack of significance of the coefficients on the survey dummy variables makes this unlikely. As a further check, we run a “horse race” between a model with only a time trend and a model with only survey dummy variables. The Akaike information criterion (AIC) selects the model with only the time trend.9 We replicated Table 2 (not shown) for each race (white, African American, Hispanic, and other) and for our sample of “early miscarriages” with similar results. We are therefore confident that we can merge the data from the three cycles.
In the unbalanced sample, there is no evidence of recall bias, but consistent with our concerns about this sample, the preferred specification is dependent on the choice of information criterion. The AIC pushes us away from the specification without survey dummy variables while using a criterion that punishes extra variables more harshly; the specification without survey dummy variables is preferred. To maintain consistency with our treatment of the younger sample, we exclude the survey dummy variables. However, including them would strengthen our conclusion of an upward trend in early miscarriages but not in other miscarriages.
Figure 1 presents the evolution of the relative rate of miscarriage for the sample of pregnancies among 13- to 25-year-old women from 1970 to 2000. The dots are the point estimates of the relative rate (log odds ratio), normalized to zero in 1970, and the lines are the smoothed estimates and the (separately) smoothed confidence intervals. The rate shows a general upward trend that is sharper during the 1970s and during the early 1990s. This sharp increase in the 1990s seems to be driven by a single outlier, but even without this outlier, there is a strong upward trend during the entire post-1970 period.
Table 3, which gives the results of the multivariate estimation, is divided into two panels corresponding to the samples we use. In each panel, the first line presents estimates of the time trend for all miscarriages, and the next three lines provide these estimates for early, middle, and late miscarriages. Although Fig. 1 suggests a more complicated pattern, the data are inadequate to estimate a higher order trend. We are never able to identify separate linear and quadratic components and therefore show only linear trends in the table.
The top panel examines the younger sample. When we include year of pregnancy in the hazard model (“one-step”), we estimate that miscarriages increased somewhat, by about 0.6% per year during the period. Two-stage estimation gives a slightly higher estimate when the second step is conducted using ordinary least squares (OLS). Second-step methods designed to reduce the influence of outliers (quantile and robust regression) do not change the substance of the results.
Finally, we experiment with a spline that allows the trend to begin at a later date (not shown). The best fit occurs when the trend begins at the beginning of the sample period. However, we cannot reject trends that begin before 1993.10 Thus, while we have strong evidence of an upward trend in miscarriage rates, we are not able to determine its timing precisely.
Our results for the other sample are broadly similar. As noted, the results for this sample must be treated with caution because the age distribution varies over time. We address this concern only partially by including a set of dummy variables for age at pregnancy. If miscarriage trends vary by age, using the unbalanced sample is problematic. Nevertheless, the similarity of the results in the two panels is reassuring.
Figure 2 shows the relative miscarriage rates, again measured by the log odds ratio, by timing using the younger sample. When we restrict miscarriages to those occurring early in pregnancy, there is a strong trend over the sample period. This is confirmed by formal statistical analysis (see the second line of the first panel of Table 3). Depending on the method used, we find that miscarriage in the first seven weeks of pregnancy rose by 1.2% to 1.6% per year. When we allow for a spline (not shown), the point estimate is that the trend began in 1986, but the confidence interval includes the entire period through 1992. As in the case of all miscarriages, we have strong evidence of a trend, yet we are unable to pinpoint its timing. Again, the trend estimates for the other sample are quite similar to those obtained using the younger sample.
In the younger sample, miscarriage occurring at 8 to 12 weeks (see the third line of Table 3) also trended upward. Depending on the choice of technique, however, the upward trend (0.7% to 0.9%) is not always statistically significant at the .05 level, which is consistent with the fact that it is less visually clear in Fig. 2. The best-fitting spline begins in 1972, but as with all and early miscarriages, the confidence interval is large and includes any start through 1994. Again, the broad pattern is found in the other sample. The estimates may be somewhat smaller than in the top panel but also tend to be more statistically significant.
Finally, we detect no evidence of a trend in miscarriages occurring after the first 12 weeks. The point estimates are almost all small, are negative in some cases, and do not approach statistical significance at conventional levels.
Before moving on, it is worth contrasting our results with those provided in National Vital Statistics Reports (Ventura et al. 2000, 2008). Although obtained from the same data source, their estimates were for all fetal loss (including those occurring after 22 weeks) and for all women age 15–44. Therefore, we expect some discrepancies, but these differences do not account for all that we observe. When we calculate fetal loss as a fraction of pregnancies, as shown in Fig. 3, their estimates imply no trend from 1976 to 1987. These estimates jump very sharply from 1987 to 1990, presumably because the 1987 estimates rely on the 1982 and 1988 cycles of the NSFG, while the 1988 and 1989 estimates rely on the 1988 and 1995 cycles, and the estimates after 1990 rely on the 1995 and 2002 cycles. After 1990, there is no clear trend. Because the approach in those reports assumes constant within-group fetal loss rates for those not having abortions, it is also useful to look at the fetal loss as a percentage of pregnancies not ending in abortion. This approach also shown in Fig. 3 reveals a very slight upward trend throughout the 1980s and 1990s with a small jump in 1982, the first year that uses the 1988 as well as the 1982 cycle, and larger jumps in 1988 and 1990, again reflecting changes in the surveys used to calculate fetal loss rates.
Despite the different definition of the key variable and sample and differences in method that produce differences in timing, the broad message is similar. The ordinary least squares estimate of the trend in fetal loss is 1.2% with abortions in the denominator, and 0.8% when abortions are dropped.11
Table 4 presents the first stage results for the remaining covariates for all miscarriages and for the subperiods for the younger sample. When interpreting these results, it is important to remember that although we generally use the short-hand “miscarriage risk,” we are, in fact, capturing the risk of reporting having miscarried conditional on awareness of pregnancy. We focus on this important distinction in the discussion.
The risk of a reported miscarriage increases significantly with respondent’s education, although this effect is clearer in the age-consistent sample. For this group, the risk of miscarriage increases by about 5% with each additional year of own education. Although average levels of education have risen over the period we study, this effect is not sufficient to account for much of the increase in miscarriage rate. When a wider age range is included, the importance of education declines. This is consistent with formal education being a less important source of pregnancy information among older women.
We also include mother’s education in four categories. Although the point estimates suggest that women with more-educated mothers are more likely to report a miscarriage, none of the coefficients approach statistical significance.
Also, in the age-consistent sample, African American and Hispanic women have a lower risk of miscarriage than do white women for all miscarriages and those in the first 12 weeks, and the estimated differences are large—in some estimates, as much as 50%—although the difference is significant for Hispanics only for the early weeks.12 The difference between whites and African Americans is statistically significant even in the late miscarriage estimates. This difference also tends to be smaller in the unbalanced sample but becomes statistically insignificant only for late miscarriages.
For the younger sample, the age variables are jointly significant only for all miscarriages taken together and approach significance (p = .07) for early miscarriages. Age is associated with diminishing miscarriage risk over most of the relevant age range. Given that we consider only pregnancies occurring before age 25, the beneficial effect of being older is consistent with the standard results, but the negative point estimates for very early conception are surprising. However, in the sample of pregnancies among women 13 to 35 years of age, the estimated effect of age at conception on miscarriage follows the standard J-shape, with the point estimates suggesting that miscarriage risk is minimized at age 21.
Women who had a prior miscarriage are substantially more likely to have a miscarriage than are those who never experienced one. Recurrent pregnancy losses are a well-documented phenomenon for a small percentage of women and undoubtedly reflect the persistence of biological, genetic, and environmental factors, but the possible importance of the persistence of factors predicting pregnancy awareness in contributing to this pattern should not be ignored. Moreover, prior miscarriage may make women more aware of early pregnancy.
The increase in the rate of miscarriage over the past decades is a surprising finding given advances in prenatal care. Before presenting our preferred explanation for this increased incidence, we consider some alternatives that have been suggested to us.
First, perhaps changes in health insurance might have reduced access to prenatal care. The best evidence suggests the opposite. The proportion of pregnant women receiving early prenatal care increased in the 1970s (Kiely and Kogan 1994) and by some measures continued to increase through 2000 (Kogan et al. 1998; Lauderdale et al. 2010). Although the literature on the effects of prenatal care shows mixed evidence (Fiscella 1995), prenatal care would be expected, if anything, to help reduce the risk of miscarriage by increasing early detection of risk factors and providing women with information to help them avoid risky behaviors during pregnancy.
Second, smoking and/or alcohol and drug abuse might have risen over the 1970–2000 period. However, reported smoking during pregnancy fell sharply between 1967 and 1980 (Kleinman and Kopstein 1987) and from 1989 to 2000 (Ventura et al. 2003). Although we have not been able to locate trend data for the 1980s, among women as a whole, smoking rates fell sharply between 1979 and 1990 (NCHS 2010: table 60). Similarly, alcohol use decreased among pregnant women between 1988 and 1995 (Ebrahim et al. 1998) and has not increased since that time (CDC 2009). We were unable to locate earlier data on alcohol consumption among pregnant women, but alcohol consumption per capita in the United States increased between 1970 and 1980 before trending downward. Per capita consumption was about one-eighth lower in 2000 than in 1970 (NIAAA 2009). Although getting accurate information on illegal drug use is obviously difficult, and we are aware of no data on trends in their use by pregnant women, there has been no trend during the 1970–2000 period in illegal drug use in the general population (Basov et al. 2001).
Third, any of a number of environmental factors could be to blame. The most obvious of these is sexually transmitted diseases (STDs), but the recorded prevalence of syphilis, gonorrhea, and chancroid were all notably lower in 2000 than in 1970 (CDC 2008). In contrast, the prevalence of chlamydia has increased, but this appears to reflect reporting rather than prevalence. Data on chlamydia were first collected in 1984, and the disease was not reportable in all U.S. states until 2000 (CDC 2008). Stillerman et al. (2008:643) concluded, “The available scientific evidence suggests a variety of links between environmental pollutants and a range of adverse birth and pregnancy outcomes. Some links, such as evidence of neurodevelopmental effects of lead, mercury, and PCBs in humans, are established, some are likely, such as occupational exposure to solvents and birth defects, others are likely though some uncertainty remains on the nature and extent, such as air pollution and adverse birth outcomes, and some are suggestive, with further study required, such as water contamination from DBPs and pregnancy loss.” Mendola et al. (2008) reached similar conclusions. The presence of most, but certainly not all, known pollutants declined over this period. Among leading air pollutants, amounts of nitrogen dioxide, volatile organic compounds (VOCs), sulfur dioxide, particulate matter, carbon monoxide, and lead all decreased, but nitrogen oxides increased (US Environmental Protection Agency 2001). Similar levels of contaminants in water generally fell following passage of the Safe Drinking Water Act (SDWA) in 1974 (US Environmental Protection Agency 1999). The removal of lead from gas and paint and the end of DDT-spraying should have decreased, not increased, these risks.
It is, of course, impossible to rule out environmental factors not currently linked to miscarriage. For example, home computers were introduced in 1974 and became increasingly common throughout the rest of the century. But we believe that with the exception of chlamydia and thus pelvic inflammatory disease (PID), for which data do not exist, we have addressed the principal contenders. Nevertheless, although known or suspected risk factors may have been reduced, precisely because they were known or suspected, given the number of currently suspected or known environmental risk factors for miscarriage, it would be foolish to argue that we can eliminate such factors as an explanation.
Nevertheless, we suggest instead that a likely explanation for the increased incidence of miscarriage is the development of better and easier pregnancy tests. The home pregnancy tests available in the market from 1977 were the first ready-to-use, at-home, over-the-counter tests available, enabling women to confirm privately their pregnancy at a very early stage. Even the first versions of such home tests were very precise—96% accuracy for positive results and 80% for negative ones—and could be taken approximately 10 days after a woman missed her period. Therefore, home pregnancy tests undoubtedly confirmed pregnancies and miscarriages that would previously been attributed to being “late.” The early-awareness effect of home pregnancy tests could impact the evolution of miscarriage rates by making women aware of pregnancy losses that would otherwise have gone unnoticed or been dismissed.
A post-1980 trend in miscarriage rates—particularly early miscarriages—would be most consistent with this hypothesis. Our best point estimate of the beginning of the trend in early miscarriages is 1986, but unfortunately, as we have noted, this start date is measured too imprecisely to be compelling. However, this hypothesis gains further support from the large trend in early miscarriage and the absence of a trend in late miscarriages. Awareness of miscarriages late in pregnancy is unlikely to have been greatly affected by home pregnancy tests because most women would be aware of being pregnant after having missed several periods, and they would usually have needed to seek medical attention when the miscarriage occurred.
There is also direct evidence that women were becoming aware of pregnancy sooner. Cycles V and VI of the NSFG asked women who had been pregnant in the last several years when they became aware of the pregnancy. For our sample of 13- to 35-year-olds, we can calculate the mean response for 1990–1993 and 1996–2000. We estimate that mean reported timing of awareness was decreasing at the rate of about one-half week per decade over this period. Future work can better address this question by adding the Cycle VII data and determining whether both the trend in early miscarriage and earlier awareness continued into this millennium.
Although the main focus of this research is on the evolution of miscarriage rates during the past decades, the results for the first stage covariates are nevertheless relevant. African Americans and Hispanics present a lower risk of miscarriage than do whites, especially early in pregnancy. This finding is surprising because many health outcomes are worse for these groups (American College of Physicians 2004) and is likely to be driven by early awareness of pregnancy. Given that women tend to learn more about pregnancy as they get older, the fact that the racial and ethnic differences are smaller in the sample that includes the older women is also suggestive of the importance of awareness of pregnancy for self-reported miscarriage. Therefore, we see our results by race as evidence of differential awareness of pregnancy, either reflecting a difference in knowledge of pregnancy related events or a difference in the use of home pregnancy tests.
The woman’s own education is also a relevant factor in determining the risk of miscarriage. Although more-educated women may tend to take better care of themselves and access early prenatal care more frequently, the incidence of miscarriage increases with education. Once again, this suggests that early recognition of pregnancy and, perhaps, greater use of home pregnancy tests are at least partly responsible for this result, although the continued significance of this variable even for late miscarriages is surprising.
The hypothesis that awareness of pregnancies and miscarriages is a highly relevant factor is consistent with most of our results and should be taken into account when using the NSFG for epidemiological studies. For example, it suggests caution in interpreting the relation between age and miscarriage revealed in our estimates. Similarly, differences in awareness of pregnancies or knowledge about reproduction by race or education level should be taken into account when performing and analyzing the results of clinical trials.
This paper was written in part while Lang was a visiting fellow at the Collegio Carlo Alberto and the University of New South Wales. He gratefully acknowledges their support and hospitality. The research was funded in part by Grant Number R03 HD05605601 from the National Institute of Child Health and Human Development, NIH. We have benefited from helpful discussions with and comments from Michael Greene, Karen Norberg, Mark Pasternack, Fred Wang, and the referees and editors. The responsibility for any errors of fact or interpretation in this article is ours alone.
1For example, the National Library of Medicine of the National Institutes of Health advises, “Miscarriages are less likely if you receive early, comprehensive prenatal care and avoid environmental hazards….” (MedlinePlus n.d.).
2Although we do not address fetal death, defined as fetal loss after 20 weeks of pregnancy, the rate of fetal death among pregnant women receiving no prenatal care is more than five times that among those receiving at least some such care (Hoyert 1996).
3The sample size was 8,450 in 1988. It increased to 10,847 in 1995 and decreased in 2002 to 7,643. We use sampling weights but rescale the weights so that the weighted number of observations for each survey equals the actual number.
4Although incomplete pregnancies are recorded separately, pregnancies ending in miscarriage are significantly shorter than those ending in abortion or birth. Given that the last year is not fully included (surveys took place in the first part of the year), including the last two years of each survey would cause miscarriages to be overrepresented and would bias our results upward. This is particularly a concern regarding pregnancies occurring in 2001 and 2002 because these years are recorded in a single survey. As expected, including the last two years of each survey increases the occurrence of miscarriages and produces a significantly steeper estimate of the trend for middle miscarriages.
5“Still pregnant” is not applicable in our case because we restrict the sample to pregnancies at least two calendar years prior to the survey.
6A small number of births are also reported at implausibly early dates. We have not removed them from the data.
7We restrict our analysis to time-invariant variables because the NSFG does not collect information on characteristics that can change during pregnancy.
8A simple example may help. Suppose that of every four pregnancies, in the absence of induced abortion, one would end in an early miscarriage, one in a late miscarriage, and two in a live birth. The true miscarriage rate is therefore 50%. Now suppose that women would choose to terminate half of pregnancies and that all such terminations occur in the middle of pregnancy (after early miscarriages would occur but before late miscarriages would occur) and that the probability of an induced abortion is unrelated to miscarriage risk. Of every four pregnancies, on average, one will end in early miscarriage, one in birth, one and a half in induced abortion, and one-half in late miscarriage. The presence of induced abortion reduces the miscarriage rate to three-eighths. Conditional on no abortion, the miscarriage rate is three-fifths. Neither of these captures medical risk.
9The AIC is defined as 2k – 2L, where k is the number of parameters in the model and L is the log-likelihood of the statistical model. Lower values of the AIC indicate better fit.
10This confidence interval and its counterparts for different durations of miscarriage do not correct for clustering.
11The sample period is 1976–2004. Standard errors are each less than 0.1.
12The coefficients shown are relative to “other,” which is primarily, but not exclusively, Asians.
Kevin Lang, Department of Economics, Boston University, 270 Bay State Road, Boston, MA 02215, USA, Email: ude.ub@gnal.
Ana Nuevo-Chiquero, Faculty of Economics and Business, IEB, Universitat de Barcelona, Av. Diagonal, 690, 08034, Barcelona, Spain, Email: ude.bu@oveunana.