Demographic and health surveys data
The Demographic and Health Surveys (DHS) are standard surveys based on representative samples of national populations. Among other information, they provide maternity histories for women aged 15 to 49 with details on the date of each birth and on the age of the mother at time of each birth. This is enough information to compute person-years lived and births by age and time period and therefore age-specific fertility rates by period for the years preceding the survey. Note that since these maternity histories are closed cohorts, these calculations can be done with basic tabulation and simple tools such as a spreadsheet or basic computer programming. The only limitation of this type of information is the truncation effect: one knows the fertility "x" years ago only up to women aged "50 - x" years, because women who were interviewed were required to be under 50 years of age. Therefore, we limited our analysis to fertility up to age 40 and for the 10 years preceding each survey for which information is complete. This truncation of retrospective surveys is inevitable; for example, a woman who gave birth nine years ago at age 41 was not interviewed, as she was 50 at the time of the survey. This effect is better explained in other documents [20
]. DHS record all births that occurred prior to the survey, starting at age 12 years. Cumulative fertility up to age 40 is noted as TFR(40) in this study and represents the average number of children ever born per woman from age 12 to age 40 years, which represents about 90% of the total fertility up to age 50 (the classic TFR).
Merging data from several surveys in the same country
If several surveys were available in a country, events and person-years were cumulated in order to provide annual fertility rates for longer periods. The cumulating of several surveys reduces fluctuations due to sample size and on average tends to compensate for minor biases associated with sampling. When displayed in a figure, estimates of cumulative fertility by age 40 grouped by two-year period tend to be quite regular and reveal the major trends in fertility, whether increasing, decreasing, or remaining steady. Of course, only formal statistical testing of slopes allows one to demonstrate an increase in fertility, a fertility decline, or a fertility stall. This method has been explained in more details in other documents [20
For this study, we used the following DHS: Ghana: 1988, 1993, 1999, 2003, 2008; Kenya: 1989, 1993, 1998, 2003, 2008; Madagascar: 1992, 1997, 2003; Nigeria: 1990, 1999, 2003, 2008; Rwanda: 1992, 2000, 2005; Senegal: 1986, 1993, 1997, 2005; Tanzania: 1991, 1996, 1999, 2004, 2007; Zambia: 1992, 1996, 2001, 2007.
All computations were done separately for urban and rural areas, since the trends were often divergent at the beginning of the transition, with an earlier decline in urban areas while fertility continued to rise or stagnated in rural areas.
Case definition of fertility stall
The criteria used for defining a fertility stall were similar to those proposed by Gendell (1985): fertility decline must have started for some years, then the decline must stop for a few years, and if the stall had come to an end, the fertility decline must have resumed. In terms of slopes, this requires an initial period with a significant negative slope, a second period with a net zero or positive slope, a significant change in slopes between the first and second periods, and, when applicable, a third period with a significant negative slope, with a significant change in slopes between the second and third period (p < 0.05 using 2-tailed tests). We used only linear trends for testing the changes in slopes, since these summarize well the changing slopes and are easy to compute. Note that in any country, changes in total fertility rates can almost always be approximated by linear trends over monotonic periods. The knots defining monotonic periods were chosen visually after plotting the yearly cumulative fertility on a graph and then fine-tuned by computing the intersection points of the two regression lines. The slopes were calculated one by one, on each monotonic segment, with the same linear trend.
Point estimates versus slopes
Our methods focused on slopes computed over periods for which annual fertility rates were available. These methods are far more stable than simply comparing point estimates. For example, in a DHS based on 6, 000 women, a TFR of 3.50 over the three years preceding the survey could be given with a confidence interval of 3.15 to 3.85 (about 0.25 due to sample size and 0.10 due to design effect). If two surveys are available five years apart, it is almost impossible to test a trend from those two points, unless the difference is very large (> 0.50). Even if the second survey indicates a TFR of 3.10, one cannot rigorously conclude whether fertility declined or stayed constant. The testing of slopes is very different, since it includes all points over the period covered, 10 years before each survey, totaling 15 years if two surveys are available. Furthermore, merging datasets for computing slopes allows one to smooth out erratic values of point estimates: these erratic fluctuations include fluctuations due to sample size and design effect, so that a simple test is enough to prove the slope or the changing of slopes.
Method 1 for testing changing slopes: demographic approach (linear regression)
The first method used for testing slopes and changes in slopes of fertility trends follows a demographic approach and uses the property of period fertility rates. The concept of TFR is abstract and refers to what is called a "synthetic cohort." In other words, it computes what would be the cumulative fertility of a real cohort if it had the same age-specific fertility rates as those observed over a given period. Here, of course, one ignores mortality, as if all women survive up to age 40, as one would do in a real cohort of women who already reached age 40. One could test the trends in period cumulative fertility (period TFR) as if they were trends in cohort cumulative fertility (i.e., equal to completed family size) with the same level and the same number of births. For example, a period TFR(40) of 5.0 based on 1, 000 births is considered to be equivalent to a cohort cumulative fertility of 5.0 among 200 women, who would have had 1000 births by age 40. Testing trends in cohort fertility therefore requires the distribution of completed family size by parity. As in the real world, when an average completed family size is 5.0, the sample includes women with 0, 1, 2 ... 16+ children ever born, with an average of 5.0. Here the period TFR(40) was simply distributed accordingly, by assuming that at the same level of cumulative fertility, the distribution of women by parity was the same in a period and in a cohort (from 0 to 16+ children ever born). This procedure allows one to obtain a direct measure of the slope and its variance, based on individual women, as one would do in a cohort.
In practice, in Method 1 one proceeds the following way. First, one computes the cumulative fertility, TFR(40), from age-specific fertility rates by single calendar year and five-year age group. Then, one computes the corresponding number of women in the synthetic cohort. These women are distributed by parity using a simple relationship linking the proportion of women with (i) children to the completed family size. These relationships were computed from cohort data using the same DHS, from parity 0 to parity 16+. Then, the sample is analyzed as a cohort sample, and cumulative fertility is related to time in a straightforward linear regression:
The model allows us to calculate cumulative fertility by year after linear fitting and the precise fertility trend, to provide confidence intervals for slopes (positive, negative, or zero), and to test for changing slopes using standard Student T-tests. This method requires no hypothesis other than the equivalence between period and cohort, which is the rationale for computing period fertility rates. A regression is also calculated for each monotonic period.
Method 2 for testing changing slopes: statistical approach (logistic regression)
The second approach focuses on age-specific fertility rates and is based on the fact that women are likely to have only one or no delivery over a period of one year, depending on age and period. Therefore, the method chosen is a linear-logistic model, or logistic regression, where the dependent variable is 1 for a birth and 0 for no birth, and the weights are proportionate to the exact person-years lived over the period. The age pattern of fertility is complex and not easy to parameterize, so age groups (12-14, 15-19, 20-24, 25-29, 30-34, and 35-39) are introduced as dummy variables, with the 25-29 age group taken as the reference category because it has the largest number of births and is therefore the most stable. The model is:
where i is the age group and Xi is the dummy variables associated with each age group, from 1 = 12-14, 2 = 15-19, etc. and 6 = 35-39, with the fourth group (ages 25-29) omitted as reference category.
This model allows one to compute age-specific fertility rates by period, to recalculate the cumulative fertility by age 40, and to estimate the trends. As in the first model, it provides a confidence interval for the slopes and allows simple testing for fertility stalls. This method requires only two basic hypotheses: homogeneity in risk of bearing a child over a short period of time and a constant age pattern of fertility over short periods of time, both of which appear realistic.