In this section, we apply the multiple imputation model for timing of MTCT of HIV to data collected in HIV Prevention Trials Network (HPTN) 024 (Taha et al., 2006
). Although HIV testing was initially scheduled to occur at birth, 4–6 weeks and 3, 6, 9 and 12 months, the majority of 4–6 week visits occurred between 6 and 8 weeks, and the three month visit was dropped early in the study. Samples collected at 3, 6 and 9 months were only tested if the 12 month sample was positive or missing.
Infants born to HIV-infected mothers are only at risk for MTCT of HIV while breastfeeding. At one site, mothers were counseled to stop breastfeeding by the time their infants reached 6 months of age, and, by 6 months of age, over 90% of the infants at this site had been weaned. In contrast, over 90% of the infants at the 3 remaining sites were still breastfeeding at six months. This difference in the underlying hazard between the sites will be accounted for by performing a stratified proportional hazards analysis.
We performed the multiple imputation as described previously. The values for Li and Ri were discussed in general in the Methods section. Here, we discuss how they were set more specifically for HPTN 024. If the ith infant never had a negative test, we set Li = 0 and Ri equal to the time of the first positive test. Because the earliest detection time is birth, Li = 0 is as general as L = −∞ in implementation. If the first positive test occurred on the day of birth, we set Li = Ri = 0. For these infants, we know that di = 1 and si = 0. If the infant had both a negative and positive test before weaning, we set Li equal to the time of the last negative test and Ri equal to the time of the first positive test. If weaning occurred before the first positive test, we set Ri equal to the time of weaning plus 30 days (due to the sensitivity issue discussed previously). For subjects who have both a positive and negative test, di is known to be 2. For subject’s with only negative tests and no positive tests, we set Li equal to the time of the last negative test unless weaning occurred more than 30 days before the negative test. In that case, we set Li equal to the time of weaning plus 30 days. Additionally, because follow-up was limited to approximately one year and therefore there was no information past this point in terms of observed events, Ri was set to 400 days. This would not impact the final analysis where the imputed data was censored at one year.
The following auxiliary variables were used in the imputation procedure: maternal CD4 count, hemoglobin, viral load, weight and age at 32 weeks gestation; enrollment site; whether the mother took nevirapine; an indicator of whether the infant was delivered at the study clinic; whether the infant took nevirapine; the duration of ruptured membranes; and the infant’s birth weight and sex.
In each augmented data set, every infant has an imputed value for si
. This si
reflects the true time of detectable infection if other events, such as death or weaning, did not intervene. Also, because there was little information past one year in the original data set, we censor the infants’ times to event at one year in the final analyses. Because si
is now on a continuous scale in the augmented data set, we can perform time to breastfeeding transmission by subsetting to those subjects whose si
is greater than 6 weeks. In contrast, the observed analysis must define the subset of interest as those infants with a negative test after 4 weeks and not positive before 8 weeks, misclassifying those infants tests at 8 weeks who may have tested negative at 6 weeks and those infants who tested negative at 4 weeks may test positive by 6 weeks. Therefore, we expect some bias in the baseline number at risk. Additionally, when performing the observed data analysis, we assumed only right censoring and set the time to event for any infant with a positive test to be the midpoint between the last negative test and the first positive test. For the sensitivity adjusted analysis, we subsetted the data to those infants with an imputed time of infection,
, greater than 0. In the proportional hazards model, we studied the relationship between maternal CD4 and viral load, stratified by site.
Censoring can be complex in these studies due to the different causes: death, weaning and loss to follow-up; therefore, we propose examining different censoring rules in the analyses and in later simulations to determine which censoring approach produces the least biased and most efficient estimate of the survival distribution or association parameter of interest. If an infant dies, is lost to follow-up or reaches the end of the study without having a positive HIV test, his/her time to event is censored at the time of the last negative test. If infant is weaned, there are three censoring options:
C1 An infant’s event time is censored at his last negative test. This is a common approach that does not require information on weaning.
C2 An infant’s event time is censored at the end of follow-up if there is a negative test after weaning in the observed data. This censoring approach reflects that these infants are no longer at risk after weaning and should produce an estimate of distribution of time to first positive test in the population under study.
C3 In the observed data analysis, if an infant has a negative test after weaning, his event time is censored at the time of weaning. Otherwise, it is censored at the time of his last negative test. In the imputed data, an infant is censored at the time of weaning if he has not already experienced the event. This approach estimates the late postnatal time to first positive distribution as if no weaning occurred.
Scenarios C1 and C2 result in the same censoring scenario for the MI analysis, no censoring except at the end of the follow-up time. However, MI results under these scenarios will be presented as coming from Scenario C2. Under a frequent testing schedule, there should be little difference between Scenarios C1 and C3.
Overall for the observed analysis, of the 1977 potential infants, 1317 tested negative at or after the 4–8 week visit and were still breastfeeding at 8 weeks. Infants were excluded because they were known to be positive by 8 weeks (N=298), were weaned before 8 weeks and therefore not at risk (N=70), had unknown infection status at 4–8 weeks due to missing 4–8 week test and later positive test (N=22), or had no test results after the 4–8 week visit (N=270). Analyses on the observed data were carried out under all three censoring scenarios (C1–C3). Analyses on the augmented data sets were carried out under censoring scenarios C2 and C3.
plots estimates of the cumulative rates of MTCT of HIV based on the Kaplan-Meier analysis of the observed data, the MI (m=2) analysis of time to detection and the MI (m=2) analysis of timing of infection. The observed data analysis starts at 0 at birth and does not reach the MI estimate of cumulative detection rate at birth until approximately 1 week. The Kaplan-Meier analysis also estimates a higher cumulative infection rate at one year than either of the MI approaches. The sensitivity adjusted analysis estimates a slightly higher in utero/delivery transmission rate than the unadjusted MI analysis; however, they both converge to approximately the same estimate by about 6 months. The MI curves reflect a hazard that begins to increase soon after birth and then levels off again around 2 months. This has also been observed in clinical trials (Taha et al., 2007
Figure 1 Cumulative rates of MTCT of HIV (solid black) and detection of MTCT of HIV. The dashed lines represents the time to detection based on MI with m=2. The step curve represents time to detection based on the observed data with cross hairs indicating censoring (more ...)
shows results from the Kaplan-Meier (KM) and proportional hazards (PH) analyses on both the observed and imputed data. For the observed data, C1 and C2 produce similar results. If all infants were tested at the end of the study, we expect these results to be identical because all weaned infants who did not experience MTCT of HIV would be censored after the end of follow-up. For the observed data, censoring scenario C3 resulted in higher KM estimates of the cumulative infection rates than C1 or C2. Because C3 treats weaned infants as if they were still at risk at the time of the weaning and therefore assumes that some would experience the event, we would expect the proportion to be higher. The same differences between C2 and C3 are seen in the multiple imputation analysis. Because censoring under C2 (not at risk after weaning) is usually of interest, we will focus the comparison between the observed and MI analyses under censoring scenario C2. Many infants who tested negative at 4–8 weeks did not have another test result available until 12 months. If that test result was positive, the time of the first positive test was imputed to be approximately 7 months in the observed analysis; therefore, we expect the observed data analyses to underestimate the transmission rate at earlier times. The results indicate that the MI may be correcting this, producing higher estimates at 3 and 6 months than the observed analysis. MI produces lower estimates of transmission rates at 9 and 12 months, though. The simulations summarized in the next section show that we expect the observed analysis to overestimate the transmission rate at 12 months. Also, the MI analyses include the 292 infants whose HIV infection status is indeterminate at 4–8 weeks due to missing tests. Potentially, these infants were less likely to have experienced MTCT, thus increasing the number at risk disproportionately to the number of events. The MI results adjusted for imperfect sensitivity are higher at all time points, reflecting that the 6 week cut point for breastfeeding transmission misclassifies approximately 1/4–1/3 of the breastfeeding transmissions as in utero/delivery. The MI results do not vary substantially over m.
Results from breastfeeding transmission analyses (C1=censored at last negative; C2=censored after end of follow-up if weaned before last negative; C3=censored at time of weaning).
To better understand the variability between imputations and how the estimates from the augmented data sets compare to the observed analysis, we plotted the KM estimates of the cumulative detection rate curves for each augmented data set (m = 2) and for the observed data (). There is variability between the estimates from the augmented data sets. At most points of interest the estimates are all contained within an interval of width approximately equal to 0.02. Before five months, the observed data analysis estimates of the survival curve are higher than all the estimates from the augmented data sets. From 5 to 8 months, the observed curve crosses all the augmented data set estimates. After 8 months, the observed curve is at the lower end of the augmented data estimates.
Figure 2 Curves of the cumulative proportion of late postnatal infections for the observed data analysis (black) and each of the augmented data sets with time to first positive test (red) and timing of infection adjusted for imperfect sensitivity (grey). The vertical (more ...)
also lists the results from the proportional hazards regression models fit to the observed and MI data. The estimate of association was higher in the observed analyses than in the MI analyses. Additionally, the standard errors were lower for viral load and higher for CD4 count in the MI analyses. The MI results varied little over m or the censoring scenario. However, the observed analyses results varied more over censoring scenarios (C3 vs. C2 or C1), suggesting some interplay between timing of weaning and CD4 count and viral load.