Correction for exposure misclassification in a confounder-adjusted Cox regression model with a time-varying dichotomous exposure can be implemented with relative ease. In our example of prenatal influenza vaccination and the risk of preterm birth, the bias-corrected AHR was slightly higher and less precise than the AHR obtained using conventional analysis. While in this instance both AHR estimates were essentially null findings, correction for this bias could result in a much greater change in effect estimates depending on the magnitude and pattern of exposure misclassification.
A variety of sensitivity analyses of maternal recall of exposures during pregnancy have been implemented as exposure misclassification is recognized as a potentially large source of error. Particularly, the misclassifications of prenatal exposures such as environmental toxicants (33
), smoking (36
), vasoactive exposures (39
), antibiotic use (40
), vitamin use (41
), genital tract infections (43
) and prepregnancy body mass index (BMI) (44
) have been examined. Various methods for evaluating bias have been employed, including assessing the statistical significance of adding validity data to multiple linear regression models (36
) or stratification of study results by length of recall period (39
), interview method (35
), or level of knowledge regarding the hypothesis under study (41
). Effect estimates have also been presented after correcting for inaccurate self-reports using single parameter estimates from internal and/or external validation study data (34
). Some studies have estimated an unadjusted measure of association after performing simulation trials assuming different misclassification scenarios (33
) or an adjusted one using Bayesian methods (38
). The perinatal epidemiology study that comes closest to the one we present here is a confounder-adjusted probabilistic bias analysis of prepregnancy BMI on adverse pregnancy outcomes (44
). However, that study evaluated pregnancy outcomes using logistic regression; the study we present is the first, to our knowledge, to extend this type of bias analysis to survival analysis with a time-varying exposure. While regression calibration remains a popular method for correcting for continuous covariates measured with error, even in time-varying exposure Cox regression models, its applicability to categorical time-varying exposures is limited (45
) and it has not been implemented in studies of pregnancy outcomes.
When inferring etiologic relations between prenatal exposures and pregnancy outcomes, Cox regression may be more appropriate than logistic regression (10
). For many adverse pregnancy outcomes, like preterm birth, low birth weight, and stillbirth, average pregnancy length is shorter in cases than in the controls. Survival analysis methods, like Cox regression, can account for that discrepancy in pregnancy length and the reduced opportunity for exposure among the cases. Further, for exposures that can change over time, time-varying exposure analysis more accurately measures the actual time exposed than simply assuming the exposure occurred for the duration of the pregnancy. In our analysis, we found the OR and time-invariant HR were approximately 7% lower than the time-varying HR, which was not a meaningful difference. However, others have shown the potential for substantial bias resulting from incorrect model selection (27
). This may partly explain the highly protective effect of influenza vaccination on preterm birth observed in a recent population-based Pregnancy Risk Assessment Monitoring System study that used logistic regression analysis (16
Our study imputed vaccination dates for records in the simulated datasets that were assigned to prenatal influenza vaccination. Date assignment has been performed in epidemiological studies before, particularly for longitudinal studies where outcomes may be missing (47
) or for infectious disease models where dates of serological events, such as seroconversion or immunological progression to more severe disease, are not observed (49
). Our method of date assignment followed a similar approach, basing the probability distribution for date assignment on the distribution of observed data. Our assumption that the vaccination date distribution for women we observed to have been vaccinated was the same as that for women who falsely reported not receiving a vaccination may have been in error and is a limitation of our analysis.
There were several other limitations to our bias analysis. First, we used an external study conducted among elderly persons in the United Kingdom to estimate sensitivity and specificity of self-reported influenza vaccination (31
). Ideally, estimates of these classification parameters from an internal validation study would have been preferred, but that was infeasible, as determination of true influenza vaccination status for all subjects was not possible in our study population. Alternatively, an external validation study conducted among pregnant women in the United States would have been preferable, but this was not available. That being said, calculation of the PPV using our study’s influenza vaccination exposure prevalence and the 2007 UK study’s sensitivity and specificity estimates produced a value (0.83 [95% CI 0.79, 0.85]) (7
) that was similar to the overall PPV we found (0.85 [0.83, 0.86]), helping to justify the transportability of the UK study. Second, we modeled influenza vaccination over time as a step function, changing from 0 to 1 on the day of vaccination; alternative functions could have been applied to model the change in influenza vaccination over time (52
). Third, our choice of beta distributions to model the observed vaccination date distributions may be questioned given the statistical lack of fit. A larger data set, collected by self-report during Behavioral Risk Factor Surveillance System interviews in 2008–2010, showed a clearer right-skewed unimodal distribution, with vaccinations beginning in August, peaking in late fall, and diminishing by January (53
). Despite our sparse data, we still felt it was appropriate to create separate beta distributions based on timing of first prenatal visit because of the differential availability of vaccinations by calendar month. Modeling the influenza vaccination dates using beta distributions resulted in the same bias-corrected AHR as we found using an empirical distribution, leading us to believe that choice of vaccination date distribution probably had little effect on the final results. Indeed, on average, only 1.4% of records in the simulated data contained assigned vaccination dates. However, the effect of the exposure distribution model choice could be much greater in analyses involving a higher proportion of imputed data.
Additional bias study limitations include the use of the observed prevalence of exposure in the NPV calculations, even when it was that very prevalence that we were evaluating in the bias analysis, and calculating PPV estimates solely using preterm vs. full-term status, rather than a more complicated model that incorporated other maternal factors. Although our study included over 2000 mothers of non-malformed infants, we were limited in power by the small number of preterm births and were concerned about over-specifying our models.
In conclusion, our study is the first to implement probabilistic bias analysis to address exposure misclassification in a Cox regression model with a time-varying exposure. Other studies examining exposures during pregnancy assessed by retrospective maternal recall could consider using this approach to evaluate the quantitative impact of exposure misclassification. In particular, our methods are applicable to one-time or intermittent exposures during pregnancy and pregnancy outcomes that differ in average gestational length, such as preterm birth, low birth weight or stillbirth.