We used data from the PIN prospective cohort study and North Carolina birth records to assess the reliability of the information obtained on the birth certificate. As demonstrated in previous studies, we found high agreement among maternal demographic and birth outcome variables.9,10,14
In addition, we found moderate agreement for behavioural risk factors and medical events variables, except for alcohol consumption, anaemia and gestational diabetes. This level of agreement is similar to some research assessing vital record reliability7,9,10,12,16
but better than others.8,13,14
Like previous research,7–9
alcohol consumption showed low correlation between the two data sources; however, the prevalence of women reporting that they consumed at least five drinks per week while pregnant was <1%, which had an effect on the correlation results.
Overall, anaemia showed poor percentage agreement and kappa. This could be due to the way the variable was constructed. For the PIN study, women’s medical records were checked for any report of anaemia for each trimester of her pregnancy. For the birth records, it was recorded only at the end of pregnancy. Women may not remember to report a brief period early in their pregnancy when they were anaemic. Therefore, we found that anaemia during pregnancy, as reported on the birth record, was not a reliable variable. Gestational diabetes is a rare event with <4% of the sample having reported gestational diabetes, which factored into the agreement.
The only variable that showed a difference in reliability by race among our study cohort was maternal weight gain. Whites had a higher correlation between weight gain reported in the PIN study and on the birth records than Blacks. Maternal weight gain was also reported with differential agreement by stratum of education, with women of ≤12 years of education having a lower correlation than women with >12 years of education.
The majority of variables in this study had no apparent difference in reporting by education level, and generally, we found similar patterns of agreement among all categories of education. Higher educated women had a better correlation for reporting of their marital status but had a lower correlation for the number of cigarettes smoked. Women with higher education had a lower correlation for the reporting of their number of previous pregnancies. This may be due to the exclusion of stillbirths from the count of previous pregnancies in the birth records variable, as women with higher education may be waiting longer to become pregnant and thus increasing their chances of having difficulties with the pregnancy. Finding differential reporting of birth record elements by educational strata is consistent with other reported research.17
Some variables had high percentage agreement values but low kappa scores, which indicates that they have very high agreement by chance alone, with little room for agreement beyond what one would expect by random assignment. This generally occurs for variables with high prevalence. For example, consider a binary variable with 90% of values equal to 1 in both data sources, and suppose that these values are assigned completely at random (i.e. the null value of the kappa statistic is true as an outcome in one data source is completely independent of the outcome in the other data source). The proportion in agreement will be (0.9*0.9) + (0.1*0.1) = 0.82 even when the kappa statistic equals zero.
More information is being collected on birth records than ever before, and there continues to be interest among perinatal researchers in using these data for surveillance purposes and estimating health associations. The additional variables collected on birth records may allow researchers to begin exploring possible mechanisms from maternal demographics, health behaviours and pregnancy events to birth outcomes. As interest in contextual and neighbourhood-level analyses has grown, vital records have increasingly become recognised as a source of readily available geocodable data. The intersection of geocoded addresses and sensitive data, however, is a potent combination and calls for careful consideration of privacy and confidentiality, not only of individual women, but also of their neighbourhoods. We do not argue that the quantity and nature of the data collected on today’s birth certificate is a negative; rather, we want to stress the importance of keeping individual and neighbourhood information confidential.
This study has several strengths. We were able to examine correlations stratified by race and highest level of education achieved. We included counties with urban, suburban and rural areas. Unlike previous research linking birth certificate data with hospital discharge summaries which are also rife with challenges, we linked our birth certificates with data sources in which we have considerable confidence. Interviewers for the PIN study received substantial training in how to reliably collect sensitive and other data and built a rapport with the women they interviewed.
One important limitation to the study reported here relates to PIN participants’ ability to represent the general population in this area of North Carolina.25
While we only compare PIN cohort data with PIN participants’ birth records, the cohort’s lack of generalisability may hinder our ability to make broad inferences regarding vital record reliability for all women. Additionally, only 87% of the women in the PIN study were matched with birth certificates and included in the analysis presented here. Some variables had low prevalence that hindered our assessment of agreement. Specifically, the low prevalence of alcohol consumption and gestational diabetes greatly contributed to the poor agreement for those variables. Further, data abstracted from medical records may not necessarily have perfect validity or reliability. Therefore, correlation between medically abstracted and birth records data may not necessarily be as informative if the former does not constitute an ideal gold standard. In the case of the PIN study, trained study personnel abstracted the relevant information from medical charts, thereby reducing the likelihood of transcription errors and reproduction of questionable values.
In conclusion, for most variables, birth records appear to be a good source of reliable information. The majority of variables showed no difference in agreement stratified by race which demonstrates that differential reporting does not contribute meaningfully to the racial disparity in maternal health behaviours, medical events and birth outcomes. Results also illustrated similar agreement across strata of education with the exception of variables for maternal weight gain, cigarette smoking and marital status. We support the use of birth records for studying how individual sociodemographic and health behaviour characteristics are influenced by social and environmental factors.