|Home | About | Journals | Submit | Contact Us | Français|
A study of pregnancy outcomes related to pregnancy in prison in New South Wales, Australia, designed a two stage linkage to add maternal history of incarceration and serious mental health morbidity, neonatal hospital admission and infant congenital anomaly diagnosis to birth data. Linkage was performed by a dedicated state-wide data linkage authority. This paper describes use of the linked data to determine pregnancy prison exposure pregnancy for a representative population of mothers.
Researchers assessed the quality of linked records; resolved multiple-matched identities; transformed event-based incarceration records into person-based prisoner records and birth records into maternity records. Inconsistent or incomplete records were censored. Interrogation of the temporal relationships of all incarceration periods from the prisoner record with pregnancies from birth records identified prisoner maternities. Interrogation of maternities for each mother distinguished prisoner mothers who were incarcerated during pregnancy, from prisoner control mothers with pregnancies wholly in the community and a subset of prisoner mothers with maternities both types of maternity. Standard descriptive statistics are used to provide population prevalence of exposures and compare data quality across study populations stratified by mental health morbidity.
Women incarcerated between 1998 and 2006 accounted for less than 1 % of the 404,000 women who gave birth in NSW between 2000 and 2006, while women with serious mental health morbidity accounted for 7 % overall and 68 % of prisoners. Rates of false positive linkage were within the predicted limits set by the linkage authority for non-prisoners, but were tenfold higher among prisoners (RR 9.9; 95%CI 8.2, 11.9) and twice as high for women with serious mental health morbidity (RR 2.2; 95%CI 1.9, 2.6). This case series of 597 maternities for 558 prisoners pregnant while in prison (of whom 128 gave birth in prison); and 2,031 contemporaneous prisoner control mothers is one of the largest available.
Record linkage, properly applied, offers the opportunity to extend knowledge about vulnerable populations not amenable to standard ascertainment. Dedicated linkage authorities now provide linked data for research. The data are not research ready. Perinatal exposures are time-critical and require expert processing to prepare the data for research.
The online version of this article (doi:10.1186/s12874-016-0174-7) contains supplementary material, which is available to authorized users.
The Mothers and Gestation in Custody (MAGIC) cohort study was set up to assess incarceration effects on pregnancy outcomes . The study used linked records to identify women pregnant while in prison and overcome the lack of pregnancy outcome data for prisoners in the state of New South Wales (NSW), Australia. History of imprisonment is not systematically recorded in pregnancy records. Information about pregnancy is recorded in NSW prison health services paper-based medical records, but this record is not updated with details about the birth or the condition of the baby if the delivery took place after release. Psychiatric illness and substance use were recognised as important confounders of the relationship between incarceration and pregnancy outcomes. Information about these conditions may be available in medical records, but smoking apart, are not included in perinatal data collected at state level in NSW. Serious psychiatric illness and substance use result in inpatient hospital stays and NSW inpatient data includes detailed diagnostic data.
Record linkage had been used elsewhere to obtain information about pregnancy outcomes among prisoners [2, 3]. NSW has appropriate infrastructure to support data linkage: a single computerised record system for managing offenders in the criminal justice system across the state; well-developed state-wide health and vital statistics collections; a jurisdictional register of persons authorised to receive opiate substitution therapy; and, since 2006 a dedicated population health data linkage infrastructure . Dedicated record linkage authorities are increasingly being used to obtain data for observational and health services research . These authorities facilitate the use of linked population data by applying complex population data linkage and the application of best practice principles  to protect patient privacy and confidentiality . Researchers are spared the task of linkage, but are responsible for design of the linkage and assessing the quality of the linked data provided to them. NSW accounts for almost one-third of Australia’s births annually  and 40 % of the Australian female prisoner population .
The NSW Centre for Health Record Linkage (CHeReL) is secure linkage facility uses probabilistic methods to link person identifiers extracted from NSW health data collections . The CHeReL promotes the use of linked data by supporting researchers, and works closely with the NSW Population Health Ethics Committee and data custodians. Metadata for these NSW Health data collections are published along with other routinely or commonly linked collections .
Five state government-maintained population databases provided data for this study.
Person-based record linkage was undertaken by the CHeReL. PDC and APDC are two of the core population health datasets that contribute to the master linkage key (MLK). Each MLK record comprises a unique person number and an encrypted record numbers for each linked record. The MLK is updated each time new data or a new data source is added. Data from other sources, such as OIMS and RoCC can be linked with MLK records. CHeReL generates the project-specific person numbers (PPN) for each linkage that are returned with the relevant encrypted record numbers to the source data custodians. The CHeReL reviews a sample of 1,000 linked project records to assure a false positive rate of ≤0.3 % and a false negative rate of ≤0.5 % the. A report of the linkage was provided to researchers before finalising the linked data [see Additional file 1].
The MAGIC study set out to examine pregnancy outcomes. PDC records were therefore the primary data source to which all other data were linked. Three data sources added information about maternal history of incarceration, maternal admissions for psychiatric illness, substance use and self-harm and maternal history of OST. The linkage also identified mothers with no history of incarceration or serious mental health morbidity. Two data sources added information about baby outcomes: neonatal admissions; and congenital anomalies diagnosed up to 1 year of age.
PDC records were the primary data source to which all other data were linked. Each PDC record includes identifying data for the mother and the baby. The linkage design specified three steps: (1) linkage of PDC mother data with data from OIMS, APDC mental health admissions and PDAS records; (2) retention of records for all PDC records linked by mother and a random 10 % sample of unlinked PDC mother records; and (3) linkage of records for the babies from the selected PDC records with data from APDC records of neonatal admissions and congenital condition registrations (RoCC). Selection criteria specifying records requested from each collection for data linkage have been included in Table 1.
Both OIMS (prisoner) and PDAS (OST authority) data custodians were requested to provide the CHeReL with files containing all permutations of the primary and alias identities.
Ethics approval for the data linkage study was provided by the NSW Population and Health Services Research Ethics Committee (EC00410). Approval for release of prisoner data for linkage was obtained from Justice Health & Forensic Mental Health Network Human Research Ethics Committee (EC00119) and later ratified by the NSW Department of Corrective Services Ethics Committee. Approval to undertake analyses by Indigenous status was obtained from the Aboriginal Health & Medical Research Council Ethics Committee in NSW (EC00342).
In NSW the provision of health data to researchers about individuals without their consent is conditional on protection from spontaneous recognition of their identities [15, 16]. Additional restrictions are to be expected when the data relates to uncommon and sensitive events such as imprisonment or admissions for psychiatric illness. On advice from data custodians, we did not request dates for key events, but sought instead the age in days of the data subject and the year for all events: birthing; hospital admission; hospital discharge; entry into prison; and release from prison. Further, we agreed to limit the request for population control data to a random unexposed sample rather than whole population data.
The aim of this study was to describe the processing of linked data to make it fit for purpose. This involved data cleaning, preparation of new data to identify incarceration exposure status for each maternity and each mother, identification of the index maternity for each mother and selection of control mothers to enable reassembly of linked data for population research.
The event at which a baby of at least 400 g birthweight or at least 20 weeks gestational age is born.
The event at which a woman gives birth to one baby (singleton birth) or several babies (multiple births).
Was calculated as maternal age at birth (days) – gestational age (weeks)*7+17. The 17 day correction takes into account that gestational age is measured from the first day of the last menstrual period, which is on average 14 days before conception; and reported as completed weeks, which discounts up to six additional days.
1st July 2000 to 31st December 2006.
1st January 1998 to 31st December 2006.
APDC record including diagnosis of a psychiatric disorder (F00-F09, F20-F99), self-harm (X60-X84, Y10-Y19, Y87.0, Z91.5), drug use (F11-F19, T40, T42, T43), or alcohol use (E24.4, F10, G31.2, G62.1, G72.1, I426, K29.2, K70, K86.0, O35.4, R78.0, T51, X45, X65, Y15, Y57.3, Y90, Y91, Z50.2, Z71.4, Z72.1) or a flag indicating admission to a psychiatric ward; or PDAS record authorising opiate substitution therapy.
Hospital episode of a person aged less than 28 days at admission.
Six de-identified data sets were prepared for researchers by source data custodians comprising the PPNs and the study data requested from each source (Table 1).
Five steps were used to process and assemble the linked data:
The OIMS Data Custodian provided researchers a ‘unique’ person number (UPN) for each prisoner with the data. Multiple-matched identities were sets of records with one UPN associated with more than one PPN or vice versa, and resolved by assuming each set was truly a single person (Fig. 1) and testing the validity of this assumption with the validation rules. The PDAS data manager resolved records with multiple-matched identities before sending data to researchers.
Person-based data can be generated by selecting one event record per person. This simple method, was used to generate maternity data from birth data because only maternal data was required maternal pregnancy outcomes and to check data quality and multiple birth was a planned exclusion factor in subsequent the analysis of baby outcomes. Had information from each baby been needed, the more complex transformation described below, would have been required.
A comprehensive person-based record used information from every incarceration event. The event history was important, so these were arranged chronologically. Incarceration order (first, second, etcetera) was added to incarceration records, arranged by episode start age, and the maximum incarceration count per person (N in Table 1) was found. A macro was applied to select and rename the set of selected original or derived data items from each incarceration record to include the event order. The revised incarceration records were then merged by person to form prisoner records consisting of sets of sequentially numbered series of N data items. Thus, 9,042 incarceration records were transformed into 3,087 prisoner records with 30 data items for incarceration start ages (start-age1 start-age2… start-age30), 30 data items for incarceration end ages (end-age1, end-age2 … end-age30), and so forth.
Mother records for prisoners were not generated until pregnancy incarceration status for maternities had been assigned (see below).
The rationale and methods used to identify inconsistences are described below. All maternities for each mother were censored if it was not possible to distinguish between an error in an individual record and a linkage error or the error could affect temporal relationships.
The estimated age (days) at conception and the prisoner record was added to each maternity record. Conditional logic was applied to arrays of the ages at the start and end of each incarceration episode and the outcome recorded in a series of a binary (zero or one value) variables were summed to count the number of incarcerations fulfilling each of the following conditions (1) incarceration ended before conception; (2) incarceration started after the birth; (3) incarceration started after conception and ended before the birth; (4) incarceration started after conception and ended after the birth; or (5) incarceration started but had not ended before conception.
Maternities with pregnancy incarceration were those with non-zero counts in categories 3 or 4 (incarceration during pregnancy), while prisoner control maternities had non-zero counts in categories 1 or 2. Maternities with a non-zero count for the final category (conceptions in prison) were censored.
Maternities for each prisoner mother specifying pregnancy incarceration status were transformed into a prisoner record, which was interrogated to identify pregnant prisoners as those with one or more maternities with a prison pregnancy. Prisoner controls were prisoner mothers with no prison pregnancies. Prisoner mothers with incarceration during pregnancy included a subset with both types of maternity. A flag for prisoner incarceration status was added to each maternity record.
The data provided to researchers included birth records for all women with matched incarceration records, all women with matched records for serious mental health morbidity (hospital admission or authority to receive OST) records that included diagnosis of a mental health condition and a 10 % sample of women with no matched records, indicating a history of neither incarceration nor of serious mental health morbidity. The data over-sampled mental health conditions. A population-based random 10 % community control sample comprised the random 10 % sample of mothers with no linked records selected by the CHeReL plus a random 10 % sample of non-prisoner mothers with mental health morbidity whose records had been linked with a record indicting mental health morbidity (Fig. 1).
The index maternity for pregnant prisoners was the first maternity with a pregnancy incarceration. For all prisoner controls and community, the index maternity was the first maternity in the study period.
An estimate of the number of women aged 18 to 44 years who gave birth in NSW between July 2000 and December 2006 was generated for the study by weighting the validated unlinked control sample count of persons by a factor of 10 and adding the count of validated women with a linked prisoner (OIMS), mental health admission (APDC) or OST authority (PDAS) record.
The CHeReL linkage report [see Additional file 1] noted that 15,995 PDAS identities were supplied for 12,526 women and 64,961 OIMS identities were supplied for 10,372 women. The final linked OIMS records supplied to researchers contained 3,087 different project person numbers (PPNs) and 3,260 OIMS person numbers (UPNs). Fig. 1 summarises the multiple-matched identities: two PPNs each appeared twice, while the same PPN was associated with 2, 3 4 or 5 UPNs in 115, 18, 2 and 4 records respectively.
Records for 624 women and 1,214 maternities were censored. Of these, records for 578 women were censored because across multiple records their data were inconsistent with being a single individual and 46 because there were no available data to determine temporal relationships between incarceration and pregnancy. Censored women accounted for 0.9 % of all study women, but 16 % of prisoners, 1.7 % of women with mental health morbidity and 0.2 % of non-prisoners with no mental health morbidity (Table 2).
Table 2 shows the total number and proportion (per cent) of person records censored and the number and proportion (per 1,000) of persons in each individual censoring category. Some persons had more than one reason for censoring. Inconsistent maternity data applied to all study women, whereas inconsistent incarceration data applied only to prisoners. Women with MHM were over twice as likely (RR 2.2; 95%CI 1.9, 2.6) and prisoners nearly ten times more likely (RR 9.9; 95%CI 8.2, 11.9) to have had their records censored because of inconsistent maternity data than were women with no linked prison or MHM records.
Inconsistent incarceration data was the most common reason overall for censoring, but applied only to prisoner records. Most invalid incarceration data (96 %) were records with incarceration periods that overlapped, the remaining records having inconstant ages (incarceration start ages larger than the end age) or duplicated incarcerations. Multiple matched prisoners (two or more DCSIDs associated with one PPN) accounted for 153 (43 %) of the individuals censored for inconsistent incarceration data. An additional file shows censored records for prisoners with incarcerations lasting less than 5 days and those with one or more periods of incarcerations of 5 or more days [see Additional file 2].
There were 3,896 maternities in the study period for the 2,589 prisoner mothers included in the study. Of these, 597 maternities with a period of incarceration that coincided with the pregnancy and were further stratified according to incarceration status at the time of giving birthing: 128 maternities with a prison pregnancy where birth took place in prison and 469 where the birth took place in the community after release from prison (Table 3).
Pregnant prisoners and prisoner controls are represented by their index maternity in Table 3. The mother-based records identified 558 pregnant prisoners with one or maternities where incarceration coincided with the pregnancy and 2,031 prisoner control mothers with maternities following pregnancies wholly within the community. The 283 prisoners with one or more maternities with a pregnancy incarceration and at one or more maternities with no pregnancy incarceration are presented as ‘Own controls’. This subset of pregnant prisoners did not contribute independently to the total number of prisoners.
Figure 2, which is not to scale, shows how the 2,589 prisoner mothers were distributed among study mothers with mental health admissions, mothers authorised to receive OST. Overall the MAGIC study estimated that less than 1 % of 403,047 mothers who gave birth in NSW between July 2000 and December 2006 spent some time in prison between 1998 and 2006. Just over 7 % of the mothers who gave birth were either admitted to hospital with a mental health condition or to a psychiatric ward between July 2000 and December 2006 or were authorised to receive OST between 1998 and 2006 (Fig. 1). The population estimate from final study data represents 99.7 % of the 404,144 women who actually birthed in NSW.
Institutionalised linkage of jurisdictional population data sources is advancing rapidly in Australia  and worldwide . This improves the availability and quality of linked data, but the governance and privacy requirements effectively separate researchers from access to the original source data and the linkage process. Researchers are freed from the onerous and highly specialised task of record linkage, but need to specify the linkage design understand the source data, the limitations of the methods used for linkage and consider the likely impacts these could have on the data linked for their research.
NSW Perinatal Data Collection has been audited for the completeness and accuracy of data reported [19, 20] and the coverage has been independently assessed in relation to birth registration data for the state . The quality of hospital episode data are closely scrutinised as these administrative data are the basis for federal funding of state hospitals . There have been several independent studies confirming good linkage between maternity and hospital data in NSW [23–25]. There has been less publicly available information about the quality of corrective services data in NSW, but publication of data from the OIMS suggests confidence in the data quality .
Researchers have a responsibility to independently test data quality. Unacceptably high rates of conceptions in prison alerted researchers to the erroneous data from the first linkage and triggered the investigation by Corrective Services NSW and resupply of the data for this research. The CHeReL supported re-linkage. This highlights the importance of good collaborative relationships between linkage authorities, data custodians and researchers.
The use of aliases and the high level of unstable and transient accommodation among people involved with the criminal justice system is common [26, 27] and complicates data linkage . Including alias identities for record linkage of prisoner data increased linkage sensitivity and generated more inclusive sample  for a small study population with a relatively high matching prevalence. The MAGIC study was not designed to test the effect of including alias identities on linkage quality. However, there was a substantially higher false positive linkages found among prisoner maternities. This suggests that sensitivity could be compromised for larger studies, particularly where the linkage prevalence is low. This underlines the importance of careful scrutiny of linkage quality when alias identities are used.
Absence of ‘gold standard’ data against which validation could be carried out is a limitation of this study. The data checks carried out were restricted to scrutiny of the data provided. External validation of data linkage requires complex arrangements and resources for investigation of original source records by separate investigators that were not available for this study. However, researchers flagged source records with inconsistent data and provided that these did not breach privacy, returned these to the source data provider. The checks that have been carried out were able to find false linkages, but there is no ready means to identify linkage failure. Available prison statistics in NSW reported cross-sectional data from which it is impossible to assess the number of women who have spent time in prison, let alone how many were pregnant. The MAGIC study was one of the first to use OIMS data for population linkage and heath research.
The MAGIC study produced the first population data from Australia to enable study of the effect of incarceration on pregnancy outcomes . Studies that seek to assess the effect of prison on pregnancy among incarcerated women are relatively sparse because of the difficulties in case finding, the challenges of selecting appropriate comparison groups and the extensive data required to control for socio-economic confounders . This cohort of 597 maternities for 558 pregnant prisoners, of whom 128 gave birth in prison and 2,031 prisoner peers with contemporaneous maternities is one of the largest available series of prison pregnancies. The use of prisoners with contemporaneous pregnancies in the community as a peer control group is a pragmatic and efficient alternative to selecting controls matched on socio-demographic variables.
This was the first data linkage study by the CHeReL to use two-stage matching of PDC data. Mechanisms for dual matching of mother and baby data for perinatal studies have since been formalised . This was also the first CHeReL linkage to use data from the NSW Department of Corrective Services and valuable lessons were learned in the process.
The capacity to report results for prisoners against the whole population increases their utility. The ideal linked population for longitudinal follow-up should include both linked and unlinked data related to the primary exposures for the whole population. Where whole population data cannot be used, and particularly for relatively rare exposures such as female incarceration, a random sample of unlinked data is a pragmatic and effective alternative that can be used to estimate population rates with a high degree of accuracy . The generation an inclusion of pregnancy incarceration status and allocation of each prisoner as either a pregnant prisoner with or without own control status or a prisoner control for validated maternities avoided duplication of effort and provided coherence for all researchers using the data to investigate outcomes.
Record linkage, properly applied, offers the opportunity to extend knowledge and monitor the effect of interventions aimed at improving health outcomes. Population data linked by dedicated linkage authorities to the highest standard is not research ready and additional effort is needed on the part of researchers to validate and prepare the data for epidemiological analysis.
APDC, admitted patient data collection; CHeReL, centre for health record linkage; MAGIC, mothers and gestation in custody; MLK, master linkage key; N, maximum event/episode count per person; NSW, New South Wales; OIMS, offender integrated management system; OST, opioid substitution therapy; PDAS, pharmaceutical drugs of addiction system; PDC, perinatal data collection; RoCC, register of congenital conditions; UPN, unique’ person number provided in prisoner data
Record linkage, preparation of the study data and analysis were undertaken with funding from the National Health and Medical Research Council of Australia. Project Grant ID 457515.
Data for this study were provided by the NSW Ministry of Health and Corrective Services NSW. Data linkage was undertaken by the NSW Centre for Health Record Linkage (CHeReL).
Ms Naomi Radom assisted LH with revision of the linked data to include all incarnation durations and Ms Elizabeth Moore from the NSW Centre for Health Record Linkage (CHeReL) commented on an earlier draft of this paper.
Data will not be shared as this was a condition of release of data to researchers by source data custodians.
The authors declare that they have no competing interests.
MHL and EAS conceived and supervised the study, contributed to the interpretation of data and reviewed the manuscript. LH designed the linkage strategy, liaised with data custodians, obtained the linked data, prepared the linked data sets for analysis, and wrote the manuscript. JRW contributed to the interpretation of data and reviewed the manuscript. All authors read and approved the final manuscript.
Additional file 1.(113K, pdf) CHeReL linkage summary. This is a copy of the final linkage summary provided to researchers by the Centre for Health Records Linkage (CHeReL) for the MAGIC project. (PDF 112 kb) Expanded Table S2. This is an expanded version of Table S2 that includes details of prisoner records with incarcerations of less than 5 day’s duration and prisoner records with one or more incarcerations of 5 or more day’s duration. The latter prisoner records were used for the analysis of pregnancy outcomes reported in 2014. (PDF 264 kb)
CHeReL linkage summary. This is a copy of the final linkage summary provided to researchers by the Centre for Health Records Linkage (CHeReL) for the MAGIC project. (PDF 112 kb)Additional file 2.(264K, pdf)
Expanded Table S2. This is an expanded version of Table S2 that includes details of prisoner records with incarcerations of less than 5 day’s duration and prisoner records with one or more incarcerations of 5 or more day’s duration. The latter prisoner records were used for the analysis of pregnancy outcomes reported in 2014. (PDF 264 kb)