|Home | About | Journals | Submit | Contact Us | Français|
To identify genetic variants contributing to preterm birth using a linkage candidate gene approach.
We studied 99 single nucleotide polymorphisms for 33 genes in 257 families with preterm births segregating. Nonparametric and parametric analyses were used. Premature infants and mothers of premature infants were defined as affected cases in independent analyses.
Analyses with the infant as the case identified two genes with evidence of linkage: CRHR1 (p=0.0012) and CYP2E1 (p=0.0011). Analyses with the mother as the case identified four genes with evidence of linkage: ENPP1 (p=0.003), IGFBP3 (p=0.006), DHCR7 (p=0.009), and TRAF2 (p=0.01). DNA sequence analysis of the coding exons and splice sites for CRHR1 and TRAF2 identified no new likely etiologic variants.
These findings suggest the involvement of six genes acting through the infant and/or the mother in the etiology of preterm birth.
Preterm birth (PTB) is a major public health issue accounting for three million deaths worldwide each year. Despite a slight decrease in the incidence recently, PTB has increased from about 10% to over 12.5% of births over the last two decades in the United States (1). Improvements in neonatal care have contributed to an increase in survival rates of preterm infants (2) in countries with optimal infant healthcare delivery. However, despite these advances, PTB is still associated with substantial rates of morbidities, especially in extremely preterm infants, including chronic lung disease, patent ductus arteriosus, retinopathy of prematurity, intracranial hemorrhage, and cerebral palsy (3). These complications, in addition to PTB itself, are the largest risk factors for infant mortality in the United States (4).
The majority (72%) of PTB is spontaneous (5) with unknown etiology (6). One substantial risk factor for PTB is genetic predisposition (7,8). An infant has an increased risk of being premature if the mother was born prematurely (9), if a maternal aunt had a premature infant (10), and especially if the mother had a prior PTB (11). Twin studies suggest that the heritability of PTB ranges from 15 to 40% (6). A few other maternal risk factors implicated in spontaneous PTB include low socioeconomic status, black race, younger age, intrauterine infection, inflammation (6), low pre-pregnancy weight (12), cholesterol levels (13,14), and substance abuse (15,16). Conditions, such as preeclampsia or fetal distress, may lead to induction of labor or cesarean delivery prior to 37 weeks gestation, resulting in an indicated PTB. Studies have shown the mother transmits much of the genetic risk for spontaneous PTB with smaller contributions from the father and fetus (17-19).
There are a variety of approaches to identify genes associated with a complex trait. A candidate gene approach takes advantage of the known biology associated with labor and delivery whereas a genome wide approach can implicate new physiologic pathways. In addition to known biology, conservation of evolutionary mechanisms can also be applied to human parturition timing to suggest additional genes (20). There are strong arguments for candidate gene studies to continue being used in the study of complex disease (21). Linkage studies have the ability to detect rare, higher risk variants and can identify causal genes when allelic heterogeneity prevents genome wide association from succeeding (22). We hypothesized that using a candidate gene linkage approach we would identify new genes containing variants that contribute to familial cases of PTB.
A total of 257 extended families were chosen including 492 premature infants forming 297 affected relative pairs (260 infant affected pairs and 37 mother affected pairs), see Table 1 for a summary by study site. The mean family size for typed members was 10.9±3.4 (median=10; range=6-27) with a mean of 2.0±0.9 for typed premature infants (median=2; range=1-5) and a mean of 1.2±0.4 for typed mothers of premature infants (median=1; range=1-3) per pedigree. An initial power analysis indicated the sample size for this study was adequate to detect evidence of linkage with modest locus heterogeneity and Mendelian models. The data for all genes looking for either a fetal or maternal effect using nonparametric linkage analysis is summarized in Figure 1 and using transmission disequilibrium test (TDT) is summarized in Figure 2. The full dataset along with parametric linkage results, which did not reveal any significant findings, is available in Supplemental Tables 1A and 1B, online. Two single nucleotide polymorphisms (SNPs) violated Hardy-Weinberg equilibrium and were not included in the analysis: rs1876831 (CRHR1, p=5E-11) and rs573549 (APOA1, p=0.0004).
With the preterm infant as the affected case, a nonparametric linkage analysis of all candidate genes revealed two linkage peaks. These multipoint linkage peaks were on chromosome 10 (CYP2E1, p=0.0011-0.002) and chromosome 17 (CRHR1, p=0.0012-0.002). CRHR1 also had significant singlepoint linkage peaks (p=0.05-0.0014). With the mother of a premature infant as the affected case, six linkage peaks were identified. Four were multipoint linkage peaks on chromosome 6 (ENPP1, p=0.003), chromosome 7 (IGFBP3, p=0.006), chromosome 9 (TRAF2, p=0.01), and chromosome 11 (DHCR7, p=0.009). Three were singlepoint linkage peaks on chromosome 5 (HAVCR2, rs12654265, p=0.002), chromosome 10 (MBL2, rs2136892, p=0.001), and chromosome 11 (DHCR7, rs1790318, p=0.008).
Using TDT, we identified eight nominally significant associated SNPs (p=<0.05). However, none fell within the most significant linkage peaks based on affection status, and none were significant when accounting for multiple comparisons using a Bonferroni correction. With the mother as case, there was one suggested association with rs10878774 in INFG (p=0.047). With the infant as case, a suggested association was seen for rs2303152 in HMGCR (p=0.046); rs605203 (p=0.047), rs7746553 (p=0.019), and rs592229 (p=0.034) in C2; rs44589901 in DEFA6 (p=0.036); rs11003136 in MBL2 (p=0.003); and rs4760648 in VDR (p=0.010).
An analysis was performed that stratified premature individuals based on type of labor. There were 251 individuals with spontaneous, 40 with induced, 36 with no labor (cesarean section), and all others with unknown type of labor. The reason for induction was not known for all individuals in the dataset. There was no significant difference in mean (p=0.07) or median (p=0.19) gestational age between those with spontaneous, induced, and no labor. When the unknown group was added, a statistically significant difference was seen when compared with the no labor group (ANOVA mean p=0.02, median p=0.01), but no difference was seen comparing with the spontaneous and induced groups. Using the preterm infant with spontaneous labor as the affected case, a nonparametric linkage analysis revealed two multipoint peaks and six singlepoint peaks. The multipoint peaks were on chromosome 9 (TRAF2, p=0.03) and chromosome 20 (BPI, p=0.03). The singlepoint peaks were in HMGCR (rs3931914, p=0.04), PTGS1 (rs10513401, p=0.03), TRAF2 (rs10781522, p=0.04), CRHR1 (rs7225082, p=0.014), and BPI (rs5743507, p=0.04, and rs4358188, p=0.04). TDT identified eight suggested associations: rs7746553 in C2 (p=0.02), rs4458901 in DEFA6 (p=0.003), rs2515617 in ABCA1 (p=0.04), rs10781522 (p=0.04) and rs 4880166 (p=0.04) in TRAF2, rs11003136 in MBL2 (p=0.02), rs1630498 in DHCR7 (p=0.01), and rs1893505 in PGR (p=0.02).
Linkage haplotype analysis was performed for CRHR1 as it had the strongest linkage signal. The 3 SNPs in CRHR1 generated 7 of a possible 8 haplotypes. These genotypes were treated as “super alleles” numbered 1 through 8. Because haplotypes 6 and 7 had low frequencies, they were pooled and redefined. Both parametric and nonparametric analyses were performed. The nonparametric p=0.0024. The dominant model had LOD score 0.82 and heterogeneity LOD score (HLOD) 1.32 (alpha=0.630). The recessive model was not significant (LOD=-18.1, HLOD=0.329 with alpha=0.121). With TDT haplotype association analysis, no individual haplotype was significant (p>0.28) with the global p=0.83.
An initial analysis of 33 SNPs in OXTR, PGR, VDR, CRHR1, PTGS1, KCNN3, TRAF2, IGF1R, and NR3C1 was performed on a subset of the population (412 premature infants forming 230 affected relative pairs) in which CRHR1 and TRAF2 showed evidence of linkage (p=<0.01). Because of strong evidence in the literature to support a causal genetic variant within these genes, we sequenced their coding regions rather than the 10-20 centimorgan chromosomal region surrounding the linkage peaks identified. While previously reported SNPs were identified in the coding regions, no novel missense, frame-shift, or nonsense mutations were detected.
Identification of a genetic contribution to PTB would allow detection of at-risk pregnancies and might also suggest environmental contributors to PTB. This could provide tools to prolong gestation by tailoring obstetrical management to individual genetic susceptibilities. In the past, interventions for preventing PTB have proven largely unsuccessful (23), but by identifying specific individual pathophysiologic mechanisms of PTB new strategies can be developed. In this study, we used a linkage candidate gene and sequencing approach in an attempt to identify chromosomal regions that may contain genes involved in the etiology of PTB. Genome wide association studies have had enormous success recently in identifying genes associated with complex traits, but they have not been reported for PTB. In addition, association will not identify those genes where allelic heterogeneity is responsible for the heritability even while those alleles might have greater impact on the phenotype in a given family than common variants typically have. Linkage is the best approach to detect this class of variant. While previous studies have supported a stronger maternal contribution to PTB, it is also thought that PTB may be due to the role of genes present in the mother/uterus, baby/placenta, or a combination of both (8,24). This is the first large linkage study looking at both the mother and the fetus as potential risk cases so the potential linkages identified will be signals to be examined in larger studies using more markers and a greater number of families then were available in this study. Our findings suggest the involvement of CRHR1 or CYP2E1 mediated by the infant and/or ENPP1, IGFBP3, DHCR7, or TRAF2 mediated by the mother in the etiology of PTB. CRHR1, DHCR7, and TRAF2, in particular, are members of pathways identified in prior studies and have biologic plausibility for playing a role in PTB.
CRHR1 encodes one of the two receptors found in humans to which corticotropin-releasing hormone (CRH) binds (25). CRHR1 is expressed in the pituitary (26), endometrium (27), myometrium, and placenta (25), among other locations. The coding sequence of CRHR1 is highly conserved with only 6 missense variants reported in the NHLBI/GO database (http://evs.gs.washington.edu/EVS/) of over 1000 sequenced Europeans. Placental CRH is part of a feed-forward loop in both mother and fetus. CRH stimulates the release of ACTH from the pituitary, which leads to the release of glucocorticoids from the adrenal glands promoting production of more CRH (26,27). Plasma CRH levels undergo an exponential increase during pregnancy peaking at the time of delivery (26) due to increased placental production and decreased CRH binding protein concentrations (27). Women having a PTB undergo a more rapid increase (26) establishing different patterns of CRH levels as early as the end of the first trimester, suggesting that the length of gestation is predetermined and the onset of parturition is triggered when CRH levels peak (26). Genetic variants in CRHR1 have also been shown to have an association with susceptibility to bacterial vaginosis, which is a risk factor for PTB (28).
7-dehydrocholesterol reductase catalyzes the final step in the synthesis of cholesterol. Cholesterol is an important substrate in the synthesis of many hormones including placental progesterone (13), which is critical for successful reproduction. A physiologic hypercholesterolemia has been demonstrated to occur later in pregnancy that is thought to be a mechanism for pregnancy maintenance (29). A study by Steffen, et al. showed fetal polymorphisms in DHCR7, as well as other genes involved in cholesterol metabolism, to be associated with birth weight and PTB. In addition, a strong association was seen between low total cholesterol levels during pregnancy in Caucasian women and PTB (13).
TRAF2 plays a role in the TNF signal transduction pathway. In this pathway, TNF binds its receptor to recruit caspase 8, initiating apoptosis (30). Activation of this pathway via binding to TNF receptor 1 on fetal membranes has been implicated in the etiology of premature rupture of membranes (31). TNF also activates the transcription factor nuclear factor kappa B (NF-kappa B), which interacts with inhibitor-of-apoptosis proteins to block caspase 8 (32). TNF binding to TNF receptor 2 on fetal membranes activates NF-kappa B via TNF receptor-associated factor 2 (TRAF2) leading to increased production of inflammatory cytokines. This subsequently increases production of prostaglandins, which can initiate preterm labor (31).
There is less evidence to support CYP2E1, ENPP1, and IGFBP3 in the etiology of PTB, but they may play a role based on known maternal risk factors. Members of the cytochrome P450 family are involved in the detoxification and metabolism of a variety of substrates as well as synthesis of cholesterol, steroids, and other lipids. CYP2E1, specifically, encodes a protein that is induced by ethanol and pathologic states like fasting, diabetes, obesity, and high fat diet (33). It also metabolizes specific substrates including ethanol and nitrosamines (34), premutagens found in cigarette smoke. ENPP1 encodes a protein responsible for cleaving pyrophosphate and phosphodiester bonds of nucleotides and nucleotide sugars, which are a source of chemical energy and play an important role in metabolism. Phosphate removal can interrupt the activity of nucleotides resulting in deranged metabolism. IGFBP3 encodes a protein that binds the majority of circulating insulin-like growth factors, which are thought to play a role in fetal and postnatal growth (35). Increases in maternal serum levels have been associated with increasing gestational age (36), and decreased levels have been shown to be present in deliveries prior to 32 weeks gestation (35). An increase in inflammatory cytokines has been shown to decrease levels of insulin-like growth factor-binding protein 3 (IGFBP3) (37), and inflammation is a well-known pathway implicated in PTB (6).
For this study, the number of affected relative pairs was limiting, particularly for mothers of premature infants. Additional families could provide power to make a genome wide linkage analysis practical. A second limitation was that roughly one-third of our cohort had unknown type of labor. Therefore, even though a separate analysis was performed looking at spontaneous PTB only, the results should be interpreted keeping in mind that the sample size as well as the power of the study were both significantly reduced. In addition, there were only 40 infants with known induced PTB and the reason for augmentation was not known for all these infants preventing us from performing a separate analysis of these individuals. For future studies, it would be important to recruit families for which the type of labor and reasons of augmentation, if applicable, are known in order to have a more informative cohort to use for stratified analyses.
Additional sequencing could better characterize the significant linkage peaks identified in this study. It would be important to look at regulatory elements for CRHR1 and TRAF2 since only coding regions were examined. In addition, the coding regions and regulatory elements for the other genes with significant linkage peaks (CYP2E1, ENPP1, IGFBP3, and DHCR7) should be sequenced. An alternative approach would be to saturate the chromosomal regions surrounding these genes with additional markers and include additional samples for a fine-mapping genetic association study with increased power. Whole exome sequencing using familial cases may also provide valuable insight. Once we have further defined genetic variants and likely environmental contributors analyses can be performed to look at the interactions and to adjust for confounding variables.
In summary, we have identified several candidate genes/regions that may harbor rare variants contributing to PTB, with one, CRHR1 having the strongest data and where the effect is modulated via the fetus.
Cases were defined as singleton preterm infants (delivery at <37 completed weeks of gestation) admitted to one of our centers in Iowa City IA, Pittsburgh PA, Rochester NY, Wake Forest NC, or the island of Funen in Denmark. We included both indicated and spontaneous deliveries. Gestational age was estimated by the first day of the last menstrual period and confirmed by obstetrical examination, including ultrasound when indicated. Signed informed consent, approved by the Institutional Review Board (#199911068) at the University of Iowa, was obtained from all families. DNA was extracted from cord blood or buccal swabs collected for the infants and venous blood, saliva samples, or buccal swabs collected for relatives. Demographic information and additional phenotype data was collected through an interview with the mother and medical chart review.
Families were included if samples were available for a minimum of two premature individuals or mothers of premature infants, excluding multiples, or one premature infant with at least one full term sibling or cousin. An infant affected relative pair was defined as any pair of premature individuals, excluding infant/parent pairs, within a family. A mother affected relative pair was defined as a pair of sisters both having premature infants.
Candidate genes were selected based on biologic plausibility, a review of current literature (14,20,26,31,34-36,38-40), and previous association study findings from our lab. The group included 8 genes (BPI, C2, DEFA6, DEFA4, DEFA5, INFG, MBL2, TRAF2) in inflammatory pathways, 12 genes involved in hormonal regulation (ABCA1, APOA1, APOA5, CRHR1, CYP1B1, DHCR7, HMGCR, LNPEP, NR3C1, OXTR, PGR, PTGS1), and 13 other genes (CYP24A1, CYP2E1, DACH2, ENPP1, EPHX1, HAVCR2, IGF1R, IGFBP3, KCNN3, TP53, UGT1A1, VDR, ZIC3). A list of all 33 genes and their respective SNPs, 99 of 101 are reported after excluding two not in Hardy-Weinberg equilibrium, are in Table 2.
Genotyping of SNP markers was performed using Applied Biosystems (Foster City, CA) TaqMan chemistry. Within each gene, 2-4 on-demand SNP genotyping assays were chosen based on linkage disequilibrium data available from the International Hapmap project (www.hapmap.org) as well as for their haplotype characteristics, such as high heterozygosity and low correlation coefficient, to maximize heterozygosity. The average heterozygosity per locus was 0.85. Applied Biosystems provided standard conditions under which reactions were run. Thermocycling was performed with conditions of 95°C for 10 minutes followed by 50 cycles alternating between 92°C for 15 seconds and 60°C for 1 minute. Allele determination was done in the endpoint analysis mode on an Applied Biosystems 7900 HT Sequence Detection System machine with SDS 2.3 software. Mendelian errors were checked and individuals with greater than 10% error rates were excluded from analyses. Genotypes were entered into Progeny (South Bend, IN), a laboratory database, and files were generated in linkage format for analysis.
Two different phenotypic outcomes, premature individuals and mothers of premature individuals, were defined in independent analyses. Singlepoint and multipoint nonparametric and parametric linkage analyses as well as linkage haplotype analysis were performed using the Merlin 1.1.2 software package (http://www.sph.umich.edu/csg/abecasis/Merlin/index.html). Two parametric linkage analysis models, autosomal recessive and autosomal dominant, were used assuming a disease allele frequency of 0.11 for both. We used penetrances of 0.80, 0.02, and 0.02 for the recessive model and 0.20, 0.20, and 0.02 for the dominant model for the wild-type homozygotes, heterozygotes, and homozygotes, respectively. The values selected were based on the rates of preterm birth in US Caucasian populations during the time frame of this study and the penetrances as arbitrary, but midrange choices based on other complex trait models. Changes in penetrances did not greatly affect results.
Association testing was performed on the families with DNA samples available to form case-parent triads using the preterm infant or the mother of a preterm infant as case. We used the Family Based Association Test (http://www.biostat.harvard.edu/~fbat/fbat.htm), a family-based TDT, to look for nonrandom allele transmission from parents to offspring.
Primers were designed from public sequence to amplify coding regions of CRHR1 and TRAF2, and are available upon request. We sequenced 190 preterm infants from the linkage cohort and 105 mothers of preterm infants from a population-matched cohort in Helsinki, Finland. In addition, 162 parents of term infants from Iowa City, 29 CEPH parents, and 85 mothers of term infants from Helsinki, Finland were used as controls. PCR products were sent to Functional Biosciences (Madison, WI) for sequencing. Chromatograms were transferred to a UNIX workstation, base called with PHRED (v.0.961028), assembled with PHRAP (v. 0.960731), scanned by POLYPHRED (v. 0.970312), and viewed with CONSED (v. 4.0).
The authors would like to thank all the families participating in the study as well as the staff responsible for consenting and obtaining samples for participants. In particular, Laura Knosp, Sarah Wente, Gretchen Cress, Ruthann Schrock, Sara Scott, and Nancy Krutzfield at the University of Iowa; Karen Debes at Magee Womens Hospital; Lauren Smith and Eileen Blakely at the University of Rochester; Kristi Lanier at Wake Forest University; and Dorthe Grosen and Bibi Anshoj at the University of Southern Denmark. Susan Berends helped identify extended families and Dina Ahram provided student support in the lab
Statement of financial support: This work was supported by a grant from the Doris Duke Charitable Foundation to the University of Iowa to fund Clinical Research Fellow Elise Bream, as well as funding from the National Institutes of Health (grants RO1 HD-052953 and RO1 HD-057192-01A2) and March of Dimes (grants MOD 21-FY10-180 and MOD 6-FY11-261).