|Home | About | Journals | Submit | Contact Us | Français|
The incidence of recurrent miscarriage (RM; ≥3 consecutive pregnancy losses) is estimated as 1%-2% in fertile couples. Familial clustering of RM has suggested the contribution of a genetic component.
Low level of HCG in maternal serum during the first trimester of the pregnancy is a clinically accepted risk factor for miscarriage. We sought to study whether variation in Chorionic Gonadotropin beta subunit genes (CGB) expressed in placenta may contribute to the risk of RM.
Resequencing of CGB5 and CGB8, the two most actively transcribed loci of the four HCG beta duplicate genes.
A case-control study involving two sample sets, from Estonia (n=194) and from Finland (n=185).
RM patients (n=184) and fertile controls (n=195).
From 71 identified variants in CGB5 and CGB8, 48 SNPs were novel. Significant protective effect was associated with two SNPs located at identical positions in intron 2 in both CGB5 (p=0.007, OR=0.53) and CGB8 (p=0.042, OR=0.15); and with four CGB5 promoter variants (p<0.03; OR=0.54-0.58). The carriers of minor alleles had reduced risk of RM. The haplotype structure of the CGB8 promoter was consistent with balancing selection; a rare mutation in CGB8 initiator element was detected only among patients (n=3). In addition, three rare non-synonymous substitutions were identified among RM cases as possible variants increasing the risk of recurrent pregnancy loss.
The findings encourage studying the functional effect of the identified variants on CGB expression and HCG hormone activity to further elucidate the role of CGB variation in RM.
Recurrent miscarriage (RM) or habitual abortion is defined as three or more consecutive pregnancy losses before 22 gestational weeks or the spontaneous abortion of an embryo/fetus weighing less than 500g. The occurrence of RM is estimated as 1%-2% of fertile couples (1, 2). Although the patients with RM undergo multiple diagnostic tests to detect parental chromosomal anomalies, maternal thrombophilic, endocrine, or immunological disorders, over 50% of the RM cases are classified as idiopathic (3). An increased prevalence of miscarriage among first-degree relatives of the women suffering from RM (4) suggests genetic contribution in recurrent pregnancy loss. Possible candidates include genes regulating the development of maternal immunotolerance and inflammatory response, coagulation, angiogenesis, vascular tone, and apoptosis. Prime candidates of the molecular causes of RM have been various trombophilic gene mutations (5-7). Convincing data has also been reported on the association between the miscarriage rates and the polymorphisms in HLA-G gene expressed on the surface of the invading cytotrophoblasts (8).
So far, major interest has focused on the physiological response of the mother to the pregnancy. Less attention has been paid to the placental proteins coded by the fetal genome with contribution from both maternal and paternal genes and their variants. One of the first proteins produced by the conceptus is human chorionic gonadotropin (HCG), also known as “the pregnancy hormone” due to its essential role in human reproduction. The main function of HCG is to delay the apoptosis of the corpus luteum during the first trimester of pregnancy. HCG has several paracrine effects in the process of implantation (9), angiogenesis and placentation (10, 11), and development of maternal immunotolerance (12). Low level and non-exponential increase of HCG in maternal serum during the first trimester of the pregnancy is a clinically accepted risk factor for miscarriage (13-15).
The hormone-specific HCG beta-subunit is expressed by syncytiotrophoblasts of placenta and is encoded by four Chorionic Gonadotropin Beta genes (CGB, CGB5, CGB7 and CGB8) located within the LHB/CGB gene cluster at 19q13.3 (Fig. 1B). Among the four HCG beta duplicate genes CGB8 and CGB5 are the most actively transcribed and contribute together 62-82% to the total pool of beta-subunit mRNA transcripts (Fig.1A, 16-18). Our previous data on the HCG beta genes showed that (i) their diversity level is one of the highest reported for human genes; (ii) there is high interindividual and intergenic difference in expression and (iii) mRNA transcription level is significantly lower in cases of RM compared to normal first trimester pregnancies (18-20). Now we have addressed the question whether particular variants in these genes may contribute to pregnancy failure. High genetic variation in the LHB/CGB region and the aim to capture both rare and common variation prompted us to choose resequencing instead of traditional genotyping. We analyzed CGB5 and CGB8 in Estonian and Finnish RM cases (n=184) and fertile women (n=195) by comparing variation and haplotype patterns between the two groups. Consistent with hypothesis of the study, we identified genetic variants in HCG beta genes either significantly increasing or reducing a subject’s risk to experience recurrent pregnancy loss.
The study was approved by the Ethics Committees of the University of Tartu, Estonia (protocols no 117/9, 16.06.03, 126/14, 26.04.2004) and the Department of Obstetrics and Gynecology, Helsinki University Central Hospital Outpatient Clinic for women with recurrent miscarriage (protocol no 298/E2/2000). Subjects were recruited and blood samples for the DNA extraction were collected at the Women’s Clinic of Tartu University Hospital and Nova Vita Clinic, Centre for Infertility Treatment and Medical Genetics, Tallinn, Estonia in 2003-2007; and in the Department of Gynaecology and Obstetrics of the Helsinki University Hospital in Finland during 2001-2004. Written informed consent was obtained from every study participant. In both participating centers patients with at least ≥3 abortions during the first trimester of pregnancy were recruited (n=184; age 18-40 yrs). As maternally and paternally derived gene variants contribute equally to the function of a fetal genome, the patient group included both, the women and their partners, who had experienced recurrent pregnancy losses. In Estonian sample collection the patient group consisted of 32 couples and 29 females with RM, and additional 3 couples with ≥3 unsuccessful in vitro fertilization procedures. In Finnish sample collection the RM group consisted of 40 couples and 5 females with RM (detailed description in 21, 22). The control group (n=195) consisted of age-matched fertile women with no history of miscarriage and either at least one normal pregnancy (the Finnish subjects, n=100) or more stringently, ≥3 successful deliveries (the Estonian subjects, n= 95). The control group was designed under the assumption that fertile women with no history of spontaneous abortions are carrying gene variants supporting successful pregnancies. Their male partners were not recruited into the control group as detailed reliable information on their past reproductive history was unavailable.
All patients had a normal karyotype tested from peripheral blood lymphocyte cultures. Female patients having uterine anomalies were excluded by ultrasonography or hysterosonogram.
DNA was extracted from peripheral blood using a protocol based on the salting-out method for DNA extraction. The CGB5 (~ 1.7 kb fragment) and CGB8 (long-range PCR ~ 8.3 kb; nested PCR ~2.5 kb fragment) genomic regions (Fig. 1C) were amplified and resequenced using previously described primers and conditions (19). The resequenced region involving CGB8 covered 2050 bp including the entire CGB8 (1474 bp), 400 bp of 5′ upstream region. The resequenced region for CGB5 (1468 bp) covered the full genic region and part of 3′downstream region (Fig. 1C). Additional primers were designed for the analysis of the 5′upstream region of the CGB5 gene (450 bp) using the Primer3 software (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi). Specificity of the PCR products was verified in three steps: (1) design of unique primer pairs for specific amplification of only one of the seven duplicated genes; (2) verification of monomorphic status of gene-specific positions used as markers for each individual gene (Supplementary Fig. S1); (3) test for Hardy-Weinberg Equilibrium (HWE) for each identified SNP. Primer sequences for PCR and resequencing are listed in Supplementary Table S1. The sequences were resolved using either ABI 3730 X1 or ABI 3730 XL DNA Analyzer (Applied Biosystems) and assembled into a contig as described (19). Polymorphisms were identified using the PolyPhred program (Version 6.02) (http://www.phrap.org/phredphrapconsed.html) (23) and confirmed by manual checking. A genetic variant was verified only if it was observed in both forward and reverse orientations. In case of indel heterozygosity the genotype of the subject was confirmed using two independent forward and two reverse primers. The nomenclature of the identified polymorphisms was based on the GenBank reference sequences: NM_033043.1 GI:15451747 for CGB5, NM_033183.2 GI:146229337 for CGB8.
Allele frequencies were estimated and conformance to HWE was calculated (α = 0.05). In total 8 rare SNPs in 5′upstream region of CGB5 were found to be deviating from HWE, as one individual was homozygous for minor allele of all these SNPs.
Haplotypes were inferred from unphased genotype data using the Bayesian statistical method in the program PHASE 2.1.1 (http://www.stat.washington.edu/stephens/; 24), applying the model allowing recombination. The running parameters were: number of iterations = 1000, thinning interval = 1, burn-in = 100; the - X10 parameter was used for increasing the number of iterations of the final run of the algorithm.
Sequence diversity parameters and neutrality tests were calculated using DnaSP (ver. 4.0; http://www.ub.es/dnasp/; 25) with the most probable phased haplotypes as an input sequence. The direct estimate of per-site heterozygosity (π) was derived from the average pairwise sequence differences, while Watterson’s θ represents an estimate of the expected per-site heterozygosity based on the number of segregating sites (S). The basis of the Tajima’s D statistic (26) is the difference between the π and θ estimates: under neutral conditions π = θ and DT = 0. The Ewens-Watterson homozygosity test implemented in Arlequin 2.000 software (http://cmpg.unibe.ch/software/arlequin3/; 27) was used to test the hypothesis that haplotypes are selectively neutral. An excess of rare variants (= homozygosity excess) indicates directional selection, while an excess of intermediate frequency variants (= homozygosity deficiency) indicates balancing selection. The relationship between inferred haplotypes was analyzed with NETWORK 4.201 software (http://www.fluxus-technology.com) using the Median-Joining network algorithm (28). Haplotype networks of CGB5 and CGB8 were calculated using (i) SNPs located in genic region from the transcription initiation site until the end of the mRNA and (ii) promoter SNPs located 5′upstream of the genic region. Singleton polymorphisms were excluded from network calculations (cannot be reliably phased) performed with default parameters. The descriptive statistics of linkage disequilibrium (LD), r2 was calculated for pairs of markers and summarized by Haploview software (29).
The significance of the association between the identified SNPs in CGB5 and CGB8 genes and occurrence of RM was tested using Cochran-Armitage test for trend implemented in statistical analysis package JMP® 6.0.3 with Genomics module 2.0.6 (http://www.jmp.com/software/genomics/). The same test was applied to address the interpopulation (Estonians, Finns) differentiation. Odds ratio (OR) with 95% confidence intervals (CI) were calculated to show the strength and direction of the association. In all tests, p<0.05 was considered statistically significant.
We sequenced the entire genic and 5′upstream regions of CGB5 and CGB8 genes in a sample collection consisting of Finnish and Estonian patients with recurrent miscarriage (RM) (n=184; n=85 Finns, n=99 Estonians) and fertile controls (n=195; n=100 Finns, n=95 Estonians). For every subject the entire sequenced region covered 4.3 kbp (Fig. 1B-C). In total 71 variants were identified: 29 and 19 SNPs in the genic part of CGB5 and CGB8, respectively; 18 and 3 SNPs in the 5′ upstream regions of CGB5 and CGB8, respectively; and 2 SNPs 3′downstream of CGB5 (Table 1). Among the 71 detected SNPs 48 (68%) were novel variants, previously not described in dbSNP database (http://www.ncbi.nlm.nih.gov/SNP/) and literature. Neither CGB5 nor CGB8 has been covered by the most recent version of HAPMAP (http://www.hapmap.org/; release March 2008). The diversity parameter π that describes the mean nucleotide diversity per bp differed in the genic and 5′ upstream regions (Table 2). Among fertile women the diversity of CGB5 (π=2.71×10-3) and CGB8 (~π=2.01×10-3) 5′ upstream regions was approximately twofold higher compared to the genic regions of CGB5 (~π=1.69×10-3) and CGB8 (~π=9×10-4) (Table 2).
Two thirds of the identified variants (n=41; 58%) were shared by Estonian and Finnish sample collections. In both sample collections there were 15 population-specific SNPs represented as single or low frequency variants (<2%). Majority of the shared SNPs showed no differences (p>0.05) among the two study populations. Significant difference in allele frequencies was detected for 8 out of 71 SNPs, most being rare variants (Table 1). Linkage disequilibrium (LD) between the identified SNPs in the resequenced region was nearly absent in both population samples (Fig. 2).
Four SNPs represented non-synonymous amino acid changes: CGB5 p.Val76Leu in a single Finnish RM patient, CGB8 p.Arg28Trp and CGB8 p.Pro93Arg in single Estonian patients, and CGB8 p.Val49Ile in one Finnish and two Estonian patients, and also seven Estonian fertile women (Table 1). Further experimental studies have to be conducted before drawing any conclusions about their effect on the hormone function.
A case-control study targeting the association of identified CGB5 and CGB8 genetic variants with RM was carried out separately for the Estonian (RM cases n=99; fertile women defined as controls n=95) and the Finnish subjects (cases n=85; controls n=100) as well as for the joint dataset. The comparison of single marker and haplotype distribution in the two sample sets revealed low population stratification (Table 1, Supplementary Fig. S2) facilitating the joint analysis in order to increase the statistical power of the study.
In the full case-control sample set a significant association with RM was detected for the CGB5 5′upstream polymorphisms (c5EF-155, c5EF-147, c5EF-144, c5EF-142) (p<0.03; OR=0.58 [95% CI 0.35-0.93]; Cochran-Armitage trend test) (Table 3). Analysis of the Estonian (p=0.083; OR=0.54 [95% CI 0.27-1.1]) and the Finnish (p≤0.131; OR=0.58 [95% CI 0.29-1.19]) subsamples supported trend for association in both study populations independently, but the p-values did not reach statistical significance (p>0.05) due to reduced samples sizes. The significant association with all four CGB5 promoter polymorphisms results from higher minor allele frequency (MAF) in fertile women (12.05%-13.08%) compared to RM group (7.10%-7.92%). This difference between the control group and RM cases was consistent in both study populations: 13.16% compared to 8.08 % and 11-13% compared to 5.95-7.74%, in Estonians and Finns, respectively (Table 3). On the haplotype network of CGB5 upstream region the promoter variants carrying the minor alleles of the four polymorphisms form a remote clade (H1-H2, H10-H11; Fig. 3A).
Among the CGB5 genic SNPs a strong protective effect was detected for the minor allele of intron 2 c5EF1038 (p<0.007; OR=0.53 [95% CI 0.32-0.85]), represented with the frequency 14.36% in fertile women compared to 8.15% in the RM group (Table 3). This effect reached statistical significance in the separate analysis of the Finnish subjects (p=0.036; OR=0.48 [95% CI 0.24-0.97]) and showed a trend for association in the Estonian (p=0.079; OR=0.57 [95% CI 0.30-1.08]) subsample. No increase in protection towards RM was detected for the combination of the minor alleles of the CGB5 5′upstream and the intronic SNP (data not shown).
Notably, the association of four CGB5 SNPs with the protective effect towards pregnancy loss was sufficiently robust to remain significant even when only the female RM patients (n=109) were considered as cases (Table 4). A separate analysis of male RM patients revealed similar trends for association and protective effect sizes as compared to female RM cases, although the p-values were non-significant possibly due to smaller sample size (n=75) that reduced the statistical power. However, the inclusion of both sexes gave a stronger effect than gender-specific analysis in all but one SNP (c5EF1038; Table Table3,3, ,4)4) further supporting the contribution of both, maternal and paternal genes in the reproductive success.
Population-specific associations were detected in the Finnish sample collection with two rare SNPs (MAF <10%) in CGB8: c8EF301 (p=0.034) and c8EF1045 (p=0.025) (Table 3). Interestingly, the protective variant in the intron 2 of CGB8 (c8EF1045) is located at the same position within the gene as the CGB5 intronic variant (c5EF1038).
The resequenced CGB8 5′upstream region stands out with only 3 SNPs (two common and one rare) compared to the respective region for CGB5 with 18 SNPs (Table 1). The rare allele A of SNP c8EF-4 was solely represented in patients, one from Finland and two from Estonia (Cochran-Armitage trend test, p=0.071). This polymorphism is located within the AP1-like sequence overlapping the HCG beta initiator element critical for basal transcription and downstream of the Ets-2 binding site acting as a major enhancer of HCG beta gene expression (30).
We applied two neutrality tests to explore observed versus expected distribution of SNPs and haplotypes in the 5′upstream region of CGB8. Both, the Tajima’s D statistic (DT=2.29, p<0.05; Table 2) as well as Ewens-Watterson homozygosity test (p=0.007) indicated a possible scenario of balancing selection driving the three apparently most efficient CGB8 promoter variants (H1, H3, H4) to high frequency in both populations (Fig. 3B; Supplementary Table S2: Supplementary Fig. S2). The rare variant H2 carried the minor allele of c8EF-4, identified solely in patients. Notably, the haplotype combining the minor alleles of c8EF-287 (C; MAF=25.2%) and c8EF-186 (T; MAF=39.7%) is expected to be present with the frequency of 10%, but was not observed in the current study (Fig. 3B; Supplementary Table S2).
Here we report the first case-control study targeting the variation in HCG beta genes in association with recurrent miscarriages (RM). Most association studies on RM have so far focused on susceptibility variants of maternal genes involved in physiological adaptation to pregnancy, such as development of immunotolerance at feto-maternal interface or alterations in fibrinolytic and coagulation pathways. As these genes also contribute to complex diseases, the role of their variants in susceptibility to RM may not be specific (5, 8, 22, 31). HCG beta genes are expressed in blastocysts shortly after fertilization (20, 32) and are essential for successful implantation. Thus, a genetic variant of these genes is more likely to have an effect on pregnancy outcome. Our study focused on CGB8 and CGB5 that provide the major fraction of HCG beta mRNA transcripts and the resequencing method was chosen instead of genotyping.
The human CGB8 and CGB5 genes are located among the seven duplicate genes within the LHB/CGB gene cluster. Major complications in targeting duplicated genes in association studies are high sequence similarity (>92%), high diversity, large number of population-specific variants and low LD due to high gene conversion activity (19, 33). These characteristics make it technically challenging to select reliable tag-SNPs and establish genotyping methods capable of targeting unique SNPs in duplicated genes. In the 379 subjects we identified only 14 out of 30 (47%) SNPs in CGB5 present in a public SNP database NCBI dbSNP and 9 out of 44 (20%) in CGB8. Several of the variants not observed in our study have been predicted in silico or by using high-throughput methods and may actually be multisite or paralogous gene variants (34). Alternatively, some of these SNPs could indeed represent variants specific to other than Estonian or Finnish populations. For example, an amino acid substitution Val79Met (nomenclature based on mature protein; from ATG p.Val99Met) in CGB5 exon 3 has been reported at carrier frequency 4.2% in a random population from the Midwest of the United States (35) but it was absent in a 580 DNA samples originating from five European populations (36). In the current study, in relatively large samples sets drawn from two neighboring populations, one third of the identified variants (MAF<2%) were found in only one population, although the sample size was sufficient to identify all common variants (MAF>5%) originally described in a large mutation screening of LHB/CGB genes (Table 1; 19). Full resequencing data collected in this study enabled to identify several rare non-synonymous and promoter variants and to conduct haplotype analysis.
Consistent with the hypothesis of the study, we identified genetic variants in HCG beta genes significantly increasing or reducing the risk of RM. A protective effect was detected for the minor alleles of two SNPs (c5EF1038 and c8EF1045) located at the identical positions in intron 2 in both CGB5 and CGB8 and for four CGB5 promoter variants (c5EF-155; c5EF-147; c5EF-144; c5EF-142). The carrier status of the minor alleles of these six SNPs reduced the risk of RM 1.7-fold in comparison to the wildtype carriers. Interestingly, the “protective” alleles of the CGB5 promoter SNPs form a motif (C-del-C-A; H2 on Fig. 3A) identical to the promoter sequence of CGB8 (Fig. 1D), which has been shown to be most actively transcribed HCG beta gene (18). The actual contribution of these sequence variants to mRNA transcription and splicing efficiency is still to be explored.
The current data suggest the CGB8 and especially its promoter region to be under stronger functional constraint compared to CGB5 in spite of high DNA sequence similarity (98-99%) between the two genes (Supplementary Fig. S1). Firstly, we detected > 2 times less polymorphisms in CGB8 genomic region (n=22) compared to CGB5 (n=49). Secondly, three rare CGB8 variants that may exhibit an effect on hormone action were present exclusively in RM patients (p.Arg28Trp, p.Pro93Arg and a c8EF-4 within proximal promoter) compared to only one such SNP in CGB5 (p.Val76Leu). Thirdly, the applied neutrality tests indicated a balancing selection in the promoter region of CGB8, but not of CGB5. Additionally, we identified only three of the four predicted major CGB8 promoter haplotypes (Fig. 3B). The haplotype combining the minor alleles of c8EF-287 (MAF 25.2 %) and c8EF-186 (MAF 39.7%) was absent in the current dataset in spite of the relatively high minor allele frequency. The discrepancy between observed (0%) and expected (10%) frequency may be explained by the localization of these SNPs within Sp1/AP-2 binding sites (37) residing in the critical region for the trophoblast-specific expression as well as cAMP-responsiveness of the HCG beta gene transcription (38, 39). Functional studies should reveal whether these sequence variants indeed possess a combinatory effect influencing the binding of the AP-2 and Sp1 transcription factors to the promoter of HCG beta and alter the transcription of genes.
One of the key factors in obtaining reliable results in a case-control study is a clearly defined study group and replication of the results in an independent dataset. We applied parallel analysis of case-control sample sets collected from two neighboring countries in order to confirm the robustness of the association across populations. As the stratification was low between these populations we also conducted a joint analysis of the two sample sets in order to raise the statistical power. Although there were minor differences in subject recruitment, the obtained results were concordant in two populations and the strength of detected associations increased in the analysis of the pooled dataset. In addition we identified two gene variants lowering the risk of RM in the Finnish dataset only possibly owing to the specific demographic history of the Finnish population (40).
In conclusion, these data from two populations provide the first evidence for the role of the variation in HCG beta genes in contributing to the susceptibility of RM. The findings encourage further studies addressing the functional effect of the identified promoter, intronic and rare protein-altering variants on HCG beta gene expression and HCG hormone activity. The diagnostic application of our findings may facilitate the improvement of early and preventive treatment of RM.
We thank the participating patients, prof. Helle Karro and Dr. Andres Salumets for providing facilities for patient material collection, and dr. Pille Hallast for advice in resequencing of the LHB/CGB genes.
Funding: M.L. is the Howard Hughes Medical Institute International Scholar (grant #55005617) and Wellcome Trust International Senior Research Fellow in Biomedical Science in Central Europe (grant 070191/Z/03/Z). In addition, funding for the work was partially provided by Estonian Science Foundation grants no. 5796 and 7471 (K.R., L.N., P.K., M.L.) and Estonian Ministry of Education and Science core grants no. 0182721s06 (M.L., K.R., L.N., T.M., P.K.) and 0182641s04 (K.R.); as well as by Sigrid Juselius Foundation & Finnish State Fund (V.M.U., M.K., K.A) and Sanofi-Aventis (V.M.U.). K.R. and L.N. have been awarded by fellowships from Graduate School in Biomedicine and Biotechnology (1.0101-0167).
Author disclosure statement: The authors have nothing to disclose.