|Home | About | Journals | Submit | Contact Us | Français|
Uterine leiomyomata (UL) are the most common female pelvic tumors and the primary indication for hysterectomy in the United States. We assessed genetic liability for UL by a known embryonic proliferation modulator, HMGA2, in 248 families ascertained through medical record-confirmed affected sister-pairs. Using a (TC)n repeat in the 5’ UTR and 17 SNPs spanning HMGA2, permutation based association tests identified a significant increase in transmission of a single TC repeat allele (TC227) with UL (allele-specific p = 0.00005, multiple testing corrected min-p = 0.0049). The hypothesis that TC227 is a pathogenic variant is supported by a trend towards higher HMGA2 expression in TC227 allele-positive compared to non-TC227 UL tissue as well as by absence of culpable exonic sequence variants. HMGA2 has also been suggested recently by three genome-wide SNP studies to influence human height variation, and our examination of the affected sister-pair families revealed a significant association of TC227 with decreased height (allele-specific p = 0.00033, multiple testing corrected min-p = 0.016). Diminished stature and elevated risk of UL development have both been correlated with an earlier age of menarche, which may be the biological mechanism for TC227 effects as a tendency of women with TC227 to have an earlier onset of menarche was identified in our study population. These results indicate HMGA2 has a role in two growth related phenotypes, UL predisposition and height, of which the former may affect future medical management decisions for many women.
UL are benign tumors, commonly referred to as fibroids, which arise from smooth-muscle cells of the uterine myometrium. Prevalence estimates of UL are as high as 77% based on serial sections of consecutive autopsy uterine specimens and approximately 25% of reproductive-age women having clinically apparent tumors, many of which result in symptoms including menorrhagia, urinary dysfunction, constipation, abdominal discomfort and infertility (Buttram and Reiter 1981; Cramer and Patel 1990). Consequently, UL are the primary indication for hysterectomy, account for approximately 1 in 5 visits to a gynecologist, and require expenditures of greater than 2.1 billion health care dollars annually in the United States (Flynn et al. 2006; Hartmann et al. 2006; Lepine et al. 1997).
UL development has a genetic liability as evinced by monozygotic female twins being nearly twice as likely to be concordant for hysterectomy or for hospitalization due to UL as dizygotic twins (Luoto et al. 2000; Treloar et al. 1992). Heritability of UL is moderate to high with Finnish, UK, and Russian population estimates of 0.26, 0.69, and 0.79, respectively (Kurbanova et al. 1989; Luoto et al. 2000; Snieder et al. 1998). Two syndromes inherited in a Mendelian fashion, hereditary leiomyomatosis and renal cell cancer (HLRCC, OMIM 605839) as well as multiple cutaneous and uterine leiomyomata 1 (MCUL1, OMIM 150800), have UL as a diagnostic component (Launonen et al. 2001; Reed et al. 1973). Furthermore, approximately 25-40% of UL have simple and non-random cytogenetic abnormalities, many of which are recurrent and used to classify these tumors into subgroups (Rein et al. 1991).
The UL subgroup defined by the presence of a translocation between chromosomes 12 and 14, specifically t(12;14)(q15;q23-24), is associated with elevated HMGA2 expression (Gattas et al. 1999; Gross et al. 2003). HMGA2 is a non-histone component of chromatin that acts as an architectural factor to modulate transcription and plays a fundamental role in proliferation of mesenchymal tissues, including the myometrium from which UL arise. Normal adult human and mouse tissues have significantly less expression of HMGA2 than their proliferating embryonic counterparts, and a pygmy phenotype manifests in transgenic mice null for Hmga2 (Gattas et al. 1999; Zhou et al. 1995). In addition, during mouse embryogenesis, Hmga2 has an expression pattern similar to the distribution of connective tissue (a major component of UL), correlates directly with expression of the proliferation marker Hist4, and is notably absent from nonproliferative tissues (Hirning-Folz et al. 1998).
HMGA2 is not only associated with UL predisposition but has recently been implicated in another growth-related phenotype, human stature. Height in humans is a complex trait with a normal distribution and a high heritability estimated at 80% (Perola et al. 2007). Although linkage analyses have yielded multiple, suggestive loci, HMGA2, or more broadly the 12q15 region, have not been included. Recently, a genome-wide SNP study of 4,921 individuals and an additional set of 19,064 persons identified a significant association of rs1042725 in the 3’ UTR of HMGA2 with human height. The only other SNP significantly associated with height, rs7968682, was 12 kb downstream of the 3’UTR of HMGA2 and in linkage disequilibrium (LD) with rs1042725 (Weedon et al. 2007). Subsequent independent studies confirmed the association of these same two SNPs with height (Lettre et al. 2008; Sanna et al. 2008). Another 3’ UTR SNP, rs8756, is in strong LD with rs1042725 and rs7968682 and was shown to be associated with human stature in an Icelandic population (Gudbjartsson et al. 2008). The variability of SNPs associated with height in this region may be due to different levels of genetic isolation.
Based on this compelling biology, we evaluated HMGA2 as a potential modifier for UL predisposition and human stature in a population of medical record-confirmed sister-pairs affected with UL and their family members (ASF). We demonstrate significant association of a specific TC dinucleotide repeat (TC227) in the 5’ UTR of HMGA2 with both predisposition to UL and to decreased height in White women. A trend of increased HMGA2 expression in UL tissue was also discovered in the presence of the TC227 allele. The possibility is raised of a common mechanism for the effect of TC227 on both UL development and height through induction of an earlier age of menarche.
Sister-pairs affected with UL and their family members were recruited domestically and internationally through medical and community advertisements and referrals to consent to participate in the “Finding Genes For Fibroids” study (www.fibroids.net). All study aspects have been reviewed and received approval by the Human Research Committee of Partners HealthCare System. Study procedures included submission of a blood sample for DNA isolation and completion of detailed epidemiological surveys ascertaining clinical, reproductive, sexual, dietary, and family history (Huyck et al. 2008). Diagnosis of UL was confirmed through medical record review.
UL were collected to develop a tissue bank from consenting, premenopausal, 25-50 year-old women who underwent myomectomy or hysterectomy at Brigham and Women's Hospital from 2003 to 2007. RNA isolated from these tissue samples was used for analysis of HMGA2 expression.
DNA from each affected sister-pair study participant was isolated using a Puregene Blood Kit (Gentra, Minneapolis, MN, USA). Samples were genotyped for the TC repeat polymorphism of HMGA2 at the Massachusetts General Hospital Genomics Core Facility and at Boston University by PCR amplification of an ~220 bp region followed by gel fractionation with addition of the internal size standard GS500 TAMRA on an ABI 377 DNA sequencer and data analysis with GeneScan 3.1.2 and Genotyper 2.5. Two CEPH reference samples were run in triplicate by both genotyping facilities to assess genotyping quality and consistency. TC repeat genotype calls were based on those of a previous study that used the same primers (Ishwad et al. 1997). The samples were also genotyped at the Harvard Partners Center for Genetics and Genomics for 17 SNPs encompassing HMGA2 using iPLEX technology (Sequenom, San Diego, CA, USA). SNPs were selected with the goal of capturing large areas of LD and being representative of the regions with short stretches of LD.
A total of 248 affected sister-pair families were genotyped and participants categorized as White based on self-report. Transmission disequilibrium test-based family association evaluations between markers and UL or height were carried out using FBAT (version 1.7.3) (Rabinowitz 1997; Rabinowitz and Laird 2000). One family was excluded from further analysis due to Mendelian inconsistency. Also excluded were families self-reported as Black based on a low frequency of the TC227 allele in this ethnic group. The global tests were implemented using the FBAT Monte Carlo-based min-p test ≤0.10 as a screening criterion. Subsequent allele-specific association tests of single markers and haplotypes were restricted to variants with a Bonferonni corrected p-value of ≤ 0.05 under an additive genetic model. Association tests were calculated with no offset for the binary UL measure and with a sample mean offset for the height measure. Ancillary tests for association of additional outcomes and risk alleles were assessed using the FBAT Z statistic. Pairwise r2 measures of LD between biallelic SNPs were also calculated to identify and exclude from further analysis those SNPs providing redundant information based on an r2 ≥ 0.80. LD plots were generated using D’ in GOLD software (Abecasis and Cookson 2000) to allow direct incorporation of the multiallelic TC repeat marker into the SNP display. It is of note that the D’ pattern of LD for the SNPs tested is similar to that found using r2 measures. Haplotype association tests with two and three markers were performed, focusing on the LD region containing the TC repeat (NCBI HapMap build 36) to reduce the number of tests.
Height was further evaluated using a two sample t test to compare the mean between the TC227- and TC227+ groups.
A portion of each UL was placed directly in RNAlater solution (QIAGEN) or frozen in liquid nitrogen immediately after surgical removal. RNA was isolated using the RNeasy Fibrous Tissue kit (QIAGEN, Valencia, CA, USA). Real-time PCR was performed as previously described (Gross et al. 2003), using the standard curve method and normalizing the level of HMGA2 in each tissue to that of GAPDH. UL with t(12;14)(q15;q23-24), an abnormality which has been shown to significantly upregulate HMGA2 expression relative to karyotypically normal UL (Gattas et al. 1999; Gross et al. 2003), were excluded from the analysis. These t(12;14) UL were identified by FISH performed with probes RP11-185D13 located at 12q15 and CTD-3225F7 at 14q24 as previously described (Moore et al. 2004).
The median and interquartile range of HMGA2 expression in the single largest, non-t(12;14) UL without evidence of either necrotic or admixed tissue from 58 independent subjects were compared between those with versus those without allele TC227.
The ASF study population included 248 self-reported White families with a mean ± standard deviation for height of 165.1 ± 8 cm, UL diagnosis age of 38 ± 5 years, and menarche age of 12 ± 1 years. We examined the association of HMGA2 with UL predisposition using a TC dinucleotide repeat ~550 bp upstream of the transcription start site in the 5’ UTR of HMGA2 and 17 flanking SNPs. The allelic distribution of the TC repeat in our ASF population was compared to that of a U.S. sample from a previous study which identified 19 polymorphic alleles comprising 18-37 TC repeats through alignment to an M13 sequencing ladder (Ishwad et al. 1997). Using the same genotyping assay in our ASF, we detected 19 alleles corresponding to 18-39 TC repeats. A similar allele frequency distribution was found between the two populations (Fig. 1).
Global tests for family-based association were used to identify those markers with a multiple testing corrected significant minimal p-value (min-p) of ≤ 0.10 (Fig. 2). Allele-specific tests of markers meeting this screening criteria revealed that only a single TC repeat allele, TC227 corresponding to 27 TC repeats, was significantly associated with UL development (allele-specific p = 0.00005, multiple testing corrected min-p = 0.00049, large sample Z = +4.052) (Table 1). The min-p method used for these analyses inherently corrects for multiple testing. All other markers that were initially nominally associated with UL presence in the global screening tests were not significant in the allele specific testing (p-value ≥ 0.05). SNPs in strong LD with each other as determined by r2 ≥ 0.80 (SNPs rs2612060, rs1480475, rs1563834 and rs867633), which would thus provide redundant information, were excluded from allele-specific testing. Sequencing of the five exons of HMGA2 from 15 individuals with (n = 13) or without (n = 2) TC227 did not identify any additional suggestive variants.
HMGA2 expression in the UL was measured by real-time PCR (Fig. 3). Tumor tissue from TC227 positive subjects had slightly higher HMGA2 expression (median = 5.39; interquartile range = [1.57, 8.58]; n = 11) compared to TC227 negative subjects (median = 2.15; interquartile range = [0.08, 5.03]; n = 47) (one-sided p-value = 0.16).
HMGA2 was also evaluated with height as the outcome in our ASF study population. After screening for markers with a global test for association using min-p (Fig. 2), the allele specific tests indicated that only the TC227 allele was significantly associated with height (allele-specific p = 0.0021, multiple testing corrected min-p = 0.016, large sample Z = −3.079) (Table 1), specifically decreased stature. Among women affected with UL, TC227 positive subjects were significantly shorter (mean = 162.8 cm, std. dev. = 8.0 cm, n = 152) than those who did not have the TC227 allele (mean = 164.3 cm, std. dev. = 6.8 cm, n = 544) at p-value = 0.0212. The direct relationship between decreased height and earlier age of menarche reported previously (Onland-Moret et al. 2005) was also examined in our ASF population; TC227 positive women showed a tendency to have an earlier onset of menstruation than non-TC227 women (large sample Z statistic = −0.95), but this was not statistically significant.
UL represent a major health issue due to significant morbidity and high population prevalence. These tumors affect up to 77% of women and cause noticeable symptoms in approximately 25% of those of reproductive age (Buttram and Reiter 1981; Cramer and Patel 1990), producing a potential design weakness in case-control studies of abundant undiagnosed subjects in the control series. We therefore selected an affected sister-pair design (ASF) for a candidate gene approach to evaluate HMGA2, an embryonic proliferation modulator of mesenchymal tissue which includes the myometrium from which UL arise. Examination of the ASF population using global testing for multiple markers (a multiallelic TC dinucleotide repeat in the 5’ UTR of HMGA2 and 17 surrounding SNPs) identified a highly significant association of the HMGA2 TC repeat, which was narrowed to TC227 corresponding to 27 TC repeats based on subsequent allele-specific tests, with development of UL and decreased stature in White women.
As with any marker association study, these results raise the question of whether the TC repeat is a causal factor for UL susceptibility or is in LD with an unidentified pathogenic variant. Multiple lines of evidence suggest the former interpretation is highly plausible. First, the TC repeat marker in the 5’ UTR of HMGA2 is flanked upstream by SNP rs2261181 and downstream by the intron 3 SNP rs2854603. While the TC227 allele alone was significantly associated with UL development, haplotypes of any combination of the repeat and flanking SNPs were not significant (rs2261181-TC repeat p-value = 0.054; rs2854603-TC repeat p-value = 0.067; rs2261181-TC repeat-rs2854603 p-value = 0.18). Second, no additional suggestive variants were identified by sequencing of the five exons of HMGA2 from 15 individuals with or without TC227. Third, the same direction of association of TC227 and UL predisposition was observed in a separate smaller sample of 86 women collected for expression analysis; although this and the ASF samples could have arisen from the same underlying population making it difficult to rule out LD on the basis of this evidence alone. Fourth, it was determined through transient transfection of HeLa cells with three constructs containing 5’ sequences of HMGA2 with varying number of TC repeats upstream of a luciferase reporter that the TC repeat is a positive regulator of HMGA2 and its length correlates with transcriptional activity (Borrmann et al. 2003). This type of functionality of simple sequence repeats in 5’ noncoding regions is supported by a growing body of literature (Chiba-Falek and Nussbaum 2001; Okladnova et al. 1998; Searle and Blackwell 1999; Shimajiri et al. 1999; Yamada et al. 2000).
HMGA2 expression analysis of tumors by real-time PCR further suggests that the TC227 allele may be causative of UL development. A trend of higher HMGA2 expression was found in tumor tissue from TC227 positive subjects compared to TC227 negative subjects (median two sample test: one sided p-value = 0.16; n227+ = 11, n227− = 47). The ability to detect statistically significant differences was complicated by a limited number of available TC227 positive tissues. Further, tumor heterogeneity is of particular importance in this circumstance because the degree of cellularity in UL is inconsistent; HMGA2 is only expressed by UL smooth muscle cells and not by the fibrous connective tissue or vascular smooth muscle cells present in variable amounts in each tumor (Klotzbucher et al. 1999). Despite these caveats, the expression medians and ranges argue for a functional role for TC227.
HMGA2 also has an effect on another phenotype involving growth, human height. Allele specific tests in our ASF population of the TC repeat marker and surrounding SNPs demonstrated that only the TC227 allele was significant for association with height, particularly decreased stature. An earlier age of menarche has also been correlated with decreased height (Onland-Moret et al. 2005), and a tendency of women with TC227 to have an earlier onset of menstruation was found in the ASF population but was not statistically significant. This, in conjunction with the fact that women with early age of menarche have an elevated risk of developing UL (Marshall et al. 1998), raises the interesting possibility that TC227 may underlie UL predisposition and decreased height through an effect on age of menarche.
Two recent genome-wide SNP studies identified an association between human stature and SNPs in or near HMGA2 in LD with each other (rs1042725 in the 3’ UTR of HMGA2 and rs7969692 12 kb downstream) (Sanna et al. 2008; Weedon et al. 2007). The negligible LD between regions containing the TC repeat and SNPs rs7968682 or rs1042725 (data not shown), combined with a lack of local SNPs representing the TC repeat region in the SNP panel employed by the recent studies, raise the possibility that different areas of HMGA2 could affect height through independent mechanisms. There is also the potential, although no current evidence, that an unknown variant in long range LD with HMGA2 is responsible for the observed associations. Interestingly, the 3’ UTR containing the significant SNPs from the genome-wide studies is known to encode multiple let-7 miRNA target sites that function to regulate HMGA2 expression (Lee and Dutta 2007).
This study suggests that HMGA2 has a broad spectrum of effects on growth from UL tumorigenesis to height through the action of the TC227 allele. These initial findings ultimately will need to be confirmed in independent populations of UL subjects as they become available, highlighting the urgency for collection of additional medically-confirmed UL study populations. Phenotypic effects of such 5’ UTR dinucleotide repeats are increasingly recognized for their influence on gene expression. A mechanistic hypothesis for TC227 is that it promotes the formation of non-B-DNA triple helical structures, creating a subtle but constitutively increased level of HMGA2 in adult tissues that affects height and UL growth through aberrant gene expression in susceptible mesenchymal tissues.
In summary, genetic and functional data presented in this study suggest HMGA2 is a UL predisposition gene and identify a variant that may contribute to formation of these tumors as well as influence stature in women. Whether the association with TC227 is significant in influencing height in men remains to be studied. The present findings in UL pathogenesis could inform future medical choices of women (Stewart and Morton 2006).
The authors thank Weining Lu for real-time PCR instruction, Rita Cantor for helpful comments on the manuscript, Efthymia Melista and Alison Brown for assistance with genotyping, and all the women and their families who participated in this study. This work was supported by NIH grants RO1HD046226 and RO1CA78895 (to CCM). JCH was supported by T32GM007748 and KLH by a Howard Hughes Medical Institute Predoctoral Fellowship in Biological Sciences.