|Home | About | Journals | Submit | Contact Us | Français|
Menin is the protein product of the MEN1 tumor-suppressor gene; one allele of MEN1 is inactivated in the germ line of patients with “multiple endocrine neoplasia type 1” (MEN1) cancer syndrome. Menin interacts with several proteins involved in transcriptional regulation. RNA expression analyses have identified several menin-regulated genes that could represent proximal or distal interaction sites for menin. This report presents a substantial and unbiased sampling of menin-occupied chromatin regions using Serial Analysis of Chromatin Occupancy; this method combines chromatin immuno-precipitation with Serial Analysis of Gene Expression. Hundreds of menin-occupied genomic sites were identified in promoter regions (32% of menin-occupied loci), near the 3′ end of genes (14%), or inside genes (21%), extending other data about menin recruitments to many sites of transcriptional activity. A large number of menin-occupied sites (33%) were located outside known gene regions. Additional annotation of the human genome could help in identifying genes at these loci, or these might be gene-free regions of the genome where menin occupancy could play some structural or regulatory role. Menin occupancy at many intragenic positions distant from the core promoter reveals an unexpected type of menin target region at many loci in the genome. These unbiased data also suggest that menin could play a broad role in transcriptional regulation.
Multiple endocrine neoplasia type 1 (MEN1) is a cancer syndrome predisposed by heterozygous germ-line mutations in the MEN1 tumor-suppressor gene (OMIM no. 13110). Somatic inactivation of the normal MEN1 allele in a predisposed cell initiates clonal tumor of parathyroid, enteropancreatic neuroendocrine, anterior pituitary, or other tissues . Biallelic somatic loss of MEN1 has also been detected commonly in sporadic tumors of similar tissues . The MEN1-encoded menin protein is expressed in all normal tissues and is predominantly nuclear . Menin interacts with a variety of transcription factors and chromatin-modifying proteins: AP1 transcription factor JunD; NFκB proteins p50, p52, and p65; homeobox-containing protein Pem; TGFβ-induced protein Smad3; BMP-2-induced proteins Smad1, Smad5, and Runx2; corepressor mSin3A; and the MLL1/MLL2-containing COMPASS-like protein complex (reviewed in Agarwal et al. ). These interactions of menin with transcriptional regulatory proteins can produce either a suppressing effect or an enhancing effect on gene expression. Therefore, transcriptional regulation (without menin necessarily binding directly to specific DNA sequence) seems to be an important physiological activity of menin.
Analysis of menin target genes is an obvious avenue to understanding menin's function. Direct participation of menin in the regulation of some genes (Hoxa7, Hoxa9, Hoxa10, Hoxc8, FoxC1, FoxC2, hTERT, IGFBP2, Meis1, p18, and p27) has been suggested by chromatin immunoprecipitation (ChIP) analyses [3–10]. Furthermore, in specific promoter-based luciferase assays, overexpression of menin modulated the promoter activity of p18, p27, rat insulin, human prolactin, human cFos, human PAI2, and mouse IGFBP2 [3,11]. In addition, cDNA or oligonucleotide microarray techniques have revealed menin-regulated genes by comparing gene expressions in cell lines (vector-transfected versus MEN1-transfected) [5,7,12–14], in Men1+/+ versus Men1-/- mouse embryos , or in human MEN1 tumors [15,16]. Menin target genes in one tissue have shown minimal overlap with menin target genes in other tissues.
Recent advances in applying ChIP with DNA microarray (ChIP chip) or cloning techniques have been helpful in identifying novel target genes or DNA-binding sites of several proteins in the context of the whole genome [17–21]. These methods have shown that selected proteins with binding-site specificity occupy far more DNA sites than previously suspected. Serial Analysis of Chromatin Occupancy (SACO) is one such approach that combines ChIP with Serial Analysis of Gene Expression (SAGE) technique, and it has been successfully used to identify genomewide cAMP-responsive element-binding protein (CREB) targets . Unlike CREB, menin does not possess any obvious DNA-binding domain nor are there any specific recognized DNA sequences that bind menin. Thus, menin's binding to DNA may possibly be indirectly facilitated by partnership with other transcriptional regulators. In the current report, SACO is used to survey menin-binding sites in human genomic DNA.
HeLa-S3 cells (ATCC, Manassas, VA) were grown in complete Dulbecco's modified Eagle's medium (supplemented with 10% fetal calf serum, 2 µM glutamine, and 100 mg/ml penicillin-streptomycin) at 5% CO2. Normal rabbit IgG was purchased from Santa Cruz Biotechnologies (Santa Cruz, CA), and antimenin (BL342) was obtained from Bethyl Laboratories (Montgomery, TX).
HeLa cells were fixed at room temperature in 1% formal-dehyde/1x phosphate-buffered saline (PBS) for 20 minutes. Cells were scraped in cold harvesting buffer (100 mM Tris-HCl pH 9.4 and 10 mM DTT) and pelleted by centrifugation at 3000g for 5 minutes at 4°C. Cell pellets were washed with cold 1x PBS, and 107 cells were lysed in 0.6 ml of lysis buffer [20 mM Tris-HCl pH 8.0, 150 mM NaCl, 0.1% sodium dodecyl sulfate (SDS), 0.5% Triton X-100, and protease inhibitors (Roche Molecular Biochemicals, Indianapolis, IN)]. Chromatin lysates were sonicated with an ultrasonic processor (Model GE 750; PGC Scientific, Frederick, MD) to an approximate DNA size of 1000 bp and below, then centrifuged for 10 minutes at 13,000 rpm at 4°C. Supernatants were transferred to fresh tubes, and each 0.6-ml aliquot of the lysate was precleared with 80 µl of washed and bovine serum albumin (BSA)-blocked 50% protein A-Sepharose (Amersham Pharmacia, Piscataway, NJ) by rocking for 1 hour at 4°C. Immunoprecipitation was performed overnight at 4°C with 4 mg of antimenin antibody or normal rabbit IgG as control. Immune complexes were captured with 80 µl of 50% protein A-Sepharose slurry for 1 hour at 4°C. Beads were collected by centrifugation at 8000 rpm for 1 minute and washed as follows: four times in lysis buffer for 10 minutes, once in LiCl buffer (0.25 M LiCl, 1% NP-40, 1% deoxycholate, 1 mM EDTA pH 8.0, and 10 mM Tris-HCl pH 8.0), once with 1x TE pH 8.0 for 30 minutes, and then once with 1x TE for 5 minutes. Chromatin protein/DNA complexes were eluted from the beads twice by adding 100 µl of elution buffer (1% SDS and 0.1 M NaHCO3 pH 8.0) at room temperature for 15 minutes each. The beads were collected by centrifugation at 13,000 rpm for 1 minute, and eluates were pooled and heated at 65°C overnight to reverse crosslinks. DNA fragments were purified using the QIAquick PCR purification kit (Qiagen, Valencia, CA).
A SACO library was prepared using antimenin ChIP DNA obtained from 6 x 107 HeLa cells . An outline of the menin-SACO library construction is shown (Figure W1). A modified version of the Long-SAGE protocol  was used to create ditags. Ditag concatemers were cloned into pZErO (Invitrogen, Carlsbad, CA) kanamycin vector and transformed by electroporation into E. cloni 10G electrocompetent cells (Lucigen, Middleton, WI). This antimenin plasmid SACO library was titered, and glycerol stocks were prepared from transformed bacteria. The average number of ditags in plasmids was analyzed by polymerase chain reaction (PCR) using vector primers flanking the insert.
Sequencing of SACO library plasmids was performed at Rexagen/Regulome (Seattle,WA). Approximately 5000 plasmids were sequenced to obtain the sequence of at least 40,000 tags. Concatemer sequences were extracted from chromatograms with the base caller “phred” using recommended settings . A custom “perl” script separated ditags at all CATGs. Duplicate sequences were removed from the analysis (same set of concatemerized tags). The resulting 21-bp SACO genomic signature tags (GSTs) were matched to genomic CATG sites using a C program. GSTs with exact matches or matches with one substitution error that were uniquely assignable to a genomic location were considered as positives. GSTs without a unique genomic match or with multiple unique matches were not considered. GSTs within 2 kb of each other were taken to be associated with the same locus. A set of scripts that automate the analysis of SACO data is available online at http://genome.bnl.gov/SACO/. Menin-SACO raw data are available at http://saco.ohsu.edu/. Additional details on the analysis of SACO loci have been published .
The genomic sequence (2 kb) flanking the GST median was copied from ENSEMBL. Primers were designed using MIT's (Whitehead Institute, Cambridge, MA) Primer3 software (http://frodo.wi.mit.edu/cgi-bin/primer3/primer3_www.cgi) such that primer pairs would amplify 200- to 300-bp products located close to the GST. Primer sequences are available on request. For GST confirmation, independent ChIP assays were performed using 6 x 107 HeLa cells. Quantitative PCR (qPCR; 25 µl) reactions were performed in duplicate using Mx3000P (Stratagene, La Jolla, CA) and the SYBR-green qPCR kit (Stratagene). PCR conditions were as follows: 95°C for 10 minutes; 40 cycles of 95°C for 30 seconds, 55°C for 60 seconds, and 72°C for 30 s. Antimenin and rabbit IgG ChIPs were expressed as nanograms of gel-purified (Qiagen) amplicon. Products showing a greater-than-two-fold enrichment relative to an IgG control were considered confirmed. Each amplicon was analyzed by agarose gel electrophoresis, and those yielding multiple products or no products were discarded and primers were redesigned. Other primer pairs used for ChIP-PCR were as follows: positive control primer pairs for Hoxc8 and Hoxa9 promoters, and seven negative control primer pairs (ch12: 2997, H2A, H2B, ch12:6151, LRP, MycP2, and Tubulin).
Total RNA was isolated from two independent culture dishes of exponentially growing HeLa cells with Trizol (Invitrogen) and further purified using RNAeasy (Qiagen). Each sample was analyzed at the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) microarray core facility using Affymetrix microarray platform (Affymetrix, Santa Clara, CA). Labeled samples were hybridized to Affymetrix human genome U133 Plus 2.0 array. Microarray data were normalized and analyzed using Affymetrix GeneChip software Microarray Analysis Suite 5.0. Expression levels were assigned based on positive or negative signals for gene expression, as per Affymetrix's detection call of “present” or “absent” (a signal intensity of > 200 arbitrary units was considered “present”).
Menin expression has been detected in all tissues and cell lines studied so far, except in menin-null tumor tissues or menin-null mouse embryo fibroblasts (MEFs). HeLa cells have been used for performing transcriptional reporter assays involving menin-interacting proteins and for isolating a menin-containing COMPASS-like protein complex. Hence, menin activities and targets almost certainly exist in HeLa cells. Thus, HeLa cells seemed appropriate for the initial analysis of menin-associated targets at a genomic scale. A SACO library was prepared from a HeLa cell chromatin extract after ChIP with an antimenin antibody. The menin-SACO library contained in excess of 3 x 106 GSTs. Close to 5000 plasmids containing concatemerized ditags were sequenced, and after removing duplicate sequences (possessing the same order of tags in concatemers), the sequence from 38,743 GSTs was analyzed. This may not be sufficient for an exhaustive analysis of menin targets in the entire genome. But a representative sampling of menin occupancy across the genome was accomplished. Of the 38,743 GSTs, 32,956 were distinct, of which 22,191 (67%) could be mapped to unique loci in the current build of the human genome sequence (Hg 17). A total of 8369 (25%) GSTs mapped to multiple locations belonging to repetitive sequences, and 2396 (8%) GSTs could not be located in the human genome sequence. The specificity of SAGE-type library analysis is increased by considering loci represented by multiple tags or GSTs . Single hits increase the number of menin targets by eight-fold. The number of likely false positives makes analysis of this data set of single hits difficult and unreliable. Other groups working with techniques similar to SACO also do not consider single hits in their analysis of target loci [e.g., in the analysis of p53 target loci (by ChIP PET) using two statistical analysis methods, Wei et al.  concluded that singletons were most likely background, and Kim et al.  have shown that identification of chromosomal targets by sequence tag analysis of genomic enrichment (STAGE) could be validated by independent methods when target genes were designated by multiple occurrences of STAGE tags). Therefore, all analyses of the menin-SACO library focused on loci represented by more than one GST (Figure 1). These include 2616 GSTs representing 1162 unique loci (unique locus = GSTs mapping within 2 kb of each other). The 2-kb interval was chosen based on a natural cutoff in the distribution of GSTs .
Among the 1162 menin-occupied loci represented by multiple GSTs, 778 (67%) loci mapped to at least one mRNA or gene predicted by ECgene EST and mRNA clustering annotation (UCSC genome browser). These consisted of 371 (32%) GSTs located within 2 kb of the 5′ end of annotated genes (defined as the most 5′ region of UCSC “known gene” or ENSEMBL gene annotation), 163 (14%) GSTs located within 2 kb of the 3′ end of annotated genes, and 244 (21%) GSTs located inside genes (not within 2 kb of the 5′ and 3′ ends of a gene). The rest of the 384 (33%) menin-occupied loci were located > 2 kb away from any known or predicted gene. The 778 loci represented 635 characterized genes. These 635 genes included 157 genes that encoded hypothetical/predicted proteins. Note that a single gene could be represented by multiple menin-occupied loci if the “ unique loci” mapped within or > 4 kb from each other in the same gene. This was observed for 70 genes in the current data set. The properties of these 70 genes were unremarkable.
Thus, as expected, menin was predominantly located near the 5′ ends of annotated genes; at the same time, the surprisingly high frequency of menin occupancy at the 3′ end and inside genes could be explained by the location of as yet unknown regulatory elements or unknown genes in these regions.
ChIP followed by PCR (ChIP-PCR) was performed for 51 menin-occupied loci represented by more than one GST and located within 2 kb of the 5′ end of annotated genes. qPCR of antimenin ChIP DNA from HeLa cells confirmed menin occupancy at 94% (48 of 51) of loci, specifically enriched by more than two-fold over anti-IgG ChIP control (range of fold change, 0.12–10.0) (Figure 2). qPCR data also verified menin occupancy at the Hoxc8 and Hoxa9 promoter regions that have been previously reported as menin targets [5,9,10,25]. Negative control primer pairs (n = 7) did not show a > 1.6-fold enrichment over anti-IgG ChIP control (range of fold change, 1.1–1.6). Also tested were 26 menin-occupied loci that mapped inside genes (Figure 2), of which 24 showed a more-than-two-fold enrichment in the antimenin ChIP over IgG control ChIP (range of fold change, 1.7–16.5). Therefore, almost each of the menin-occupied genomic loci identified by sequencing the menin-SACO library was also enriched in independent ChIP-PCR analysis, confirming the robustness of menin-binding sites identified by SACO.
The current study has identified many menin-occupied DNA sites outside the 5′ end of genes. The SACO analysis reported here is consistent with an independent approach that we have used  with ChIP chip and arrays containing 20,000 human promoters, an end-to-end coverage of 381 genes, and an additional 20 Mb on chromosome 7. Both analyses showed many interaction loci of menin in promoter regions, but also many other menin interaction loci outside these regions. The sensitivity of detecting weaker direct protein-DNA interactions or weaker indirect protein-protein-DNA interactions would depend on the quality of the ChIP step that is common to both SACO and ChIP chip. Therefore, both techniques would be equally handicapped in being sensitive to weak interactions. With the current menin-SACO data set, a representative sampling of menin occupancy across the genome was accomplished. From ChIP chip menin data, there were at least 1706 promoters in HeLa cells that bound menin. In the present report, after partial sequencing of the menin-SACO library, 371 menin-occupied loci were found near promoter regions. The ChIP chip approach may indeed have missed some targets. Therefore, we estimate that approximately 22% (or less) of the genome is being interrogated with the current menin-SACO data set. A comparison has not yet been performed between SACO-identified menin-occupied sites and those sites identified by ChIP chip analysis in HeLa cells.
A location map of the 1162 menin-occupied loci using “ENSEMBL KaryoView” (http://www.ensembl.org/Homo_sapiens/karyoview) showed even distribution among subchromosomal loci (data not shown). No significant association of menin GSTs at regions of gene clustering was observed when we examined this subchromosomal distribution of menin-occupied loci.
It is important to identify the nature of menin-occupied DNA sites and to understand menin's functions at sites of menin occupancy. Menin is known to regulate AP1-activated transcription at AP1-binding sites [27,28]. A directed search for AP1-binding sites “TGAGTCA” or “TGACTAA” near the 1162 menin-occupied loci (using a 1-kb flanking sequence) revealed that these sites occurred at a frequency of 7.14 x 10-5 and 6.28 x 10-5, respectively. These frequencies were not significantly higher than the frequency in the human genome of “TGAGTCA” (8.08 x 10-5) or the frequency of a random heptanucleotide (6.52 x 10-5) with the same GC content as the AP1-binding site .
Sequence analysis of the DNA-binding site motifs of other menin-interacting transcription factors was not performed because either their consensus-binding sites were not known (Pem, Ches1), the sequence was small (CAGA for Smad proteins), or the consensus sequence was variable and sequence subunit specificity for the binding site was not known (GGGRNNYYCCC for NFκB-p50, NFκB-p52, and NFκB-p65). We are interested in performing these analyses when more extensive data are available. In addition, Smad proteins have been reported to regulate transcription from AP1 sites . Furthermore, the AP1 transcription factor JunD is our favored candidate and is, so far, the most promising candidate as a valid menin partner. We have also tried to find a common menin-DNA interaction motif at the 1162 menin-occupied loci represented by multiple GSTs, but searches have so far not been successful (unpublished data).
It is not known if any of the 244 intragenic menin-occupied regions participates in transcriptional regulation. The functional significance of factor occupancy at intragenic loci and the role of intragenic loci in regulating transcription are being actively pursued in several laboratories [18,21,31]. Therefore, further studies might also shed light on the role of menin occupancy at intragenic loci. Given menin's presence in protein complexes that modify transcriptionally active chromatin, one possibility is that menin occupancy identified at intragenic regions may coincide with regions where menin could track the transcription process along the gene as a component of protein complexes.
Therefore, the current analysis further highlights recent evidence of the broad role of menin in transcriptional regulation.
Menin-occupied loci were examined for the presence of CpG islands within 2 kb of GSTs. Among the 1162 loci that could be located near mRNA, CpG islands were found near 61% of menin-occupied loci that were near the 5′ end of genes, 5% of the 3′-end loci were near CpG islands, and 6% of the loci inside genes were near CpG islands. For loci that could not be located to mRNA, only 8% were near CpG islands. Therefore, the possibility of a gene(s) being located near these orphan loci could be very low. They might end up being “inside” genes (intragenic) based on additional annotations of the human genome.
When transcriptional regulatory proteins are shown to occupy loci near the 3′ end of genes or inside genes, bidirectional or antisense transcription is generally suspected to occur near such regions . Based on the less abundant occurrence of CpG islands near the menin-occupied loci that were represented by GSTs mapping near the 3′ end (5% near CpG islands) or inside genes (6% near CpG islands) compared to those near the 5′ ends (61%), it is possible that these loci may not occur near sites of bidirectional or antisense transcription because such regulatory regions are reported to be associated with CpG islands .
To analyze the types of genes near menin-occupied loci, Gene Ontology (GO) biological process categories were assigned to the 635 menin target genes by using GOstat . Hypothetical genes (n = 157) were not considered for GO assignment. Functional categories of 478 menin target genes are summarized (Table 1). In comparison to the categories of all genes, the most overrepresented categories are genes important or predicted to be important in cellular metabolism (51%), macromolecule metabolism (33%), and cell cycle (7%). In the GO function hierarchy, a few genes belong to multiple categories and were thus scored more than once. The large number of genes without annotations or functions did not allow a thorough classification of the entire sample.
The occupancy of a locus by menin does not indicate whether the gene is expressed, nor does it specify the direction of any regulation by menin. To find out the expression status of menin-occupied target genes, total RNA preparations isolated from HeLa cells were analyzed for expression using oligonucleotide arrays. Among the 635 menin target genes identified by SACO, a comparison of HeLa RNA expression data and SACO data gave 614 genes for which both menin occupancy and gene expression status were available. Sixty-two percent of the menin-occupied genes were expressed, compared to 40% of genes expressed for the entire microarray, indicating that menin-occupied genes are more likely to be transcribed (P < .003). When restricted to expressed genes in HeLa cells, the representation of genes where menin occupied the 5′ end was similar to that observed for the genomic occupancy of menin GSTs identified near the 5′ end of genes (48% in genomic occupancy vs 52% in expressed gene list). But the representation of genes where menin occupied intragenic loci or 3′ ends was different from the genomic occupancy of menin at these sites (intragenic loci: 31% in genomic occupancy vs 40% in expressed gene list; 3′ end loci: 21% in genomic occupancy vs 9% in expressed gene list).
To further evaluate how the genes occupied by menin in HeLa cells are modulated as a consequence of menin loss in MEN1-associated human tumors or Men1-null MEFs, a comparison with published gene expression array data was made. The analysis showed that among the 15 genes that were common between the menin targets identified by SACO and the published gene expression data (Table W1), eight genes were downregulated in islet tumors (that lack menin). This correlation of menin-occupied genes and their downregulation on menin loss in islet tumors is not significant because the “islet tumor downregulated genes” category was overrepresented in the analysis compared to other categories (143 of 319 genes that were considered for comparison).
Menin shows a differential effect on transcription activated by JunD versus c-Jun—two members of the AP1 family of transcription factors [27,28]. To assess whether any of the menin-occupied loci identified in this study correlates with AP1-regulated genes, a list of 249 AP1-regulated genes  was compared to menin-occupied target genes. The analysis showed 10 genes that were common between the menin targets identified by SACO and the published data about AP1-regulated genes (Table W2). This correlation did not yield any important data about menin interaction and AP1-regulated genes except for Hoxa10, which has been previously shown by ChIP-PCR as a menin target  (Table W3), and DDR2, a gene involved in osteoarthritis that is upregulated in menin-complemented menin-null MEFs (Table W1). The 10 genes that were common targets were identified as AP1-regulated genes using various methods and various cells (as reported in Hayakawa et al. ). Information about the directionality of their AP1-regulated transcription is not known, and it is also not known if they are JunD-regulated or c-Jun-regulated.
Future analysis of SACO-identified menin targets and their transcriptional regulation under physiological and/or pathological conditions should help extend our findings.
The many interactions of menin with proteins and with “genes” suggest two alternate models of menin action. Protein interactions, mainly with JunD, indicate that menin might initiate its action into only one pathway, for example, by suppressing the activity of an oncogenic substrate JunD . In contrast, the current study, together with another recent study , opens a new paradigm for the normal and abnormal actions of menin. The genomic interactions identified by SACO and ChIP chip point to the possibility of a more complex nature of menin action. The new paradigm involves action beginning at many genomic pathways. Further work should explore whether these two models of menin action are complementary.
A substantial sampling of the menin-SACO library for menin-occupied loci in genomic chromatin showed that menin was not confined, in part, to a large number of promoter regions but that menin could also occupy many other regions inside genes and at the 3′ ends of genes. These data suggest that new as yet unidentified regulatory sequences could be present at those intragenic loci occupied by menin, or they could be regions where menin tracks with transcriptional progression. Menin's functions at the 1162 genomic loci and the nature of the DNA sequence at these loci remain to be established.
We thank MEN1 collaborators in National Institutes of Health intramural laboratories and members of the Goodman laboratory for helpful discussions. We thank Margaret Cam of the NIDDK microarray core facility for HeLa expression analysis, and Tao Tao of the NCBI User Service for AP1-binding site search.
1This research was supported, in part, by the Intramural Research Program of the National Institutes of Health (National Institute of Diabetes and Digestive and Kidney Diseases, National Human Genome Research Institute, and National Institute of Deafness and Communication Disorders) and by an extramural National Institutes of Health grant DK45423 (to R.H.G.).