|Home | About | Journals | Submit | Contact Us | Français|
Given the inherent limitations of in silico studies relying solely on DNA sequence analysis, the functional characterization of mammalian promoters and associated cis-regulatory elements requires experimental support, which demands cloning and analysis of putative promoter regions. Focusing on human chromosome 21, we cloned 182 gene promoters of 2500 bp in length and conducted reporter gene assays on transfected-cell arrays. We found 56 promoters that were active in HEK293 cells, while another 49 promoters could be activated by treatment of cells with Trichostatin A or depletion of serum. We observed high correlations between promoter activities and endogenous transcript levels, RNA polymerase II occupancy, CpG islands and core promoter elements. Truncation of a subset of 62 promoters to ~500 bp revealed that truncation rarely resulted in loss of activity, but rather in loss of responses to external stimuli, suggesting the presence of cis-regulatory response elements within distal promoter regions. In these regions, we found a strong enrichment of transcription factor binding sites that could potentially activate gene expression in the presence of stimuli. This study illustrates the modular functional architecture of chromosome 21 promoters and helps to reveal the complex mechanisms governing transcriptional regulation.
Gene expression in eukaryotic organisms requires coordinated regulation of thousands of genes. The challenge is to unravel the components and function of complex genetic networks and underlying regulatory processes. Although these processes are integrated at many different levels of the cellular machinery, the regulation of the initiation of transcription is essential and often the rate-limiting step (1) and involves mainly promoter regions located immediately upstream of the gene transcription start sites (TSSs). These regions can integrate various signals to control the transcription rates of associated genes, such as spatial and temporal cues during development, or in response to hormonal, physiological and environmental signals (2). Promoter regions usually entail a core promoter within 50–100 bp surrounding the TSS (3), proximal response elements located up to 250 bp upstream of the TSS, and distal response elements, which can reside several kilobases upstream and downstream of the TSS. The core promoter contains transcription factor binding sites (TFBSs) recognized by the general transcription factors and regulates basal transcription levels, whereas proximal and distal promoter regions are believed to harbor gene-specific TFBSs integrating additional signals for fine-tuning of transcription rates (4).
Mammalian promoter regions have been investigated mainly on a gene-by-gene basis using various reporter gene assays. Attempts to map promoter elements by genome-wide computational analyses of the DNA sequence made use of TFBS predictions and evolutionary conservation of short DNA stretches (5), but these approaches have been limited by the heterogenous nature of promoter regions, and to some extent by the sequence divergence of regulatory regions among mammals (2). Large-scale experiments addressing the architecture and activity of predicted promoters remain essential. Of the few studies available so far, transient reporter assays testing several hundred putative human promoters detected activity for 60–68% of those in the given experimental conditions (6,7). In a study by the ENCODE consortium, 25% of putative promoters predicted de novo from cDNA analysis were reported to be functional (6). These and other studies converge in postulating that mammalian upstream regulatory regions represent a heterogeneous group of modular elements with disparate structural features and cell-type specific activities (2,7,8). Additional systematic experimental analyses are necessary to gain broader insight into promoter structure and function.
We recently established a procedure based on transfected-cell arrays for the functional characterization of promoters (9). Here, we expanded this approach to characterize the promoters of the human chromosome 21 (HSA21) genes, since HSA21 serves as a model for pilot genomic studies due to its small size (48 Mb) and because of its association with Down syndrome. Based on the 231 well-annotated protein-coding genes on HSA21, we cloned 182 promoter fragments of 2.5 kb in size upstream of the predicted TSS, including the TSS itself, as well as a set of shorter fragments (500 bp upstream of the TSS) for 62 promoters. We used transfected-cell arrays (9–15) to carry out promoter reporter assays in HEK293 cells under normal growth conditions and after treatments known to alter gene expression. We correlated the measured activities with the presence of core promoter elements, RNA polymerase II occupancy (16), endogenous transcript levels, and expression profiles derived from EST data for 45 different human tissues. We show that data collected after treatment of cells with different stimuli can provide insight into the presence and identity of functional cis-regulatory elements. Taken together, we generated the first chromosome-scale reference data set on the structure, function and responses of human gene promoters.
Promoter annotation and primer design was based on human gene annotations from the Ensembl database v30. Promoter regions were defined relative to the most 5′ TSS of all annotated transcripts of a gene. PCR primers for all 231 genes of chromosome 21 were designed using the software PRIDE (17). The optimal target region for the design was defined as ranging from −2450 to +50 bp relative to the respective TSS. Each primer pair was required to flank the TSS. The downstream part was shortened if an ATG appeared within the +50 bp downstream region. In total, 223 primer pairs were obtained, to which 12 bases of adapter sequences were added for recombination cloning after two-step PCR amplification of the fragments (Gateway technology, Invitrogen). The same approach was used for cloning of truncated promoter fragments of ~500 bp upstream of the TSS.
Touch-down PCR from genomic DNA was performed according to a protocol optimized for amplification of GC-rich promoter regions (18), using as templates genomic DNA as well as available genomic BAC and fosmid clones. Then, a secondary PCR was performed with Gateway adapter primers (Invitrogen), followed by PEG-8000 precipitation of PCR products. The modified reporter gene vector pZsGreen1-1 and the control plasmid pHcRed1-N1 were used as described before (9). PCR products were cloned into the pZsGreen vector using Gateway BP Clonase II Enzyme Mix (Invitrogen) and transformed into competent TOP10 cells following the manufacturer’s recommendations. Resulting colonies were screened by colony PCR, plasmids were isolated from positive clones using a QIAprep Spin Miniprep Kit (Qiagen), and inserts were confirmed by 5′ and 3′ end sequencing. Promoter coordinates and primer sequences for 182 cloned promoter fragments are listed in Supplementary Table S1.
Samples for array spotting were prepared as previously described (9). Effectene reagent (Qiagen) with Enhancer (Qiagen) was used as transfection agent. Spotting solutions containing 32 ng/µl of promoter construct and 7.5 ng/µl of reference plasmid were kept at 4°C until arraying. Automated spotting was performed with a high-speed non-contact dispensing system (instrumentONE, M2 Automation). Arrays were printed onto home-made poly-l-lysine (Sigma) coated microscope glass slides using a 500 µm outlet port solenoid valve, which delivered 20 nl of sample per spot. Average spot to spot center distance was 1.5 mm. Samples were arrayed in triplicates. After arraying, slides were maintained in low humidity condition at 4°C. Human embryonic kidney cells (HEK293T from ATCC) were cultured in Dulbecco’s modified Eagle’s medium (DMEM, Gibco Invitrogen) supplemented with 10% (v/v) fetal calf serum (Biochrom) at 37°C in a humidified 6% CO2 incubator. One day prior to transfection, cells were seeded in a 60 cm2 culture plate in 10 ml of medium. On the day of transfection, cells were washed with PBS, detached with Accutase (PAA Laboratories) and seeded at 3.5 × 106 per slide onto printed slides, which were placed into a QuadriPerm chamber (Greiner) for reverse transfection. For each treatment, two slides were used in parallel, so that for each construct, six replicate spots could be analyzed. For treatments after 24 h of incubation at 37°C with 6% CO2 in DMEM supplemented with 10% fetal calf serum, the medium was changed to DMEM/FCS with 200 nM Trichostatin A (Sigma) or fetal calf serum-free DMEM. After 48 h of transfection, slides were washed with PBS, fixed in 3.7% formaldehyde with 4 M sucrose in PBS for 30 min, stained with DAPI and mounted with Fluoromount-G (Southern Biotech). The slides were kept in the dark at 4°C until analysis.
Microscopy images were acquired and fluorescent objects were detected as previously described (9). The average total number of cells per image frame was 649 cells, as controlled by DAPI staining. In this area, the maximum number of cells that could theoretically be transfected, i.e. cells found in the area of the spotted DNA, was measured to be 370 cells. For each scanning position, the number of HcRed-expressing cells, ZsGreen-expressing cells and co-transfected cells (HcRed- and ZsGreen-positive) was determined. The average transfection efficiency for HcRed alone was 14.1%, while an average of 7.8% of the cells expressed ZsGreen, depending on promoter reporter activity. The average total number of transfected cells (positive for either fluorophore) was 68 cells, resulting in 18.4% combined average transfection efficiency. To determine promoter reporter activities from numbers of fluorescent cells, two selection criteria were taken into account. First, the fraction of green-fluorescent cells among all red cells in a spot had to exceed 16% (transfection threshold). Second, the number of cells both green and red had to exceed the number of cells both green and red in the negative control spots (empty pZsGreen1-1 spotted in 10 replicates) by three standard deviations (reporter activity threshold). A promoter was classified as active if both thresholds were exceeded in at least four out of six replicates on two different cell array slides. Thus, a binary promoter activity index (with 0 for inactive and 1 for active promoters) was generated for each promoter region under investigation.
We used known position-weight matrices for TATA box, INR and DPE elements (19) together with the TransFac MATCH tool (20) for detection of common promoter motifs under default parameters. Genome-wide coordinates of CpG islands (21) were intersected with the coordinates of cloned promoters to identify CpG islands. In this, we required 500 bp immediately upstream of the TSS to overlap with at least 10% of the total sequence of a CpG island. Genome-wide coordinates of RNA polymerase IIA-bound regions in HEK293 (16) were intersected with the coordinates of cloned promoters to assess occupancy of the hypophosphorylated form of Pol IIA in promoter regions.
We retrieved associations of expressed sequence tag (EST) identifiers to UniGene cluster identifiers in 45 tissues generated for 5 799 931 human ESTs clustered into 116 190 UniGene clusters from the UniGene FTP site (Hs.profiles.gz for Homo sapiens build #207). EST expression profiles for these UniGene clusters were extracted from the ‘Body Sites’ category of the original file. The resulting EST set for 156 HSA21 genes consisted of 32 450 ESTs from 45 tissues. We then calculated for each gene with corresponding cloned promoter the number of different tissues where corresponding ESTs could be found.
For the set of promoter sequences that showed specific response patterns in our experiments, we searched for common TFBSs that might explain these responses. To score transcription factor binding, we used a physical affinity-based model described in previous publications (22,23), and matrices describing 610 vertebrate transcription factor binding preferences from TRANSFAC version 12.1 (20). For each binding matrix, we calculated the affinity of the matrix for each sequence, and then transformed these affinities into P-values as described before (22). These P-values represent the probability that the observed binding affinity is greater than would be expected from a random sequence from a human-promoter-based background model. The P-values for each sequence can then be combined using Fisher’s method. P-values are not multiple test-corrected, and are merely used to rank the factors. Each binding matrix is then ranked according to its combined P-value, giving a natural ranking of the transcription factors that have the most enriched binding within the sequence set as a whole.
Enriched gene ontology (GO) terms were identified using the DAVID functional annotation tool (24). Entrez GeneIDs for 40 promoters activated by serum depletion and 28 promoters activated by Trichostatin A were compared to a background set of 126 inactive promoters within the GO category ‘biological process’.
Promoter fragments were selected for 231 genes on HSA21, based on their most upstream annotated TSS (see ‘Materials and methods’ section). Primer pairs encompassing 2.5 kb of DNA sequence upstream of the TSS could be designed for 223 promoter fragments. Of these, 182 fragments were successfully amplified and cloned into a reporter vector upstream of the ZsGreen fluorescent reporter gene. Promoter coordinates, primer sequences and supporting data for each TSS from RNA-seq and Pol IIA ChIP-seq data (16) as well as CAGE and EST/cDNA data (25,26) are listed in Supplementary Table S1. Retrospective comparison with endogenous TSS coordinates determined by RNA-seq from HEK293 indicated that the use of Ensembl TSS annotations yielded slightly more correct TSS positions than the use of predominant TSSs derived from CAGE/EST data from other tissues or cell types. Screenshots from the UCSC genome browser showing all data sets for genes with contradictory TSS data can be found in Supplementary Figure S2. HEK293 cells were co-transfected on cell arrays spotted with promoter reporter constructs and a normalization plasmid expressing red fluorescent protein HcRed. The combined mean transfection efficiency was calculated to be 18.4% (see ‘Materials and methods’ section). For assessing reporter gene activity, we measured the ZsGreen and HcRed fluorescence signals in a cell-number-based quantification approach with stringent thresholds ensuring a reliable readout (see ‘Materials and methods’ section and Supplementary Table S1). The mean numbers of co-transfected cells were 32.9 ± 5.9 and 6.2 ± 4.8 for promoter fragments scored as active and inactive, respectively, whereas we found 1.7 ± 2.0 transfected cells in the negative controls. Figure 1A shows that this scoring scheme allowed a clear-cut distinction to be made between the active cloned promoter fragments (56 out of 182 tested) and the silent fragments (126 out of 182).
We compared the capacity of the cloned promoters to drive the transcription of a reporter gene with the endogenous transcript levels previously determined for HEK293 cells by RNA-seq (16). For the 56 active promoters, 50 corresponding genes were found expressed (Figure 1B), indicating a rate of 89% true positives in the assay (Supplementary Table S1). In contrast, only 37 of 126 inactive promoters were associated with expressed genes. The enrichment of expressed genes in the set of active promoters was highly significant (P = 1.2 × 10−14).
In order to distinguish promoters of ubiquitously expressed genes from those which might confer tissue specificity, we made use of expression data compiled in UniGene EST clusters (27). We observed a strong correlation between promoter reporter activity in HEK293 cells and the number of different tissues in which the associated genes were transcribed (Figure 1C). Out of 56, 38 active promoters were associated with broadly expressed genes (ESTs found in >25 different tissues). In contrast, only 29 of the 126 inactive promoters controlled genes with a broad expression pattern (Supplementary Table S1). The enrichment of broadly expressed genes among active promoters was highly significant (P = 1.1 × 10−8).
For all genes analyzed, we correlated promoter and gene activities in HEK293 cells with the presence or absence of hallmarks of TSSs and of key regulatory elements. An overview of the different features analyzed for the HSA21 promoters is summarized in Figure 2.
To correlate promoter activity with the presence of functional TSSs, we relied on ChIP-seq data that we previously reported for hypophosphorylated RNA polymerase II polypeptide A (Pol IIA) used as a landmark of transcription initiation in HEK293 cells (16). A large fraction of active promoter fragments (35 out of 56) contained or overlapped Pol IIA-bound regions (Figures 2 and and3A),3A), whereas inactive promoter fragments were strongly depleted of Pol IIA (12 of 126; Supplementary Table S1). The enrichment of Pol IIA occupancy in active promoters was highly significant (P = 2.9 × 10−13).
Core promoters are known to be associated with promoter-specific sequence elements controlling the initiation of transcription of downstream genes. For instance, CpG islands, the TATA box, initiator (INR) and downstream promoter elements (DPE) are functionally important, although their presence is not always required for promoter activity (7,8,28). We analyzed the occurrence of these four elements within the 500 bp near the TSSs of all 182 cloned promoter fragments (Supplementary Table S1). Based on a genome-wide reference map of CpG islands (21), we observed that 83 of the 182 cloned promoters (46%) overlapped with a CpG island over a sequence length of at least 50 bp, and that these islands were almost always located at the TSS (98% of the cases). Active promoters were highly enriched for CpG islands (P = 2.3 × 10−11) when compared to the silent ones (46 out of 56, and 37 out of 126, respectively). The TATA box, located 28 to 34 bp upstream of the TSS (29), is the best-known core promoter element. TATA boxes are often associated with strong tissue-specific promoters and result in clearly defined TSSs (8). TATA boxes were present in only 14 of the 182 cloned promoters (7.7%), which is slightly below the previously reported portion of TATA-containing promoters among all known human promoters, estimated to be from 10 to 20% (30). TATA boxes occurred almost twice as frequently in silent promoters (9% with TATA) than in active promoters (5% with TATA), but this enrichment was not significant. No trend was observed for the INR element, which was present in 7% of the active fragments (4 of 56) and in 6% of inactive promoters (8 of 126). On the other hand, as shown in Figure 3A, a DPE element was found in half of the active promoters (28 of 56) but in only in one-third of silent promoters (42 of 126; P = 0.025; Figure 3A). Three elements occurred together in a significant number of cases. Of the 47 promoters with Pol IIA occupancy, 45 contained a CpG island and 24 a DPE element. Lastly, it is notable that 53 inactive fragments did not overlap with any of the promoter elements or Pol IIA-bound regions.
To assess further the functionality of the cloned promoter fragments, we monitored promoter activities after challenging the cells by treatment with Trichostatin A (TSA) or by depletion of fetal calf serum (FCS) from the culture medium. The effects of TSA (an inhibitor of class I and II histone deacetylases) on cell function are complex and include triggering the activation of transcription from repressed regions of the chromosomes (31). Here, TSA treatment activated 28 of the 126 inactive promoters, whereas only three of the 56 previously active promoters were silenced (Figure 2). The genes activated by TSA did not belong to any specific category of biological function (data not shown). A large fraction of the TSA-activated genes (15/28) exhibited a broad expression profile (ESTs in >25 tissues), in contrast to those which remained silent (14/98; P = 5 × 10−5, see Figure 3B). Similarly, we observed significant enrichments of endogenously expressed genes and Pol IIA binding regions (Figure 3B). Interestingly, CpG islands represented the most frequently observed class of elements, found present in 68% of TSA-activated promoters (Figure 3B), whereas no enrichment for either TATA, INR or DPE elements was detected.
Serum depletion from the cell culture medium is known to elicit stress responses and apoptosis through activation of several factors, such as NFκB and CREB (32–34). Here, after the cells were deprived of serum for 24 h, no promoter was silenced. However, 40 promoters were activated, of which 19 were also found to be activated by TSA (Figure 2). Comparison of the specific features of promoters activated by TSA or serum depletion (Figure 3B and C) showed, in both cases, enrichment of genes with broad expression patterns and with CpG islands, albeit the latter feature was more pronounced among promoters responding to TSA. However, in contrast to the TSA treatment, promoters activated by serum depletion were not enriched for genes endogenously expressed in HEK293 cells or marked by RNA Pol IIA occupancy. Instead, a significant enrichment of DPE elements was observed. CpG islands and DPE elements occurred together in 28% of promoters activated by serum depletion (11/40), but only in 8% of promoters that remained silent (7/86). We found that 20% of the activated promoters corresponded to genes associated with cellular responses to the environment (Supplementary Table S1). In contrast, only 8% of the promoters remaining inactive belonged to this category.
Altogether, monitoring HSA21 promoter gene activity on transfected-cell arrays revealed that 56/182 promoters of 2500 bp in length were able to drive reporter gene expression in HEK293 cells under normal growth conditions. Assays in the presence of different external stimuli showed that an additional 49 promoter fragments have the capacity to induce reporter gene expression. Figure 4 shows the associated EST data from 45 different tissues and gene expression levels in HEK293 cells. Those promoters that were active under standard conditions were expressed in a large number of tissues and exhibited relatively high endogenous expression levels in HEK293 cells. The set of promoters that could be activated by TSA exhibited slightly lower expression levels in both data sets, while the promoters activated through serum depletion tend to be expressed in significantly fewer tissues and showed only weak expression in HEK293 cells. For the 77 silent fragments that could not be activated, only sparse EST data were available, and only 17 promoters were associated with genes expressed in HEK293 cells. A closer inspection of these 17 inactive fragments, together with integration of Pol IIA ChIP-seq and RNA-seq data, revealed that in four cases the core promoter was missed by 10–30 bp (for C21orf19, C21orf90, HEMK2 and PFKL), and that in five cases an alternative TSS was used for these genes in HEK293 cells (ABCG1, MRPS6, NCAM2, NRIP1 and PCBP3). Apart from these cases, the majority of cloned HSA21 promoters recapitulated their function in living cells.
To investigate the influence of distal promoter regions on transcription, we cloned a subset of 62 truncated promoter fragments of ~500 bp, thus removing the distal ~2000 bases. Under standard conditions, 29 of the 62 short fragments could drive transcription in reporter assays (Figure 5). Compared to assays performed with the long fragments, truncation of promoter length resulted in loss of activity for only three cases (DSCR2, OLIG1 and SIM2). However, six promoters gained activity in their truncated form, while their longer version was inactive (MRPL39, RBM11, CHAF1B, HLCS, C21orf45 and SH3BGR), suggesting the presence of inhibitory regulatory regions in the distal ~2000 bp sequences.
Regarding the responses of short fragments to treatments with TSA and depletion of serum, we found that 40 short promoters (66%) recapitulate, under all conditions, the activity patterns observed for the long fragments (Figure 5, lower part), while the remaining 21 behaved differently (Figure 5, upper part). In contrast to their longer counterpart, 14 truncated promoters could not be activated by any treatment, indicating the loss of activating cis-regulatory upstream elements. Conversely, seven truncated promoters could be activated under more conditions than their longer counterparts (C21orf66, C21orf45, CHAF1B, MRPL39, HLCS, RBM11 and SH3BGR), suggesting that they have lost inhibitory elements located in the distal ~2000 bp sequences.
We aimed to identify cis-regulatory elements that might contribute to the response patterns observed after treatment with external stimuli. We ranked affinities for 610 known vertebrate transcription factor (TF) binding matrices (20) for the 2.5 kb promoters activated by serum depletion or TSA treatment, and for the distal 2000 bp region of promoters which had lost their capacity to be activated by either of those stimuli upon truncation (see ‘Materials and methods’ section). Of the TF binding matrices enriched in the sequences from each class of promoter, we retained only data pertaining to TFs expressed in HEK293 cells. We found that the highest ranked factors tend to be related to serum responses or TSA treatment (Table 1). All binding sites detected for these TFs in the analyzed promoter fragments are listed in Supplementary Table S1. Results are in accordance with previous reports that demonstrated the importance of USF1, NFκB, MYC and ETS1 activities in serum response. For the TSA responses and associated histone deacetylase inhibition, we found reports describing responses to TSA treatment for MAFG, AP1 (FOS/JUN), p53 and OCT1. Thus, four of seven TFs with enriched binding sites in serum-sensitive promoters and four of eight TFs with enriched sites in TSA-responsive promoters have been previously implicated in corresponding signal transduction pathways.
Using a transfected-cell array procedure, we monitored the activities of 182 cloned promoters corresponding to ~80% of all HSA21 genes in HEK293 cells. Compared to previous studies, where the length of promoters was no larger than 1000 bp (6,35,36), we aimed here at a more comprehensive coverage of potential upstream regulatory elements by cloning 2500 bp fragments. In addition, we challenged the cells with external stimuli to identify the regulatory nature of the elements in the cloned promoters. In this, the cell array format proved reliable and represents a cost-efficient alternative to conventional reporter gene assays in microtiter plates.
Our data showed that promoter reporter activities recapitulated endogenous gene expression to a great extent (89% concordance). Nevertheless, no transcripts were detected for six genes whose promoters were active on the cell arrays (C21orf13, C21orf115, DSCR4, DSCR8, KRTAP21-2 and RSPH1). Four of these genes are expressed in only a few tissues according to EST data. The activity of these promoters in our assay might indicate that elements repressing the expression of those genes in HEK293 were not included in the promoter reporter constructs, or that tight chromatin structures or DNA methylation occurring in the natural genomic context were not recapitulated in the reporter constructs. Conversely, for 37 out of 126 promoters inactive under standard conditions, we found expression of the corresponding genes in HEK293 cells. In fact, the majority of these promoters were activated by treatment with TSA or serum depletion, suggesting that the corresponding cloned fragments contained a functional core promoter but lacked binding sites for activating factors essential to lift reporter gene expression above the detection threshold. The larger part of the remaining 17 inactive promoters missed the core promoter required for transcription; either due to inadequate TSS annotation or use of an alternative TSS in HEK293 cells (see Supplementary Figures S2). Future promoter studies in HEK293 and other cultured cells will benefit from the wealth of data that is now becoming available from RNA-seq and TSS-seq (37). The TSA-activated promoters generally exhibited broader expression patterns and higher expression levels in HEK293 cells than the promoters activated after serum depletion, suggesting that TSA enhanced the activity of promoters that were already active at low levels, while serum depletion leads to activation of previously inactive promoters.
We observed that the co-occurrence of core promoter elements and Pol IIA occupancy were hallmarks of promoter activity. The finding of significant enrichments of CpG islands and DPE elements, but not TATA or INR elements in active promoters confirmed previous observations (35,38). CpG islands are present within more than 80% of the promoters active in HEK293 cells, a feature previously reported in other cell types, such as primary fibroblasts (39). Moreover, the strong correlation between gene expression levels and promoter activity in our reporter assays concerned mostly genes containing CpG islands. RNA polymerase IIA-binding could either correspond to Pol II stalling at genes poised for activation (40,41), where CpG islands are also known to be enriched (42), or to active TSSs (16,43). As expected, we found the presence of Pol IIA-bound regions strongly associated with active promoter fragments.
We observed that a subset of the silent promoters was activated by treatment with Trichostatin A. An expected effect of TSA, a specific inhibitor of mammalian class I and II histone deacetylase enzymes (31), is the activation of transcription from repressed chromosomal regions through chromatin remodeling (44). Transiently transfected plasmids are not entirely subjected to the same regulatory mechanisms that affect native chromatin, but it has been shown that chromatin structures can be formed on plasmid DNA (45). Subsequently, it should be possible to reverse histone deacetylase-dependent silencing mechanisms by TSA (46,47). Indeed, TSA was able to activate 28 of the silent cloned promoters. Apart from the effects of TSA on histone acetylation, it might also be possible that TSA has an indirect influence on CpG methylation. We found that 68% of the TSA-activated promoters contain a CpG island. It has been shown that in some cell lines, TSA downregulates the expression of the DNA methyltransferase DNMT1 (48), and that CpG demethylation occurs at genes with low expression levels, which are then fully activated upon demethylation (49). Taken together, it is likely that the promoter-reporter constructs are sensible to histone deacetylation, or to DNA methylation-mediated silencing (50,51), or to both mechanisms.
The comparison of the activities of long and short promoter fragments showed that only three promoters lost their activity upon truncation. This finding implies that in general, proximal promoter regions are sufficient to drive gene expression. A different picture emerges when activity changes through external stimuli are taken into account. Here, 29% of all tested long fragments changed their activity upon stimulation by TSA or serum depletion, while only 5% of the tested short fragments responded to these stimuli. The observed difference is evidence for the presence of cis-regulatory response elements in the distal promoter regions of these genes.
Depletion of serum triggers cell type-specific responses affecting cell cycle regulation, cell growth, differentiation and apoptosis (33,52,53). Indeed, among the 40 promoters activated by serum depletion, one-fifth was annotated within the GO category ‘response to stimulus’. Analysis of TFBSs in the promoters of the genes activated by serum depletion showed strong enrichment of binding sites for seven transcription factors. Notably, four of the top-ranking factors (USF1, NFκB, ETS1 and MYC) have already been shown to be involved in responses to serum starvation. For instance, this treatment enhanced USF1 expression and binding of USF1 in the promoter of the target gene lipocalin-type PGD synthase in brain-derived cells (54). NFκB has been found potently activated upon serum starvation in HEK293 cells, leading to apoptosis (32). Ets domain-containing transcription factors, such as ETS1 identified here, are implicated in the response to serum in endothelial cells (55). Finally, MYC has been previously implied in responses to growth factor-deprived conditions, where it is involved in induction of apoptosis (56).
Analysis of TFBSs in the promoters of the genes activated by TSA showed strong enrichment of binding sites for eight transcription factors. Remarkably, similar to serum depletion, four of these factors (MAFG, p53, OCT1 and AP1) have previously been shown to be responsive to TSA treatment. TSA can abolish MAFG-mediated repression of gene expression via Maf recognition elements in reporter gene assays in HEK293 cells (57), leading to gene activation. TSA can induce p53-mediated cell cycle arrest or apoptosis, depending on the cell type (58,59). TSA can also induce gene expression via OCT1, independently of p53 (60). In addition, TSA can promote the binding of AP1 to a recognition site and activate the expression of the osteoporin gene in a mouse mesenchymal cell line (61). The activating effects of TSA, originally shown for specific genes as reported above, seem to represent a more general mechanism targeting the genes whose promoters contain binding sites for these transcription factors.
We observed similarities in the responses of promoters to serum depletion with that to TSA treatment, given that 19 promoters were activated by both types of treatment in our reporter assays. One explanation for this overlap might involve the TSA-responsive AP1 complex, which is composed of members of the JUN, FOS and CREB/ATF families. JUN is also a mediator of the MYC-induced apoptotic signaling following serum starvation (56). Thus, JUN is involved in mediating the responses to both TSA treatment and serum depletion, which might explain the observed overlap between the sets of promoters activated by these stimuli.
Using reporter gene assays on transfected-cell arrays, we showed that cloned promoter fragments largely recapitulate gene expression driven by the endogenous promoters, with integration of endogenous signaling pathways into reporter gene expression. Under standard conditions, cloned promoters of ~500 bp in size, spanning core and proximal promoter regions, are generally sufficient to drive gene expression. Comparative analysis of the activities of long and short promoters highlighted the presence of inhibitory cis-regulatory elements within −500 to −2500 bp for six promoters. Extended promoter regions were found necessary in many cases to integrate cellular signaling into reporter gene expression. Upon truncation, a significant number of promoters lost their response to cellular signaling. Monitoring the activity of long and short promoter fragments upon stimulation showed evidence for both inhibitory and activating response elements upstream of the 500 bp proximal promoter. Although it has been reported that promoter truncation reveals predominantly the presence of negative regulatory elements within 1000 to 500 bp upstream of the TSS (35,62), we found here that activating distal promoter elements were significantly more frequent than negative elements in our experimental set-up.
This chromosome-scale promoter study sheds some light upon both general and specific regulatory response elements controlling gene activity. The identified promoter activities and responses to stimuli constitute a valuable resource for further investigations. This collection of cloned HSA21 promoters could be used in future promoter activity studies in different cell lines, or in combination with transcription factor overexpression or knock-down, thereby allowing researchers to build a detailed understanding of the mechanisms of transcriptional regulation across an entire chromosome.
Supplementary Data are available at NAR Online.
The Max Planck Society for the Advancement of Science, Munich, Germany, and the German Federal Ministry of Education and Research (BMBF) in the framework of the National Genome Research Network 2 (NGFN2, SMP-DNA) [grant number 01GR0414]. Funding for open access charge: The Max Planck Society for the Advancement of Science, Munich, Germany.
Conflict of interest statement. None declared.
The authors thank Sabine Thamm and Irina Girnus for assistance in preparation of plasmid constructs.