In this study, we have undertaken a comprehensive global analysis of the time course of gene induction following growth factor stimulation of quiescent human cells. As expected, we identified both rapid and delayed gene inductions resulting from PDGF stimulation. Forty-nine genes were induced within 30 minutes of stimulation, as expected for immediate-early genes, whereas 84 genes required 2-4 hours of PDGF stimulation for maximum induction. Surprisingly, we found that the majority of the genes induced with delayed kinetics (58/84) were primary response genes, since their induction was not inhibited by cycloheximide. The transcriptional program induced by growth factor stimulation thus involved three distinct classes of genes: immediate-early genes, delayed primary response genes, and secondary response genes, which accounted for approximately 37%, 44% and 19% of the genes induced within 4 hours of PDGF stimulation, respectively. Similar kinetics of induction of representative delayed primary response genes were observed in response to the alternative mitogens EGF and serum, suggesting that their induction kinetics are not PDGF-specific event. Examples of delayed primary response genes have been observed by others in primary human fibroblasts (3
), rat arterial smooth muscle cells (4
) and mouse 3T3 cells (5
), but the large number of primary response genes we found to be induced with such delayed kinetics was unexpected, suggesting a more complex regulatory landscape in mammalian cells.
Transcriptional programs are often represented as gene networks, where products of expressed genes activate or repress secondary downstream gene targets. Many analyses assume temporal regulation according to the canonical immediate-early/secondary response gene paradigm to infer protein-gene interactions from correlations in gene expression data (33
). By highlighting the unexpectedly high incidence of delayed primary response genes, our results have broad implications for analyses that infer regulatory interactions from temporal correlations in gene expression. Since many genes that are induced with a significant lag after growth factor stimulation are still primary response genes, it cannot be assumed that temporally delayed gene expression requires the prior induction of upstream transcriptional regulators.
Because delayed primary response genes represented a major component of the transcriptional response to growth factor stimulation, we used both computational and experimental tools to elucidate the properties of this group of genes. We first sought to determine whether the delayed primary response genes shared similar functions with the immediate-early genes. Therefore, the immediate-early and delayed primary response genes' functional classifications were compared using the Gene Ontology (GO) database. The immediate-early genes were enriched in Molecular Function terms related to transcriptional regulation. This corresponded well with their recognized role as transcriptional effectors in the induction of secondary response genes. In contrast, the delayed primary response genes were not enriched in functions related to transcriptional regulation and had no significant functional overlap with the immediate-early genes. These comparisons suggest that the products of immediate-early genes may have unique functions in regulating the transcriptional response to growth stimulation, while the delayed primary and secondary response genes may function as effectors of this transcriptional program. In this regard, it is noteworthy that cyclin D1 was initially described as a secondary response gene in macrophages, whose induction linked cell cycle proliferation to growth factor stimulation (34
). However, cyclin D1 behaved as a delayed primary response gene in the present study, as well as in 3T3 cells (8
) and human fibroblasts (6
We also examined the basis for the distinct kinetics of induction of immediate-early and delayed primary response gene mRNAs. Analysis of hnRNA demonstrated that both immediate-early and delayed primary response genes were induced at the transcriptional level. The hnRNAs of immediate-early genes were rapidly induced, coincident with the rapid inductions of their mRNAs. The lag in induction of a number of delayed primary response mRNAs appeared to result from either a delay in transcription initiation or the start of productive elongation, as suggested by the delayed inductions of their hnRNAs. In contrast, hnRNAs of other delayed primary response genes were rapidly induced, suggesting that the lag in mRNA induction resulted from delays in subsequent stages of transcriptional elongation or processing. These differences between the kinetics of induction of immediate-early and delayed primary response gene mRNAs appear to be associated with a combination of factors, including the over-representation of upstream binding sites for shared transcription factors, core promoter elements, gene length, and exon frequency.
Computational comparisons revealed striking differences in the prevalence of predicted binding sites for shared transcription factors in the upstream regions of immediate-early and delayed primary response genes. Binding sites for several known regulators, including SRF, AP-1, CREB, KROX and NF-κB, were over-represented in the upstream regions of immediate-early genes compared to other genes that were expressed in T98G cells but not induced by PDGF. In contrast, binding sites for either these or other transcription factors were not significantly over-represented upstream of the delayed primary response genes. The absence of predicted binding site enrichment upstream of the delayed primary response genes may indicate that, whereas immediate-early genes are activated by a shared set of transcription factors, the delayed primary response genes are controlled by a more diverse set of regulators, which would not be identified as over-represented in the gene set. Alternatively, it is possible that delayed primary response genes contain fewer clusters of transcription factor binding sites near their promoters than immediate-early genes, or that the transcription factor binding sites upstream of delayed primary response genes are lower affinity sites than those upstream of immediate-early genes, since lower affinity sites that are divergent from the binding site matrix might not be scored in the computational analysis. Both of these factors could reduce the affinity of transcription factor binding to the promoter regions of delayed primary response genes, correspondingly reducing their rates of transcriptional activation.
The core promoters of the immediate-early genes also differed from those of the delayed primary response genes. In particular, promoters of the immediate-early genes contained higher affinity TATA boxes than those of the delayed primary response genes. Similarly, the prevalence of TATA boxes in the promoters of immediate-early genes (59%) was significantly higher than in the promoters of delayed primary response genes (34%) or in all genes in the genome (22%). This may have important implications in transcription initiation, with higher affinity TATA boxes conferring greater transcriptional activity on the promoters of immediate-early genes. Reinforcing the notion that the immediate-early genes have stronger, more defined initiation is the demonstration that these genes also have a significant bias for the SP, or single peak, promoter class defined by CAGE analysis (25
). Moreover, because some components of the transcription initiation complex, including TBP, remain bound to DNA following pol II promoter clearance, the stability of these factors may modulate the transcription reinitiation rate. Thus, high scoring TATA boxes present in immediate-early promoters may represent higher affinity TBP binding sites that confer rapid reinitiation (35
). Indeed, previous work demonstrated instability of TBP-TATA interactions following the first round of transcription (36
) and non-canonical TATA box sequences diminish binding of TFIIA (37
), a general transcription factor that is thought to stabilize the TBP-TATA complex (38
The differences in both upstream transcription factor binding sites and core promoters are also consistent with differences in the binding of RNA polymerase II to the promoter regions of immediate-early and delayed primary response genes. Chromatin immunoprecipitation indicated that pol II was bound to the promoters of both immediate-early and delayed primary response genes in unstimulated cells, and that pol II occupancy increased on the promoters of about one-third of the genes in both groups following growth factor stimulation. Thus, transcriptional induction of the majority of immediate-early and delayed primary response genes may result from the start of productive elongation by a paused polymerase, rather than by recruitment of pol II to the preinitiation complex. These findings are consistent with previous demonstrations of paused polymerases near the transcription start sites of immediate-early genes, including FOS
), as well as with global analyses that have detected preinitiation complexes at the promoters of many non-transcribed genes in human cells (39
). Importantly, however, the amount of pol II bound to the promoters of immediate-early genes was significantly greater than that bound to the promoters of delayed primary response genes. These differences in pol II occupancy highlight a key distinction between the immediate-early and delayed primary response gene promoters. Together with the differences in both upstream transcription factor binding sites and TATA boxes, these findings point to transcription initiation, and perhaps reinitiation, as one of the primary mechanisms for rapid responses of immediate-early genes to growth factor stimulation relative to the delayed primary response genes.
Our analysis also revealed significant differences between the immediate-early and delayed primary response genes in both primary transcript lengths and exon frequencies. The immediate-early genes tend to be shorter and contain fewer exons than the delayed primary response genes, which are similar in length and exon frequency to other genes in the genome. These transcript features may contribute significantly to the lag in mRNA expression of delayed primary response genes, particularly for those genes that displayed a rapid induction of transcription, as detected by hnRNA. VCL
provides an extreme example of the possible effect of primary transcript length and exon frequency on kinetics of mRNA expression. Analysis of hnRNA established that transcription of VCL
was rapidly initiated, similar to immediate-early genes, such as FOS
Consistent with its rapid transcriptional induction, SRF has been reported to be a key inducer of VCL
). However, the accumulation of VCL
mRNA was delayed by 2-3 hours compared to the hnRNA. This lag in mature VCL
mRNA production may be explained by the 122 kb primary transcript length, which is more than six times the average immediate-early gene primary transcript length, and the presence of 22 exons, which is almost four times the average number of immediate-early gene exons. At the other extreme, DKK
(another delayed primary response gene) has a primary transcript of only 3.3 kb containing 4 exons, comparable to that of the shortest immediate-early genes. In contrast to VCL
, transcriptional induction of DKK
is delayed for 2-3 hours after growth factor stimulation, coincident with increased pol II occupancy at its promoter. DKK
may therefore represent an example of a gene whose delayed induction results primarily from a lag in pol II recruitment and transcription initiation.
Multiple differences between immediate-early and delayed primary response genes thus appear to contribute to the distinct kinetics of induction of their mRNAs. The immediate-early genes are characterized by over-representation of binding sites for several transcription factors in their upstream regions, promoters with high affinity TATA boxes, and short primary transcripts containing relatively few exons. In all of these respects, the delayed primary response genes are similar to other genes in the genome. Additional features, such as chromatin structure, may also distinguish immediate-early from delayed primary response genes, as has been reported for genes displaying rapid versus delayed inductions in response to other stimuli (41
To determine whether these characteristics of immediate-early genes were consistent in other cell types, we analyzed the features of immediate-early genes induced by the mitogenic stimuli EGF in HeLa cells and serum in MCF10A cells (normal human breast epithelial cells) in published data sets (44
). As in T98G cells, the immediate-early genes induced in both HeLa and MCF10A cells showed an over-representation of transcription factor binding sites, including sites for SRF, AP1, CREB, KROX and NF-κB, that were conserved in mouse and dog (Supplementary Table 7
; for complete Transfac output, see Supplementary Table 8
). Likewise, immediate-early genes in HeLa and MCF10A cells had significantly higher TATA scores, lower exon frequencies and shorter transcript lengths as compared to the genome as a whole (Supplementary Figures 3-4
). Thus, the immediate-early genes induced in T98G, HeLa, and MCF10A cells by three different mitogens share common characteristics of genomic organization.
The multiple features associated with rapid induction of immediate-early genes may have been selected for based on the functions of immediate-early gene products as transcriptional regulators that mediate subsequent alterations in gene expression in response to growth factor stimulation. The rapid induction of immediate-early genes might be expected to play an important role in achieving a robust cellular response to extracellular signals. In contrast, the lag in induction of both delayed primary and secondary response genes is consistent with the apparent functions of these genes as effectors rather than mediators of growth factor signaling. Thus, immediate-early genes are not only characterized by a lack of requirement for new protein synthesis prior to their transcriptional induction; they also possess distinct genomic features that may have been selected to confer rapid inducibility.