We set out to examine the organization and function of genomic elements responsible for transcriptional regulation by GR. Our study yielded five conclusions: (1) GR occupancy at a GRE is generally a limiting determinant of glucocorticoid response in A549 cells; (2) the core GR binding sequences conform to a consensus that displays substantial GRE-to-GRE variation as anticipated, but the precise binding sequences at individual GREs are highly conserved through evolution; (3) GREs appear to be evenly distributed upstream and downstream of their target genes; (4) most GREs are positioned at locations remote from the TSSs of their target TSSs; and (5) native GREs are commonly composite elements, comprised of multiple factor binding sites, and they are individually conserved in position and architecture yet very different from each other. We shall consider the implications of these conclusions in turn.
We began by surveying more than 1,000 genes, with half of them candidates for steroid regulation, and a specific subset known to be GR-regulated in A549 cells. We found that GR occupancy of A549 GREs correlated strongly (nearly 90%) with genes that are glucocorticoid responsive in A549, suggesting that GR binding is generally a limiting determinant for response in these cells. In a small number of cases, we observed GR occupancy close to genes that were GR-unresponsive in A549 cells, but were steroid regulated in other cells [
4] (E. C. Bolton and K. R. Yamamoto, unpublished results). This implies that GR occupancy at these genes likely reflects bona fide response element binding, but that GR binding is not a limiting factor for glucocorticoid regulation of this minority class of genes in A549 cells. Collectively, our data suggest that restriction of GR occupancy in A549 cells may be responsible for much of the cell-specific GR-mediated regulation in these cells. The mechanisms of occupancy restriction could be positive or negative mechanisms, such as accessory factors that stabilize GR binding, or chromatin packaging that precludes it. Although the strong correlation between GR occupancy and glucocorticoid responsiveness in A549 cells seems likely to hold in other cell types, it is conceivable that responsiveness may be determined differently in other cell types. Thus, it will be interesting to examine cell-specific GR regulation in other cells to complement the observations made in A549 cells. It is intriguing that one component, GR, within such varied and complex machineries would so strongly predominate as a determinant of transcriptional regulation in A549 cells. It will be interesting to examine regulatory complexes that mediate other types of responses (e.g., heat shock and DNA damage) to assess whether response element occupancy by a single factor in each class is a dominant determinant of responsiveness.
We examined sequence conservation of a set of GREs that are occupied by GR both in human lung epithelial cells and in mouse mesenchymal stem cells. We found that the 15-bp core GR binding sequences varied greatly among the different GREs (B), whereas the sequences of the individual binding sites were nearly fully conserved across four mammalian species (C). Crystallographic studies demonstrate that GR makes specific contacts with only four bases of the 15-bp core binding sequence [
35], yet every position, including the “spacer” between the hexameric half sites, appears to be equivalently conserved. This indicates that the binding sequences serve functions in addition to merely localizing GR to specific genomic loci and instead may carry a regulatory code that affects GR function. Leung et al. reported similarly strong evolutionary conservation of individual κB binding sequences [
36]. Indeed, Luecke and Yamamoto showed that GR directs distinct regulatory effects when tethered to NFκB at two κB response elements that differ by only one base pair [
7]. Thus, one interpretation of our data findings is that factor binding sites may serve as allosteric effectors [
19] in which individual binding sequences convey subtle conformational differences to specify distinct factor functions. Conceivably, this hypothesis might also explain why GR predominates as a limiting determinant of responsiveness, because factors that read allosteric regulatory codes might specify the rules for assembly of GRE-specific and thus gene-specific regulatory complexes.
To characterize the architecture of GREs, we took several approaches. In unbiased computational analyses, we identified enriched sequence motifs within 500-bp segments encompassing core GR binding sites. Sequence motifs resembling binding sites for GR, AP-1, ETS, SP1, C/EBP, and HNF4 were overrepresented relative to a background of unbound GR regions, consistent with the notion that native GREs are composite elements. For most of these GREs, the role of these factors in GR transcriptional regulation remains to be tested, but it is notable that ETS-1, SP1, and HNF4 have been shown at other genes to augment glucocorticoid responses [
37–
39]. Moreover, Phuc Le et al. [
40] described motifs resembling AP1 and C/EBP binding sites within certain mouse GREs and showed that nearly half of the GREs predicted to encompass C/EBP binding sites did indeed bind C/EBPβ [
40]. These findings further the view that our computational analysis can infer factors that potentially interact with GR at GREs. Using a similar approach, Carroll et al. [
9] and Laganiere et al. [
41] have interrogated estrogen response elements and identified FOXA1 as a factor playing an important role for both estrogen receptor binding and transcriptional activity. Thus, we anticipate that the factors that occupy the GR composite elements may interact physically, functionally, or both, thereby affecting binding as well as regulatory activity. Indeed, an averaged comparison of human and mouse sequences flanking core GR binding sites revealed that a region of approximately 1 kb was conserved above the background level (C), suggesting that native composite GREs are extensive and typically may contain numerous factor binding sites. Interestingly, individual GREs displayed distinctive patterns of sequence conservation extending from the core GR binding sites (D;
Figure S3). These GRE signatures likely reflect conservation of various sequence motifs at different positions within each element, producing GRE-specific (and therefore gene-specific) architecture that likely creates distinct regulatory effects.
To investigate the distribution of regulatory elements relative to their target genes, we monitored GR occupancy across 100 kb regions centered on the TSSs of glucocorticoid responsive genes. We found that GREs were evenly distributed upstream and downstream of their target genes with the majority located >10 kb from their target promoters; other metazoan regulatory factors, such as estrogen receptor (ER) and STAT1, have similarly been reported to act from sites remote from their target genes [
9,
42–
45]. In contrast to these factors, E2F1 was shown to mainly bind promoter proximal regions [
42]; others have used computational approaches to infer factor binding sites close to promoters, but these have not been experimentally confirmed [
46]. In parallel with our findings, Carroll et al. reported that only 4% of estrogen receptor ER binding regions was mapped within −800 bp to +200 bp from TSS of known genes from RefSeq [
43]. Our data demonstrated that 9% of GBRs were positioned at this location. These studies together imply that steroid receptors, which include estrogen receptor and GR, in general regulate transcription from remote locations. Interestingly, we found that the positions of individual GREs were generally conserved across species (
Table S2), implying that GRE position may be functionally important for target gene regulation. In any case, our findings differ dramatically from those in prokaryotes and fungi, where transcriptional regulatory elements are promoter proximal. It has been suggested that these two broad classes of regulatory mechanisms, so-called long range and short range, are mechanistically and evolutionarily related, and that long range control might facilitate regulatory evolution [
11]. As predicted by that model, distal elements, far from target genes as measured by linear DNA distance, may operate in close proximity with their target promoters in 3-D space
. For example, Carroll et al. detected an interaction between the
NRIP-1 promoter and its distal estrogen response element [
9]. It will be interesting to determine whether response element location (i.e., promoter proximal versus distal) is somehow related to mechanism or to physiological network.
Remote response element locations can complicate assignment of cognate target genes. An extreme example is olfactory receptor gene expression, which is governed by a regulatory element that can operate on target genes located on different chromosomes [
47]. In this study, we assigned the GREs to the nearest RefSeq gene responsive to dex in A549 cells. In other contexts, these GREs may be nonfunctional or may operate on genes other than those assigned in A549 cells (). Clearly, unequivocal assignment of a GRE to a given target gene will require genetic manipulations not readily accessible in mammalian cells at present. It is encouraging, however, that GR occupancy of GREs correlated strongly with glucocorticoid responsiveness of adjacent genes, supporting the view that these are bona fide direct GR targets (; ). In fact, when these genes were subjected to Gene Ontology analysis, we found that they were enriched in cell growth and immune responses (unpublished data), two biological processes regulated by GR in A549 cells [
48,
49]. We found GR occupancy at genes up- and down-regulated in response to dex, consistent with GR serving either as activator or repressor in different contexts. At present, we cannot assess the significance of the finding that GR was detected at GREs adjacent to activated genes versus repressed genes at a 6:1 ratio in A549 cells; whether this difference reflects differences in GRE occupancy, epitope accessibility, crosslinking efficiency, or other variables has not been determined.
Genomic response elements orchestrate transcriptional networks to mediate cellular processes for single- and multicellular organisms. The present study advanced our understanding of the organization, evolution, and function of GREs and at the same time raised a series of interesting questions. Among the more intriguing: How is GR occupancy restricted to a small subset of potential GREs in a given cell context? What is driving the strong conservation of virtually every base pair within the core GR binding sequence at individual GREs? Addressing these and other questions raised in our study will contribute additional new insights about gene regulation by GR and by other regulatory factors.