In order for transcription to start, the transcription machinery must first overcome the negative effects of chromatin, the highly ordered compact form in which native DNA exists inside every cell [28
]. The fundamental structural units of chromatin are nucleosomes, which consist of 146 base pairs of DNA wrapped around a single histone octamer composed of two histone H2A–H2B dimers and one H3–H4 histone tetramer [29
]. The compaction of DNA into chromatin prevents the protein–DNA interactions required for transcription, unless these chromatin structures are decondensed and altered in ways to make the underlying DNA sequence available to transcription factors and RNA polymerase II (RNAPII).
Enhancer elements can be defined as DNA sequences that serve to recruit transcription factors which promote the decondensation of repressed chromatin and/or facilitate the assembly of the transcription machinery at gene promoters [30
]. The human genome encodes approximately 1700–1900 sequence-specific transcription factors [31••
]. These proteins usually contain two distinct domains, one responsible for the recognition of specific DNA sequences (DNA-binding domain), the other carrying out a regulatory function (regulatory domain). One primary function of the regulatory domain is to recruit cofactors that carry chromatin-remodeling activities or can directly interact with the RNAPII transcriptional machinery [32
Several classes of protein complexes are recruited to specific enhancer elements to remodel the local chromatin structures [32
] (). One class of proteins, represented by the SWI/SNF complexes, modifies the chromatin structure noncovalently in an ATP-dependent fashion [34
]. These proteins, once recruited to enhancer elements, can reposition specific nucleosomes along the DNA. Consequently, core promoters may be exposed to allow transcription to start [35
]. Alternatively, key transcription factor target sites may also be exposed to allow the assembly of functional enhancer complexes. Another class of cofactors remodels chromatin structure by introducing covalent modifications to the N-terminal tails of histones [8
]. One of the well-known modifications involves the acetylation of histones H3 and H4 at the N-terminal domains. Such modifications may directly induce the decondensation of packed nucleosomes, or serve as a platform for the recruitment of additional chromatin-remodeling factors. A number of histone acetyl-transferases (HATs) have been identified that catalyze the acetylation of histones at specific residues. The protein complexes that catalyze histone acetylation include PCAF, CBP, p300, GCN5, TRRAP, and others, which are also known to function as cofactors for many transcriptional activators [38
Figure 2 Three mechanisms by which enhancers act to enhance transcription at target promoters. (a) Transcription factors recruit nucleosome-remodeling complexes containing SWI/SNF proteins (green ovals), ATPases that can slide nucleosomes along the DNA in an ATP-dependent (more ...)
The third class of cofactors that can be recruited to enhancer elements includes so-called mediator complexes [39
]. These proteins facilitate transcription by serving as interfaces between sequence-specific transcription factors and the general transcription apparatus in eukaryotes. Transcriptional coactivators in this category, including MED1, p160, Asc2, and others, have been shown to be recruited to specific enhancer sequences to promote the assembly of functional transcription initiation complexes [40
One type of experimental evidence to suggest a DNA sequence as an enhancer element is its association with an activator protein that binds to specific DNA sequences. Although this strategy has been successfully carried out for a number of transcription factors in a variety of cell types (for example, see Refs. [10
]), the strategy is not really feasible for the determination of all enhancer elements, because of the large number of transcription factors encoded by the human genome and the number of cell types needed. Further, the mere binding of a sequence-specific DNA binding protein could lead to activation, repression, or no transcriptional consequence. Therefore, an alternative approach has been used to determine the binding sites of coactivator proteins, such as p300, binding of which is more closely related to transcriptional activation. Using this strategy, Visel et al.
recently determined the p300-binding sites in the mouse genome in forebrain, mid brain, and limb of e11.5 mouse embryos [43••
]. Between 500 and 2500 binding sites were identified in each of the embryonic tissues. That these elements function as tissue-specific enhancers was confirmed by mouse transgenic reporter assays, which showed that over 80% of the tested elements drive reporter gene expression in the tissue where p300-binding was detected. Because virtually all known transcription factors function by recruiting transcriptional coregulators, and because the number of chromatin-remodeling complexes or mediators in the genome is much less than the number of sequence-specific transcription factors, the strategy of using cofactors as one ‘marker’ for enhancer elements is a more practical approach to identify all enhancer elements in the genome. The main hurdle, however, is the availability of suitable antibodies against each of the known coactivator proteins.
Another strategy to experimentally determine enhancers stems from the initial observation that distal p300-binding sites are associated with a unique combination of chromatin modifications that involves, among others, the presence of mono-methylated histone H3 lysine 4 (H3K4me1) and the absence of the tri-methylated form of this lysine (H3K4me3) [22
] (). Indeed, when this pattern of chromatin modification signature was used to search for additional similar genomic regions in 1% sampling of the human genome, approximately 400 putative enhancers were identified that included 85% of the p300-binding sites and ~300 other sequences. Importantly, the majority of these putative enhancers are associated with DNaseI hypersensitivity, bound by coactivators p300 or MED1, and associated with additional ‘active’ chromatin marks such as histone acetylation, making them likely enhancers. When tested in reporter assays, the predicted enhancers can indeed support transcriptional activation, providing preliminary evidence for their function. With the enhancer-specific chromatin signatures, we have generated a list of more than 90 000 potential enhancers in four types of human cells ([22
]; Hawkins et al.
, unpublished data). To date, a total of 26 predicted enhancers have been tested by reporter assay in transient transfection in vitro
, and over 80% (21 out of 26) of the tested fragments were shown to possess enhancer activity, supporting the validity of this enhancer-finding method ([22
]; Hawkins et al.
, unpublished data). It is worth noting that although many chromatin modification marks are found at enhancers and can be used to predict such elements in the genome, Heintzman et al.
found that with the use of profiles of just two chromatin marks — H3K4me1 and H3K4me3 — one can achieve excellent specificity and sensitivity [22
]. Additionally, this minimal chromatin signature has been used to identify enhancers in a variety of different cell types, in both humans and mice (Ren et al.
, unpublished data).
Figure 3 A strategy to map enhancers based on their chromatin signatures. (a) Derivation of the average chromatin modification profiles from known enhancers. Enhancers (green DNA) are flanked by nucleosomes containing mono-methylated histone H3 lysine 4 (H3K4me1, (more ...)
Cell type specific activity of enhancers
One of the most prominent features displayed by enhancers, compared to that of promoters and insulator elements, is their cell type specific activities. While previous works on classical enhancers such as those in beta-globin genes have suggested such properties of enhancers, recent genome-wide studies have confirmed this on a global scale. Among the p300-binding sites identified in three embryonic tissues, the majority are occupied by the coactivator in only one of the tissues, and when tested in mouse transgenic assays exhibited tissue-specific enhancer activities [43
]. Similarly, p300-binding sites found in three human cell lines demonstrated highly cell type specific occupancy by the factor [22•
]. Furthermore, the enhancers identified in different cell types are associated with cell type dependent chromatin modification patterns. The cell type specific presence of chromatin marks, such as H3K4me1, at enhancers is closely correlated with cell type specific expression of the putative targets of these enhancers. These findings indicate that enhancers are more dynamically regulated in different cell types, suggesting that these elements are of primary importance in driving cell type specific gene expression.
Enhancer-specific transcription factors
Computational analysis of the putative enhancers discovered in the human genome has revealed a number of over-represented DNA motifs, with some matching the recognition sites of known transcription factors [22•
]. Interestingly, of the 41 motifs identified in these enhancer sequences, over 90% appear to be unique to enhancers, and exhibit no enrichment at promoters, suggesting that some transcription factors may function exclusively through these distal cis
-regulatory elements. Indeed, recent investigations into the genomic binding sites of 14 sequence-specific transcription factors in the mouse embryonic stem cells revealed two classes of in vivo
binding sites by these factors — nearly half of them, including Oct4, Sox2, Nanog, appear to bind more preferentially to distal regulatory sequences, while the rest, including cMyc, prefer to occupy promoters [42
Target genes of enhancers
One of the challenges in characterizing enhancer function is determining which genes they control. The issue arises because frequently these distal cis
-regulatory elements are located tens or hundreds of kilobases away from their target genes, and could be located at the gene body of nearby genes. Further complicating the issue, there has also been report that enhancers could activate target genes located on different chromosomes [45
To resolve the target genes of enhancers, researchers frequently assign the enhancers to the nearest genes as a first order approximation [43••
]. While in most cases, such assignment would sufficiently explain cell type specific expression of genes, there has not been any report on the rate of false positives by this strategy. A variation of the above strategy is to assign enhancers to the genes located within the same genomic segments bounded by the enhancer-blocking insulator elements, which can be experimentally determined as CTCF binding sites [22•
]. This strategy appears to capture nicely the correlation between chromatin modification patterns at enhancers and the differential gene expression at the presumed target genes. Consistent with this model, upon depletion of CTCF and presumably loss of enhancer-blocking function by the insulator elements, a significant number of genes located near previously shielded enhancers become activated (). While this strategy is conceptually simple, the limitation has been a lack of understanding of the functional mechanism for enhancer blocking by insulators. As discussed above, an emerging consensus is that CTCF binding sites act to establish long-range chromosomal interactions that would lead to the formation of local topographical constraints. Depending on the way such topographical constraints are formed, the enhancer/promoter interactions may be restricted in different ways, and therefore different assignments may be made for the enhancers.
Figure 4 Enhancer activity in the absence of CTCF. The effect of CTCF knockdown in human cells is depicted in this model, illustrating that activation of transcription (green block arrows) occurs due to loss of the enhancer-blocking insulator function, as described (more ...)
In principle, a more direct approach for assigning enhancers to target genes is to experimentally determine the long-range chromosomal interactions between enhancers and target promoters [46
]. This can be accomplished by the Chromosome Conformation Capture (3C) method [47
] or its high throughput variations including 4C (circular 3C) [48
] or 5C (3C carbon copy) [45
]. This strategy is based on the observations that active enhancers are brought in close proximity to target promoters through DNA looping [47
]. In many cases, this method has helped to define unexpected target genes, such as those located in different chromosomes [45
]. The future will see more enhancer/target relationships defined using this strategy.