|Home | About | Journals | Submit | Contact Us | Français|
Histone modifications play a key role regulating transcription and thus ultimately regulate cellular development and differentiation. To understand how histone modifications influence normal development and disease states, a global catalogue of histone modifications and modifying enzymes in normal and disease states is necessary. The first such systematic mapping experiments using the recently developed ChIP-Sequencing technique have revealed a combinatorial modification “backbone” consisting of multiple histone modifications associated with active transcription. The human epigenomic datasets that are now being produced provide valuable resources for a better understanding of the functional regulatory elements of transcription and the pathways necessary for normal cellular development and pathological conditions.
The term “epigenetics” commonly refers to the study of mitotically and/or meiotically heritable changes in gene function that are not attributable to a change in DNA sequence. An “epigenome” is an –omic representation of all epigenetic phenomena across the genome . The best studied epigenetic phenomenon is DNA methylation [2–5]. It is currently debated whether other chromatin modifications also belong to the class of “epigenetic” modifications due to the lack of evidence of heritability . However, DNA methylation and histone methylation can influence each other . Furthermore, many histone modifications have been shown to play important functional roles in genomic regulation; their importance is unquestioned even if their label as true “epigenetic” marks may be uncertain. For practical purposes, we will define an epigenome as the genome-wide distribution of all modifications to chromatin in a cell. While there is only a single genome for each organism, each cell type or developmental state within an organism can have its own epigenome.
Although a number of techniques are available to analyze chromatin modifications at a genomic scale including ChIP-chip and ChIP-SAGE (for a recent review see ), the combination of ChIP with massively parallel sequencing (ChIP-Seq) now allows for the genome-wide analysis of epigenetic modifications in mammalian systems [9••,10••]. The application of these techniques has led to great advances in our understanding of how epigenetic modifications contribute to important biological processes. Because the DNA methylome has been the subject of several recent reviews [5,11], we will focus our discussion here to the genome-scale studies of histone modifications.
Histone methylation has been extensively studied in recent years and is implicated in crucial functions in many cellular pathways . Once thought a stable modification, it is now clear that histone methylation is a modification that is dynamically regulated by a variety of histone methyltransferases and demethylases [13,14]. These enzymes play a role in transcriptional regulation and are linked to many human diseases [5,15].
The trimethylation of histone H3 lysine 4 (H3K4me3) is highly enriched in the promoter regions of active genes in yeast , fly , and human cells [18–19, 20•]. Our recent profiling of 20 lysine and arginine methylation marks in human T cells found that H3K4me3 is highly localized surrounding the transcription start site (TSS) and H3K4me2 and H3K4me1 gradually spread towards the transcribed regions of active genes (Figure 2) [9••], which is similar to observations in yeast . The high-resolution nature of the analysis revealed a decrease of the H3K4me3 signals immediately upstream of TSSs. This is due to a decrease in nucleosome occupancy just upstream of the TSS, a feature that correlated with RNA polymerase II binding . Although both H3K36me3 and H3K79 methylation are targeted to the transcribed regions of active genes, H3K36me3 peaks near the 3′ ends whereas H3K79 methylation peaks at the 5′ ends of genes (Figure 2) [22••]. The methylation marks associated with gene silencing include H3K9me2, H3K9me3, H3K27me2, H3K27me3 and H4K20me3 (Figure 2) [9••]. One surprising discovery is that the monomethylation of the H3K9, H3K27 and H4K20 residues are all associated with actively transcribed regions, even though their di- and trimethylated states correspond to silent genes [9••]. These results suggest that all the monomethylated forms of lysines are associated with active transcription.
Abnormalities in methylation patterns, resulting from chromosomal translocation of methylation enzymes, are associated with many types of cancers, including leukemia, lymphomas and carcinomas [5,23]. However, the underlying mechanisms that determine the translocation hot spots are largely unknown. Genomic breakpoints tend to colocalize with open chromatin regions, such as DNase I hypersensitive (HS) sites . Indeed, comparison between active marks and somatic breakpoints revealed a predominant co-localization in T cell cancers (62%), that was less dramatic (26%) in non-T cell cancers [9••]. This may partially explain why certain subtypes of cancers are associated with specific genomic breakages. In addition to providing functional information for the methylation of histone lysine or arginine residues, these data sets constitute a reference epigenome for characterizing epigenomic defects in T cell-related diseases.
Another critical histone modification is acetylation. Genome-wide studies in yeast have revealed that both hyper- and hypoacetylation of lysine residues are associated with transcription [16,24–27]. In addition, H4K12 acetylation has been shown to have a role in transcriptional silencing [25,28]. Previous functional studies of histone acetylaton in human and other mammalian organisms mainly focused on the diacetylation of H3 lysines 9 and 14 (H3K9ac/14ac) as well as the hyperacetylation of H4. The first genome-wide acetylation mapping in human T cells found that H3K9ac/K14ac is highly elevated in promoter regions and its level is correlated with gene expression . Additional studies on chromosomes 21 and 22 of human HepG2 cells  and across 1% of the human genome in five cell lines  reached similar conclusions.
Our recent profiling of 18 acetylated lysine residues on four core histones in human CD4+ T cells using ChIP-Seq revealed that all were correlated with active transcription [22••]. In addition, three gene-centric patterns of histone acetylation can be described: (1) acetylation highly elevated at the transcription start sites (TSSs) but low at the transcribed region as exemplified by H3K9ac; (2) acetylation elevated at the TSSs and gradually decreasing across the transcribed region (H3K23ac); (3) acetylation elevated both at the TSSs and across the transcribed region (H4K12ac). The acetylated residues of the first pattern are probably involved in transcriptional initiation, whereas the residues with the last two patterns are probably involved in transcriptional elongation. These observations are consistent with biochemical data on histone acetyltrasferases (HATs). For example, H3K14ac (pattern II) and H4K8ac (pattern III) are affected by deletion of PCAF/GCN5, but not by deletion of CBP/p300 . It is known that PCAF associates with the elongating form of RNA Pol II whereas p300 interacts with the initiation form of Pol II .
The mechanisms by which histone modifications modulate gene expression remain unresolved. The histone code hypothesis suggests that different histone modifications on the same tail or different tails can act sequentially or form a combinatorial pattern to specify a biological event [33,34]. These codes are interpreted by protein factors with specific binding domains such as bromodomains for acetylation marks or chromodomains for methylation marks to execute a biological process [33,34]. However, different models including charge neutralization  or the signaling pathway model  have been proposed to illustrate the effects of histone acetylation . An extensive study using mass spectrometry revealed more than 150 combinatorial patterns within amino acids 1–50 of histone H3.2 in HeLa cells . An apparent lack of hierarchy among these modification patterns prompted the authors to suggest a model in which corresponding enzymes that methylate or demethylate act autonomously without much cross-talk. However, due to the lack of high-resolution, genome-scale maps of histone modifications, it has been difficult to determine whether there exist gene-specific combinatorial patterns of histone modifications until recently.
By characterizing the genome-wide distribution of 38 histone methylation and acetylation marks at mono-nucleosome resolution in human resting CD4+ T cells, we were able to analyze combinatorial patterns of histone modification across the human genome and found numerous distinct combinatorial patterns [22••]. Further analysis of the patterns revealed that: (1) the H3K27me3 modification appears to be dominant because all patterns containing this modification tend to be repressive; (2) the H3K4me3 modification alone is not sufficient to support active transcription because the genes associated with H3K4me3 alone tend to be silent; (3) the histone modification pattern alone does not determine the expression level; genes associated with many patterns show an extremely broad range of expression from silent to active. A striking discovery from this analysis was the identification of a common pattern (a “backbone” consisting of 17 modifications) associated with 3,286 promoters [22••]. The number of promoters associated with this pattern decreases by over 90% if any one of the modifications is removed from the pattern.
It is not clear why histone tails are subject to so many modifications simultaneously. One possible explanation is that these modifications may function together to provide a robust mechanism of maintaining a chromatin conformation compatible with gene activation by acting sequentially, cooperatively and/or redundantly. It appears that the addition of active modifications to this backbone is cumulative, with genes displaying higher levels of transcription being associated with a larger number of active modifications, including both acetylation and methylation. For further refinement in our understanding of the function of histone modifications, it is clearly important to analyze other modifications such as phosphorylation, sumoylation and ubiquitination in these cells.
ES cells serve as an excellent system to study the contribution of histone modifications and their corresponding enzymes to differentiation. For example, a systematic RNAi screen revealed that Tip60, a subunit of a HAT complex, is required to maintain ES cell identity . Another approach, involving the profiling of H3K27me3, revealed that it is associated with critical differentiation genes in both human and mouse ES cells [40,41], and is suggested to repress the expression of these genes before differentiation. Indeed when murine ES cells are differentiated to terminal neuronal cells, many neuron-specific developmental genes, that are targets of Polycomb group proteins at the progenitor stage, lose H3K27me3 and become fully activated after terminal differentiation .
One of the most interesting observations in ES cells is the co-existence of H3K4me3 and H3K27me3, termed “bivalent domains”, detected at many critical developmental genes [10••,43, 44•, 45, 46]. Although proposed to play a key role for the pluripotency of ES cells, these domains also exist in other cell types, including hematopoietic stem cells (HSCs) [47••], neuronal stem cells , human CD4+T cells [20•] and various T helper cells .
Our recent profiling of histone methylation and H2A.Z in human CD133+ HSCs and the cells differentiated to erythrocyte precursors revealed 2,910 bivalent promoters in the HSCs [47••]. Among these bivalent promoters, the majority (53%) lose H3K4me3 while only a small fraction (19%) lose H3K27me3 after differentiation. This observation suggests that most of the bivalent genes lose their activation potential by losing H3K4me3 during differentiation. However, it is not clear how these modifications are resolved during differentiation. Examination of other modifications revealed that the bivalent promoters that lose H3K27me3 after differentiation are already associated with increased levels of H2A.Z and several monomethylation marks including H3K4me1, H3K9me1 and H4K20me1 as well as Pol II in the HSC stage [47••]. These data suggest that the fate of bivalent genes and their expression potential during differentiation are linked to the chromatin modifications in the stem cell stage. Our recent mapping of H3K4me3 and H3K27me3 in various T helper cells also indicated that the bivalent modification of critical transcription factor genes provides a mechanism of reprogramming their expression states and underlies the plasticity of differentiated T cells. Collectively, the data from the various systems discussed above suggest that changes in gene expression during differentiation are associated with epigenetic changes. Thus, epigenetic changes may serve to facilitate either activation or repression and provide a robust mechanism for stabilizing the changed gene expression state.
The mechanisms responsible for the apparently opposing modifications of H3K4me3 and H3K27me3 to be detected at the same genomic locus remain unresolved. There are five scenarios that could explain the detection of co-occurrence of H3K4me3 and H3K27me3 in the genome: (Figure 1).
Even though these scenarios only considered the methylases, demethylases for H3K4me3 and H3K27me3 may also be involved in the generation of the bivalent modifications. Each of these scenarios could result from a dynamic equilibrium of two opposing modifications that is regulated by the corresponding enzymes. These processes provide an opportunity to either generate an “active” or “repressive” chromatin environment by shifting the equilibrium of the modifying enzymes via different mechanisms such as regulating the protein level or location, or altering the level of an interacting protein as we proposed previously [20•].
Although both active and inactive promoters can be associated with H3K4me3[9••,49••] and histone acetylation[22••], a set of distinct epigenetic features can distinguish active from inactive promoters [9••]. Actively transcribed genes have elevated levels of H2BK5me1, H3K9me1, H3K27me1, H4K20me1, H3K36me3 and H3K79me1,2,3 as well as the acetylation of H4K5/8/12/16 in their gene body regions (Figure 2).
Although transcription start sites (TSSs) can be predicted with some success, it is often difficult to predict enhancers solely based on sequence information [50,51]. Our previous data have suggested that histone acetylation islands outside of promoter regions may indicate the presence of functional enhancers  and can be used to predict enhancers with both conserved and non-conserved sequences . A recent study by Heintzman et al. using a ChIP-chip assay suggested that the epigenetic signature for enhancers is the presence of H3K4me1 but not H3K4me3 [53•]. However, using the ChIP-Seq technique, our lab has found that enhancers can also be associated with H3K4me3 and H2A.Z in addition to H3K4me1 [9••]. The discrepancy between our study and the Heintzman et al. study could be attributed to the fact that different criteria were used to identify enhancers. Heintzman et al. used p300 binding sites to identify enhancers while our work has used DNase HS sites to identify putative enhancers. Since DNase HS sites are detected at both promoters and enhancers [54,55], we restricted our analysis to only those in the intergenic regions [22••]. Our genome-wide survey of 39 histone modifications on 4,179 potential enhancers has revealed that five histone modifications (H3K4me1, H3K4me2, H3K4me3, H3K9me1 and H3K18ac) and the histone variant H2A.Z are detected at more than 20% of enhancers [22••]. However, because enhancers and inactive promoters can be associated with a similar set of histone modifications including all three states of H3K4 methylation and histone variant H2A.Z [22••], it is difficult to distinguish between them solely based on histone modification patterns. Furthermore, there may be un-annotated TSSs of non-coding RNAs or other unknown transcriptional units, which may be modified in a way similar to the promoters of protein coding genes. Identification of all transcriptional units, particularly their TSSs, in various tissues using newly developed techniques such as RNA-Seq may help to clarify this issue. Therefore, a comprehensive strategy combining chromatin modification patterns, nuclease hypersensitivity, and gene annotation is necessary to more accurately predict functional regulatory elements of transcription. Additionally, the development of high-throughput techniques for assaying enhancer activities will also be necessary for genome-wide enhancer identification and confirmation.
As discussed in this review, we have witnessed remarkable progress in our understanding of human epigenomes. The emerging picture is that many histone modification patterns, including both acetylation and methylation, are associated with promoters and enhancers and many others are enriched in actively transcribed regions. Many interesting questions have arisen based on the new knowledge. We still do not know what roles these histone modification patterns are playing in transcriptional regulation, how these patterns are established during development or how they are maintained from one generation to the next. The histone modification pattern observed at any time during development results from balanced activities of HATs/HDACs and HMTs/HDMs. Therefore, the systematic mapping of locations of these enzymes will shed light on the mechanisms of pattern establishment. However, histone modifying enzymes don’t necessarily recognize specific DNA sequences and they can likely be recruited by transcription factors or even RNA polymerases. To fully elucidate these networks, it will be necessary to map the binding sites of transcription factors active during development. In combination with other strategies, these genomic mapping experiments will provide critical information to clarify many interesting questions.
Next-generation sequencing based “Seq” assays, such as ChIP-Seq, MeDIP-Seq, DNase-Seq and RNA-Seq, are rapidly expanding our knowledge of human genome function and epigenomics. In combination with other computational, genetic and biochemical strategies, we expect these techniques will lead to big advances in the understanding of human epigenomes and their contribution to normal development and disease conditions.
The work in Zhao lab was supported by the Intramural Research Program of National Heart, Lung, and Blood Institute, NIH.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Papers of particular interest, published within the period of review, have been highlighted as:
• of special interest
•• of outstanding interest