|Home | About | Journals | Submit | Contact Us | Français|
In metazoans, transcription is regulated by promoters and additional elements, which may be located far from their target gene(s). Moreover, genes (including those encoding cytokines and cytokine receptors) are commonly clustered in the genome, providing the opportunity for the shared, competitive or sequential use of regulatory elements. New techniques, discussed here, are generating an avalanche of high-resolution genome-wide data through which candidate regulatory elements have been identified in specific cell types (including T cells), their functions inferred, and their physical interactions in 3-dimensional space demonstrated. As a result, a nearly comprehensive list of regulatory elements in the Th2 cytokine locus, a growing list of elements in the interferon-γ gene locus, and maps of their 3-dimensional interactions are now available, though much remains to be learned about the molecular mechanisms at play, the dynamics of these interactions and their functional importance.
Effector T lymphocytes are characterized by the cytokines they produce and the effector functions they exert. Th1 cells produce IFN-γ as their signature cytokine and are important for host defense to intracellular pathogens, Th2 cells produce the cytokines IL-4, IL-5 and IL-13 and protect against helminths, whereas cells of the more recently identified Th17 lineage produce IL-17a, IL-17f, IL-21, and IL-22 and provide immunity against extracellular bacterial and fungal pathogens for which Th1 or Th2 cells are not sufficiently protective. By contrast to these protective functions, Th2 cells orchestrate allergic responses, while Th17 and Th1 cells are involved in autoimmune diseases.
Th1 development is specified by the cytokines IL-12 and IFN-γ, which signal through the transcription factors (TFs) STAT4 and STAT1 and induce the Th1 ‘master regulator’ TF T-bet, whereas IL-4 instructs Th2 commitment, signaling via STAT6 and inducing the Th2 ‘master regulator’ TF GATA-3. In the past two years, Runx3 and Runx1 have been shown to collaborate with T-bet to augment IFN-γ expression, silence IL-4 expression and reinforce Th1 lineage commitment (1, 2). At the same time, work from several groups has shown that IL-6 and TGF-β initiate Th17 differentiation via STAT3, the orphan nuclear receptor RORγ(t) and its paralog RORα (3), inducing these cells to produce IL-21, which acts in an obligatory and autocrine manner to reinforce the Th17 fate, while IL-23 produced by antigen presenting cells sustains expression of these TFs and Th17 survival and cytokine secretion thereafter.
But where and how do these TFs act? TFs control cytokine expression in part by binding to and recruiting RNA polymerase II containing complexes to gene promoters, but promoters alone are insufficient for proper lineage-specific cytokine expression. Additional and often quite distal regulatory elements are required for proper expression, as Agarwal and Rao discussed in their review of cytokine gene expression, which appeared in this journal ten years ago (4). Work done subsequently has sought to identify the complete set of regulatory elements in cytokine gene loci and to assess their contribution to cytokine gene expression in vitro and in vivo. The current review discusses this work, focusing in particular on the last several years. During this time, technological innovations have permitted distal regulatory elements to be mapped more comprehensively and more globally, yielding an increasing more complete view of the true dimensions of cytokine gene loci, the numbers of regulatory elements within, and the dynamic evolution of their epigenetic marks and physical and functional relationships in response to environmental cues and T cell differentiation.
Eukaryotic promoters are located directly upstream of the transcribed region, providing a site for recruitment of RNA polymerases and subsequent initiation of transcription. Promoter-driven transcription is influenced by cis-regulatory elements – sequences located on the same chromosome anywhere from a few kb to tens or hundreds of kb from the gene(s) they help to regulate. Cis-regulatory elements that modulate gene expression include enhancers, silencers, locus control regions (LCRs), insulator/boundary elements and matrix attachment regions (MARs). Enhancers greatly increase the basal level of gene transcription regardless of their orientation and location upstream, downstream or even within introns. In contrast, silencers repress expression, either actively or in a ‘hit-and-run’ fashion by inducing the formation of heritable alterations to chromatin (5). LCRs contain both enhancer and insulator activity and are defined as cis-regulatory elements conferring tissue-specific, copy-number-dependent gene expression (6). Insulators/boundary elements may demarcate genomic loci and establish or maintain the organization of chromatin into domains that allow genes to be regulated independently of the influence of neighboring loci (7). Some insulators function as barriers - protecting a genomic locus from silencing by surrounding heterochromatin - while others block the activity of neighboring enhancers. MARs are thought of as sequences through which chromatin loops are tethered - perhaps to the nuclear matrix (8).
Gene expression is governed by these regulatory elements through the binding of TFs and other regulatory proteins, with the probability of binding determined by factor abundance and ability to access target regulatory elements in chromatin. Chromatin accessibility at regulatory elements is influenced by nucleosome composition, position and interactions with DNA, post-translational histone modifications, and methylation of cytosines within CpG dinucleotides. Such modifications also serve as landmarks for the presence and activity of regulatory elements in a given cell type (9-10,11•) and techniques for their detection are essential tools for finding candidate regulatory elements. Regulatory elements often show evolutionary conservation, which means that they may be predicted by computational searches for evolutionarily conserved non-coding sequences (CNSs) (Box 1).
This box describes complementary experimental approaches by which to detect candidate gene regulatory elements. By comparing findings in cell types that do or do not express the gene(s) of interest (e.g., Th1 vs. Th2 cells) or that do or do not have the potential to do so (e.g., naïve CD4 T cells vs. fibroblasts) element detection is enhanced and, in some cases, the function of elements can be imputed (Figure 1), thus helping to prioritize and to inform the design of experiments to test function directly.
This assay detects modified histones, transcription factors and other regulatory proteins by immunoprecipitation followed by analysis of the associated DNA; provides information regarding the location of regulatory elements, genes/gene loci, locus boundaries and their activity in that cell type.
This assay assesses DNA cytosine methylation. DNA methylation of regulatory elements, particularly at promoters and the proximal transcribed regions, typically inhibits transcription. Cytosine methylation is assessed as for ChIP-chip or ChIP-Seq using an antibody specific for methylated cytosine. Allows genomewide semi-quantitative assessment of DNA methylation; this approach may underestimate or fail to detect CpG methylation in regions in which the density of CpGs is low; resolution is less than that of sequencing of bisulfite-modified DNA (58).
HS sites are regions where the density of nucleosomes is reduced or the association of DNA with nucleosomes is otherwise altered so that the DNA is more sensitive to digestion with DNaseI (and/or other nucleases) than surrounding regions. HS sites are present at all or nearly all gene regulatory elements, including promoters, enhancers, silencers, boundary elements, and locus control regions, that are active or poised for activity in the cell type evaluated.
These assays assess the physical proximity between DNA sequences as they occur in the nucleus. As noted in the text, distal regulatory elements are often approximated to the genes they regulate and active or repressed genes are often approximated to each other at transcription factories or in heterochromatin, respectively.
This assay assesses the sub-nuclear location of genes/loci/chromosome territories and physical proximity of two or more genomic regions to each other. The information is complementary to that obtained by 3-5C; resolution of sequences separated by as little as 90kb of linear DNA can be achieved (62).
3D-FISH – FISH is performed on cells processed in a manner that the 3-dimensional structure of the nucleus is preserved, with or without 3-D reconstruction of images obtained by confocal fluorescence microscopy.
Cryo-FISH - A variation of 3-D in which 100-200 nm cryosections are made of paraformaldehyde fixed cells, which improves resolution by removing out of focus light that would otherwise be reflected by objects outside the section in the z axis (63).
Immuno-FISH – Immunofluorescence and FISH are done together to detect protein co-localization with specific genes/loci/chromosomes.
Nucleosomes are typically displaced or altered in conformation at functional regulatory elements, thereby rendering the DNA at these sites hypersensitive to digestion by DNase I. Thus, DNase hypersensitive (HS) sites denote the presence of regulatory elements functional in the cell type studied (12). Patterns of histone modifications, DNA methylation and binding of specific transcription and regulatory factors, when correlated to gene expression patterns in specific cell types, provide additional information by which the function of specific elements may be inferred. Technological advances have markedly accelerated element discovery, by providing high-resolution, genome-wide profiles of epigenetic marks in embryonic stem cells and certain differentiated cell types, including primary human CD4 T cells (12, 13••, 14-15, 16•, 17-18, 19•).
Chromatin immunoprecipitation (ChIP) paired with genome tiling arrays (Chip-chip) or high-throughput sequencing (ChIP-seq) has revealed that in active or poised genes histone H3 mono- or di-methylated on lysine 4 (H3-K4me1 or H3-K4me2) marks the transcriptionally permissive chromatin of distal regulatory elements, transcribed regions and promoters (Figure 1); H3-K4me3 is markedly enriched on nucleosomes flanking the transcription start site; H2A.Z (a variant of H2A) is present at promoters and distal regulatory elements but not within transcribed regions; RNA polymerase II (Pol II) and TAF1 are bound to transcription start sites and nucleosomes are displaced from them (20); DNase HS sites, detected using DNase-chip or DNase-seq, are found at the promoters and at intronic and distal regulatory elements of genes with these permissive chromatin marks (11•, 19•); these marks and Pol II binding are most intense at actively transcribed genes, are commonly detected though less intense at primed and poised ‘bivalent’ genes, but are not detected at poised ‘null’ and silenced genes (see below and Figure 1). H3-K4me3 coordinates proper transcription initiation by docking TFIID, CHD1, BPTF/NURF and MLL complexes, which in turn facilitate chromatin remodeling, transcript elongation, splicing and histone acetylation, sustain H3-K4 methylation, and remove repressive H3-K9 and H3-K27 methylation (20-22). H3-K36me3 is present throughout the transcribed region of genes undergoing active transcription. Thus, focal H3-K4me3 followed by a region of H3-K36me3 identifies active promoters and their transcribed regions with the magnitude of these marks correlating directly with transcription. Conversely, H3-K27me3 is not present in genes marked by H3-K36me3, but rather at silent genes, consistent with its role in Polycomb-mediated repression (23). H3K9me3/2 is an alternative mark associated with the silent heterochromatin associated with repetitive elements and transposons. However, discrete peaks of H3K9me3/2 are present in some active genes where they are thought, like H3-K36me3, to inhibit inappropriate transcription initiation (24). In contrast to the repressive nature of H3-K27me3/2 and H3K9me3/2, H3K27me1 H3K9me1 are found at active genes. Promoters of some genes are ‘bivalent’ - marked both by H3-K4me3 and H3-K27me3 (13••, 16•, 25). ‘Bivalent’ marks are most commonly found at CpG rich promoters of developmentally regulated genes that are inactive but poised for induction or silencing on differentiation, suggesting that lack of expression in this context is an active process. By contrast, CpG poor promoters of tissue-specific genes that are rapidly and transiently activated in response to environmental stimuli, i.e., immune response genes, often have neither mark in precursors but gain H3-K4me3 or H3-K27me3 as they differentiate into expressing or non-expressing cell types, respectively (13•, 16•, 25). Boundary elements are suggested by the presence of an HS site separating a domain of accessible chromatin from repressive chromatin, where CCCTC-binding factor (CTCF) and cohesin mark and create chromatin domain boundaries (13••, 16•, 19•, 25-26, 27•). However, CTCF and cohesin may also bind within transcribed regions, and at these sites their binding does not necessarily impede transcription and chromatin remodeling, as is the case for a binding site in the first intron of Ifng, for reasons as yet unclear (26•, 27•). Though not defined by such global studies, silencers may have ‘bivalent’ marks that resolve with further differentiation, as shown for the Il4 silencer (28).
In organisms from S. cerevisiae to humans, gene organization is non-random, in that co-regulated genes are frequently located in linear proximity on the same chromosome (29-31). In some cases these clusters arose through gene duplication, which is an important substrate for evolutionary change because one copy is free to acquire coding and/or regulatory mutations that if advantageous will be positively selected and if deleterious will be lost. Examples include the β-globin, Hox, immunoglobulin, T cell receptor and MHC/HLA loci. In other cases, genes related in function but not in sequence are clustered, e.g., imprinted genes, genes in a specific metabolic pathway or whose expression is restricted to a particular cell type. Regardless of origin, evolutionarily conserved gene clusters are likely to be advantageous. Clustering may allow common regulatory elements or particular TFs to be shared, facilitate alternative or serial expression or serial rearrangement of genes through competition for regulatory elements, co-ordinate remodeling of chromatin, position as yet inactive developmental or environmentally-induced genes in accessible chromatin associated with nearby housekeeping genes, or facilitate formation of chromatin loops and their approximation to transcriptional hubs.
Many cytokine and cytokine receptor genes are clustered. The best-characterized of these is the evolutionarily conserved Th2 cytokine locus, containing Il4, Il5, Il13, and two housekeeping genes, Rad50 and Kif3a (Figure 2). This locus contains an array of known cis-regulatory elements, including enhancers (Il13 HS1,2; Il4 HSII/III; CNS1/HSS1,2; CNS2/HSV,VA), a silencer (Il4 HSIV) and an LCR located in Rad50 introns (RHS 4-7) (32). These elements collaborate to assure proper Th2 cytokine expression, which appears to be dependent on their being approximated to each other through intrachromsomal looping (33). In a variety of cell types - even those that do not have the potential to express these cytokines - the Il13 and Il5 promoters are approximated to the Il4 promoter, creating a ‘pre-poised’ conformation. In naïve, Th1 and Th2 CD4 T cells and NK cells but not in B cells and fibroblasts these promoters are drawn more closely together and approximated to CNS1 and CNS2, and the Th2 LCR is approximated to the Il4 and Il13 promoters, creating a ‘poised’ conformation. This conformation is dependent on STAT6 and the LCR and can be induced in fibroblasts by enforced expression of GATA3 in concert with Ca++-signaling. As there are few changes in conformation unique to Th2 cells, what allows for Th2 lineage-specific transcription of this locus? One possible answer is the restricted availability of STAT6 and GATA3. A second candidate is SATB1, which organizes the genome in a cell-specific manner by anchoring DNA to the nuclear matrix (8). In Th2 cells, activation upregulates SATB1, which then binds to CNS1, CNS2 and 9 other sites extending from Il5 past Kif3a, driving the formation of additional, smaller loops (34•). Knockdown of SATB1 prevents formation of this ‘active’ locus confirmation and markedly impairs Il4, Il5, and Il13 expression. These findings demonstrate an important role for SATB1-induced loop formation in Th2 cytokine expression and suggest that a Th2 chromatin domain extends to include Kif3a. Genome-wide analysis of human CD4 T cells has shown that CTCF binds between IL5 and its neighbor IRF1 and between KIF3a and its neighbor SEPT8 (13••), perhaps helping to segregate a Th2 domain from surrounding regions; consistent with this notion, DNase HS sites that differ in presence or magnitude between human Th1 and Th2 cells are evident throughout this region (authors' unpublished observations). The genes encoding IL-3 and GM-CSF are ~500 kb downstream of Il5 and can be co-expressed by Th2 cells and eosinophils, raising the possibility of even more extended intrachromosomal interactions.
While less well characterized than the Th2 locus, recent studies have identified multiple cis-regulatory elements within ~110 kb surrounding Ifng (35-37, 38•, 39), including enhancers and boundary elements many of which bind the Th1-TFs T-bet and STAT4 (Figure 2). Of these, IfngCNS-22 is an enhancer marked by permissive histone modifications in naïve, Th1 and Th2 CD4 T cells, suggesting that it may poise this gene for induction in the early stages of Th1 differentiation (36, 38•). By contrast, IfngCNS+46 has insulator activity in vitro (38•), which might limit effects of potentially repressive downstream regions in Th1 cells and/or orchestrate spreading of repressive H3-K27me3 modifications in Th2 cells. The IfngCNS-6 (previously referred to as CNS1) enhancer is approximated to the Ifng promoter in naïve, Th1 and Th2 cells (33), although whether this reflects approximation through looping or merely linear proximity on the chromosome is unclear, while IfngCNS+18/20 (CNS2) is approximated to the Ifng promoter only in Th1 cells, presumably through looping (33). Additional loops approximate other distal elements to the Ifng promoter, some of which are common - perhaps representing a basal ‘poised conformation’, whereas others differ between Th1 and Th2 cells (authors' unpublished observations). The factors driving looping in the Th1 locus are unknown, but may include T-bet and STAT4. SATB1 is expressed by human naïve, Th1 and Th2 cells, but is most highly expressed by Th2 cells (40), suggesting that other architectural proteins may be at work in Ifng locus organization.
Unlike the Th2 locus in which the three cytokine genes are co-expressed and likely arose through gene duplication, the nearest neighbors of Ifng are two members (IL26 and IL22) of the IL-1o family (38•), which are expressed predominantly by Th17 rather than Th1 cells (41-46); a housekeeping gene MDM1, lies further upstream (Figure 2). These genomic relationships are conserved from bony fish to humans, with the exception of rodents, in which Il26 has been lost; in some mouse strains, the position occupied by IL26 in humans instead contains an inverted and non-expressed duplication of Il22 (Iltifb) (38•). Although Ifng and Il22 (and IL26 in humans) are typically expressed by different T effector subsets, they can be co-expressed by T and NK cells in humans and mice (43-45). Yet unanswered is whether TFs critical for driving expression of Ifng or Il22 shut off one another through active repression or by altering chromatin conformation, or whether chromatin loops promote competition between these genes for shared regulatory elements in some contexts while facilitating their coordinate expression in others.
There are a number of other evolutionarily conserved cytokine and cytokine receptor gene clusters. One cluster contains the other four members of the IL-10 family – IL10, IL19, IL20, and IL24. Originally considered to be a Th2 cytokine, IL-10 can be produced by all types of CD4 effectors and by CD8 T cells, macrophages, and B, mast and dendritic cells (47, 48). By contrast, IL-19, IL-20 and IL-24 are produced primarily by macrophages (49, 50). A number of DNase HS sites have been identified within 30 kb of murine Il10, some of which can act as enhancers (50), but their role in the regulation of Il10, whether any act on neighboring genes, and the 3-dimensional architecture of this locus are unknown. In human CD4 T cells, the IL10 gene is bracketed by strong CTCF binding sites (13••), perhaps allowing it to be regulated independently of neighboring family members. Another example is the locus containing IL12rb2 and IL23r. IL-12 is a cytokine critical for Th1 polarization, while IL-23 has been shown to augment the Th17 response. Receptors for these cytokines share a common subunit IL-12Rβ1, thus responsiveness of a T cell to one or the other is determined by which gene is transcribed from the IL23r/Il12rb2 locus, raising the possibility that their clustering facilitates alternative expression through competitive utilization of a nearby regulatory element.
While intrachromosomal interactions appear to be common, genome-wide 4C analyses (Box 1) show that interactions between regions on different chromosomes also occur (51•, 52•, 53). In principle, looping that approximates regions from different chromosomes could allow the transcription of a gene on one chromosome to be regulated in trans by regulatory elements from another. However, such trans-regulation has been tested for only a few intrachromosomal interactions, and the results do not always suggest that interactions denote function. For example, each olfactory neuron expresses only one of many olfactory receptor genes scattered across the genome, and expression correlates strongly with spatial approximation of the expressed gene to the H enhancer on chromosome 14, suggesting that approximation permits trans-regulation by this enhancer (54). However, disruption of the H enhancer only affected expression of nearby olfactory receptor genes on chromosome 14 (55). Better evidence for trans-regulation was obtained from studies in naïve CD4 T cells. The Ifng promoter on murine chromosome 10 was reported to interact in trans with the Il5 promoter, Rad50 promoter and Th2 LCR RHS6 on murine chromosome 11 (56). Mutation of RHS7 within the Th2 LCR (Figure 2) not only abrogated these interactions, but also led to delayed Ifng and decreased Il5 expression when naïve T cells were differentiated under Th1 and Th2 conditions, respectively (56). These results suggest that proximity in naïve T cells helps to poise both loci. When naïve T cells adopt the Th2 fate, intrachromosomal interactions within the Th2 and Th1 loci increase and interchromosomal interactions between these loci appear to be abolished, leading to the proposal that this switch contributes to lineage specification.
The availability of complete genome sequences and new experimental tools has allowed candidate gene regulatory elements to be identified and their function to be inferred on a genome-wide scale and at high resolution. Similarly, chromatin conformation capture assays (3C-5C) have been scaled to map all interactions within a defined region or the interactions of one region with the rest of the genome, while advances in microscopy have refined the resolution of FISH, permitting interactions identified by 3-5C to be independently validated (Box 1). These innovations can be expected to provide us with a comprehensive knowledge of candidate gene regulatory elements and their interactions in T cell subsets and other cell types of interest in the near future. A challenge thereafter will be to extend these findings, which will be derived from bulk populations, to understand the dynamics of TF interactions with regulatory elements, chromatin modifications and chromatin looping in individual cells. Studies in the Th2 locus have provided evidence that TFs we consider Th2 lineage-specific (STAT6 and GATA-3) bring together distal regulatory elements to facilitate Th2 gene expression, but what role do the Th1-specific TFs T-bet and STAT4 play in the interactions of critical cis-regulatory elements with the Ifng promoter? Similarly, trans-interactions between the Th2 and Ifng loci have been reported in naïve T cells, but confirmation of these findings and elucidation of the mechanisms and regulatory proteins involved are yet needed. And whether looping-induced physical interactions between regulatory elements are the driving force behind changes in gene expression or are more often merely coincidental due to the spatial constraints of the nucleus remains to be seen.
Work in the authors' laboratories was supported by the National Institutes of Health (T32-AI07411, R01-AI071272, N01-AI40069, R01-HD18184 - ER and CBW) the Medical Research Council, UK (MM).