|Home | About | Journals | Submit | Contact Us | Français|
Eukaryotic genomes are packaged into chromatin, where diverse histone modifications can demarcate chromatin domains that are amenable or block gene expression. While silent chromatin has been associated with long noncoding RNAs (lncRNAs) for some time, new studies suggest that noncoding RNAs also modulate the active chromatin state. Divergent, antisense, and enhancer-like intergenic noncoding RNAs can either activate or repress gene expression by altering histone H3 lysine 4 methylation. An emerging class of enhancer-like lncRNAs may link chromosome structure to chromatin state and establish active chromatin domains. The confluence of several new technologies promises to rapidly expand this fascinating topic of investigation.
Diverse sets of cellular mechanisms are employed to properly control gene expression under normal and stressed states. Organization of eukaryotic DNA into chromatin represents a significant layer of gene regulation, and active chromatin—representing sites that are available for transcription and other DNA-templated processes—are marked by specific histone variants and histone modifications [1–3]. For instance, histone H3 has many characterized sites for covalent modification; in particular, lysine 4 on its N-terminal tail (H3K4) can be mono-, di-, or trimethylated, which generally correlate with enhancers, active genes, and promoters, respectively[4–7]. Indeed, H3K4 methylation is required for the cellular memory of active gene state  and are mediated by a conserved family of histone methylases named SET1, Trithorax, and MLL in yeast, flies, and mammals, respectively . Enhancer elements and promoters are dispersed throughout the genome, and yet histone methyltransferases (such as MLL and DOT family proteins [9–12]) and histone demethylases (LSD1, JARID1A, and UTX ) are able to localize to these specific regions and in a cell-type specific manner, affect their enzymatic function. Thus, the ubiquitous yet specific nature of these interactions creates an important biological paradox: how do these complexes know which histones to modify and which ones to leave alone?
Characterization of the chromatin landscape revealed that much of the genome is pervasively transcribed [14–16]. Initial efforts to explore the functional consequences of this transcription have revealed long noncoding RNAs (lncRNAs, defined as >200 nts in length) as mainly repressive players in gene regulation. Examples such as XIST, HOTAIR, and lincRNA-p21 are among the most well studied lncRNAs and have been shown to be involved with X-chromosome inactivation, breast cancer metastasis, and p53-depended gene repression, respectively [17–21]. These functions occur through interaction with chromatin complexes such Polycomb Repressive Complex 2 (PRC2) in the case of XIST and HOTAIR. These observations suggest that RNA can provide a gene-specific targeting mechanism to non-specific enzymatic activity, but until recently the ability for RNA to coordinate activation of gene expression has not been well explored. A notable exception is the roX RNAs in Drosophila, which mark the male X chromosome together with the Male Specific Lethal (MSL) complex to enhance transcriptional by two fold .
In this review, we focus on the identification and characterization of novel ncRNAs and how they act to affect active chromatin and gene expression. First, we discuss new methods that have uncovered novel ncRNAs “markers” of active gene states and regulatory elements. Second, we highlight recent mechanistic studies that elucidate the connection between ncRNAs and H3K4 methylation. Collectively, these studies suggest a general role in gene regulation where ncRNAs can mark and often modulate the active chromatin state in both positive and negative manners.
Discovering novel ncRNAs is critical to expanding the catalog of known transcripts. The advent of GRO-Seq has recently enabled the systematic identification of nascent RNA transcripts across the genome . Initial application of this method revealed many short RNA transcripts (less than 250nts) around gene promoters in both sense and antisense orientations, termed divergent transcripts, and characterized genome-wide polymerase pausing just downstream of the transcription start site . More recently, GRO-Seq was used to characterize androgen-responsive transcription in prostate cancer cells. Interestingly enhancer-templated RNAs (eRNAs) were responsive to androgen treatment . Active enhancer elements are marked with H3K4me1, histone H3 lysine 27 acetylation (H3K27ac) [25,26], depleted of H3K27me3, and have characteristic binding of p300 (histone acetyltransferase ) and Med12 (subunit of the mediator complex ). Analysis of chromatin modifications and transcription factor binding has recently allowed for identification of novel enhancer elements. Intriguingly, based on extensive correlation of active enhancers with local eRNAs versus histone modifications and regulatory protein (p300 or Med12) binding, several groups observed that a defining characteristic for functionally active enhancers is the production of divergent eRNAs [24,25,29]. The marking of enhancer elements by eRNAs as well as H3K4me1 hints at a scenario where eRNAs could interact with the chromatin to establish or maintain H3K4me1 or where the H3K4me1 promotes eRNA transcription. Looping of enhancer and promoter regions together offers a potential mechanism for eRNAs to affect coding-gene expression (Figure 1A). Further work will be needed to assign a specific function to eRNAs and define what role they play in establishing the chromatin signature at enhancers and promoters.
If enhancers can be transcribed, it may come as no surprise that promoters can also produce noncoding transcripts. A recent study systematically examined the transcriptional landscape of promoters encoding cell cycle genes over a large number of conditions, including phases of the cell cycle, oncogenic pathway activation, and stem cell differentiation . This study revealed that cell cycle promoters, marked by domains of H3K4me3, indeed produce long noncoding transcripts (>200 nt) in a periodic fashion over the cell cycle, but in a manner that is out of phase with the expression of the neighboring coding genes. Thus, two parallel synchronous waves of gene expression—one coding and one noncoding—underlies human cell cycle progression. Some of these promoter lncRNAs are likely functional; one lncRNA, termed PANDA, is specifically induced by p53 during DNA damage and manages the balance between cell cycle arrest versus cell death. Interestingly, PANDA modulates cell death by binding to the transcription factor NF-YA and titrates it away from chromatin, indicating a trans rather than cis mode of action.
Transcription around promoters has recently been reexamined studies in mammalian and yeast systems have described the process of divergent transcription where two distinct RNAPII complexes initiate in opposite directions to produce RNA transcripts [23,31–33]. Analysis of cryptic unstable transcripts (CUTs) and stable unannotated transcripts (SUTs) offered the first evidence of the widespread nature of divergent transcription in yeast. Divergent CUTs were found to be correlated with the expression of their sense protein-coding genes while sense CUTs were anti-correlated, suggesting that sense CUTs may interfere with transcription of the coding gene [32,33] (Figure 1B). A new method termed nascent transcript sequencing (NET-Seq), based on deep sequencing of nascent RNA fragments bound to RNA polymerase II, has revealed divergent transcripts in more detail . This analysis revealed that, for most promoters, the ratio of antisense to sense transcription was less than 0.25, suggesting that while promoters have the capacity to generate divergent transcripts, at least in yeast, there is a strong directional preference towards the sense orientation. This directional bias was shown to be regulated by Rpd3S, an H4 deacetylase complex, which has activity at the 3' end of coding genes . Interestingly, many of the divergent RNA transcripts in yeast overlap with the 3' ends of coding genes, suggesting that given the compact nature of the yeast genome, Rpd3S has evolved to control divergent RNA transcription to allow proper transcription of protein-coding genes (Figure 1B).
While divergent transcripts may be at the mercy of sense RNAs in yeast, the mammalian situation is more complicated. Divergent transcription was initially described in mouse embryonic stem cells (mESCs) and human lung fibroblasts, revealing that most gene exhibited divergent transcripts and that they were highly correlated with protein-coding gene expression [23,31]. Examination of the chromatin marks at these promoters revealed unexpected characteristics: the active H3K4me3 mark was colocalized to both the sense and antisense RNAPII regions whereas H3K79me2, a mark for productive transcriptional elongation, was only present downstream in the sense direction, suggesting a mechanism of divergent initiation but unidirectional elongation . Divergent pausing of the RNAPII complexes was recently shown to occur at promoters with divergent RNAs suggesting the antisense RNAPII complex forms with many of the same factors as the sense complex . Further work exploring the activity of the positive-transcriptional elongation factor b (pTEF-b), which is a master regulator of RNAPII pause release and productive elongation, on the antisense RNAPII complex revealed that the upstream-antisense RNAs (uaRNAs) are unexpectedly sensitive to its activity . These studies have established divergent transcription as a pervasive and well-conserved process that yields divergent transcripts as well as divergent histone methylation marks (Figure 1A). In yeast, divergent transcripts appear to be involved in a repressive pathway while in mammalian cells no functional characteristics have yet to be assigned to the RNAs. The fact that uaRNAs in mammalian cells are regulated by the canonical transcription machinery but their expression tightly controlled could hint at some functional consequence of their transcription, perhaps as genomic anchors for recruitment of transcription regulatory factors.
Functional studies of ncRNAs in yeast have revealed genetic interactions between certain lncRNAs and the methylation status of H3K4. The PHO84 antisense transcript runs through the body and promoter of PHO84 gene, and can inhibit sense PHO84 transcription in a SET1-dependent manner in cis (the endogenous configuration) or in trans (when the antisense is transcribed from a plasmid) [38,39]. The Ty1 retrotransposon has an antisense CUT RNA named RTL whose expression is anti-correlated with the Ty1 transcript . Interesting, silencing of the Ty1 transcript is accomplished in trans by the RTL RNA through the Set1 methyltransferase complex. More recent work has classified RTL RNA as a member of the Xrn1-sensitive unstable transcripts (XUTs) given their sensitivity to the cytoplasmic 5' to 3' exonuclease Xrn1 . Many of these XUTs are antisense to coding genes are have properties similar to RTL. The widespread nature of these antisense XUTs in yeast and their connection to the Set1 complex suggest an important link between lncRNAs and H3K4 methylation in yeast, specifically that the antisense XUTs may be positively regulated by Set1 and in turn repress the sense transcript (Figure 1B). One aspect of the XUT mechanism that is not clear is how XUTs, which act at the chromatin level to affect gene expression, are signaled for exported from the nucleus for Xrn1 degradation. Focused studies exploring the specific mechanism for RNA-mediated gene control between cellular compartments will be of interest in the future.
While most of the mammalian lncRNAs characterized to date repress gene activity, recent studies revealed that perhaps many lncRNAs enhance gene expression. The GENCONE database identified numerous new lncRNAs, several of which resulted in examples of activating lncRNAs (ncRNA-a) . Depletion of these lncRNAs resulted in decreased expression of nearby protein coding genes, and experiments fusing cDNAs encoding ncRNAs to reporter genes suggested several ncRNAs activate gene expression in cis in an enhancer-like manner, although the mechanism remains unclear. Separately, studies of a novel lncRNA termed HOTTIP has provided a paradigm for how enhancer-like ncRNAs may act. HOTTIP is located on the very 5' end of the HOXA homeotic gene cluster . In cells where 5' HOXA genes are active, the locus adopts a highly compact form, looping to bring HOTTIP and the nascent HOTTIP RNA into proximity to multiple 5' HOXA genes. In turn, HOTTIP RNA directly binds to WDR5, a subunit of the WDR5-Ash2L-RbBP5 (WAR complex) associated with all MLL H3K4 methylases, and recruits it to 5' HOXA to enforce H3K4 methylation and gene expression (Figure 1A). Ectopic expression and RNA tethering experiments showed that HOTTIP can only act in cis, and thus is likely strictly dependent on the endogenous chromosome looping to dictate its target genes. Thus, lncRNAs such as HOTTIP mediate information transfer of spatial information in chromosomal looping into biochemical information in histone modifications. The recent identification of another lincRNA that activates HoxA6 and HoxA7 in cis via direct interaction with MLL1 suggests a common theme .
Broad domains of active chromatin are hallmarks of developmental genes under epigenetic maintenance . Once a lncRNA targets the MLL complex, the relatively non-sequence specific DNA binding activity of Ash2L, newly recognized through structural studies, spreads the complex across chromatin. At the HOXA locus, point mutations that abolish Ash2L DNA-binding activity in vitro allows the complex to target to the HOTTIP element, but fails to spread properly across HOXA . These finding support the notion that lncRNAs have the capacity to enhance gene expression by specifically nucleating H3K4 methylase complex at specific genes, which then spread or stabilize via a separate DNA binding activity in a manner dependent on chromosomal looping. We await future work that delineates in detail the RNA-protein interaction required to specify these events.
Establishing a more complete catalog of lncRNA transcripts will speed the discovery of novel ncRNAs with functional roles (such as HOTTIP), and recently work in mouse and human cells has revealed large numbers of previously unannotated lncRNAs [47,48]. Extensive deep sequencing of transcriptomes from diverse human tissues annotated more than 8000 lncRNAs, and importantly revealed that on average, lncRNAs are more tissues specific than protein-coding genes, suggesting that lncRNAs may actually play a significant role in determining cellular fate . Guttman and colleagues characterized mESC-specific lncRNAs and through functional studies showed many of these lncRNA are regulated by pluripotency factors (Oct4, Sox2, and Nanog) and almost 30% of the ESC-specific lncRNAs appear to interact with at least one repressive chromatin protein complex . Towards addressing the potential functions of lncRNAs at chromatin in an unbiased manner, a new method termed Chromatin Isolation by RNA Purification (ChIRP) was developed . Analogous to ChIP-seq, ChIRP-seq provides genome-wide identification of the specific sites of RNA occupancy on chromatin, and has been applied to the telomerase RNA TERC, roX2, and HOTAIR. These lncRNAs bound hundreds to thousands of sites throughout the genome, had focal foot prints (similar to transcription factors) and sequence specificity in their localization. Together, shRNA screens and ChIRP should enable high throughput functional studies of the lncRNA landscape.
Efforts to understand the interplay between ncRNAs and chromatin have established that both the RNA and chromatin can affect one another. While features of the active chromatin state have been well studied, the interactions between RNA and active are only beginning to be explored. Given the pervasive nature of transcription and the near ubiquitous nature of histone methylation, future work is likely to uncover mechanisms that employ both ncRNAs and the state of chromatin methylation to regulate gene expression.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.