|Home | About | Journals | Submit | Contact Us | Français|
It is widely accepted that the genome is regulated by histone modifications that induce epigenetic changes on the genome. However, it is still not understood how ubiquitously expressed chromatin modifying complexes are “guided” to specific genomic sites to induce intricate patterns of epigenetic modifications. Previously believed to represent “genome junk”, it is now becoming increasingly clear that large non-coding RNAs associate with chromatin modifying complexes. Here we explore an intriguing hypothesis that large non-coding RNA molecules might represent a molecular trafficking system that modulate chromatin modifying complexes to establish specific epigenetic landscapes.
During development, the genome undergoes a complex choreography to establish distinctive gene expression patterns that define cellular identity. These changes are mediated through the presence of specific histone modifications and DNA methylation patterns, which are established by ubiquitously expressed chromatin modifying complexes with unknown specificity. However, what guides these complexes to distinct and specific sites under different cellular contexts is not understood. Almost 35 years ago a first clue came that perhaps RNA may play a role in this process based on the observation that chromatin structure was found to be associated with several unknown RNAs . Two key studies further demonstrated that RNA was a critical component in the global localization of chromatin modifying complexes [2,3]. For example, depletion of single-stranded (ss)RNA, but not ssDNA was shown to be required for the localization of key histone modifications [2,3].
Indeed, several recent studies have begun to unravel the association of large non-coding RNAs with enzymatic complexes that establish these epigenetic landscapes [4, 5**,6**]. These studies suggest a potential role for large non-coding RNAs in regulating chromatin state. Specifically, large non-coding RNA molecules might be required for the specificity of chromatin formation across the genome [4, 5**,6**,7,8,9**]. Thus, expression patterns of non-coding RNAs may influence specific epigenetic states by interfacing with chromatin modifying complexes and thereby imparting specificity.
Although these examples suggest a key role for RNA in epigenetic regulation, it is not understood how RNA imparts specificity to otherwise ubiquitous chromatin modifying complexes. Although there is a wealth of information about small non-coding RNAs regulating chromatin [2,3,10–14], in this review we specifically focus on large non-coding RNAs in mammalian systems. Here, we discuss several recent studies that have gleaned insights into possible roles for large non-coding RNAs modulating the regulation of chromatin modifying complexes. By using examples of X inactivation, HOX gene regulation and imprinting, we propose putative models of how large non-coding RNAs could, in part, serve as a genetic trafficking system.
X chromosome inactivation is a classic and dramatic example of RNA based establishment of epigenetic regulation. Briefly, X chromosome inactivation is a process in female mammalian cells in which one copy of the X chromosome is inactivated. This ensures that females produce the same dosage of X-linked genes as the male produces with only one X chromosome [15,16]. Remarkably, a multi-exonic, spliced, capped and poly-adenylated large non-coding RNA known as Xist (X inactive specific transcript), is expressed on only one female X chromosome and induces the entire chromosome to become transcriptionally inactive [17–23]. This RNA was the first example to illustrate the power of a large non-coding RNA to be required for epigenetic changes. The coating the future inactive X chromosome by Xist ultimately changes the epigenetic program of an entire chromosome [16,24–26].
Polycomb group complexes (PcG), such as PRC2 (comprised of the proteins EZH2, EED and SUZ12), are chromatin-modifying complex that are conserved from flies to mammals and are required for proper establishment of heterochromatic regions genome-wide . As first demonstrated with the large non-coding RNA HOTAIR (see HOX section), PRC2 was also found to colocalize with Xist at the inactive X chromosome [4,5**,26,28]. This binding was shown to occur via a 1.6kb non-coding RNA encoded within Xist, named RepA [5**,29*]. Furthermore, a highly conserved double-stem loop structure in RepA was found to directly bind components of the PRC2 complex, most likely to Ezh2 [5**]. More recently, another study validated the binding between RepA and PRC2 and found that the full-length RepA (consisting of two double-stem loop structures) sequence binds to Suz12 via structural interactions between the stem-loop repeats and spacers that stabilize the binding interaction [29*]. Although it is too early to determine the exact interactions between RepA and PRC2, it is clear that PRC2 forms a ribonucleic-protein interaction with RepA that is critical for the deposition of histone 3 lysine 27 trimethylation (H3K27me3), leading to chromosome wide heterochromatin formation [5**].
A further layer of RNA dependent recruitment of PRC2 involves a large non-coding RNA transcribed antisense to Xist, namely Tsix . In contrast to Xist, Tsix ceases to be expressed in the future inactive X chromosome and instead is increasingly expressed in the future active X chromosome . Hence, it was also suggested that Tsix could compete for the binding of PRC2, and in this way might reduce PRC2 binding to RepA [5**]. One possibility is that Tsix prevents Xist function by reducing heterochromatin formation on the active X [5**]. All these findings demonstrate that physical association of large non-coding RNAs with RepA is required to bestow PRC2 localization on the future inactive X chromosome. Hence, X inactivation represents a powerful model of RNA dependent cis recruitment of chromatin modifying complexes.
The Homeobox transcription factors (HOX genes) were famously discovered for their ability to transform the identities of body segments in fruit flies . In mammals 39 HOX genes are encoded across four loci (HOX-A: HOX-D) on different chromosomes. The relative position of each HOX gene within a cluster is reflective of its spatial and temporal expression along the proximal-distal and anterior-posterior axes in developing embryos that define a unique positional cellular identity [32,33]. HOX gene expression is coordinately regulated by the dynamic interplay between PcG and Trithorax complexes that establish heritable collinear domains of heterochromatin and euchromatin respectively [34–36].
Interestingly, recent studies have revealed numerous large intergenic non-coding RNAs (lincRNAs) within the HOX clusters, which also exhibit similar spatial and temporal patterns of expression as the HOX genes [4,37,38]. It has been demonstrated that one of lincRNA encoded in the HOX-C cluster, termed HOTAIR, physically associates with PRC2. It is also required for the proper maintenance of a large domain of heterochromatin, but in a different cluster, namely HOX-D . It is not know how HOTAIR interacts with HOX-D since it has very little sequence homology yet there are several possible models for how this regulation could occur. For example, HOTAIR could bind to PRC2 facilitating ribonucleic interactions with sequence specific adaptor proteins that recognize sequences in the HOX-D locus (Figure 2c). Consistent with this hypothesis HOTAIR was recently determined to interact with COREST (RCOR1) [7, M.C. Tsai and H.Y. Chang personal communication]; which serves a molecular beacon for silencing of neuronal specific genes several of which flank of HOX-D .
Alternatively, HOTAIR could form triple helix interactions with the HOX-D cluster that are known to not follow the Watson-Crick base pairing  (Figure 2b). Regardless, loss of HOTAIR function results in decreased levels of H3K27me3 and PRC2 binding, leading to the activation of HOX-D genes in trans . Hence, HOTAIR helps to maintain characteristic HOX gene expression pattern, and shows RNA based recruitment of PRC2 in trans, revealing a potentially new form of genomic crosstalk. It is likely that numerous other HOX large non-coding RNAs bind PRC2 and assist in the proper orchestration of HOX gene expression.
In addition to repressive polycomb protein regulation, HOX genes are also epigenetically regulated by trithorax proteins (TRX). TRX enzymatically modify histone H3 lysine 4 trimethylation (H3K4me3), thereby establishing the positional patterns of euchromatic HOX domains . In one study it was demonstrated that a Drosophila large non-coding RNA Ultrabithorax (Ubx) binds to the methyl transferase trithorax (Ash1) and guides the complex to properly regulate the collinear expression of Ubx . However, a contradicting observation was reported that this large non-coding RNA serves as a repressor for Ubx . Nonetheless, both studies underscore the role of a lincRNA in establishment of epigenetic states. Another study found that the TRX protein Mll1 physically associates with two antisense HOX large non-coding RNAs Evx1as and HoxB5/6as during mammalian embryoid body (EB) differentiation . Overall, these studies demonstrate that several large non-coding RNAs bind TRX group proteins in both flies and mammals.
Despite representing a fraction of the genome, HOX clusters encode a bounty of large non-coding RNAs. Many of these HOX large non-coding RNAs share a common role in interacting with chromatin modifying complexes such as PRC2 and TRX to coordinate a dynamic regulation of epigenetic states and establish proper domains of HOX gene expression.
In mammals, somatic cells possess two copies of a gene (alleles), one inherited from the mother and the other from the father. Most of the alleles are expressed simultaneously. However, a small fraction referred to as imprinted genes, are differentially expressed depending on whether the gene was maternally or paternally inherited [45,46]. Imprinted genes are regulated in cis by imprinting control regions (ICR), which can repress adjacent genes by utilizing large non-coding RNAs . Despite regulating diverse genomic loci, they share an emerging mechanistic theme: antisense large non-coding RNAs that interact with chromatin modifying complexes result in silencing of maternal or paternal alleles. Recently, two large non-coding RNAs involved in imprinting were shown to bind chromatin modifying complexes and required for the proper localization of histone modifications [6**,8,9**].
For example, most of the genes near the Kcnq1 gene are expressed from the maternal chromosome, except the paternally expressed antisense non-coding RNA termed Kcnq1ot1 . Recently, it was shown that Kcnq1ot1 represses the paternal allele of Kcnq1 by interacting with the histone methyltransferases G9a and PRC2 [6**,8,49]. A similar observation was reported with the large non-coding RNA Airn, which in the placenta imprints the Igf2R, Slc22a2 and Slc22a3 genes on the paternal allele [9**]. Interestingly, truncation of Airn results in the loss of G9a accumulation at the SLC22A3 promoter region, and in a decreased level of G9a recruitment and Slc22a3. This suggests that Airn helps to accumulate G9a at the promoter region to silence Slc22a3 [9**]. Although Airn and Kcnq1ot1 may or may not directly interact with the chromatin modifiers, it is clear that depletion of these large non-coding RNAs results in the loss of histone modifications at imprinted loci, presumably through displacement of key chromatin modifying complexes [4,6**,7].
Here we have surveyed several recent studies that demonstrate a common theme: large non-coding RNAs bind to chromatin modifying complexes such as PRC2, TRX and G9a and impart specific silencing of genomic loci both in cis and trans [4,8,9**,42,44]. Are these only distinct examples or can they be generalized to a common theme? A recent study demonstrated that numerous lincRNAs bind to PRC2 and multiple other chromatin modifiying complexes . This suggests a more global role of lincRNAs, and we hypothesize that these might guide chromatin-modifying complexes to specific target sites. In support of this notion, the above study also demonstrated that lincRNAs are required for proper PRC2 mediated gene repression at gene loci that are normally repressed by PRC2 . Collectively, these studies have revealed a global theme: large non-coding RNAs are required for the proper establishment of chromatin domains, possibly by steering epigenetic modifying complexes to their specific destinations. However, it is crucial to note that the models presented here do not exclude other proposed or plausible mechanisms. Here we only focus on a few possible RNA based mechanisms that may impart specificity to otherwise ubiquitous chromatin modifying complexes.
First, large non-coding RNAs could serve as RNA based tethers to recruit either directly or indirectly chromatin modifying complexes to the specific location of the RNA transcription . Thus, a code of large non-coding RNAs could be expressed in a cell specific manner providing a map for proper epigenetic landscaping in cis (Figure 1, ,2a).2a). In contrast to the cis model, large non-coding RNAs could also act in trans, and could target multiple sites of action (Figure 1b).
There are several mechanistic models by which large non-coding RNAs could act in trans (Figure 2b–d). One possibility is the formation of a RNA-DNA triplex, which could serve as a beacon to recruit epigenetic modifying complexes either directly or indirectly (Figure 2b). This may be more likely than an alternative RNA-DNA duplex model: considering RNA sequence complementarity to DNA sequence has not yet been observed for trans acting large non-coding RNAs . However, RNA-DNA triplexes do not rely on Watson and Crick base pairing  and may be alternative explanation of RNA-DNA interactions.
Another possibility of RNA based modulation of chromatin modifying complexes could occur by large non-coding RNAs serving as an architectural scaffold that directly or indirectly links chromatin complexes to DNA binding proteins or transcription factors (Figure 2c). Alternatively, RNA could bind directly or indirectly to chromatin remodeling proteins that change the shape of the protein or protein complex. This structural change might induce the complex in a particular way, confer site specificity or serve as an allosteric regulator (Figure 2d).
Finally, we propose that ribonucleic-protein interactions with chromatin modifying complexes such as PRC2 could mediate long-range interactions (both intra and inter-chromosomal) between DNA elements (Figure 2e). This raises an intriguing possibility that enhancer elements may contain large non-coding RNAs that mediate multi-lateral interactions with protein coding genes that result in “looping” of DNA to establish epigenetic control of neighboring genes. Consistent with this idea the interaction of Kcnq1ot1 with chromatin modifying complexes does not silence a continuous domain of neighboring genes. Since several genes are skipped perhaps Kcnq1ot1 mediates three-dimensional DNA-RNA-Protein interactions that precisely define regulated genes in the imprinted locus. It is important to note that if numerous other enhancer like control regions are transcribed, similar to Kcnq1ot1, Airn and Xist, all the mentioned studies demonstrate this is not due to passive transcription, rather the large non-coding RNA molecules themselves are required to facilitate the recruitment and silencing of these loci [5**,6**,9**].
Further investigation is required to resolve the mechanism by which large non-coding RNAs and protein coding genes such as chromatin-modifying complexes interact; and whether this interaction is important for the establishment of distinctive epigenetic states. However, it is clear that ncRNAs molecules play a key role in regulating the trafficking of epigenetic landscapes such as X inactivation, HOX gene regulation, imprinting and possibly numerous other biological processes.
In the 19th century, Lamarck’s idea how organisms inherit beneficial, environmentally acquired characteristics were diminished by Darwin’s theory of natural selection. However, the theory may still apply for non-coding RNA. Large non-coding RNAs might represent a way by which characteristics are propagated from mother to daughter cell and from generation to generation and perhaps in response to environmental cues. As we have summarized here, numerous large non-coding RNA molecules can attract chromatin-modifying complexes and might help establish heritable chromatin states. Thus, we envision the possibility that large non-coding RNA could be distributed between dividing cells and ensure epigenetic memory.
Consistent with this idea, it was recently demonstrated that the injection of a non-coding RNA (mir-124) into fertilized mouse eggs yielded a stable and heritable epigenetic change to the Sox9 locus that further lead to a transmissible growth phenotype across several generations of progeny . Paramutation is another such example of RNA based genome molding, which describes the non-Mendelian interaction between two alleles of a single locus that results in heritable changes of one allele by the other allele [52–54]. Several studies have demonstrated that RNA is required to ensure that these changes can be inherited from one generation to the next [55,56]. Collectively these studies support our hypothesis of the importance large non-coding RNAs in genome plasticity and epigenetic memory.
By way of analogy, we propose an “RNA air traffic control” model to impart epigenetic inheritance. Daily, thousands of planes fly across the globe without inherent specificity, but are guided to their destination by air traffic control signals. Similarly, there are many distinct chromatin-modifying complexes and other proteins traveling around the nucleus, without inherent specificity. Perhaps large non-coding RNAs serve, at least in part, as an adaptable genetic air traffic control system to bring chromatin complexes to unique epigenetic destinations. If true, we could imagine reverse engineering the code of large non-coding RNAs to restore misregulated epigenetic states in disease models. Although, far from established, numerous recent studies are shedding new light on an emerging role of large non-coding RNAs in epigenetic regulation.
We would like to thank Sigrid Hart from the Broad Institute for the illustrations, M. Guttman, M. Cabili, M. Huarte, L. Goff, A.K. Khalil and J.S. Mattick for critical comments on the manuscript.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
Papers of particular interest, published within the period of review, have been highlighted as:
* of special interest
** of outstanding interest