|Home | About | Journals | Submit | Contact Us | Français|
Cellular pathways must be synergized, -controlled and organized to manage homeostasis. To achieve high selectivity within the crowded cellular milieu the cell utilizes scaffolding complexes whose role is to bring molecules in proximity thereby controlling and enhancing intermolecular interactions and signaling events. To date, scaffolds have been shown to be composed of proteinaceous units; however, recent evidence has supported the idea that non-coding RNAs may also play a similar role. In this Point-of-View article we discuss recent data on ncRNA scaffolds, with particular focus on ncRNA HOTAIR. Using our current knowledge of signaling networks we discuss the role that RNA may play in writing and regulating histone modifications and the information needed for correct gene expression. Further, we speculate on additional, yet undiscovered, roles that ncRNAs may be playing as molecular scaffolds.
The inside of the cell is composed of complex, interwoven networks that must be organized and act synergistically to ensure survival. For the cell, organization is further complicated by the crowded nature of the cellular milieu, with the concentration of its constituents rivaling those found in protein crystals. Thus, the cell must overcome a high degree of molecular crowding to achieve extraordinary spatial organization. Understanding how the cell utilizes its molecular components to guide the correct interactions and control signaling cascades is a fundamental challenge for biologists.
One way in which the cell can overcome the challenge of organizing specific interactions is to utilize scaffolding molecules that can bring several components together, guiding them to enhance their activities. Scaffolding proteins have been shown to act as molecular landing pads for metabolic enzymes, trans-membrane receptors and ion channels.1 Specifically, scaffolding proteins can act by providing platforms in which multiple signaling molecules can assemble by coordinating feedback pathways, modulating the modification of its binding partners and coordinating localization of their substrates.2 The importance of scaffolding proteins has been further underscored by the discovery that they play critical roles in immune response, mating and G-protein signaling.3–8
Proteins have historically been viewed as the powerhouses of the cell. Nevertheless, molecular evolution studies suggest that RNAs contributed significantly to the development of modern organisms by doing more than mediating the flow of genetic information between DNA and proteins. RNA can be found at the heart of complex machines such as the ribosome, RNaseP, telomerase and spliceosome.9–12 Structural studies on these indispensable RNA-protein complexes have revealed a reoccurring theme: RNA can play a key scaffolding role that is critical for biological activity. These are classic examples of how RNA can contribute a structural role in essential biological processes, but they represent a minute portion of the non-coding transcriptome.
The human genome contains only ~20,000 protein-coding genes, representing <2% of the total bases. Interestingly, a substantial fraction of the human genome is transcribed to yield many short or long non-coding RNAs (ncRNAs) with limited protein-coding capacity.13 Long non-coding RNAs are a set of non-coding transcripts that are greater than 200 nt in length. Long intergenic non-coding RNA (lincRNA) molecules are such an example of long non-coding RNAs. LincRNAs are transcribed by RNA Polymerase II, capped, spliced and polyadenylated.14,15 Originally, the majority of these transcripts were disregarded as being transcriptional noise and unfunctional, due to their reduced sequence conservation. In addition, recent studies have proposed that the majority of “dark matter” transcripts are associated with known genes, or are the consequence of transcriptional truncations or error.16–19 Nevertheless, the utilization of unique chromatin marks (K4-K36 bivalency) and the reconstruction of RNA-Seq maps without relying on previous annotations have further supported the idea that lincRNAs are unique transcripts.20,21 These studies have revealed thousands of lincRNAs differentially expressed in several mouse and human cell types.20–22 Furthermore, molecular profiling and genetic screens have identified lincRNAs that play a role in dosage compensation, imprinting, developmental gene expression and reprogramming of human induced pluripotent stem cells in an allele- and cell-type specific manner.23,24 These studies hinted that lincRNAs may be regulators of epigenetic mechanisms, actively participating in pathways that would control gene expression. Despite these exciting developments, the mechanistic basis of how these RNAs work is still mostly unknown.
In 2007, Rinn et al. identified the lincRNA HOTAIR and showed that it associates with and targets Polycomb Repressive Complex 2 (PRC2) to distantly located genes. PRC2 is a chromatin modifying complex consisting of H3K27 methylase EZH2, SUZ12 and EED. This work therefore provided evidence that lincRNAs may globally affect gene expression.25 This observation has been extended by others to show that many well-characterized intergenic RNAs are indeed binding to these complexes,26–28 thus revealing a long-sought-after missing puzzle piece. Following these observations, a prevailing model suggests that lincRNAs interact with and guide chromatin-remodeling complexes to specific regions on the genome, thereby controlling gene expression.22,29 The majority of these lincRNAs are believed to target these complexes in cis, tethering them to adjoining regions on the chromosome.30 Recent studies of several chromatin modification complexes revealed that thousands of lincRNAs as well as antisense and promoter-associated ncRNAs bind to chromatin modifying complexes.22,28,31 These results further support the notion that lincRNAs are key players in epigenetic regulation.
The lincRNA HOTAIR is transcribed from the HOXC locus and targets Polycomb to silence HOXD and hundreds of select genes on other chromosomes.25 This observation drew much excitement, as it was the first example of a lincRNA that acts in trans to regulate genes at a distance, suggesting that lincRNAs may play more complex roles in gene regulation. Subsequent experiments demonstrated that the HOTAIR-PRC2 interaction could promote the onset of metastasis and that PRC2 and HOTAIR are functionally dependent on each other for promoting tumor invasiveness.32 Despite these exciting developments, critical fundamental questions remain regarding the properties, biological activity and functional role of the RNA-protein complexes involving HOTAIR. Could there be more proteins binding to HOTAIR? What regions of HOTAIR are important to binding these complexes? Are there mutually exclusive pieces of HOTAIR that select for specific protein complexes?
The silent chromatin state of HOX genes is regulated by synergistic actions of PRC2 and the LSD1-CoREST complexes, the latter being a H3K4me2 demethylase that removes an active chromatin mark.33,34 Thus, it was hypothesized that HOTAIR may also bind the LSD1-CoREST complex. Immunoprecipitation (IP) of LSD1 retrieved endogenous HOTAIR with comparable enrichment to that of PRC2 IP. Knockdown of HOTAIR resulted in genome-wide changes in the occupancy of PRC2 and LSD1, and concurrent loss of H3K27me3 and gain of H3K4me2.35 These results gave important insight into how RNA could bridge two independent complexes together on chromatin. Does HOTAIR contain specific pieces of information that mediate coordination of PRC2 and LSD1?
Deletional analysis of HOTAIR demonstrated that independent pieces of the RNA bind to PRC2 and LSD1 in a mutually exclusive manner. The first 300 nucleotides bind to PRC2, and the last ~600 nucleotides to LSD1 complex. Extensive structural analysis by nuclease footprinting revealed that each domain is composed of unique secondary structure elements.35 Thus, these structural units may provide the correct information to select the right protein complex in a sea of others. These studies demonstrated that a lincRNA can function as a scaffold, akin to proteins, to control cellular signaling events (Fig. 1). Further biochemical evaluation on lincRNAs will be necessary to provide evidence that lincRNAs, like other elaborate ncRNAs, contain specific structural and informational pieces that impart a scaffolding function.
Indeed, RNA is poised to be a unique scaffolding molecule. RNA structure is malleable and dynamic. Unlike its protein counterpart, this would make RNA suited to interact with a certain set of binding partners under one structural state and, under another, bind to a whole new subset of interactors. The simplicity of this transition is allowed by RNA's unique ability to adopt several structural states that have similar energetic properties.36 An example of such an interplay was recently observed with the discovery that the human vascular endothelial growth factor-A (VEGFA) 3′-UTR undergoes a binary conformational change in response to an environmental signal, therefore controlling a binding switch between IFNgamma-activated translation inhibitor complex and heterogeneous nuclear ribonucleoprotein L (HNRNPL, also known as hnRNP L).37 This is just one such example where the structural characteristics of RNA are exploited to employ differential scaffolding roles, leading to control over a gene expression event.
As illustrated in Figure 1, lincRNAs like HOTAIR can serve as the informational template to program the chromatin state at target genes. In the familiar role of messenger RNAs (mRNA) as the template for protein synthesis, sequential codons on the mRNA are recognized by transfer RNA and the ribosome, leading to the incorporation of specific amino acids that corresponds to the sequence of the mRNA. By analogy, we suggest that lincRNA may have a similar role in templating the epigenome. HOTAIR, and likely other lincRNAs, contain multiple binding sites for distinct histone modification enzymes that direct specific combinations of histone modifications on target gene chromatin.38 Based on their dynamic patterns of expression, specific lincRNAs can potentially direct complex patterns of chromatin states at specific genes in a spatially and temporally organized manner during development and disease states. The highly regulated patterns of lincRNA expression and their control by key transcription factors involved in lineage-specific development and human disease states are consistent with this idea.20 Another possible consequence of induced proximity by lincRNAs is that the enzymes may act on one another, thereby modifying the activity of one or more of the interacting partners. It is well known that histone modification enzymes, such as lysine methylases, lysine demethylases, lysine acetyltransferases and lysine deacetylases can act on nonhistone substrates, including each other.39,40 In the most general sense, a lincRNA can be viewed as a molecular matchmaker that can be used to program a complex series of chemical events. The recent illustration of several lincRNAs that can bind diverse chromatin and transcriptional regulators suggests rich combinatorial possibilities for this mode of regulation.41 Studies focused on the informational content contained within lincRNAs are sure to reveal additional levels of programmed sophistication.
The diversity of histone modifications has suggested the concept of a “histone code”, but in vivo patterns of histone modifications are dominated by a remarkably small number of patterns out of all possible combinatorial possibilities.42,43 Protein-protein interactions between histone modification enzymes, as well as coupled histone “reader” modules, enforce the cross-talk and segregation of specific histone modifications. LincRNAs may well represent an additional mechanism to program and refine the chromatin landscape.
So far, a significant portion of identified lincRNAs have not been shown to perform specific biological functions. However, a substantial subset has been documented to associate physically with chromatin modification complexes, suggesting that they may play roles in gene expression or other chromatin-templated processes.20,25,35,44 Nevertheless, the large number and extensive length of lincRNAs suggest that they may be capable of serving additional functions. Inside the cell, RNA is rarely found without a protein partner, and it has been shown that mRNAs can interact with several RNA-binding proteins (RBPs) throughout their lifecycle.45,46 Interestingly, proteins originally disregarded as RBPs have been shown to moonlight as RNA binders. One such group of newly classified RNA interactors is composed of metabolic enzymes.47 Classic examples are Iron regulatory proteins, glyceraldehyde-3-phosphate dehydrogenase, thimadylate synthase and dihydrofolate reductase.48–51 The entire catalog of these interactions remains to be discovered. Nevertheless, it is conceivable that lincRNAs may act as scaffolding units to enforce local proximity between an enzyme and its substrate, thereby increasing the local concentration, similar to the protein scaffold Ste5.52 Further, lincRNAs may act as direct regulators of enzymatic activity by binding to the enzyme to either activate or suppress its activity. Such action has been observed with the mitochondrial enzyme NAD+-specific isocitrate dehydrogenase (IDH). IDH binds to the 5′-end of mitochondrial mRNAs, and this binding inhibits its activity.53,54 Newly developed whole-transcriptome approaches to identifying RNA-protein interactions will surely aid in identifying new RNA-binding enzymes and potential lincRNA-enzyme complexes.55–57 LincRNAs such as HOTAIR and Xist bind directly to chromatin remodeling enzyme complexes; whether or not they directly modulate their activity remains to be discovered. It is stimulating to speculate that lincRNAs may provide scaffolding to complex enzymatic structures outside of the nucleus, and that they may act as stoichiometric modulators of enzymatic activity.
RNA sequences have been shown to bind small molecules and their interactions can have adverse effects on gene expression. The classic examples of such interactions are riboswitches.58,59 Riboswitches are RNA sequences that act in cis by binding to small molecules to modulate transcription and translation.60,61 Identified small molecules that bind RNAs are ions, protein enzyme cofactors and nucleotide analogs.62 They act by tuning or radically changing the structure of the RNA, usually to modulate protein-binding sights downstream of the aptamer region. Most signaling cascades involving protein scaffolds portray classical input-output behavior in response to feedback loops or switching due to stimuli.63 LincRNAs, similar to their protein-coding cousins, may bind to small molecules and influence transcription-centric feedback loops. Chemical donors utilized by enzymes that modify chromatin, s-adenosyl methionine for example, have been shown to bind to RNA. The logic of such a binding event in this case may be to modulate RNA recognition of chromatin or chromatin remodeling enzymes, thereby providing a new mechanism in which RNA recognition of small molecules would modulate transcription. These activities would also further underscore RNA's role as a modular scaffold whereby inputs could come from a variety of sources.
Recent genomic experiments have generated a plethora of sequencing information describing the abundance and composition of long ncRNAs. However, this large amount of information has left a bottleneck at the level of bridging the gap from lists to mechanistic insights. The long ncRNA field is now poised to benefit from targeted and systematic analyses of the pathways and components involving lincRNA binding partners and complexes. Extensive interrogation of protein scaffolds has revealed an extensive array of functionalities, demonstrating their necessity to the organization and homeostasis of the cell. We await similar findings to improve our understanding of lincRNA biology, ultimately expanding the role that non-coding RNA plays in the cell. Just like an iceberg, the information that lies beneath these initial layers of complexity may be vast.
R.C.S. is supported by the NIH T32(AR-007422). H.Y.C. is supported by the California Institute for Regenerative Medicine and M.T. by the Susan G. Komen Foundation. H.Y.C. is an Early Career Scientist of the Howard Hughes Medical Institute.