|Home | About | Journals | Submit | Contact Us | Français|
Non-protein-coding (nc) RNAs are diverse in their modes of synthesis, processing, assembly, and function. The inventory of transcripts known or suspected to serve their biological roles as RNA has increased dramatically in recent years. Although studies of ncRNA function are only beginning to match the pace of ncRNA discovery, some principles are emerging. Here we focus on a framework for understanding functions of ncRNAs that have evolved in a protein-rich cellular environment, as distinct from ncRNAs that arose originally in the ancestral RNA World. The folding and function of ncRNAs in the context of ribonucleoprotein (RNP) complexes provide myriad opportunities for ncRNA gain of function, leading to a modern-day RNP Renaissance.
Many eukaryotic genomes have a relatively sparse content of protein-coding loci. However, substantial regions beyond those devoted to protein-coding mRNA production are transcribed . The rapid increase in ncRNA discovery has outpaced functional studies or even a system of annotation based on transcript biogenesis, processing, or fate. Still, broad classes of ncRNAs have emerged. Even the most short-lived transcripts, those degraded at the site of synthesis, can have significant biological activities. Indeed, at least some of the cis-acting natural antisense RNAs and ncRNAs that regulate gene expression, chromatin dynamics, and chromosome structure could function primarily as nascent transcripts [2–4]. Other transcripts, produced from a single DNA strand or synthesized in a bidirectional manner, form inter-or intramolecular duplexes to yield the ~20–30 nucleotide small RNA molecules that direct RNA silencing-related pathways .
This review focuses on a third class of functional ncRNA, which for simplicity here will be designated as structured ncRNA. The term “structured” is intended in a broad sense, referring to folded motifs that may be interspersed within a larger, mostly unstructured transcript or that may fold only in association with a particular ligand. The function of ncRNAs in this category is generally envisioned to require biologically stable accumulation, conferred by association with one or more proteins. RNPs that harbor ncRNAs have many possible forms, with static or dynamic subunit compositions and RNA-protein interactions of varying degrees of sequence- or structure-specificity.
Some structured ncRNAs emerged as functional molecules relatively early in evolution and in their extant versions can be considered descendants of an ancestral RNA World. However, the majority of ncRNAs appear to have evolved more recently. Here, we explore the properties of ncRNAs that evolved in a protein-rich cellular environment, in contrast with the ancestral ncRNAs that developed function prior to their acquisition of protein partners (or the very existence of proteins). This evolutionary distinction has implications for ncRNA function, as discussed in the first section below. Subsequent sections illuminate distinct mechanisms of ncRNA function, including ncRNA roles as scaffolds of macromolecular complex assembly, hybridization guides, templates for polymer synthesis, and beyond. Importantly, the selected ncRNA examples illustrate a range of evolutionary scenarios: acquisition of function by a novel transcript, diversification of RNPs assembled by a single ncRNA, and diversification of a ncRNA family that retains shared protein partners.
Current versions of the structured ncRNAs that debuted early, arising in a RNA-dominated World, have been paradigms for insights about fundamental principles of RNA folding and catalytic mechanisms . These ncRNAs are proposed to have evolved their protein interaction partners in an early RNP World (Figure 1), driven by the ability of short peptides to improve and regulate RNA folding and function . Following the transition to a largely Protein World, rich with protein enzyme active sites, the selective pressure for evolution of new catalytic RNAs would have been low . However, predominance of protein-based catalysis has not halted the acquisition of new functions by ncRNAs . Instead, we propose that the rich diversity of RNA-protein interaction modes evident in modern organisms facilitates ncRNA gain of function in the context of RNPs. To emphasize the important distinctions between ancestral and ‘modern-day’ evolution of function by ncRNAs, the post-protein explosion of ncRNA complexity can be termed the RNP Renaissance (Figure 1).
A well-documented, logical role for structured ncRNAs of the RNP Renaissance is to act as versatile platforms for protein assembly (Figure 2). This role capitalizes on the prior evolution of a large inventory of protein folds and an enormous structural diversity of protein-RNA interactions. Transcription can support synthesis of extremely long RNA polymers in vivo, and yet only short motifs are required for highly specific protein interactions. These features make RNA well suited for gain of function as a protein-bridging platform of macromolecular assembly. Indeed, the functions of most structured ncRNAs are likely to depend at least in part on their protein-scaffolding abilities.
RNA scaffolds have distinct, unique structural properties. Constraints on the relative positioning of proteins bridged by RNA can be tight (if the scaffold is rigid), rotationally flexible (if the otherwise rigid scaffold has hinges), or variable over great distance (if the scaffold harbors segments of duplex RNA with many hinges, has extended single-stranded as well as duplex regions, or undergoes conformational dynamics on a biological time scale). Proteins may exchange from the scaffold with different kinetics, through mutually exclusive, independent, or coordinate interactions. While bridging of several different proteins may be the most common function served by RNA scaffolds, structured ncRNAs can also bridge or nucleate the assembly of multiple subunits of the same protein. For example, heat shock RNA 1 appears to function by trimerizing the heat shock transcription factor HSF1 , which stimulates HSF1 function as a transcriptional activator.
The use of RNA-scaffolded assemblies for chromatin specialization has evolved independently in numerous biological contexts. The Drosophila roX1 and roX2 RNAs are notable examples of this ncRNA function. RNPs assembled on functionally redundant roX1 or roX2 accomplish dosage compensation in males by increasing gene expression from the singleton X chromosome. The roX RNAs bind and bridge numerous proteins (Figure 2A), including chromatin-modifying enzymes . These RNPs are preferentially recruited to specific chromosome loci and then spread to flanking binding sites, eventually generating a RNP-coated chromosome with a characteristic banding pattern. In addition to their scaffolding role, the roX RNAs either directly or indirectly serve as allosteric activators of RNP enzyme activity in chromatin modification . Although chromatin surrounding a roX RNA expression site preferentially recruits roX RNPs, this feature is not a fundamental requirement for ncRNA function: sites of roX RNA expression and function can be physically unlinked . Mammalian X-chromosome inactivation also involves spreading of ncRNA on chromatin, albeit in a manner more strictly cis-linked to the ncRNA expression locus . These and other examples highlight the theme of ncRNA function in chromatin specification and raise the prospect that long, partially structured ncRNAs are particularly well suited to mediate protein spreading along the length of a chromosome.
The scaffolding function of ncRNAs can be exploited to regulate protein activities in response to changing cellular conditions, as demonstrated by the human 7SK RNA. This abundant nuclear ncRNA forms distinct RNP assemblies that are in dynamic, stress-regulated exchange (Figure 2B). In one RNP form, 7SK RNA negatively regulates the transcription elongation factor P-TEFb by sequestering it in a multisubunit RNP complex [14,15]. In alternate RNP(s), 7SK RNA interacts with a distinct set of proteins to form complexes predicted to have reciprocally related function [16–18]. 7SK RNA has long been thought to be vertebrate-specific, but recent evidence suggests that it has a wider evolutionary distribution . This finding presents an opportunity to investigate the phylogenetic diversification of 7SK RNP composition and biological regulation as a model for understanding structured ncRNA gain of function in the RNP Renaissance.
Another role for structured ncRNAs that has been particularly well-characterized is to act as a guide for Watson-Crick base-pairing. Even nascent transcripts can base-pair with a complementary nucleic acid target, if a suitable region of sequence is accessible for hybridization. But the RNP context of a structured ncRNA provides opportunities for improved hybridization specificity and expanded diversity in the biological outcome of hybrid formation. Large families of structured ncRNAs can share a conserved motif architecture that establishes both the specificity of ncRNA assembly with protein partners and the position of the sequence that will function as a hybridization guide (Figure 3). Gene duplication followed by guide-sequence divergence can expand the scope of RNP hybridization targets (Figure 3, right), while sequence changes in other regions of the ncRNA can expand the scope of RNP function in a different manner, giving entirely new biological roles to the ncRNA (Figure 3, left). In these examples, the RNA motifs and protein interaction partners that protect ncRNA biological stability are conserved while additional ncRNA motifs are reshaped through natural selection. Evolutionary adaptation of ncRNA in RNP context thus provides enormous potential for ncRNA-mediated diversification of RNP architecture and function.
The small nucleolar (sno) RNAs are representative structured ncRNA families that share motifs for protein interaction but possess divergent guide sequences for target hybridization. SnoRNAs typically guide base and sugar modifications of ribosomal RNA . The shared snoRNP proteins include an enzyme that catalyzes the modification reaction, but its activity and substrate specificity depend on base-pairing of the snoRNA internal guide with the intended RNA target of modification. A subset of snoRNA family members have an additional motif that mediates preferential association with Cajal bodies rather than the nucleolus; these so-called scaRNAs modify small nuclear RNA rather than ribosomal RNA targets . Newly evolved snoRNA family members have also been proposed to regulate adenosine-to-inosine editing and pre-mRNA splicing [22–24]. The snoRNA structural platform has even been appropriated to fulfill functions beyond target hybridization, for example acting within the vertebrate telomerase RNA to direct precursor processing, mature RNA accumulation as RNP, and RNP localization .
Beyond the biological exploitation of structured ncRNAs as specificity factors for hybridization to a target nucleic acid, structured ncRNAs can present an internal single-stranded region for use as template. Telomerase RNAs harbor a template for synthesis of the telomeric-repeat DNA at eukaryotic chromosome ends. Across evolution the family of telomerase RNAs has retained the presence of a template and the ability to recruit telomerase reverse transcriptase to copy it, but otherwise family members are highly divergent in RNA and RNP architecture. Divergent telomerase RNA structures adapt the enzyme to particular, organism-specific strategies of active RNP assembly, localization, and regulation . The bacterial 6S RNA demonstrates that ncRNA-templated nucleic acid synthesis can be co-opted as a mechanism to regulate ncRNA function. The 6S RNA binds to a subset of RNA Polymerase holoenzymes to impose promoter-specific transcriptional inhibition in stationary phase . Remarkably, beyond mediating polymerase inhibition by ncRNA-protein interaction, 6S RNA also harbors an internal region that functions as a template for RNA synthesis [28,29]. 6S RNA-directed RNA synthesis allows the polymerase to escape from 6S association during outgrowth from stationary phase.
A final example here illustrates that the templating function of ncRNA is not restricted to directing the synthesis of nucleic acids. The bacterial tmRNA instead provides an mRNA-like template for protein synthesis . A region of tmRNA enters the decoding site of a ribosome stalled in translation and is paired with charged tRNAs to template the addition of a C-terminal peptide tag. The fusion protein is marked as a product of abortive translation and rapidly degraded. Even from the few examples described above, a remarkable breadth in the ability of structured ncRNAs to exploit Watson-Crick base-pairing for recognition of dNTPs, NTPs, or tRNA anticodons is evident.
The great heterogeneity of possible RNA folds and RNP architectures suggests that a much wider scope of ncRNA function should be possible, beyond the use of modular RNA motifs for protein binding or base-pairing. It is less straightforward to define novel biochemical properties or functions of ncRNA than to uncover new examples from the known ncRNA playbook, particularly given the relatively few methods available for studying ncRNA versus protein folding, interactions, localization, and dynamics in vivo. Many structured ncRNAs with currently unknown roles will eventually be shown to function primarily by bridging associated proteins and/or base-pairing with nucleic acids. However, we suggest that especially in the context of biologically stable, structured ncRNAs already playing these roles, modular RNP architectures present opportunities for ncRNAs to explore additional mechanisms of function.
Expansion of the Y RNA family provides a potential example of novel ncRNA gain of function. Many organisms encode a single Y RNA, but higher eukaryotes have diversified a Y RNA family (Figure 4) with at least some members under positive selection for sequence divergence [31,32]. Y RNA biological stability requires association with its partner protein, Ro. Ro does not have enzymatic activity, but enzymes recruited transiently to the Ro platform are thought to mediate degradation of misfolded ncRNAs recognized by Ro as substrates for quality control . A bound Y RNA is predicted to occlude the Ro surface required for high-affinity binding to misfolded RNA . Thus, the ancestral Y RNA could have been a negative regulator of Ro association with potential ncRNA quality control substrates .
Y RNA family members share the motifs necessary for Ro binding, but they are highly divergent in other sequence features (Figure 4). Like snoRNA families, the Y RNA family could have gained function by beneficial expansion of the scope of targets recruited to the shared protein platform. Unlike snoRNAs, however, Y RNAs lack single-stranded regions of conserved length and positioning that would be candidate motifs for mediating hybridization to RNA targets. Instead, the family of Y RNAs may specialize Ro function by recruiting particular misfolded ncRNA targets in their endogenous RNP context, via a RNP-RNP mode of recognition rather than simple RNA-RNA base-pairing . Motifs specific to an individual Y RNA could be subject to ongoing selection for improved recognition of the evolving misfolded ncRNA targets of quality control surveillance.
Much remains to be understood about Y RNA and Y RNP function. Independent of the eventual roles defined for individual members of the Y RNA family, it will be interesting to uncover how expansion of this ncRNA family in shared RNP protein context led to gain of function compared to how gain of function was accomplished by assembly of a single ncRNA with alternate RNP protein partners, as discussed above for human 7SK RNA.
Knowledge about ncRNA has reached only the tip of an iceberg , but already the diversity of ncRNA functions resists easy categorization. Beyond elucidating additional details of specific modes of ncRNA function, further investigation will expand our knowledge of the scope of ncRNA roles in the RNP Renaissance and mechanisms by which ncRNA evolution generates functional complexity. These efforts will be facilitated by methods that can assess ncRNA localization and interactions in vivo. Innovative use of modified nucleic acids as hybridization probes greatly advanced the study of ncRNPs . New tag-based ncRNA localization and RNP affinity purification methods [16,38,39] should enable a broader scope of analysis, including integration of ncRNAs into the current protein-centric versions of cellular interaction networks.
Are there discernable themes for the biological roles of structured ncRNAs arising in this modern-day RNP Renaissance? As a final point of speculation, we note that many structured ncRNAs function in a manner linked to cellular stress. Several independently evolved ncRNAs have been shown to be stabilized by stress and to act under these conditions to inhibit transcription [27,40–42]. Even ancestral ncRNAs may have gained roles in stress response: for example, under stress, mature tRNAs are processed by cleavage in the anticodon loop to generate transiently accumulating tRNA halves . These and other examples of stress-associated ncRNAs could have been preferentially characterized due to their abundance or readily discernable regulation; in eukaryotes, these ncRNAs are often produced by RNA Polymerase III, facilitating their detection using bioinformatic analysis . However, there may be a robust biological rationale for evolution of ncRNA roles during stress conditions. The minimal lag and low energetic cost of RNA synthesis relative to protein synthesis would be advantageous for a stress response, and, in fact, many forms of cellular stress disfavor a response requiring protein synthesis due to stress-induced translational repression. The repeating theme of roles for structured ncRNAs in cellular stress responses, along with a growing appreciation of extraordinary ncRNA diversity, suggest that there are important and widespread implications of the RNP Renaissance.
We thank Harry Noller for inspiration with the term RNP Renaissance and Suzanne Lee for discussion. Funding for non-coding RNA research in the Collins lab has been provided by the Cancer Research Committee of California and the National Institutes of Health.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
J. Robert Hogg, JRH: Department of Biochemistry and Molecular Biophysics, Howard Hughes Medical Institute, College of Physicians and Surgeons, Columbia University, New York, NY 10032; Email: ude.aibmuloc@1272hj.
Kathleen Collins, KC: Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA 94720-3200; Email: ude.yelekreb@snillock.