|Home | About | Journals | Submit | Contact Us | Français|
Dedicated to the memories of Leslie E. Orgel (1927–2007) and Stanley L. Miller (1930–2007).
Ribonucleic acids are structurally and functionally sophisticated biomolecules and the use of models, frequently truncated or modified sequences representing functional domains of the natural systems, is essential to their exploration. Functional non-coding RNAs such as miRNAs, riboswitches, and, in particular, ribozymes, have changed the view of RNA’s role in biology and its catalytic potential. The well-known truncated hammerhead model has recently been refined and new data provide a clearer molecular picture of the elements responsible for its catalytic power. A model for the spliceosome, a massive and highly intricate ribonucleoprotein, is also emerging, although its true utility is yet to be cemented. Such catalytic model systems could also serve as “chemo-paleontological” tools, further refining the RNA world hypothesis and its relevance to the origin and evolution of life.
Ribonucleic acids (RNAs) are now known to perform diverse cellular functions, far beyond their “classical” roles as mediators of protein synthesis. Importantly, the role of non-coding regions, including microRNAs and riboswitches, continues to be the subject of intense current research [1–4]. RNA-based enzymes, or ribozymes, were the first functional non-coding RNAs to be discovered and have intrigued the scientific community ever since [5,6]. In particular, understanding the molecular intricacies of RNA catalysis remains of contemporary interest and relevance to molecular and chemical biology, as well as to prebiotic chemistry and evolution [7,8]. Given the large sizes of the first ribozymes, it initially appeared daunting to study them from physical organic chemistry and mechanistic perspectives. Subsequent discoveries of smaller ribozymes triggered, however, the development of model systems, which allowed chemists to advance the fundamental understanding of RNA as a genuine catalytic molecule.
The term ‘model systems’ embraces numerous and somewhat ambiguous meanings and might elicit different connotations among scientists in distinct disciplines. For an RNA biochemist, a model system could imply the truncation of a large RNA to a smaller and manageable core sequence largely capable of mimicking the structure and function of the biological macromolecule. Despite inherent challenges, such minimized constructs have found great utility in RNA biochemistry. Alternatively, one might view the fabrication of RNA sequence that contain modified or reporter nucleobases as model systems that, in addition to mimicking the unmodified truncated RNA, come with the added value of built in probes. The fabrication of such systems could be more challenging, but has the potential to provide information that is not accessible by more classical means of traditional biochemistry. Yet, model systems could also further deviate from their natural counterparts. Such reductionism, popularized in the zenith of biomimetic chemistry, would view, for example, bis-dinitrophenyl phosphate as a model substrate for exploring RNA hydrolysis. Here we consider model systems that are truncated or isolated parts of the natural entities. They offer the advantage of retaining biological relevance and function, while at the same time serving the chemists’ need for facile synthesis and chemical modification.
In this review we discuss two biologically relevant catalytic RNA systems of contrasting levels of understanding. The first is the well-known hammerhead ribozyme, where only recently biochemical data using model systems have been congruent with updated structural work. Recent studies are highlighted using the original and updated model systems that are finally producing detailed mechanistic views of the hammerhead’s catalysis. The second system is the spliceosome. This ribonucleoprotein complex dwarfs the hammerhead in size and its RNA components are the least understood in terms of their role in catalysis. Given the spliceosome’s large size and dependence on many protein components, one wonders whether or not simple RNA model systems can be considered valid. We look at a recent emerging model and its ability to shed light on the catalytic role of the RNA component.
The hammerhead ribozyme is a small motif within plant RNA viroids that is responsible for processing the genome via a self-cleaving phosphodiester transesterification reaction [9–11]. Understanding how the hammerhead increases the rate of this transformation by a million fold compared to the corresponding background uncatalyzed reaction continues to be of fundamental interest. This section focuses on the evolution of hammerhead model systems, largely favored due to their manageable size, and not on detailed analyses of self-cleaving ribozymes and their mechanisms, which have been reviewed [12,13].
Boundary experiments, defining the essential regions of the hammerhead motif , coupled to Uhlenbeck’s hammerhead generated by hybridizing two oligonucleotides , ultimately led to a working model system known as the “minimal” (45 nt) hammerhead (Figure 1A) . While the natural transformation takes place intramolecularly (or in cis), the bimolecular, trans-cleaving version shown in Figure 1 became the commonly used one . In addition to the conserved 15 nt core, the ribozyme necessitates surrounding sequences shown as helices I, II, and III that presumably provide structural and conformational support. These sequences are adjustable both in length and nucleotide composition, but require Watson-Crick complementarity. The loop capping off helix II, while not conserved, was identified in all natural motifs and thus included in the minimal hammerhead construct. This synthetically simple model system was used to generate most of the early biochemical functional data. In particular, mutagenesis studies showed that replacing all core residues, except for nucleotide 7, impacts reaction rates, confirming their role in the assembly of the active site .
A unique predicament, not encountered with other ribozymes, emerged with the publication of the first hammerhead crystal structures, [17,18] where the structural and biochemical data were in conflict . In particular, the hammerhead conformation, as appeared in the solid-state structures, appears inconsistent with the proposed mechanism and involvement of key residues as established by systematic biochemical analysis. An especially troubling issue was the role of residue G12 in hammerhead catalysis. Modifications to G12 were known to drastically affect cleavage rates, suggesting its participation in assembling the transition state. Yet, G12 was nowhere near the cleavage site in the crystal structure of the minimal hammerhead.
A potential resolution to this problem surfaced when a larger version of the minimal hammerhead, which contained an additional loop off helix I, was shown to display enhanced ribozyme activity (Figure 1B) . This key discovery followed observations showing that the minimal hammerhead was not active in vivo while the natural motifs were. Expanding helix I with a domain first thought to be inconsequential resurrected ribozyme activity. The expanded hammerhead was indeed 100 to 1000 times faster than the minimal hammerhead due to tertiary interactions that populate the proper conformation necessary for cleavage . An essential crystal structure of the hammerhead containing the additional loop , which is now termed an “extended” hammerhead (Figure 1B), showed a dramatically different hammerhead transition state-like conformation in the crystalline state and offered a solution to the structure–function dilemma .
The refined view of the hammerhead raised a crucial question regarding the validity of the biochemical data gathered with the minimized motif over decades of explorations. Nelson and Uhlenbeck have recently shown that essentially all previous cleavage kinetics data generated by modifying the active site nucleotides are consistent with the proposed transition state structure of the extended hammerhead . Even the troubling quandary with residue G12 mentioned above was resolved, as this residue is perfectly positioned near the cleavage site in the extended hammerhead structure .
So what does the crystal structure of the minimal hammerhead show? It is now established that the minimal hammerhead has at least two primary conformations, the first being the active cleaving conformation that is representative of natural hammerheads and a second, more prevalent inactive conformation that had consistently been observed in crystal structures [25,24]. Does this imply that the minimal hammerhead is no longer a good model system? This is likely to be the case if one is concerned with transition state structures. It remains, however, quite informative for biochemical investigations. It was, after all, results generated with the minimal hammerhead that suggested nucleotides G8 and G12 as the general acid and base residues, respectively (Figure 2A) . Furthermore, a new tertiary base pair was identified between G8 and C3 and was suspected to facilitate the correct orientation of the 2’-OH of G8 for general acid catalysis using the minimal hammerhead construct (Figure 2) . Indeed, mutations of G8 and C3 residues were shown to impact cleavage rates in both the minimal and expanded hammerhead constructs .
While the minimal hammerhead still remains a relevant model system for biochemical studies, there has been a definite shift towards the use of the expanded construct. This has particularly been the case for exploring the role of residue G12 as a general base (Figure 2) . An extended hammerhead, modified at the C17 cleavage site, was recently used in affinity labeling studies. Specifically, replacing the 2’-OH with a 2’-bromoacetamide residue created a potent electrophilic trap for the suspected anionic G12 general base (Figure 2B). The pH dependence of the alkylation reaction rate mirrored that of the hammerhead cleavage reaction rate, supporting the proposal that G12 acts as a general base and suggesting that its pKa is significantly modulated in the context of the ribozyme .
The specific roles of G8 and Mg2+ in acid catalysis have also been recently addressed. As shown in Figure 2, it is the 2’-OH of G8 and not the nucleobase itself that serves as a general acid. Even with pKa perturbations within the folded ribozyme, this hydroxyl is unlikely to become acidic enough, unless influenced by a metal ion cofactor. Divalent metal ions such as Mg2+ have long been known to facilitate catalysis. Less clear are their specific effects as general or Lewis acids, or their impact on structural stability and global folding [29,30]. In a recent FRET study with the extended hammerhead, different metal ions produced minimal variability in global folding but caused dramatically different cleavage rates , suggesting a metal ion participation in the transition state (Figure 2C) . The metal ion, however, could also be serving as a Lewis acid coordinated to the 5’ scissile oxygen (labeled X in Figure 2C). To discern this potential ambiguity Thomas and Perrin used a modified extended hammered containing a 5’-sulfur-based leaving group (called the S-link substrate) along with modifications to the 2’-OH of G8 (deoxyG8, and 2’-OMe versions) . It was demonstrated that divalent metals lower the pKa of the G8 2’-OH via direct coordination as shown in Figure 2C and not by acting on the 5’-scissile residue [25,32].
Eukaryotic RNA transcripts contain exons (coding regions) separated by introns (non-coding regions). These precursor-mRNAs (pre-mRNAs) are processed to the corresponding mature mRNAs by the spliceosome [34–38]. This massive ribonucleoprotein (RNP) complex, containing hundreds of proteins and five small nuclear RNAs (snRNA), catalyzes the two transesterification reactions that encompass the splicing and ligation processes [39–44]. It is certainly justifiable to ask whether or not RNA models can be considered biologically relevant where the native system is dependent on both proteins and RNA. Here we discuss attempts to model individual events of this inherentally complex process using truncated RNA systems.
The spliceosome is an assembly of five RNP complexes (U1, U2, U4, U5, and U6) that interact with pre-mRNAs at different stages (Figure 3). It was shown that the catalytically essential complexes were the U2, U5 and U6 RNPs, which specifically bind to reactive regions. More importantly, the snRNAs of U2 and U6 were discovered to be necessary for catalysis in addition to their base-pairing recognition and binding roles [45–49]. Exactly how they participate in catalysis is the subject of current research, but these early observations led to the hypothesis that the spliceosome is a ribozyme [50–52]. The similarity of the spliceosome to group II intron ribozymes furthered this view. Both follow nearly identical splicing mechanisms. Also, the Intermolecular Stem-Loop (ISL), which contains a highly conserved AGC sequence in the U6 snRNA is strikingly similar to Domain V of the ribozyme [53–56]. Further evidence that supported the hypothesis of reactive snRNAs without the assistance of proteins came from in vitro evolution studies. Ribozyme activity was identified when a library of various RNAs based on the core sequences of U2 and U6 were subjected to rounds of selection and amplification [57, 58]. But it was the publication of a protein free, catalytically active, minimal U2–U6 snRNA complex that proved most compelling .
The minimal splicing complex is able to assemble and recognize a small branch point oligonucleotide in vitro and catalyze a phosphoesterification reaction using a branch point adenosine similar to the natural system (Figure 4A) . It consists of two RNA strands (70–80nt) containing the conserved ACAGAGA box, AGC triad and ISL that are essential for catalysis. While attempts were made to include a model 5’ splice site intron, catalysis was only observed with the branch point RNA strand. Consequently, this system does not catalyze the native reaction; instead the branch point strand attaches to the conserved G of the AGC triad forming a non-natural phosphotriester bond and eliminating water. Despite this difference, the minimal complex appears to be structurally and functionally similar since mutagenesis and other biochemical data match the natural system. To further support the inherent reactivity and relevance of the U2–U6 complex, a follow up study used purified natural snRNAs to demonstrate the production of the same phosphotriester product .
A relevant model needs to demonstrate that a truncated system can utilize the 5’ splice site and branch point pre-mRNA regions to catalyze the native reaction. Modifications to the minimal U2–U6 splicing complex were made to include a 5’ splice site intron in a covalent addition to the 5’ domain of U6 via a stabilized linker to ensure correct folding and reactivity (Figure 4B) . This proved necessary because, in the spliceosome, the U5 snRNA and proteins facilitate binding of the intron and the U6 domains. This model system catalyzes a reaction resembling the one in vivo, and is also shown to be sequence dependent at the 5’-splice site as in the natural system. The product is the addition of the branch point oligonucleotide via a covalent 2’,5’ bond of the adenosine to the splice site of the intron concomitant with the loss of the small exon sequence . Mutagenesis of the conserved and non-conserved regions provided evidence that the complex adopts an active site conformation similar to the natural system. While it appears that a biologically relevant model has been identified, a recent study demonstrated that caution must be taken when such modifications are made .
Another attempt to use the minimal U2–U6 complex on the well-studied yeast spliceosome by constructing a modified version in which the U2 and U6 were embedded in one RNA strand, but separated with a polyU loop and stabilized with additional GC residues, was made (Figure 4C) . The branch-site strand was extended for enhanced base pair stabilization to U2. New to this construct was the incorporation of a third RNA strand to represent the 5’ splice site intron. While a reaction took pace, it did not produce the expected product. It was discovered that the complex did not use the 5’ splice site substrate, nor did it use any of the conserved sequences, including the ISL in U6, in its catalysis. Systematic truncation and labeling studies indicated that the system shown in Figure 4C could be reduced to a minimal active version, which catalyzes a unique phosphodiester-based reaction (Figure 4D) . The complete branch site oligonucleotide inserts itself into the U6 region, but not with the usual branching A residue. Instead it utilizes the 2’ OH of the residue on the 5’ end, forming a unique 2’–3’ phosphodiester bond while excising the remaining U6 construct. Do the consequences of the seemingly conservative modification made to the U2–U6 minimal system foreshadow future problems with this model? A recent discussion concerning the validity of protein free splicing systems has surfaced, and a more thorough process to fabricating U2–U6 models might be needed [63,64].
This year marks the 150th anniversary of Darwin’s publication On the Origin of Species. While he was largely concerned with the evolutionary mechanisms of new species, his interest in the origin of life is famously documented in the 1871 letter to his friend, botanist Joseph Hooker :
“It is often said that all the conditions for the first production of a living organism are now present which could ever have been present. If (and oh! What a big if!) we could conceive in some warm little pond, with all sorts of ammonia and phosphoric salts, light, heat, electricity present, that a protein compound was chemically formed, ready to undergo still more complex changes. At the present day, such matter would be instantly devoured or absorbed, which would not have been the case before living creatures were formed”.
While our understanding of the molecular basis of biology has since been significantly advanced, shedding light on the origin of life remains, as Darwin knew it, a fascinating and frequently frustrating endeavor.
Over the last three decades RNA has become the most relevant biomolecule for understanding Life’s origin. Discoveries that point to RNA’s diverse roles in biology , advances made in its prebiotic synthesis [68,69] and the ongoing study of catalytic RNAs continue to provide compelling evidence for the RNA World hypothesis [70–73]. The hammerhead and the U2–U6 model systems discussed above are good examples for catalytic models that serve as “chemo-paleontological” tools in learning how RNA could have sustained an ancient biochemical world. Specifically, the discovery that nucleobases serve as catalytic moieties in small nucleolytic ribozymes such as the hammerhead beautifully demonstrates that chemically limited biomolecules (compared to the diverse building blocks found in proteins) can be quite resourceful in performing functional and genetic roles . In terms of evolution, the spliceosome, while not likely to have originated in an RNA world, is still considered to have originally been an RNA machine which was either a relative to the group II intron or a product of evolutionary convergence [75,76]. It is likely the U2–U6 minimal complex, in some way, represents a vestige of the early RNA based spliceosome. Utilizing it as a starting point for directed evolution experiments or exploring the impact of systematic additions of RNA and/or protein components on catalysis, could provide a glimpse into the chemical evolution of this minimal system and the complexity of contemporary splicesomes .
Interestingly, the ribosome, a more ancient RNP, which carries out the process of translation in contemporary cells, has been hypothesized to emerge from an RNA-only core structure termed a proto-ribosome [78–81]. At the heart of the peptidyl transferase center is a ribozyme which undoubtedly originated in the RNA world, albeit, with more primitive functionalities . Identifying and experimentally corroborating the proto-ribosome hypothesis is likely to illuminate how RNA-catalyzed peptide bond formation took place over 3.8 billion years ago. Using a proto-ribosome as an RNA model system could also clarify how proteins have contributed and refined this machine over the course of evolution .
While RNA model systems have been widely exploited, their use with catalytically active RNA molecules is particularly informative. RNA domains, primarily involved in ligand binding, can be frequently truncated with little loss of function. In contrast, catalytically active RNAs could be exquisitely sensitive to modifications. Substrate specificity and product identity, as well as stereochemical and kinetic features, can all be monitored and used to assess the validity and quality of a proposed model system. Indeed, the minimal hammerhead has served as the workhorse model system for over two decades. Its size and simplicity facilitated the probing of virtually every aspect of its function. Even though early structural work is now considered to reveal inactive conformations, most, if not all, previous biochemical data has been reconciled with the updated extended hammerhead structure. The extended variant is presently taking over as the system of choice for mechanistic studies on ribozyme catalysis.
Employing RNA model system to the spliceosome, in comparison, is in its embryonic stages. Given the relatively recent identification of the U2–U6 complex as a model system, its true utility for exploring the first splicing reaction and its possible participation in the ligation step [38,65], remain to be established. It appears to be sufficiently relevant to the natural complex, provided that modifications are critically investigated. Since this RNA model lacks any protein counterparts, certain aspects, most notably its kinetic parameters, are unlikely to match the natural spliceosome.
The study of non-coding RNAs using model systems will undoubtedly continue to shed light on our biochemical origins. It is important to appreciate how young this field actually is and the likely existence of other biologically important non-coding RNAs we have yet to discover . Future findings of RNA’s role in biology may prove more beneficial to our understanding of early and extant life. While 150 years have passed since the birth of a major biological revolution, RNA is clearly providing us with another.
We thank the National Instituted of Health (grant GM069773 to YT) and the National Science Foundation (GK-12 Socrates Fellowship to ACR) for generous support.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.