|Home | About | Journals | Submit | Contact Us | Français|
Coronaviruses have the potential to cause significant economic, agricultural and health problems. The severe acute respiratory syndrome (SARS) associated coronavirus outbreak in late 2002, early 2003 called attention to the potential damage that coronaviruses could cause in the human population. The ensuing research has enlightened many to the molecular biology of coronaviruses. A programmed -1 ribosomal frameshift is required by coronaviruses for the production of the RNA dependent RNA polymerase which in turn is essential for viral replication. The frameshifting signal encoded in the viral genome has additional features that are not essential for frameshifting. Elucidation of the differences between coronavirus frameshift signals and signals from other viruses may help our understanding of these features. Here we summarize current knowledge and add additional insight regarding the function of the programmed -1 ribosomal frameshift signal in the coronavirus lifecycle.
The Coronaviridae family is comprised of toroviruses and at least three groups of coronaviruses which, together with the Arteriviridae and Roniviridae, belong to the order Nidovirales (1; http://www.ncbi.nlm.nih.gov/ICTVdb/Ictv/index.htm). The arteriviruses and coronaviruses can cause enteric and respiratory tract infections in mammals and birds while the roniviruses infect fish. The severity of pathogenesis varies depending on viral genotype. Outbreaks sometimes result in diarrhea and loss of livestock. After the SARS (severe acute respiratory syndrome) associated coronavirus epidemic in 2002–2003 interest and research in coroniviruses increased dramatically. Unlike the previously identified human coronaviruses (HCoV-229E and HCoV-OC43) which are more commonly associated with inconsequential respiratory infection, SARS-CoV had a mortality rate of 9.6% (2; http://www.who.int/csr/sars/country/table2004_04_21/en/index.html). The resulting resurgence in interest has added to decades of work from a few groups and has advanced our knowledge of the coronavirus lifecycle considerably. This article focuses on how these viruses manipulate the host protein synthetic machinery to produce proteins essential for viral replication via a mechanism called programmed -1 ribosomal frameshifting (-1 PRF). We describe salient features and how study of the frameshifting mechanism may enhance our understanding of other parts of the viral lifecycle.
Successful infection of host cells requires many steps which are common to all coronaviruses (Figure 1). Typically the coronavirus spike (S) glycoprotein and/or the haemagglutinin proteins interact with a cellular receptor to mediate entry into the host cell. Coronaviruses can also enter the cell via endocytosis. While many group 1 coronaviruses interact with an aminopeptidase, the SARS coronavirus uses the angiotensin converting enzyme 2 (ACE2) and/or the C-type lectins DC/L-SIGN as the host receptors (reviewed in 3). Cathepsin L cleaves the S protein and the viral envelope fuses with the host cell membrane. The virus disassembles, releasing the genomic RNA after which a replication-transcription complex forms on double membraned vesicles (4 and references within). New genomic and subgenomic RNA (sgRNA) is produced by the unique mechanism of discontinuous transcription during which negative-strand RNA intermediates are produced (reviewed in 5). Structural and accessory proteins are translated from the plus-strand sgRNA (Figure 2). Nucleocapsid proteins (N) package the genomic RNA and are met by envelope proteins which accumulate in the ER-to-Golgi intermediate compartment (ERGIC) for assembly. After virus particles assemble they egress from the cell via exocytosis.
Viruses by definition are dependent on the host cellular machinery for replication. However, as the viral lifecycle progresses inside the cell, the virus not only usurps certain host enzymes, but it also generates proteins that are not available from the host repertoire. For example, translation of viral proteins from the initial infectious RNA utilizes the host ribosomes while generation of new virus RNA requires a virally encoded RNA dependent RNA polymerase (RDRP). Many coronavirus proteins are translated from subgenomic RNAs (sgRNAs) rather than the genomic RNA (Figure 2). One result of this is that translation of plus-strand viral message RNA into proteins must theoretically occur in at least two phases: first the genomic RNA serves as a template for production of nonstructural proteins including the RDRP; then the RDRP uses the genomic RNA as a template for the production of sgRNAs (Figure 1). The structural and accessory proteins are produced from the sgRNAs in the next phase of translation. Thus, the second phase of translation cannot occur without the production of enzymes from the first phase. It is not known if successful infection requires the presence of more than one copy of the genomic RNA.
ORF1a/b, which is translated in the first phase, is a polyprotein that is cleaved into 16 non-structural proteins (Figure 2). The nonstructural domains in ORF1a (nsp1–11) and ORF1b (nsp12–16) are defined by proteolytic cleavage sites (reviewed in 6). The functional domains suggest that most of these proteins are involved in proteolytic cleavage and production or modification of RNA. The nonstructural proteins form replication complexes on the double membraned vesicles (DMV). During RNA replication negative strand copies of the genomic RNA are made along with negative strand sgRNAs. These in turn serve as templates for the production of positive strand genomic and sgRNAs (reviewed in 5). The ORFs encoding structural and accessory proteins are translated from sgRNAs. The nonstructural proteins remain with the replication complex on the DMV while the structural proteins migrate for assembly into viral particles (7).
Production of proteins from ORF1a/b does not follow the usual rules of translation. Two polyproteins are produced during the translation of one disjointed open reading frame. The first polyprotein is encoded entirely within ORF1a and translation terminates at the stop codon that defines ORF1a, as is typical in normal translation. However, there are signals contained within the RNA prior to the stop codon that direct a fraction of elongating ribosomes into an alternative reading frame, allowing them to bypass the ORF1a termination codon and continue translation into ORF1b, creating a larger polyprotein. This redirection of the ribosomes to create two polyproteins has been demonstrated for many viruses including arteriviruses (8), roniviruses (9) and a number of coronaviruses (10–13). The mechanism by which the ribosomes are redirected is called programmed -1 ribosomal frameshifting (-1 PRF). It is often at least 2-orders of magnitude more efficient than baseline rates ribosomal error. The efficiency of a -1 PRF may range from 15–60% depending on the assay system and the amount of RNA flanking the core sequence (14–16). This suggests that the flanking sequences are of some importance; however, codon and reading frame constraints pose some limitations on analyses of these flanking sequences using current in vitro assay systems. Until the relatively recent emergence of a variety of molecular and viral tools specific to coronaviruses the pursuit of these issues has been limited. The following section describes how our understanding of -1 PRF has advanced with particular emphasis on the many contributions made by analysis of coronavirus frameshift signals.
Programmed -1 ribosomal frameshifting (-1 PRF) is a mechanism used to regulate gene expression at the level of protein synthesis. As ribosomes translate one ORF they encounter a signal in the mRNA that directs a fraction of them to shift into an alternative downstream ORF which is in the -1 phase relative to the initiating upstream ORF (Figure 3). In viruses -1 PRF usually results in a C-terminally extended polyprotein containing additional function not present in the upstream ORF. The use of a -1 PRF mechanism for expression of a viral gene was first published in 1985 for the Rous sarcoma virus (17) and subsequently for other retroviruses (18). The first complete coronavirus sequence was published in 1987 (IBV; 19) and later that same an in vitro translation system was used to demonstrate that a -1 PRF mechanism was used to translate ORF1ab (10). In subsequent years, the IBV frameshift signal has been extensively analyzed by the Brierley and co-workers to become one of the most well characterized -1 PRF signals.
-1 PRF signals are usually composed of a “slippery site” followed by a stimulatory structure. These two elements are typically separated by a short spacer region. The slippery site is composed of a heptameric sequence such that the A- and P-site tRNAs can un-pair from the mRNA and re-pair in the -1 reading frame (20; Figure 3). The nucleotides surrounding the heptameric slippery site have been shown to have a limited effect on frameshifting efficiencies. Experiments altering the spacer region between the slippery site and stimulatory element reduced frameshifting efficiency suggesting that there might be some optimal spacer sequence (27–30). The three nucleotides 5′ of the heptameric sequence also affect -1 PRF efficiency suggesting a role for the exiting tRNA in the ribosomal E-site (24, 31). The stimulatory element has been shown to contribute significantly to -1 PRF efficiencies.
While the stimulatory structure was initially postulated to be a simple mRNA stem-loop studies of the IBV -1 PRF signal provided the first evidence for the requirement of a more complex mRNA pseudoknot (27; Figure 4). Subsequently mRNA pseudoknots were identified in the frameshift signals of a wide variety of plant and animal viruses. As additional viral sequences became available more elaborate stimulatory structures were identified in coronaviruses. These include “kissing loops” (32), and three stemmed mRNA pseudoknots, which were predicted for the coronavirus and the related torovirus Berne virus (33–34), and subsequently demonstrated by nuclease mapping for the SARS coronavirus (15–16). The variation in these stimulatory elements suggests that the additional features might be required for fine-tuning frameshifting efficiency or, alternatively, involved in additional viral functions. Interestingly, efficient frameshifting was observed when the third stem was deleted from the SARS-CoV pseudoknot, or when a similar region was deleted from the IBV stimulatory structure, suggesting that these regions are not required to modulate -1 PRF (15, 35). However, it is clear from mutational analyses that when the third stem is present that it has an effect on -1 PRF (14–15). Furthermore, additional sequence upstream of the core frameshift signal has been shown to affect -1 PRF efficiency in SARS-CoV (16). Thus, although core essential elements of the frameshift signal have been defined, the scope of factors, either cis- or trans-acting, has not yet been revealed.
A number of models have been proposed to describe the mechanism by which -1 PRF occurs (20–24). All the models posit that the stimulatory element causes a pause in translation and that base-pairing is required at the non-wobble positions of at least two tRNA molecules to the mRNA after the frameshift (Figure 3). Differences among the models are centered on the timing of the frameshift within the context of the elongation cycle. The detection of two different frameshift products by protein sequencing (18, 20) suggests that the different models may not be mutually exclusive. Analysis of frameshifting is complicated somewhat by the availability of malleable experimental systems imitating the appropriate host cell. It has been shown that prokaryotic ribosomes decipher coronavirus frameshift signals quite differently from yeast, plant or mammalian ribosomes (25–26). Thus a suitable system must be used to draw purposeful conclusions from in vitro analyses of -1 PRF. The prevalence of coronaviruses and their spread among a wide range of mammals including bats (1) suggests that analyses in mammalian cells are appropriate in most instances.
While progress has been made on elucidating the mechanism of -1 programmed ribosomal frameshifting and the RNA sequences involved in coronaviruses, the requirement for -1 PRF in the lifecycle of this class of viruses remains obscure. For other viruses, such as HIV and the yeast totivirus L-A, frameshifting regulates the relative ratios of structural to enzymatic proteins. The relative abundance of the coat proteins to viral RNA affects packaging and deviations from the optimal ratio result in a loss of infectivity (36–38). In contrast, -1PRF in coronaviruses modulates the relative ratios of two different classes of enzymatic proteins: proteases (and other uncharacterized proteins) encoded by the upstream Orf1a, and RDRPs and RNA modifying enzymes encoded by Orf1b. As coronavirus structural proteins are encoded on sgRNAs (the transcription of which is dependent on the frameshift), the role of -1 PRF on both the levels and timing of their synthesis, and on virus propagation in general has not yet been characterized. More specifically, the functional domains of the predicted proteases are encoded in nsp3 and nsp5 within ORF1a, prior to the -1 PRF site. The RNA modifying functions (RNA dependent RNA polymerase, helicase, exoribonuclease, uridylate-specific endoribonuclease and S-adenoslmethionine-dependent ribose 2′-O-methyltransferase) are encoded in nsp12–16 after the -1 PRF site (Figure 2). The reason for regulating the abundance of these proteins relative to one another is unknown. Nsp8 was recently described as a second RDRP raising the possibility that nsp8 and nsp12 ratios are important for controlling the amount of different RNA transcripts made during replication (39). The mechanisms by which a 100-fold more plus-strand RNA is made relative to the negative strand, or the mechanism that directs production of sgRNA rather than genomic RNA to be produced are not known (5). While it is possible that the ratio of nsp8 to nsp12 protein products may affect one of these mechanisms, this suggestion does not take into account the relative ratios of the other 14 proteins encoded in ORF1a/b.
The genomic RNA from plus-strand RNA viruses serves at least two functions during infection: 1) it acts as the mRNA from which viral proteins are translated, and 2) it is the template from which new genomic and subgenomic RNA is transcribed. How the infectious RNA transitions from one function to the other remains unanswered. Some progress has been made in our understanding of how another plus-strand RNA virus, the Barley yellow dwarf virus (BYDV) regulates ribosome and replicase traffic on an RNA template (40). Unlike the SARS coronavirus which utilizes a pseudoknot for frameshifting, BYDV requires a kissing-loop interaction similar to that described for the human coronavirus 229E frameshifting (32). The model requires disruption of long range RNA:RNA interactions which, if they do not reform, allow a switch between translation and transcription. In BWYV, interactions between a sgRNA and the genomic RNA also inhibit translation of the genomic RNA leaving it available for transcription or packaging (41). Such long range RNA:RNA interactions or interactions between gRNA and sgRNAs have not yet been identified in coronaviruses. An RNA switch in the 3′ UTR of MHV has been characterized and found to be essential (42). This motif is found in all group 2 coronavirus sequences but only the 5′ or 3′ portion appears to be conserved in group 1 or group 3 coronaviruses. Interestingly a third stem-loop in the pseudoknot of the -1 PRF signal is predicted to be conserved among the group 2 coronavirus but not in the group 1 coronaviruses which utilize a kissing-loop for frameshifting (15). Some alterations to the third stem in the SARS coronavirus pseudoknot result in a loss of infectivity without dramatically affecting frameshifting, and a subset of viral proteins encoded by the subgenomic RNAs have been identified that bind to the pseudoknot in the SARS -1 PRF signal (Plant and Dinman, unpublished data). These findings suggest that this region of the SARS (+) strand is vital for an aspect of the virus lifecycle other than -1 PRF.
One current research challenge lies in producing mutations having only moderate effects on -1 PRF so that more meaningful virology can be pursued. As these mutant viruses and replicons become available we will be able to correlate the efficiency of frameshifting with production of genomic and subgenomic RNAs, and with viral titers. It is expected that some of these mutations will result in defects that will give insight into the function of the internal stem loop (stem 3) of the frameshift signal, and that that the insight thus gained will provide an alternative starting point for dissecting the coronavirus replication system.
The prospects for studies of both -1 PRF and coronaviruses are encouraging. A number of synergistic advances are being made in both areas. The constantly expanding number of coronavirus sequences is enhancing the ability of researchers to identify conserved RNA sequence and structural motifs with a greater degree of confidence. As critical elements are identified experiments can be thoughtfully designed to generate mutant viruses and replicons from which useful information can be acquired. The size of the coronavirus genome and the lack of unique restriction sites for cloning pose some difficulties in manipulation of the virus. However recent advances in the stability of plasmid vectors and the availability of class II restriction endonucleases or restriction enzymes that cleave adjacent to the recognition sequence have circumvented some of these difficulties (43). The available clones and replicons (44–45) have allowed many groups to readily investigate different aspects of the coronavirus lifecycle. Developments in the NMR field are also enabling the solution of larger RNA structures such as those that direct -1 PRF. In addition, advances are being made in the design of algorithms able to predict tertiary RNA structures such as frameshift-promoting mRNA pseudoknots (46–48). As more structural data are generated the computational algorithms will be refined which in turn will provide enhanced tools for experimental design by bench-based researchers.
The coronavirus frameshift signals are complex and diverse and, as described above, have yielded a wealth of molecular biological data describing features important for -1 PRF. As noted previously, there are several limitations to -1 PRF studies, the most pertinent being that there are two overlapping open reading frames to maintain. Obviously silent protein coding mutations are preferable so that only recoding events are analyzed rather than protein function. The termination codon for the first ORF is very early in the SARS -1 PRF signal compared to the position of stop codons in other frameshift signals and this has increased the variety of mutations that can be sustained. A further limitation is that a frameshifting event must occur for production of the RNA dependent RNA polymerase which is essential for virus production. Thus mutations which abolish frameshifting completely will not produce replicative or infectious virus. It has been shown for some other viruses that there is an apparent threshold level of frameshifting required for competent virus production (36–38). Mutations that subtly alter the frequency of -1PRF in the coronavirus context are being discovered and these will lead to a greater understanding of the role of -1PRF in coronavirus replication.
Many of the published research articles would not have been possible without the free exchange of reagents and collaborations set up between groups with different expertise. The rapid emergence and severity of the SARS associated coronavirus lead to the sharing of unpublished information at conferences and meetings which in turn added vigor to the field. Because of the competitive nature of research, this sharing of resources is dependent on the ethical behavior of the researchers. We thank all those who are part of the community and have contributed important information that has the potential to prevent or control virus spread should a similar outbreak occur in the future. This work was supported by NIH grant RO1 AI064307.
The findings and conclusions in this article have not been formally disseminated by the Food and Drug Administration and should not be construed to represent any Agency determination or policy.