The redundancy of the genetic code means that a typical 300–amino acid protein can be encoded in about 10151
ways, raising the question of to what extent the actual encoding is optimal. Actual encodings are biased to use some synonymous codons more frequently than others (the “codon bias”). For instance, in humans, the Ala codon GCC is used four times as frequently as the synonymous codon GCG. Similarly, but independently, some synonymous codon pairs are used more or less frequently than expected (the “codon pair bias”) (1
). For instance, on the basis of codon frequencies, the amino acid pair Ala-Glu is expected to be encoded by GCCGAA and GCAGAG about equally often. In fact, the codon pair GCCGAA is strongly underrepresented, even though it contains the most frequent Ala codon, such that it is used only one-seventh as often as GCAGAG (2
) (table S1
). Although it is not clear why some codon pairs are under- or overrepresented, it is possible that codon pair usage affects translation (3
We previously reported the generation of poliovirus de novo in the absence of natural template (4
), using reverse genetics and the ability to synthesize large DNAs. We and others recently synthesized novel polioviruses encoding precisely the same amino acid sequences as wild-type poliovirus, but using rare codons (5
); these viruses were attenuated. Here, we used poliovirus as a model system to explore the consequences of genome-scale manipulation of codon pair bias (). We call the process of designing such viruses “synthetic attenuated virus engineering” or SAVE.
Fig. 1 (A) The poliovirus genome. Shown is the viral RNA with its covalently linked 5′ viral protein VPg, the 5′ nontranslated region [consisting of cloverleaf and internal ribosomal entry site (IRES)]; the long open reading frame (open box) (more ...)
We developed a computer algorithm that can recode a given amino acid sequence, but using different codon pairs, while controlling other features of the sequence such as the codon bias and the folding free energy of the RNA (2
) (figs. S1 and S2
). This algorithm was used to design two new polioviruses, PV-Min and PV-Max (), with a P1 region (encoding the viral capsid, 2643 nucleotides) containing under-or overrepresented codon pairs. The P1 region is suitable for such experiments because it can be deleted or substituted without affecting genome replication (7
). Virus PV-Min was recoded to use codon pairs that are underrepresented relative to the human genome, and it contains 631 synonymous mutations. Virus PV-Max was recoded to use overrepresented codon pairs, and it contains 566 synonymous mutations. PV-Min has a codon pair bias score (CPB score)much lower than that of normal human genes (). Both PV-Min and PV-Max encode precisely the same amino acid sequences as the wild type, but they use different pairwise arrangements of synonymous codons [; for calculation of CPB scores, see (2
)]. These P1 fragments were synthesized, sequenced, and incorporated into a full-length cDNA construct of poliovirus () (2
In vitro transcribed RNAs of PV-Max, PV-Min, and wild-type virus were transfected into HeLa R19 cells to assess virus production (5
). PV-Max produced 90% cytopathic effect within 24 hours after RNA transfection, similar to the transfection of wild-type RNA (8
). The PV-Max virus generated plaques identical in size to the wild type (). In contrast, the PV-Min RNA produced no visible cytopathic effect after 96 hours, and no viable virus could be isolated even after four blind passages of the supernatant from transfected cells.
We subcloned portions of the PV-Min P1 region into an otherwise wild-type virus to reduce the number of underrepresented codon pairs (). These subclones yielded viruses with varying degrees of attenuation (). Viruses containing P1 fragments X and Y were each slightly attenuated; however, when added together they yielded virus PV-MinXY, which was substantially attenuated (). Virus PV-MinZ was about as attenuated as PV-MinXY. Construct PV-YZ did not yield viable virus (). We conclude that the in-viability of PV-Min was due to the sum of defects in the various subportions.
One-step growth kinetics were examined. Like the wild-type virus, the chimeric viruses had an eclipse phase followed by exponential growth. However, as measured by plaque-forming units (PFUs), the final titer of PV-Min constructs was decreased by up to a factor of 1000 with re-spect to wild-type viruses (). This low plaque titer could have resulted from lower production of virions (i.e., cells infected with PV-Min constructs produced fewer virus particles), or from lower specific infectivity of those virions (i.e., the particles that were produced were less efficient in establishing a plaque), or both. We examined both possibilities (2
). When the number of viral particles produced per infected cell was measured, we found that cells infected with PV-MinXY or PV-MinZ produced fewer viral particles than did the wild type, but the effect was only about a factor of 10 or slightly less ( and ). When the number of particles per PFU was measured, we found a more striking effect: PV-MinXY and PV-MinZ each required about 100 times as many viral particles as the wild type to generate a plaque (). For wild-type virus, the number of virions applied per plaque generated was about 137, whereas for PV-MinZ, the number of virions applied per plaque generated was 13,500; hence, the main defect was reduced specific infectivity of the virions. The total attenuation from both effects together was a factor of ~1000.
Poliovirus specific infectivity and attenuation. PLD50 is the amount of virus that caused paralysis in 50% of infected mice.
The heat stability of PV-MinXY and PV-MinZ was identical to that of the wild type. This observation suggests that their low specific infectivity is not a result of gross defects in the capsid (2
) (fig. S3
To measure the possible effect of codon pair bias on translation, we used a dicistronic reporter encoding both R-Luc and F-Luc (2
) (). Because the F-Luc reporter is translated as a fusion protein with the proteins of the P1 region, the translatability of the P1 region directly affects the amount of F-Luc protein produced. Thus, the ratio of F-Luc luminescence to R-Luc luminescence is a measure of the translatability of the various P1 encodings.
Fig. 2 Effect of altered codon pair bias on translation. (A) Structure of a dicistronic reporter (5). The first cistron uses the hepatitis C virus (HCV) IRES to initiate translation of Renilla luciferase (R-Luc). This first cistron provides an internal control (more ...)
The variously encoded P1 regions were tested (). PV-MinXY, PV-MinZ, and PV-Min produced much less F-Luc per unit of R-Luc than did the wild-type P1 region, which strongly suggests that the underrepresented codon pairs reduced translation (). The reduced translation is probably sufficient to explain the attenuated phenotype, because smaller reductions in translation caused by other methods have been observed to attenuate poliovirus; apparently, poliovirus has a fairly high threshold requirement for translation (5
). In contrast, PV-Max P1 produced more F-Luc per unit of R-Luc than did the wild type, consistent with enhanced translation ().
PV-MinXY and PV-MinZ each contain hundreds of mutations (407 and 224, respectively). If the attenuation of these viruses were due to hundreds of small defects, it should be difficult for these viruses to revert to wild-type virulence. Alternatively, if most mutations are neutral, with a small minority contributing to attenuation, then reversion should occur. To distinguish these possibilities, we serially passaged viruses PV-MinXY and PV-MinZ in HeLa R19 cells 17 and 19 times, respectively, at a multiplicity of infection (MOI) of 0.5. The titer was monitored for phenotypic reversion, and the passaged virus was sequenced. After 17 or 19 passages of PV-MinXY or PV-MinZ, respectively, no phenotypic change was detected (i.e., same titer, induction of cytopathic effect) and no nucleotide changes were seen in the synthetic region, supporting the idea that there are many small defects.
We next tested whether the synthetic viruses were also attenuated in animals. Viruses were administered to CD155 transgenic (tg) mice (which express the poliovirus receptor) via intracerebral injection (10
), allowing direct exposure to the central nervous system, the ultimate target of poliovirus pathogenesis (11
). PV-MinXY and PV-MinZ viruses were attenuated by a factor of 1000 (as measured by particles) or a factor of 10 (as measured by PFUs) () (2
). PV-Max virulence was identical to that of the wild type.
Because PV-MinZ and PV-MinXY encode exactly the same proteins as wild-type virus, they might provoke a protective immune response. Alternatively, the relatively poor translation of the mutant mRNAs might prevent such a response. To distinguish these possibilities, we administered PV-MinZ and PV-MinXY to groups of eight CD155 tg mice at a dose of 108
particles once a week for 3 weeks via intraperitoneal injection. Ten days after the final injection, the protective antibodies of the seven surviving mice in each group were measured via microneutralization assay, and a robust immune response was detected (fig. S3
). Subsequent challenge of the immunized mice with an otherwise lethal dose of wild-type poliovirus via intramuscular injection did not lead to death or signs of paralysis or paresia; in contrast, all mock-immunized mice died.
Technology for the synthesis of large DNAs and for the redesign of living systems (5
) allows the reengineering of viruses for specific purposes such as vaccines. We have used these approaches to generate polioviruses that use a large proportion of under- or overrepresented codon pairs. Although it has been known for many years that codon pair usage is biased (1
), this phenomenon has previously been studied primarily by informatics (13
). We now find that underrepresented codon pairs cause poor translation and attenuation in poliovirus. One theory for the existence of codon pair bias is that certain tRNAs interact poorly on the ribosome (3
), and so the codon pairs causing the juxtaposition of such tRNAs are underrepresented; our translation data are consistent with this theory.
We note that attenuation is not caused by random changes in synonymous codons if those changes do not systematically reduce codon bias or codon pair bias. Previously we created the virus PV-SD, with 937 mutations in synonymous codons in the P1 region (5
). In PV-SD, neither codon bias nor codon pair bias was changed, and that virus was not attenuated (2
) (table S2
). Here, we created virus PV-Max, which similarly contained 566 mutations in synonymous codons, and it was also not attenuated (). It is noteworthy that even though PV-Max contains overrepresented codons, it is not more virulent than the wild type (), possibly because evolution has already effectively optimized encoding.
The correlation between the degree of codon pair deoptimization and the degree of viral attenuation, as well as the lack of viral reversion upon passaging, are consistent with the idea that many of the 631 mutations in PV-Min cause small, additive defects. Thus, these attenuated viruses should be stable genetically, because an increase in virulence might require dozens or hundreds of reversions. This genetic stability is important; for example, the oral poliovirus vaccine (OPV) has 51 mutations, but only 5 have been shown to contribute to attenuation (15
). Thus, OPV can (rarely) revert to neurovirulence in vaccine recipients, causing vaccine-associated paralytic poliomyelitis. Even more seriously, the vaccine strains may evolve into highly virulent circulating vaccine–derived polioviruses by mutation and recombination with related Coxsackie A viruses (15
). Such viruses have caused small epidemics of poliomyelitis (15
Finally, these results suggest that synthetic attenuated virus engineering (SAVE) could play a role in creating new vaccines for various types of viruses. By deoptimizing codon pair bias, one could systematically attenuate a virus to variable but controllable and predictable extents. This approach has four key features: (i) It produces a virus encoding precisely the same amino acid sequences as the wild-type virus, and therefore eliciting the same immune response. (ii) It is a systematic method apparently applicable tomany viruses, and possibly not requiring detailed, virus-specific research. (iii) The attenuation is not subject to reversion, simply because of the sheer number of mutations. (iv) It can be combined with other attenuating changes (such as amino acid changes from adaptation of the virus to low temperatures or alternative species) or with other synthetic biology approaches to attenuation (18
), thus taking advantage of additional modes of attenuation while providing the unique advantage of limited reversion. This “death by a thousand cuts” strategy is in contrast to existing methods of attenuation, which typically depend on a small number of mutations, and which can revert. Even for an inactivated rather than live virus approach, these features would allow a vaccine to be made from a safer starting material than the corresponding wild-type virus.