|Home | About | Journals | Submit | Contact Us | Français|
Smallpox was eradicated using variant forms of vaccinia virus-based vaccines. One of these was Dryvax, a calf lymph vaccine derived from the New York City Board of Health strain. We used genome-sequencing technology to examine the genetic diversity of the population of viruses present in a sample of Dryvax. These studies show that the conserved cores of these viruses exhibit a lower level of sequence variation than do the telomeres. However, even though the ends of orthopoxviruses are more genetically plastic than the cores, there are still many telomeric genes that are conserved as intact open reading frames in the 11 genomes that we, and 4 genomes that others, have sequenced. Most of these genes likely modulate inflammation. Our sequencing also detected an evolving pattern of mutation, with some genes being highly fragmented by randomly assorting mutations (e.g., M1L), while other genes are intact in most viruses but have been disrupted in individual strains (e.g., I4L in strain DPP17). Over 85% of insertion and deletion mutations are associated with repeats, and a rare new isolate bearing a large deletion in the right telomere was identified. All of these strains cluster in dendrograms consistent with their origin but which also surprisingly incorporate horsepox virus. However, these viruses also exhibit a “patchy” pattern of polymorphic sites characteristic of recombinants. There is more genetic diversity detected within a vial of Dryvax than between variola virus major and minor strains, and our study highlights how propagation methods affect the genetics of orthopoxvirus populations.
Smallpox was eradicated in the 1970s through the use of intensive vaccination in combination with campaigns designed to discover and isolate residual pockets of disease (10). The vaccines used in many of these campaigns were composed of live vaccinia virus (VACV) cultured in large quantities on the skins of animals, usually cows. Many different vaccinia virus strains were used as vaccines toward the end of this era, including a strain that was distributed in a lyophilized formulation called Dryvax (DVX) produced by Wyeth Laboratories (24, 37). This calf lymph vaccine is derived from the New York City Board of Health VACV strain and shares this origin with the most commonly studied VACV research strain, Western Reserve (WR). However, the two viruses have long been propagated in isolation. The last stocks of Dryvax were produced after passaging the virus 22 to 28 times in cows (24), while the sequenced strain of WR (NC_006998) has a complex 70-year history of passage, first in rabbits, followed by mice and, in more recent decades, by extensive passage in cell culture (26; R. Condit, personal communication).
These old smallpox vaccines were rarely subjected to clonal purification; in fact, the methods used to propagate them would have readily produced mixtures of viruses that are commonly called quasispecies. They were also contaminated with adventitious agents, including bacteria and bacterial debris (10). This situation is considered intolerable for modern licensure requirements and created problems when the need arose to produce new smallpox vaccine supplies in the late 1990s. This led to the development of ACAM-2K (Acambis clone 2000), a licensed vaccine comprising a VACV strain cloned from Dryvax and cultured on Vero cells (11, 14, 22, 37). ACAM-2K was one of seven viruses originally cloned from a pool of Dryvax production lots and shown to replicate the immunogenicity of Dryvax in humans while exhibiting a seemingly comparable (albeit still not ideal) safety profile (11).
Genome sequencing has suggested that vaccines like Dryvax are comprised of a complex mixture of viruses. The Esposito laboratory reported that there are 573 single-nucleotide polymorphisms (SNPs) and 53 insertions-deletions (indels) of various sizes that differentiate ACAM-2K (originally called clone 2, or CL2) from a more neurovirulent sister clone, clone 3 (CL3) (24). Similar degrees of sequence difference are observed when these viruses are compared in a pairwise manner with other independently isolated Dryvax subclones, including VACV-3737 and VACV-Duke. VACV-Duke is of special interest because it was isolated from a patient who developed progressive vaccinia after being vaccinated with Dryvax (18). Thus, it may illustrate an example of clonal selection operating in vivo upon a virus presumed to have been a component of the original inoculum. The fact that old smallpox vaccines comprise a quasispecies is not restricted to Dryvax, of course. Garcel et al. have documented a diversity of phenotypes exhibited by clones isolated from a stock of the VACV strain Lister (13), and shotgun sequencing of unpurified stocks identified >1,200 polymorphic sites distributed across a mix of Lister genomes (23).
These observations raise intriguing questions about the degree of genome diversity that can be found in old smallpox vaccines. In this communication, we have taken advantage of recent advances in DNA-sequencing technologies to explore this question in greater detail. Our results illustrate the remarkable complexity of the quasispecies that characterize stocks of old, unpurified smallpox vaccines and suggest that the viruses that have been isolated to date represent only a small fraction of the diversity of viruses in these preparations. These genomic studies also provide insights into the origin of viruses like VACV-Duke and of orthopoxvirus evolution under the selection processes associated with classical VACV propagation methods.
Viruses were isolated from a stock vial of Dryvax (lot 1556-14) and propagated on mycoplasma-free monkey kidney epithelial (BSC-40) cells in modified Eagle's medium (MEM) supplemented with 5% fetal bovine serum, 1% nonessential amino acids, 1% l-glutamine, and 1% antibiotic at 37°C in a 5% CO2 atmosphere. BSC-40 cells were grown to 80% confluence in 24-well plates and then infected with virus at a multiplicity of infection of approximately 1 PFU/well in 100 μl of phosphate-buffered saline (PBS) for 1 h at 37°C. The viruses were cultured for 3 days and then harvested from wells containing only one plaque. These viruses were then cloned twice more by limiting dilution as described above. Plaque images (see Fig. S1 in the supplemental material) were processed with ImageJ (1).
Each stock of plaque-purified virus was bulked up using sequential passages on BSC-40 cells and then purified by centrifugation through sucrose gradients as described previously (16). The DNA was extracted from each purified virus using proteinase K digestion followed by phenol-chloroform extraction. The amount of DNA was determined by spectrophotometry, and then 5 μg of each specimen was sequenced at the Genome Québec Innovation Centre (Montréal, Québec) using a high-throughput pyrosequencing approach on a Roche 454 GS FLX Titanium sequencer platform. A total of 12 viruses were sequenced using this approach, and 11 were successfully assembled into complete viruses.
Different-size contigs were assembled from the raw sequencing data using Newbler, and then CLC Genomics Workbench 4 software was used to inspect the trace data and complete the assembly of nearly full-length genomes. Conflicts between the reference sequences and our assemblies were resolved by using PCR to amplify the region of interest, followed by Sanger sequencing of the amplicons. Bioinformatic analyses were performed using Viral Genome Organizer (17, 34) and Viral Orthologous Clusters (9, 35; http://www.virology.ca). The program LAGAN (6; http://genome.lbl.gov) was used to produce alignments of multiple genomic sequences, and Base-by-Base software (5) was used to fine-tune the alignments and to produce a visual summary of the whole-genome alignments. To explore the phylogenetic arrangements, 98.8 kb of conserved DNA sequences (spanning genes DVX_058 to DVX_155) was extracted from the multiple-genome alignment and analyzed using a maximum-likelihood analysis with the Recombination Detection Program (RDP) (21) and 1,000 bootstrap replicates. Phylogenetic trees were plotted using TreeView (25). The plot of putative recombination sites (see Fig. 9) was produced using the program Simplot/Bootscan (20) with a 200-nucleotide (nt) window, 20-nt steps, gap stripping “on,” and 100 replicates and employed a neighbor-joining method of tree calculation. The Genome Annotation Transfer Utility (GATU) (32) was used to initially transfer a reference annotation to our Dryvax-derived viral genome sequences. Artemis (27) was used to visualize and edit the annotation. Table 1 lists the accession numbers for the VACV genomes cited in this communication.
To facilitate gene annotation, all of the complete experimental genomes were aligned and used to create a synthetic genome including a collective of all the open reading frames (ORFs). This synthetic, or “master,” DVX genome was used as a reference for annotation purposes.
Quantitative PCR (q-PCR) was used to determine the relative abundances of the virus types discovered through genome sequencing. The primers used in this experiment are shown in Table S1A in the supplemental material and are named according to the genes they target. A pool of virus DNA was prepared by boiling the Dryvax vaccine in 5% (vol/vol) ion-exchange resin (Sigma; C7901) for 30 min, followed by centrifugation for 20 min at 10,000 × g. The supernatant was transferred to a clean tube and used as a source of DNA for the q-PCRs. The gene designated DVX_209 was carried by all four of the virus variants and was therefore used as standard to normalize the amount of virus DNA. The q-PCRs were assembled using a SYBR green “supermix” (Bio-Rad; 170-8882) and processed in a Bio-Rad Min-Opticon cycler according to the manufacturer's directions. Cloned virus DNAs were prepared as described above for use in ordinary PCRs.
Virus DNA was digested with SalI (Fermentas) and size fractionated by electrophoresis through 0.7% agarose gels. The DNA was fragmented in situ with 0.2 M HCl, denatured with 0.4 M NaOH and 1 M NaCl, transferred to a nylon membrane (Pall Corporation; B60207), washed, and then UV cross-linked. A 445-bp biotin-labeled probe was prepared using PCR in reaction mixtures containing biotin-16-dUTP (Roche; 1093070), two oligonucleotide primers (5′-GACTTAAACAACGGACAC-3′ and 5′-GGCATAAAACACGAAGAGAA-3′), and Taq DNA polymerase (Fermentas). After hybridizing the probe to the DNA, the membrane was stained with IRDye 800CW-coupled streptavidin (Li-Cor; 926-32230) and imaged using a Li-Cor infrared imager as recommended by the manufacturer.
A single round of passage by limiting dilution on BSC-40 cells was used to isolate >50 different randomly selected Dryvax clones. A total of 25 viruses were then chosen for further passage by limiting dilution two more times on BSC-40 cells. These viruses were separately “bulked up,” the high-titer stocks were purified using sucrose gradients, and the virus DNA was isolated using phenol-chloroform extractions. This method produced various yields of virus DNA, and we arbitrarily elected to sequence the 12 viruses that yielded the greatest amounts of DNA for 454 sequencing. Although this may have biased the selection in favor of viruses that replicate most efficiently in BSC-40 cells, the 11 viruses that were eventually sequenced and assembled into complete contigs produced plaques that were not obviously any different from the range of plaque sizes produced by the original pool of 25 viruses (see Fig. S1 in the supplemental material). The choice of viruses thus represents a modestly representative sample of the diversity of viruses in Dryvax, although any virus exhibiting a profound replication defect in BSC-40 cells would probably not have been sequenced.
The viruses were sequenced using a multiplex approach and a Roche 454 GS FLX Titanium sequencer. Table S2 in the supplemental material summarizes the sequencing statistics. The sequence reads were automatically assembled into initial contigs and then manually assembled into nearly complete final contigs using several different sequencing tools. Because the average read length was only ~320 nt, these sequencing methods do not provide accurate insights into the structure of the highly repeated elements located in the virus telomeres (2, 3). We therefore elected to define the left and right ends of each genome as each comprising four copies of the 54-bp repeats located proximal to the boundaries of the terminal inverted repeats (TIR). DNA sequencing data and Southern blots showed that these viruses exhibit the variable numbers (54, 69, and 125 nt) of telomeric repeat elements that have previously been reported (2, 18) to characterize vaccinia virus strains (data not shown).
Pyrosequencing methods are prone to producing indel-type sequencing errors within homopolymeric base runs. In many cases, these mistakes were easily spotted wherever the consensus produced frameshift errors within normally intact VACV ORFs, and the true sequence could be deduced from visual inspection of the sequences of one or more of the aligned high-quality replicate reads. We also PCR amplified six of the sites where it was not possible to deduce the true sequence from the replicate reads and resequenced them. In all six instances, the correct sequence was the one that supported the original reference sequence and an intact ORF. This led us to assume that “mutations” in homopolymeric runs of more than 4 bases were artifacts of the sequencing technology, and they were thus edited to maintain previously described open reading frames. It is possible that a few true mutations were missed due to this assumption.
Four other Dryvax-derived whole-genome sequences had been assembled and annotated prior to starting this project. To facilitate the direct comparison of these viruses, we first produced a master genome that contains all of the ORFs that have been identified as being carried by one or more Dryvax derivatives and then transferred the earlier annotations to our sequences. The genes were numbered according to the system used to annotate strains ACAM-2K and CL3 (24), although we added four additional genes (DVX_063.5, DVX_080.5, DVX_164.5, and DVX_192.5), which are widely conserved among VACV strains, including VACV strain WR. This system generally defines an ORF as comprising at least 50 amino acids, and where a gene was identified as being fragmented, it referred to the fact that a much larger contiguous ORF is seen in one of the Dryvax clones (e.g., M1L or DVX_041), or that it has been truncated by at least one-third of its original length or split into two or more pieces in another vaccinia virus strain (e.g., strain Copenhagen). In some cases, it is not clear which of several ATG codons encoded the initiating MET; thus, the different annotations can create the false appearance of variably sized ORFs. Where possible, we used the known transcription start site to identify the most likely start codon (39). This was not possible with intermediate and late genes, so where these discrepancies were noted, we identified the most highly conserved consensus ATG as the probable start codon. Table S3 in the supplemental material summarizes all of the genes and other large ORFs carried by these cloned viruses, along with the reported gene complements for strains ACAM-2K, CL3, Duke, and 3737.
An important feature of the data summarized in Table S3 in the supplemental material is the large number of genes that seem to be conserved between the different viruses, insofar as they comprise nearly identical ORFs. This is not a surprising feature of the genes within the conserved central core of these viruses (see below), but it does suggest that many genes in the relatively unstable telomere region also provide some selective advantage under the conditions in which these viruses have been propagated. Many of the conserved genes appear to regulate inflammatory (and other) host antiviral processes, such as DVX_001 (a chemokine-binding protein), DVX_015 (an interleukin 1 [IL-1] receptor antagonist), DVX_013 (SPI-1 serpin), DVX_034 (a secreted complement binding protein), and DVX_042 (an NF-κB inhibitor). Kretzschmar et al. (15) have noted that vaccines derived from the New York City Board of Health strain produced ~10-fold less postvaccinial encephalitis than other once widely used smallpox vaccines (e.g., Bern, Lister, and Copenhagen). It is possible that the conservation of such anti-inflammatory genes is a contributing factor behind this phenotype, although this is clearly not a sufficient explanation, as most of these genes are still also carried by strains Lister and Copenhagen (Bern has not been sequenced). It may also confer a replicative advantage for viruses that can delay the induction of a sterilizing immune response in the animals in which these vaccines were propagated. Several genes of still unknown function (e.g., DVX_009, DVX_010, and DVX_012) have also been maintained intact within the telomeric regions and thus may be deserving of further investigation. DVX_027 bears some resemblance to the tumor necrosis factor (TNF)-binding proteins of other poxviruses.
Table 2 summarizes the major differences in gene complement between the 15 sequenced Dryvax clones. A great many genes differ only slightly in length, and this is commonly due to the acquisition of one or more in-frame deletions. One of the more interesting examples of these genes is DVX_204 (B11R), which includes anywhere from 1 to 12 copies of a 6-nucleotide repeat (5′-ACAGAT-3′) in the 5′ end of the gene (Fig. 1). The in-frame deletion (or insertion) of this 6-bp sequence creates ORFs ranging in length from 219 to 285 bp among these VACV clones (Table 2). This variable repeat is conserved among orthopoxviruses, and the longest reported set of these repeats (23 copies) is carried by the B11R homolog of monkeypox virus, strain Zaire. A more typical example of this pattern of mutagenesis is seen in DVX_142 (A12L). This gene ranges in length from 570 to 579 bp due to combinations of 0-, 3-, and/or 6-bp in-frame indels near the middle of the gene (Fig. 2). A notable feature of the generation of the deletion variants is that over long periods of time it could rationalize how poxviruses have evolved to encode some of the smallest known examples of several different enzymes.
Besides an accumulation of nonframeshifting indels, several genes have been disrupted by a varying pattern of frameshift mutations. For example, most Dryvax clones carry intact homologs of DVX_084 (I4L) and DVX_177 (A41L), but these genes are disrupted by single frameshifts in DPP17 and DPP13, respectively. The DVX_084 gene is of some interest, as it encodes the large subunit of the ribonucleotide reductase. The gene is not essential (12), and although most poxviruses encode a small subunit for the ribonucleotide reductase subunit, only a subset of orthopoxviruses and the suipoxvirus swinepox virus carry a gene for the large subunit. An interesting feature of the frameshift mutation in I4L is that it is linked to a number of nearby point mutations that are unique to DPP17 (Fig. 3). Clusters of DNA damage and mutations are a hallmark of exposure to ionizing radiation (31), although this pattern of mutations could also be a consequence of “patchy” recombination. Interestingly, horsepox virus (HSPV) includes several frameshift mutations in the I4L homolog, one of which is identical to the ΔT mutation in DPP17 (33). This peculiar feature of the horsepox genome is discussed further below.
Several genes also show a classic pattern of accumulating frameshift and point mutations that seem to be progressively degrading the residual ORFs. This is presumably a consequence of the fact that, once a gene has been inactivated by an initial mutation, there is no longer any further selection for the maintenance of gene function, and sequence drift can occur. For example, the DVX_039-041 ORFs derive from a single larger gene (M1L) that is intact in several of the strains, including ACAM-2K and WR (DVX_041). However, the length of the gene varies due to a combination of in-frame indels in some viruses, and the gene is disrupted by a 2-bp frameshift and/or C-to-A nonsense mutations in other viruses (Fig. 4). These mutations appear to be assorting independently between different viruses and create at least eight different alleles of the one original gene.
It has been previously noted that many of the indels that differentiate VACV from variola virus (VARV) are associated with small duplications, and this led Coulson and Upton (8) to suggest that poxvirus replication is susceptible to strand slippage errors. Our sequencing of a group of much more closely related and cocultivated VACV strains provides many further examples in support of this hypothesis. We found that >85% of the indel mutations are associated with the presence of repeated sequences in one or more of the cloned viruses. (Note that one cannot determine with certainty which sequence is ancestral from these data alone, although if only one virus carries a particular sequence, it most likely derives from a virus resembling the consensus sequence.) Figure 5 illustrates the manner in which duplications of nucleotide sequence are associated with indels. It is important to note that many of the insertions and deletions associated with 1 to 3 nt of identity occur at sites of repeated sequence (e.g., a T from TTT or an AT from ATAT) and thus tend to skew these statistics, even when we discount deletions occurring in longer patches of homopolymeric repeated sequence as being probable artifacts of the sequencing technology. A similar pattern of Streisinger frameshifting within short homopolymer repeats has been previously noted in recombinant VACV MVA strains carrying cloned HIV genes (38). The number of events declines as the length of the duplication increases, possibly due to the greater instability of longer repeats and thus their deletion from the virus pool. We did look closely at sites where there was no apparent sequence duplication(s) associated with particular indels, i.e., the “0-nt” class of events (Fig. 5). One unique putative insertion mutation was located adjacent to a classic VACV topoisomerase recognition site (29), which is conserved in all of the other viruses (Fig. 5, bottom), but no other distinguishing features were noted regarding these sites.
It has previously been noted that the junction regions, where the boundaries of the TIR are located, are prone to high frequencies of mutation and rearrangement (18, 24). This is true of these different clones, where several large rearrangements create significant alterations in the gene structure surrounding the junction with the right-hand TIR (Fig. 6, top). Besides discovering additional examples of viruses bearing telomeres resembling Duke (DPP13 and DPP21) and ACAM-2K/3737 (DPP09-12, -15, -16, -19, and -20), we also detected an additional 11.7-kb deletion in isolate DPP17. This deletion completely excised the DVX_212 (B19R) gene, an interferon binding protein (7, 36) that is partially deleted in ACAM-2K (24), as well as the proteins of unknown function encoded by DVX_210 and DVX_211. We did not initially sequence an example of a virus bearing the deletion characteristic of strain CL3, although it is present among the clones and in the vaccine stock (clone DPP25 [see below]). Such significant alterations in the gene complement of these strains might be expected to alter the abundances of the strain types in the original virus stock. To test this, we designed sets of PCR primers that can differentiate the four major “deletion types” of these viruses (see Table S1B in the supplemental material). PCR analysis showed that of the 25 cloned viruses, 13 carried the deletion characteristic of ACAM-2K, nine resembled Duke, one looked like CL3, and two carried the largest deletion characteristic of DPP17 (Fig. 7A and B). Because picking plaques biases the recovery of viruses, we also used q-PCR to measure the abundances of the different deletion alleles in a pool of virus DNA extracted directly from the stock of vaccine. These studies confirmed that (as defined by these large deletions) the ACAM-2K-like viruses were the dominant form (~60%), followed by Duke-like viruses (~40%), and <1% of the viruses resembled CL3 and DPP17 in these stocks. Southern blotting of DNA extracted from the Dryvax pool produced a pattern of hybridization signals with intensities that were also consistent with these measurements (Fig. 7C).
An intriguing aspect of the discovery of the different viruses is that they may reflect the past evolutionary history of the telomere junctions. These seemingly complicated structures can be explained if it is assumed that a virus originally resembling CL3 was subjected to a rearrangement that transposed a copy of the left telomere sequence into the right telomere. This would create a family of viruses resembling ACAM-2K. A simple process of additional deletions could then produce the Duke- and DPP17-like viruses from the CL3- and ACAM-2K-like viruses, respectively (Fig. 6, bottom). The relative rarity of viruses resembling CL3 and DPP17 suggests that some of the genes in this interval may have adaptive value in competitive growth environments, but it is not possible to say with certainty which ones.
Figure 8 shows a plot of the distribution of sequence differences across the different genomes. Inspection of this plot suggests that the density of sequence differences varies unevenly across the different genomes, with fewer polymorphic sites in the region between nucleotides ~40000 and ~150000. More quantitatively, we detected a density of about 0.8 SNP per 100 bp between nucleotides 40 and 150000 and 1.3 SNPs per 100 bp in the 30-kb segments on either side of this central region. Most of the SNPs (74%) comprise transition substitutions, i.e., pyrimidine for pyrimidine and purine for purine. This interval is bounded by the genes DVX_053/F4L and DVX_170/A36R and encompasses the F9R-A32L region that has been previously identified as including the highly conserved core of poxvirus genes (35). The lower rate of accumulation of mutations would be consistent with the large number of essential and highly conserved proteins encoded within this region.
The SNPs located between nucleotides 42240 and 141145 (aligned position) were used to examine the relationship between the different cloned isolates. Figure 8 shows a phylogenetic tree that clearly demonstrates clustering of the viruses that share a historical origin as the Dryvax vaccine. A notable feature of this tree is that most of the viruses that we have isolated cluster as one group, and separately from the ACAM-2K and CL3 strains that were independently isolated from another stock of the same vaccine. However, the association is not absolute, with DPP21 and DPP17 falling elsewhere within the “Dryvax cluster.” It is also curious that horsepox virus falls into this grouping, providing some support for the hypothesis that HSPV derives from a feral vaccine strain (33) in much the same manner as VACV appears to have established new zoonotic infections in Asian water buffalo and South American cattle (4, 30). It has previously been noted that there is more sequence diversity among all the sequenced VACV strains (including Tian Tan and Copenhagen) than among extant variola virus strains (24). Judging by the branch lengths, our phylogenetic analysis shows that VACV, which can be isolated from a single vial of Dryvax stock, also exhibits more sequence diversity than is seen in strains of variola major and variola minor viruses.
Within the cluster of Dryvax clones, the branching is not securely supported by the bootstrap values. This most likely reflects the fact that these viruses are likely genomic mosaics generated by multiple recombination events at some point in their history, and this obscures any clear relationship between the different isolates (8). This hypothesis is supported by an analysis of how the patterns of SNPs and indels are shared between the different genomes. For example, the widely used program Bootscan (20, 28) calculates how often a bootstrapping algorithm assigns two viruses to a common branch of several possible trees and how this relationship changes as one scans across a window encompassing different polymorphic sites. Bootscan discovered evidence that each virus has many short patches of sequence that closely resemble portions of other “sister” viruses. For example, within an interval toward the left end of the virus containing a relatively high density of informative polymorphic sites, DPP13 includes blocks of sequence that resemble portions of homologous loci in DPP11, -16, -20, and -21 (Fig. 9). Different patterns are detected in different viruses, although the reciprocal signals are readily detected (i.e., DPP15 includes a patch of sequence resembling DPP17 and vice versa [Fig. 9B and C]).
How much recombination these viruses have been subjected to is more difficult to determine. Previous studies from our laboratory have shown that replicating poxviruses can very efficiently recombine DNA under certain special circumstances, but there are also some physical constraints operating within a cell that, in combination with issues relating to the multiplicity of infection, limit recombination between coinfecting viruses (19). This dichotomous situation is also seen in the Dryvax stocks. On one hand, many mutations show a complete loss of linkage, as is illustrated by the A12L and M1L genes (Fig. 2 and and4,4, respectively), and appear to be assorting randomly among the genomes. On the other hand, there has not been enough recombination to fully obscure the patchwork patterns of closely linked polymorphic sites that are detected by a recombination detection algorithm (Fig. 9). The simplest explanation for this situation is that these stocks are composed of a mixture of recombinant viruses, but there has not been sufficient recombination to completely obscure the presence of a few still linked markers.
There is one fascinating illustration of this patchy pattern of recombination. Most of the mutations in the DPP17 DVX_084 gene are tightly linked and seemingly unique to that virus (Fig. 3). We have found no other Dryvax strains with a similar pattern of markers. However, what is exceedingly curious about the DPP17 DVX_084 ΔT frameshift mutation is that it lies in the center of a 200-nt patch of DPP17 DNA that, aside from a new 2-bp deletion, closely resembles the homologous locus in the horsepox R1 gene. The remainder of the DPP17 DVX_084 gene is clearly more closely related to those of viruses like ACAM-2K, as judged by the surrounding pattern of polymorphic sites (Fig. 10). This looks like a “molecular fossil” and provides further evidence of a shared origin (or at least some cocultivation) of horsepox and vaccinia viruses.
These studies provide new insights into the population structure and evolutionary trajectories of classical smallpox vaccines. Within a single vial of Dryvax, we have identified at least four different variants of VACV, as defined by the pattern of large deletions in the right-hand telomere. By this crude definition, viruses resembling ACAM-2K comprise about 60% of the viruses in this population, and this observation provides additional retrospective support for the wisdom of selecting this isolate to serve as a clonal representative of the VACV in Dryvax. These stocks also contain a substantial fraction of viruses (~40%) bearing the right telomeric deletion characteristic of strain Duke, as well as sufficient other genetic commonalities to cluster Duke in a phylogenetic tree with DPP12 (Fig. 8). This suggests that the Duke strain was not a novel form of spontaneously arising virulent VACV, but rather represents a preexisting strain type that was subjected to clonal selection in a patient susceptible to vaccinia necrosum (18). Rather surprisingly, we could detect only small numbers of viruses resembling the virulent CL3 strain (≤4%), even though it represented one of the seven viruses originally characterized during the development of ACAM-2K (24). Whether this reflects some differences between different lots of virus or experimental protocols or simply random chance is difficult to say. We have also identified a new variant strain of VACV, DPP17, which was probably produced by the deletion of 7 kb from the right telomere of a virus resembling ACAM-2K (Fig. 6). Given this lineage and the evidence suggesting that ACAM-2K represents a less virulent form of VACV (compared with CL3), it would be interesting to test the safety of a DPP17-based vaccine. It is possible we have missed viruses bearing other large deletions, because the Poisson 95% confidence interval is 0.0 to 3.7, with zero observed events, and thus a screen of 25 plaques could have easily missed any viruses comprising less than 3.7/25 (i.e., 15%) of the population. However, the true abundance would have to be far less than 15%, because no other variants were detected by Southern blotting. We found no viruses carrying previously unknown genes, although again, the small sample size makes it impossible to conclude there are no other genes remaining to be discovered in rare DVX clones.
Finally, these studies also provide some interesting insights into the behavior of VACV in the face of the selective forces imposed by classical calf lymph culture methods. It is not surprising that the central core of conserved genes appears to resist mutation, but selection also appears to favor the retention of at least some genes in the classically “unstable” telomeres. As a number of authors have previously noted, the viruses in vaccine stocks differ greatly due to an abundance of single-nucleotide and other polymorphisms (13, 24). However, our sequencing data also highlight the natural instabilities associated with sites bearing small duplications. We have previously noted that only limited sequence identity is needed to support recombinational repair in VACV-infected cells (40), and this pattern of repeat instability is perhaps reflective of this process. Recombination also appears to be rearranging the different genomes, but not to such an extent as to completely unlink all of the mutations. These viruses have clearly been subjected to a long history of mutational drift and periodic rearrangement by recombination in the absence of severe selection pressure. This has created a much greater degree of genetic diversity in a VACV vaccine than is seen in viruses like variola virus, which would have been subjected to very different evolutionary pressures, especially bottlenecks, during the natural passage of smallpox from person to person.
We thank Nicole Favis for excellent technical support and Wendy Magee and other members of the Evans laboratory for their helpful advice. We also thank the staff at the Genome Québec Innovation Centre for their guidance and the reviewers for helpful comments.
This work was supported by operating grants from the Canadian Institutes for Health Research and the Alberta Cancer Foundation and by an infrastructure award from the Canada Foundation for Innovation.
†Supplemental material for this article may be found at http://jvi.asm.org/.
Published ahead of print on 5 October 2011.