It is becoming increasingly apparent that single-stranded DNA (ssDNA) viruses such as the anelloviruses [1
], geminiviruses [4
], parvoviruses [10
] and microviruses [13
] are probably evolving as rapidly as many RNA viruses [15
]. While the inherent infidelities of RNA polymerases and reverse transcriptases drive the high rates of evolution seen in RNA viruses, all known ssDNA viruses replicate using presumably high-fidelity host DNA polymerases. It is surprising, therefore, that the basal mutation rates of ssDNA viruses are orders of magnitude higher than those of their hosts [15
The best supported, non-exclusive theories that have so far been put forward to explain discrepancies between basal mutation rates of ssDNA viruses and their hosts are that: (1) when in a ssDNA state the genomes of these viruses are subject to mutagenic processes that are less frequently experienced in dsDNA [4
]; (2) geminivirus genomes, and those of some other ssDNA viruses, are not sufficiently methylated such that normal host mechanisms of mismatch repair may not function during their replication [16
]; and (3) when replicating, ssDNA virus genomes are only transiently double stranded such that when errors occur they are not efficiently repaired by host base-excision pathways [4
Evidence is mounting that the rapid evolution of geminiviruses is, at least in part, driven by mutational processes that act specifically on ssDNA. Controlled evolution experiments involving Maize streak virus (MSV), a geminivirus in the Mastrevirus
genus, have revealed a strand specific G → T mutation bias that is possibly attributable to oxidative damage to guanines [9
]. Similarly, analyses of nucleotide substitution biases in natural tomato and cassava infecting geminivirus isolates (in the Begomovirus
genus) have, in addition to similar G → T mutation biases, identified overrepresentations of C → T and G → A transitions. These biases indicate that geminivirus DNA may experience elevated rates of spontaneous damage while in a single stranded state [4
]. Although it remains to be determined in a larger scale study whether an excess of C → T and G → A transitions have occurred during mastrevirus evolution, all these studies are consistent with the hypothesis that viral ssDNA is subjected to greater oxidative stresses (such as oxidative deamination of guanine and cytosine or oxidation of guanine to 8-oxoguanine) compared to host dsDNA.
High geminivirus basal mutation rates do not, however, necessarily imply that these viruses are also evolving rapidly. Rather than simply being the rate at which mutations occur, evolutionary rates are also influenced by (1) the rate at which deleterious mutations are purged from a population by negative, or purifying, selection, (2) the efficiency with which advantageous adaptive mutations are fixed in a population by positive, or diversifying, selection and (3) the rate at which neutral mutations (i.e. those mutations with no effect on fitness) are fixed in or lost from a population by random genetic drift. Adopting the convention of Duffy et al
] we differentiate between the biochemical or basal rate at which mutations arise (mutation rate, measured in rounds of genomic replication or units of time), and the usually slower rate at which mutations accumulate in wild populations evolving under natural selection (substitution rate, usually measured in years).
Geminiviruses have either one (monopartite, species in the Begomovirus, Mastrevirus, Topocuvirus and Curtovirus
genera) or two (bipartite, species in the Begomovirus
genus) ~2.7 Kb genome components. These compact genomes are among the smallest of any known viruses and encode only a small number of usually multifunctional and often overlapping genes [18
]. Mastreviruses such as MSV and Wheat dwarf virus (WDV), for example, express only four distinct proteins: a movement protein (MP), a coat protein (CP), a replication associated protein (Rep) and a RepA protein, expressed from an alternative spliceform of the rep
gene transcript such that it shares ~70% of its amino acid sequence with Rep [18
]. The compactness of mastrevirus genomes is further emphasised by the fact that, with the exception of MP, these proteins have multiple known functions [18
]. Given that many, if not most, mutations that occur in such compact genomes will be at least slightly deleterious and therefore subject to negative selection, it is expected that mastrevirus nucleotide substitution rates will be at least slightly lower than their basal mutation rates.
It is currently a matter of dispute as to how much lower geminivirus substitution rates are relative to their basal mutation rates. Experimental analyses of highly adaptive point mutations [19
] and mutation frequencies in genomes sampled after 30–60 days of replication within infected plants [6
] imply that the basal mutation rates of geminiviruses are in excess of 10-3
mutations per site per year (mut/site/year). Correspondence between the phylogenies of certain mastrevirus species and those of their grass hosts has, however, prompted speculation that mastreviruses may have co-diverged with grasses and that their substitution rates may therefore be as low as 10-8
substitutions per site per year (subs/site/year; [23
]) – i.e
. ten thousand times lower than their basal mutation rates. It is possible that very short-term evolution experiments (<0.2 years) produce inflated estimates of long-term substitution rates, because they are measuring adaptation (positive selection) to a novel host (e.g
]), or have not allowed sufficient time for negative selection to have effectively purged mildly deleterious mutations [24
]. However, the co-divergence hypothesis demands a long-term substitution rate four orders of magnitude lower than the approximately 2 × 10-4
to 7 × 10-4
subs/site/year rates that have been estimated in short-term (<5 years) evolution experiments [7
] and longer term (over tens of years) substitution rates estimated from temporally structured tomato and cassava infecting begomovirus datasets sampled from nature [4
The ten-thousand-fold discrepancy between directly-calculated geminivirus substitution rate estimates and those implied by the co-divergence hypothesis is difficult to reconcile. It has been suggested that different evolutionary forces are operating over short- (less than one year), long- (tens of years) and very long-term (thousands of years) evolutionary timescales: even though point mutations rapidly accumulate in geminiviruses over observable timescales, over the millennia mastreviruses experience an almost complete absence of positive selection and neutral genetic drift, coupled with almost unfalteringly efficient negative selection [23
]. This argument relies on the strange circumstance of mastrevirus species having had long co-evolutionary histories within their hosts, but without their having engaged in arms races with those hosts.
Here we describe a series of evolution experiments involving MSV and Sugarcane streak Réunion virus (SSRV – a mastrevirus species closely related to MSV [25
]) that lasted between 6 and 32 years. Our results provide extensive additional support for the hypothesis that, as with other geminiviruses, MSV and SSRV basal mutation rates are possibly elevated by unrepaired oxidative damage inflicted on ssDNA. We additionally show that, contrary to expectations under the co-divergence hypothesis, neutral genetic drift and not negative selection appears to be a dominant process determining the fate of new mutations.