Reviewer's report 1:Dr Eugene Koonin (National Center for Biotechnology Information, USA)
This is a truly exciting paper that reports the discovery of a completely unexpected entity, an apparent hybrid between a ssDNA virus related to circoviruses and an RNA virus related to tombusviruses. This finding is of great interest on two levels. First, to my knowledge, such a chimera between RNA and DNA viruses – not only of these particular families but in general - has never been observed before. Of course, there are many examples of mixing and matching in the virus world, but somehow they so far have been confined to the same type of nucleic acid. Second, this work highlights the new route to discovery in virology – the metagenomic path. This is literally a fishing expedition, with all its advantages and drawbacks. The main advantage is the capacity to discover essentially everything that is ‘out there’, even at low abundance, without the need for the laborious and biased procedures of virus and host growth. But, here is also the severe limitation of metagenomics: neither the host nor, strictly speaking, the virus is identified to the regular standards of microbiology and virology. In any case like this, but most especially when a bizarre chimera was discovered, it is crucial to show as convincingly as possible that the presented sequence is indeed the virus genome rather than some assembly artefact or chimeric clone. I think this is done in a satisfactory manner in this paper, by inverse PCR from an independent environmental sample. So I believe this is a real virus. Moreover, it is remarkable that the closest homologues of both the Rep protein and the capsid protein were detected in other metagenomic samples, those from the GOS. It is extremely intriguing whether these represent the same kind of chimeric genomes or the proposed RNA-DNA recombination event is relatively recent, and these neighbors are the closest relatives from the respective families of RNA and DNA viruses. With the genome of BSL-RDHV released, this should not be too hard to test. In a more general plane, one cannot help wondering how many of such unexpected wonders of the virus world await in all kinds of environments, and more practically, are the criteria for recognizing a new virus are going to change any time soon.
I have some minor specific issues with the paper.-The title may be construed as a bit misleading as ‘evolutionary link’ seems to imply that ssDNA virus(es) evolved from ssRNA virus(es) or vice versa. I would suggest mentioning the chimeric genome in the title itself.
Author’s response: The title has been revised.
-I am surprised by the methodology employed for building the trees (‘rough-cluster cladograms’) in Figure . Why use this crude approach instead of regular maximum likelihood method (RaxML) and perhaps even a Bayesian method in addition? Not that I expect the result to change dramatically but the new virus is interesting and unusual enough to invest a reasonable effort to make the phylogenetic analysis as robust as possible.
Author’s response: This section has been revised and much more extensive alignments are presented and phylogenetic analysis performed (Figures,and).
-I find the emphasis on the similarity in genome organization between the circular ssDNA virus which BSL-RDHV apparently is and ssRNA tombusviruses to be rather strange. Isn’t the similarity with circoviruses much more straightforward? To me, this looks like a circovirus in which the capsid protein was displaced by one from a tombus-like virus.
Author’s response: This has been revised throughout the text. However, we find the genome arrangement to be strikingly different from most circoviruses, thus have retained Figure.
Reviewer's report 2:Dr. Mart Krupovic (nominated by Dr. Patrick Forterre) (Institut Pasteur, France):
Diemer and Stedman report on characterization of a putative viral genome, which has been obtained in the course of a metagenomic analysis of virome samples collected at the Boiling Springs Lake. The putative viral genome (BSL-RDHV) encodes four proteins, two of which share sequence similarity with proteins from previously characterized viruses. One of these proteins is related to typical superfamily II rolling circle replication initiation proteins that are abundantly found in DNA viruses and plasmids. Strikingly, the other one is most similar to capsid proteins of eukaryotic icosahedral positive-sense RNA viruses. The observation that genes for two key viral functions— virion formation and genome replication—are apparently derived from unrelated RNA and DNA viruses/replicons to form a new chimeric viral entity is exciting, although not entirely novel (see below). The findings presented in this paper substantially advance our understanding not only on the genetic diversity in the virosphere but also on the potential mechanisms responsible for the emergence of novel viral types. I therefore think that the paper is definitely worth publishing. However, some parts of the manuscript can still be improved as detailed below.
Background: This section consists of five lines praising the usefulness of metagenomics in studying virus evolution, followed by a few paragraphs, which resemble Results rather than the Introduction. Given the fact that the paper is about virus evolution, the Background section could provide some information on the current hypotheses on the origin of viruses and the mechanisms of their evolution. This would allow the readers to more fully appreciate the significance of the findings presented in the Results section The authors might find useful the recent reviews on this subject by (Koonin and Dolja, 2011; Krupovic et al., 2011; Forterre and Prangishvili, 2009). Dolja VV, Koonin EV: Common origins and host-dependent diversity of plant and animal viromes. Curr Opin Virol 2011, 1(5):322–31. Krupovic M, Prangishvili D, Hendrix RW, Bamford DH: Genomics of bacterial and archaeal viruses: dynamics within the prokaryotic virosphere. Microbiol Mol Biol Rev 2011, 75(4):610–35. Forterre P, Prangishvili D: The origin of viruses. Res Microbiol 2009, 160(7):466–72.
Author’s response: This section has been extensively revised.
Results: I. Capsid protein: Similarity of the BSL-RDHV capsid protein to those of RNA viruses appears to be highly significant (especially to the CP of Sclerophthora macrospora virus A). However, the similarity is confined to domains S and P of the RNA virus CPs, which covers only the central region of the BSL-RDHV capsid protein (residues 156–302). The BSL-RDHV is 542 aa. Could the authors comment on the N- and C-terminal regions of the BSL-RDHV CP, which are not shown in the alignment presented in Figure S1?
Author’s response: We have revised the text to discuss these aspects and have included tables of BLASTp hits and extensive alignments (Figures–).
Do these regions share sequence similarity to proteins in the databases? What is their predicted secondary structure? Are they likely to fold into independent functional domains? How this might affect capsid formation? In addition, the authors should provide more information on Sclerophthora macrospora virus A (SmV-A) and Plasmopara halstedii virus A (PhV-A), the two viruses sharing the highest sequence similarity with the CP of BSL-RDHV. Stating the fact that they are unclassified ssRNA viruses is not enough. For example, what is the host range of SmV-A and PhV-A (if known), what is the genomic relationship between these viruses and tembusviruses, etc.
Author’s response: This section has also been revised and we hope that this work will stimulate research on the under-studied SmV-A and PhV-A viruses, since they may also provide insight into the mechanism of formation of the BSL RDHV-like virus genomes.
Perhaps this information might provide some hints about the origin of BSL-RDHV? The S-P domain organization is not typical for all icosahedral (+)ssRNA viruses. The information on how widespread this CP architecture is among RNA viruses would be very interesting. Is it only found in Tombusviridae and a few unclassified viruses?
Author’s response: This S-P configuration is only known and demonstrated by X-ray crystallography in the “carmovirus-like” group of Tombusviridae.
From the alignment (Figure S1) it seems that the S domain is considerably more conserved between BSL-RDHV and tombusviruses. Does the same hold true when BSL-RDHV CP is compared with SmV-A and PhV-A only?
Author’s response: As above, this section has been considerably revised.
Besides, the S-P organization is not called “double jelly-roll configuration”, as the authors state on page 5. Double jelly roll fold is found in diverse dsDNA viruses and is structurally quite different from that of the CP of tombusviruses (Krupovic and Bamford, 2008). Krupovic M, Bamford DH: Virus evolution: how far does the double beta-barrel viral lineage extend? Nat Rev Microbiol 2008, 6(12):941–8.
Author’s response: This has been corrected.
In addition, the Qres colouring of the CP model in Figure is not very meaningful and can be eliminated.
Author’s response: We find that, since the alignment does not indicate a high degree of amino acid sequence similarity in the P domain of the CP proteins, a structural assessment is warranted to better substantiate claims of interviral transfer and homology of the BSL and S-P-type CPs of tombusviruses. That the structural conguency extends over the whole structure is best displayed with a Qres score.
II. Rep protein: The authors could briefly introduce the rolling circle replication initiation proteins (RCR Reps). RCR Reps contain three conserved motifs (not just active site Tyr): Ilyina TV, Koonin EV: Conserved sequence motifs in the initiator proteins for rolling circle DNA replication encoded by diverse replicons from eubacteria, eucaryotes and archaebacteria. Nucleic Acids Res 1992, 20(13):3279–85. Are all three motifs conserved in the BSL-RDHV? Figure S2 shows an alignment between the nuclease domains of RCR Reps from BSL-RDHV and PCV2 (by the way, the legend does not correspond to this figure). A more inclusive set of RCR Reps could be compared (and not only for the nuclease, but for the helicase domain as well).
Author’s response: See revised Figure.
In addition, the fact that there is a stem loop preceding the Rep gene does not necessarily suggest the single-stranded nature of the BSL-RDHV genome in the virion (page 6, second paragraph). dsDNA viruses also use RCR Reps for replication (e.g., corticovirus PM2).
Author’s response: While it does not completely rule out the possibility that the BSL RDHV virus harbors a double-stranded genome within the virion, the stem-loop, the sequence similarity to the PCV Rep, and the Rep structural assessment all strongly indicate a single-stranded circovirus-like genome and replication cycle. Until virions can be produced and DNA extracted for analysis, this cannot be definitively shown. Experiments to detect ssDNA in BSL samples are underway. Moreover, no detectible sequence similarity between the BSL/circoviral and PM2 Rep was detected, and no nucleic acid sequence similarity was detected between the BSL and PM2 origins of replication, indicating that the BSL virus is not likely to be related to the PM2 corticovirus.
III. The trees: I suggest replacing the rough-clustering trees (Figure ) with corresponding alignments, since such trees are not very meaningful. Figure shows the CP tree of BSL-RDHV, tombusviruses, satellite viruses, geminiviruses and nanoviruses. The authors say that BSL-RDHV clusters with tombusviruses, “to the exclusion of capsid proteins found in ssDNA plant-infecting viruses that also encode Rep”. None of these other proteins (for which information on the structure is available) possess both S and P domains, while the information on the nanovirus CP, to my knowledge, is not available at all. It therefore makes no sense to put on the same tree proteins that might not even be homologous. Similarly for Figure , which shows the tree of RCR Reps – the similarity between the Reps of microviruses and circoviruses is confined to the three motifs of the nuclease domain (microviral Rep also does not have the helicase domain). Supplementary files 2 (Blast scores) and 3 (accession numbers) should be combined. It would be also useful if the authors could supplement the table with the pairwise identity values.
Author’s response: This has been done.
Conclusions: “…RNA-DNA recombination has only been inferred”: Perhaps it could be mention here that numerous RNA virus genomes (from different families) were recently discovered in the genomes of various eukaryotic hosts, which suggests that RNA-DNA recombination might be not as uncommon as previously believed.
Author’s response: This has been added and see the author’s response to Reviewer 3.
The authors point out “that lateral transfer of capsid genes occurred between an ancestor of ssRNA satellite viruses and a circular, ssDNA geminivirus or nanovirus during co-infection [32
]”. However, what we suggested in ref [32
] is that geminiviruses originated from plasmids of phytopathogenic bacteria (phytoplasma) by acquiring the capsid-coding gene from a plant-infecting RNA virus, i.e., recombination occurred between two unrelated DNA (plasmid) and RNA (virus) replicons to give rise to a novel element – the ancestor of geminiviruses. The claim that “ssRNA satellite virus capsid proteins are found exclusively in ssDNA genomes of the large and well characterized Geminiviridae and Nanoviridae families” is also not supported: (i) there is no evidence that nanoviral CP adopts the jelly-roll fold (even though this is probably true), (ii) among DNA viruses this fold is not restricted to geminiviruses as it is also found in CPs of parvoviruses and microviruses (and certain dsDNA viruses), (iii) most importantly, the single jelly roll fold is most widespread in viruses with RNA genomes (12 different families!). The suggestion that “ssRNA satellite viruses most likely acquired their capsid proteins from gemini- and nanoviruses” has no ground. The fact that “Satellite, gemini- and nanoviruses often co-infect the same hosts” per se is not a proof, especially considering that the primary partner during coinfection for ssRNA satellite viruses are other ssRNA viruses (with jelly roll CPs).
Author’s response: We have elected to remove this particular example as a possible precedent for interviral RNA-DNA recombination because the claims asserted in Krupovic et.al.,2009 have not yet been substantiated. We agree that the jelly-roll fold itself probably originated in RNA viruses, and that a CP gene phylogeny indicates a common ancestry amongst the RNA satellite-, DNA gemini- and nanovirus CPs. However, we find the assertion that the gemini- and nanovirus CPs were directly and recently obtained from an RNA satellite-like virus to be speculative. While investigating the evolutionary trajectories of the jelly-roll fold and determining its ultimate origin in DNA virus groups is certainly an intriguing prospect, such an endeavour is beyond the scope of this report.
The authors prefer a scenario according to which “the capsid gene was transferred from a ssRNA virus to a ssDNA virus in the predecessor of the putative RDHV family”. However, can the authors be sure that at the origin of the RDHV ancestor was a virus and not a plasmid? In principle, the acceptor of the tombusvirus-like capsid gene could have been any kind of a replicon (e.g., a plasmid) with a circovirus-like RCR Rep. Besides, plasmids could have also been at the origin of circoviruses, as we have pointed out previously.
Author’s response: The BSL Rep protein sequence bears little resemblance to plasmid Reps, while demonstrating a substantial similarity to circovirus-like Reps. Unless there are other uncharacterized plasmids with circovirus-like Reps, the data indicate that it is more likely that the recombination occurred in a circovirus-like genome. While it is conceivable that circoviruses ultimately originated from plasmids, the low level of sequence divergence between the BSL RDHV Rep, CP and other related proteins indicate a recent acquisition of the CP protein by an already circovirus-like ancestor. The alternative hypothesis would require the convergent evolution of the BSL and tombusvirus-like CPs, which we consider highly unlikely.
Last paragraph of the Conclusions: In my opinion, it is an overstatement to say that the observations presented in this paper implicate viruses in the transition from the RNA-World to the DNA World.
Author’s response: This section of the conclusion has been modified for clarity, but we would like to confirm our difference of opinion on this subject.
However, I certainly agree that the findings “extend the modular theory of virus evolution to encompass a much broader range of possibilities”. What I also find intriguing about such chimeric viruses is how their discovery might impact our views on the timeline of virus origins as well as our attempts to devise higher levels of virus classification. It is often assumed that viruses emerged around the same time or even before the cellular organisms while the possibility that new groups of viruses might be emerging in the contemporary biosphere is rarely discussed. Building on the hypothesis by Koonin and Ilyina (1992), we have suggested that geminiviruses might represent one such group of “new” viruses [32
]. Koonin EV, Ilyina TV: Geminivirus replication proteins are related to prokaryotic plasmid rolling circle DNA replication initiator proteins. J Gen Virol 1992, 73:2763–6. RDHV might be an even more convincing example in support of the on-going emergence of novel virus groups from pre-existing mobile genetic elements (viruses and plasmids).
Author’s response: We very much agree with your assessment.
For higher-order virus classification, I personally favour the capsido-centric view (Krupovic and Bamford, 2009, 2010), according to which determinants for virion architecture are inherited in a given viral group from their common ancestor, while genetic determinant for other functional modules (e.g., for genome replication proteins) move relatively freely in and out of these viral genomes. In other words, the movement of functional modules occurs relative to the capsid-encoding genes. Krupovic M, Bamford DH: Does the evolution of viral polymerases reflect the origin and evolution of viruses? Nat Rev Microbiol 2009, 7(3):250. Krupovic M, Bamford DH: Order to the viral universe. J Virol 2010, 84(24):12476–9. In contrast, according to another line of thought, different functional modules in the viral genomes deserve equal weight when considering relationships between viruses: Koonin EV, Wolf YI, Nagasaki K, Dolja VV: The complexity of the virus world. Nat Rev Microbiol 2009, 7(3):250. Lawrence JG, Hatfull GF, Hendrix RW: Imbroglios of viral taxonomy: genetic exchange and failings of phenetic approaches. J Bacteriol 2002, 184(17):4891–905. Therefore, depending on the viewpoint, RDHV can be considered as a relative of tombusviruses, which had its original genome replication machinery (RdRp) replaced with a gene for RCR Rep. On the other hand, it might also be seen as a circovirus in which the ancestral CP gene was replaced by a gene from a tombusvirus. What do the authors think about classification (and affiliation to the existing viral taxa) of RDHV and other chimeric viruses, which are likely to be discovered in the future?
Author’s response: These points are highly intriguing to consider and this commentary is very much appreciated. First, the continued use of metagenomics promises to have a marked effect on current schemes of virus taxonomy. We may only guess at what effects the BSL RDHV virus and its relatives will have on these taxonomic frameworks. Second, this issue pertaining to the trajectories of the Rep and CP modules brings to the fore an important question regarding the origin of the linear and circular ssDNA viruses. It is unlikely that the BSL RDHV-like genome evolved incrementally from an RdRp-containing RNA virus. However, the notion that linear and circular ssDNA viruses first evolved from ssRNA viruses in such a manner, first by conversion to DNA and then via the acquisition of an RCRE domain (the Rep S3H domain also being derived from an RNA virus), as opposed to having emerged largely via modular exchanges is certainly a topic very worthy of investigation.
Reviewer's report 3:Dr. Arcady Mushegian (University of Kansas School of Medicine, USA)
The manuscript by Diemer and Stedman reports the existence of the new virus, characterized by circular single-strand DNA genome and a novel configuration of two genes, i.e., 1. nanovirus- or circovirus-like replication protein with the usual predicted DNA-nicking and NTPase domains and 2. the jelly-roll capsid protein clearly related to capsid proteins of positive-strand RNA viruses (tombusvirudae) and two unclassified RNA viruses of fungi. The experiments indicate that the metagenomic sample from the hot lake contains the full circular genome, and that similar genomes most likely exist in the ocean samples (in that case, their circular form was not shown, but most likely will be). This is a fascinating discovery of a novel virus group, suggesting the ancient act of exchange of genetic material between RNA and DNA virus genomes. I fully support the publication of this study, but must request that some of the more sweeping statements in the paper are moderated, in order to better agree with the evidence. Abstract: “little is known about their collective origin and evolutionary history” --- see next comment. Ibid. “it is not currently possible to determine whether the principal virus groups arose independently, or whether they have a shared evolutionary history” --- the hypothesis that RNA viruses arose before the advent of DNA genomes, when the protein-encoding genomes were made of RNA, is not unreasonable. This would argue for the independent, or at least separate in time, origins of DNA and RNA virus genomes. Therefore, the word 'collective' in the first sentence is doing some heavy lifting that it probably should not. On the other hand, retro-transcribing viruses and RNA viruses seem to satisfy anyone's definition of two 'principal virus groups', and yet there is plenty of evidence that they have a shared evolutionary history, at least in their replication enzyme.
Author’s response: This section has been extensively revised.
Ibid. “no mechanism for RNA-DNA recombination has yet been identified” --- what about retrohoming of group II introns ?
Author’s response: The following passage was added to the conclusions section based on the suggestions made by Mushegian and Krupovic: ”The presence of non-retroviral RNA virus genes in cellular genomes [
]] suggests that some cellular mechanism exists that allows RNA-DNA recombination in lieu of a virus-derived RT. Although the group II intron retro-homing phenomenon [
]] and transposon mediated exchanges have not been observed to mediate interviral lateral gene transfer, these or similar host cell-based mechanisms may have facilitated the formation of the BSL RDHV-like viruses.”
p. 5: The moniker “RNA-DNA hybrid virus” (RDHV) must go. This is a thoroughly misleading name. The authors show abundant evidence of a circovirus-like or nanovirus-like virus with single-strand DNA genome that, in the past, have acquired a capsid protein from an RNA virus. Nonetheless, it is a DNA virus now. This is not even the first example of that kind of mosaicism – BL1/BC1 proteins of bipartite geminiviruses are similar to the 30
K family of movement proteins of plant RNA viruses, but no one calls bipartite geminiviruses “DNA-RNA viruses” because of that. RNA genomes of closteroviruses encode homologs of cellular HSP70 proteins, but these viruses are not RNA-DNA viruses either. Descriptive name such as “Boiling Spring Lake Virus 1” or something of this kind should do just fine. Note that this objection to “RDHV” is not the nomenclature war, but rather aims at setting the molecular record straight.
Author’s response: The moniker “RDHV” is mentioned in the text as provisional. We feel that a succinct descriptive name for this new virus genome-type is warranted, at least temporarily. Other conceivable names seem insufficient to describe a novel and probably wide-spread virus group and its ancestry, and would be significantly more confusing or excessively complicated (e.g. “a Boiling Springs Lake virus from the Sargasso Sea”). We completely agree that the genome discovered represents a DNA virus. Once we have identified the host and/or structure for the virus we will propose a taxonomically appropriate name through the ICTV (and let the nomenclature wars rage).
p. 5 and later: I am sure that there is a straightforward sequence-similarity argument on the evolutionary relatedness of “RDHV” capsid protein and tombusviruses. I could obtain statistically significant similarity between the former and the latter by PSI-BLAST and HHPred approaches. I recommend that the authors do the same. Instead, we are reading “The predicted structure of the BSL RDHV capsid protein is congruent to the S-P domain double jelly-roll configuration found in the ssRNA Tomato Bushy Stunt (TBSV) and Melon Necrotic Spot (MNSV) tombusviruses [12
]. Amino acid sequences are moderately conserved amongst the three proteins based on BLOSUM80 [14
], while percent sequence identity is low (Figure ) (see Additional Figure for alignment).” This is ambiguous: if the sequence-similarity/database search statistics arguments (not the same as sequence identity!) are not sufficient to establish the evolutionarily significant similarity, then there is no basis for threading and structure modeling; and if sequence-similarity arguments were used, why not say so?
Author’s response: This section has been extensively revised and Figures added (Figures–).
p. 7: “The most parsimonious scenario” --- more parsimonious than which other scenarios?
Author’s response: This section has also been revised. See reply to Krupovic in regard to the origin of linear and circular ssDNA viruses.
pp. 7–8: Several mentions of satellite RNA viruses seem out of place – tombusviruses are not satellites and neither are fungal viruses discussed in the paper?
Author’s response: These references have been clarified.
pp. 8–9: (last paragraph of the paper) “Assuming that RNA viruses evolutionarily preceded all DNA virus groups[33
], evidence of gene transfer from RNA to DNA viruses complements the RNA-first theory[35
].” --- I do not understand what this means. First, if we assume that RNA viruses evolutionarily preceded all DNA virus groups, then we do have a partial answer to the question that was said to be currently impossible to answer in the Abstract (see above). Second, “to complement” more or less means to provide a missing part or an additional, compatible line of argument, correct? I am not sure what does the virus described in this study have to do with evolutionary precedence of RNA viruses over DNA viruses: surely, in order for this virus to emerge, both RNA viruses and DNA viruses have to be around already?
Author’s response: This final paragraph has been revised and clarified.