Reviewer's report 1
Dr. Arcady R. Mushegian (Stowers Institute for Medical Research, Kansas City, MO, USA)
1. I agree that there seems to be no evidence of complete virus genomes integrated into Vitis genome, but what is the evidence that none of the virus reverse transcriptase-related inserts encode a complete ORF? Also, I have searched the NR protein database with the movement protein sequences (ORF1 in caulimoviruses) and can see many predicted proteins in Vitis – perhaps some of them may be expressed too?
2. The authors used 30% identity cutoff to distinguish between pararetrovirus-like sequences and retrotransposons that also have reverse transcriptases (and often gag/ORF4-like sequences in addition). Perhaps using aforementioned movement proteins as a marker would be more specific. It gives me an impression of a much high copy number of such inserts in grapevine compared to other completely sequenced genomes. A putative selective advantage, i.e., harnessing the spuriously transcribed inserts to protect from viruses, may perhaps explain the persistence of some inserts but not necessarily this high copy number. Can the accumulation of virus-like inserts in Vitis be a consequence of perennial lifestyle and clonal propagation (non-integrated viruses are typically excluded from seeds, limiting exposure in annuals)? I searched the whole-genome assembly of poplar P. trichocarpa for sequences related to caulimovirus movement proteins, and there appear to be two dozen matches at least, which seems compatible with this hypothesis. Discuss?
Minor: Ln 69–70 "unknown" includes "extinct" in this context, doesn't it?
The manuscript was largely rewritten in response to these very useful comments with an emphasis on the potential selective advantage of the pararetroviral inserts for the perennial plants. The final variant of the work included a preliminary survey of the viral inserts not only in grapevine, but also in poplar genome. In accord with the reviewer's proposal, we have also assessed the presence of the pararetroviral movement protein gene-related inserts. As to the terms 'unknown' versus 'extinct', the former means extant yet unidentified viruses, whereas the latter applies to the viruses that no longer exist as the functional, infectious entities.
1. Ln 38 in Abstract: change "active viral integration mechanisms" to "virus-encoded integrases".
2. Ln 41. Delete "Bioinformatics"?
3. Ln 44. Change "caulimo-" to "caulimoviruses". Can we confidently say that these sequences are not from other groups of caulimoviridae?
Ln 73 "DNA-containing" is a bit ambiguous, change to "virion DNA?"
Ln 89 change to "made the pararetroviruses.....woody plants exceptionally rare if not extinct".
Ln 92 and elsewhere: change "homology hits" and "hits" to "sequence matches" or "matches".
Lns 93–94 "hits with the putative viral nucleotide sequences" vs. "at the protein level": what was compared to what – details?
Lns 99–101: movement proteins are usually encoded on the same (35S) transcript as reverse transcriptases, so perhaps that fact that the two classes of matches are found separately is another indication that essentially random fragments of viral mRNA are incorporated into essentially random genomic locations (cf. Lns 144–146)? Also, change "less significant" to "lower" and delete "therefore".
Ln 104: explain the significance of the 30% identity cutoff: was it used to exclude retroelements, and how do we know this has been accomplished?
Ln 108: change "apparently derived" to "that were reported to originate"
Ln. 112–113 and Ln 125: delete "(not shown)".
Delete Ln 120.
Ln 136: delete "exceedingly"?
Ln 139: put "potential" in front of "origins" or, better, delet the word.
Lns 140–141: the sentence seems to be redundant with the following one
Ln 160: change "no" to "hardly any"
Ln 180: change "the most parsimonious" to "one" – I am not sure how much more parsimonious is this hypothesis over any other.
Ln 195: consider deleting "reverse-transcribing".
Ln 198: virtual reality is not a reality; is virtual extinction an extinction?
Lns 206–207. More potent than what. And what is "more sophisticated"?
For Discussion: badnaviruses are pararetroviruses, and yet they infect trees and shrubs (cacao, raspberry, spirea...).
We have accommodated virtually (meaning 'nearly') all editorial changes proposed by Dr. Mushegian.
He has also raised an important discussion point: if, according to our hypothesis, woody plants such as grapevine evolved resistance to pararetroviruses via exaptation of pararetroviral inserts, why do fully infectious pararetroviruses occur at least in some species of the woody plants? Certainly, this implies that our hypothesis is not universally applicable to all woody plants. This, however, is hardly a surprise given that 'woodiness' has evolved independently in many families of the gymnosperms and angiosperms that likely had varying initial levels of both the exposure and resistance to diverse virus lineages. It is also possible that the early domestication and vegetative propagation of the grapevine have increased the virus pressure and accelerated emergence of a resistance to pararetroviruses.
Reviewer's report 2
Dr. I. King Jordan (Georgia Institute of Technology, Atlanta, GA, USA)
The authors of this paper report the discovery of partial reverse transcriptase encoding open reading frames, of apparent pararetroviral origins, in two grapevine genomes. The authors also demonstrate that several previous claims for the positive strand RNA viral origins of grapevine genome sequences actually represent experimental and/or annotation artifacts. The discovery of pararetroviral genomes is interesting since grapevine plants do not appear to be subject to infection by pararetroviruses, unlike numerous herbaceous plants. The authors hypothesize that the pararetroviral sequence inserts reported here confer immunity to pararetrovirus infection via an RNA interference (RNAi) like mechanism based on stochastic transcription of the integrated viral sequences.
This is an intriguing hypothesis and the authors lay out two lines of research that can be used to test their idea: 1) a census of all grapevine viruses using metagenomics and 2) an investigation of the mechanisms of RNAi in woody plants. I would like to suggest two other tests of their hypothesis, which while less direct may be easier to carry out.
First, it would be interesting to know if these pararetroviral sequence inserts actually encode small RNAs and/or if they are expressed as RNA at all. This could be addressed computationally as with the work presented here. For instance, are there small RNA libraries for grapevine that could be queried? Are there ESTs that support the expression of these pararetroviral like inserts? This could also be assessed experimentally with RT-PCR for example.
Second, the authors suggest that the presence of pararetroviral sequences in the grapevine genomes is consistent with the idea that 'heritable maintenance of pararetroviral sequences has potential benefits for the host plants.' If this is indeed the case, then one may expect that the pararetroviral sequence inserts are conserved over evolutionary time. As with the expression of pararetroviral sequences, this point could be addressed computationally and/or experimentally. Two genome sequences are analyzed here but it is unclear if the pararetroviral sequences discovered are conserved at orthologous positions in the two genomes. If so, are these sequences conserved more or less than protein coding gene sequences between the plants? A PCR survey of multiple Vitaceae related strains and species could be conducted to look for orthologous pararetroviral sequences as was done for the previously misidentified GLRaV-8 sequences.
The defense hypothesis would seem to suggest that the pararetroviral sequence inserts uncovered here are still effective at guarding against infection by pararetroviruses. Yet the authors propose that the pararetroviral sequences in the grapevine are "derived from the currently extinct, grapevine-specific pararetroviruses." Given the need for sequence identity between small RNAs and RNA/DNA targets in RNAi systems, how is it that these ancient inserts could still be effective at maintaining immunity against pararetrovirus insertions?
Integrated pararetrovirus sequences are also found in the genomes of herbaceous plants, but herbaceous plants are susceptible to pararetrovirus infection. The authors propose that long-lived woody plants, such as grapevine, evolved more vigorous RNAi defense mechanisms than more short lived herbaceous plants. Is it known that herbaceous plants mount less effective responses to foreign agents in general? Are there any other lines of evidence or references in support of this idea?
From a technical perspective, it would help to have a bit more detail on the methods of sequence analysis used here. It is not possible to understand what kind of analysis was conducted based on the information provided. For instance, what program was used to compare sequences? Were comparisons done using nucleotide (as stated) or protein (as implied) sequences or both? In addition, the authors refer to "highly significant similarity" between grapevine genome sequences and pararetroviruses but no statistics are shown.
Minor point: It would be helpful if the URL listed for the Grape Genome Browser pointed straight the browser http://www.genoscope.cns.fr/vitis
as opposed to the front page of Genoscope.
We very much appreciate Dr. Jordan's idea to investigate possible transcription of the pararetroviral inserts in grapevine into siRNAs or other small RNAs that may enable RNAi antipararetroviral response either computationally or experimentally. Although we are not aware of the grapevine small RNA or EST databases, deep sequencing of the grapevine transcriptome in general and small RNAs in particular would be certain to generate required data. Similarly, testing conservation of the pararetroviral inserts in different grapevine varieties and species is a promising idea that can reveal when the events of virus sequence insertions has occurred relative to diversification of the family Vitaceae or genus Vitis. In fact, this will be more doable when the several homozygous grapevine genomes are available. The existing genome sequences of two cultured variants of Pinot Noir are more problematic to compare because one of these is highly homozygous whereas another is highly heterozygous.
The question of how pararetroviral inserts may still be effective against the challenge of extant viruses if they are derived from now extinct viruses is very intriguing. One possible scenario is that there are no pararetroviruses left that are capable of infecting grapevine. This would be analogous to extintion of the smallpox virus due to the global vaccination program. If this were true, the pararetroviral inserts themselves could have lost their selective advantage and will gradually deteriorate. An alternative scenario is that the grapevine-specific pararetroviruses closely related to the viral inserts found in grapevine genome still lurk in the wild plant host species. If this were the case, gradual sequence evolution of such viruses could result in eventual escape from the antiviral control mediated by the existing inserts. The census of extant pararetroviruses, as well as investigation of viral insert-derived small RNAs discussed above could help to distinguish between these scenarios.
The next question is concerned with the evidence for more vigorous antiviral defenses in woody versus herbaceous plants. To the best of our knowledge this remains just a plausible hypothesis. The only circumstantial supporting evidence we are aware of is that many RNA viruses infecting woody or perennial plants, but not those infecting annual herbaceous plants have acquired an AlkB domain apparently involved in repairing the methylation damage to the viral RNA (ref. [31
]). This observation can be interpreted to suggest that woody plants have evolved an additional line of antiviral defense via targeted methylation of the viral RNAs. This possibility, as well as better understanding of RNAi machinery in woody plants are very promising directions for the future research.
We have also made several modifications to account for the raised technical issues.
Reviewer's report 3
Dr. Eugene V. Koonin (National Institutes of Health, Bethesda, MD, USA)
In this interesting Hypothesis paper Bertsch et al demonstrate the presence in the grapevine genome of multiple sequence segments homologous to parts of pararetrovirus genomes and hypothesize that these integrated virus-derived sequences confer resistance to the respective viruses via an RNAi mechanism. The observation is not entirely novel because integrated pararetrovirus sequences have been reported previously in other plant genomes (petunia and rice) as duly cited in this manuscript. There is, however, a very interesting new point here, namely, that those other, herbaceous plants are susceptible to pararetroviruses, whereas no pararetroviruses of grapevine are known in spite of exhaustive virological study of this plant. So the authors reasonably conjecture that the RNAi-based defense mechanisms are particularly important and hence especially efficient in a woody plant like grapevine. In addition to these findings and ideas, this work puts to rest the previous erroneous reports on the presence, in the grapevine genome, of segments homologous to certain positive-strand RNA viruses. Unlike pararetroviruses, positive-strand RNA viruses have no reverse transcription step in their reproduction, so the possibility of integration is dubious – and would be a sensation of sorts if confirmed. I think it is important to dissuade such myths.
This is not an earth-shattering discovery but the paper is interesting, has valuable biological implications, and will attract the reader's attention to a remarkable phenomenon. I would like to make two comments, one of a fundamental nature, the other one more on the technical/presentational side.
1. The authors suggest that there are no extant pararetroviruses of grapevine and moreover that the virus-derived sequences in the grapevine genome represent extinct viruses. I think one should be more cautious on these issues because known RNAi-based antiviral defense mechanisms require exact complementarity between the siRNA and the target. So if those target viruses are indeed extinct, that extinction must have been a recent event. Else, the viruses still might be around but cannot replicate in grapevine owing to the interference from the endogenous homologous sequences. This deserves a more careful and nuanced discussion. Along more or less related lines, I would be quite interested to know whether or not the viral inserts are homologous in other grapevine varietals ? is there any chance to find out ?
2. I am not sure the current manuscript is documented/illustrated as fully as possible. Table is a rather sketchy characterization of the viral-like sequences in the grape genome. I would rather see at least a couple of alignments, and perhaps, a schematic showing the chromosomal location of these sequences. Is there anything unusual in their genomic surroundings ? Any ideas on how they could be expressed ?
We are grateful to Dr. Koonin for his incisive comments. First of these comments resonates with one by Dr. Jordan. Indeed, it is perfectly feasible that pararetroviruses that were formerly capable of infecting grapevine are currently surviving in different host plants. It will also be important to learn how conserved are the pararetroviral inserts in a wide variety of grapevine cultivars, and if and how they are expressed. However, we agree that both the acquisition of pararetroviral inserts by grapevine and the apparent extinction of the parental pararetroviruses could be the relatively recent events, perhaps associated with the domestication of grapevine in the Caucasus 8–10 thousand years ago.
As requested, we have included a multiple alignment of the conserved regions of the pararetroviral inserts with those in the most closely related infectious pararetroviruses (Fig. ) and a chromosomal map showing location of the inserts.