To test the feasibility of virus identification using DNase SISPA-next generation sequencing (NGS) and to get a first estimate of its sensitivity on field samples, we selected both strong and weak positive SBV infected field samples. For viral RNA quantification, in vitro transcribed RNA was used as a standard curve in the previously described L gene real time RT-PCR. The curve showed a linear range at least from 2.75 to 7.75 log10 RNA copies per µL, and a sensitivity of less than 2.75 log10 RNA copies per µL ().
Two field samples of aborted lambs with Cp values of 20.59 and 20.65 corresponding to 7.66 and 7.64 log10 RNA copies per µL () allowed unambiguous identification of Schmallenberg virus. One sample (BE/12-2478) yielded 2 S segment sequences (covering about 13% of the S segment), 81 M segment sequences (covering about 56% of the M segment) and 109 L segment sequences (covering about 81% of the L segment) (, ). The other strong positive sample (BE/12-2068) resulted in a total of 8 SBV specific sequence reads distributed over the L and M genomic segments. This difference in sequence coverage for two samples with a comparable SBV RNA load is most likely due to the difference in the amount of raw sequence data (). While the sequencing of BE/12-2478 yielded about 95000 reads, BE/12-2068 only resulted in circa 25000 reads probably due to DNA library quantification issues at the sequencing facility. Moreover, more tissue sample was available for DNase SISPA protocol from sample BE/12-2478 (), although the ratio of viral reads to total raw reads was superior (0.01) in sample BE/12-2068 compared to sample BE/12-2478 (0.002 ; ).
| Table 1SBV virus quantification and confirmation by DNase SISPA-NGS in selected field samples from Belgium. |
| Table 2Output of the metagenomic analysis on raw sequence data from the sequencing libraries from SBV-positive samples. |
Three weak positive field samples (one from an aborted calf, two from aborted lambs) containing virus quantities equivalent to 4.27–4.63 log
10 RNA copies per µL did not allow identification using DNase SISPA-NGS (). This is consistent with an approximate sensitivity of 10
4–10
6 virions per ml estimated in previous studies using in vitro virus dilutions
[11] or tissue biopsy samples
[12]. However, it should be noted that the exact ratio between viral RNA quantities and intact virion quantities in field samples (the functional unit being detected in this method) remains to be determined. Our preliminary data indicate that RNA extracted from an SBV isolate containing 10
5 TCID50/ml may contain up to 8.46 log
10 RNA copies per µL (). This indicates that precaution should be taken in interpreting RNA quantities in terms of DNase SISPA-NGS sensitivity, which is determined by the amount of viral nucleic acids that remain protected in intact virions during nuclease treatment. Moreover, a comparison of approximate sensitivity with other viral discovery methods is almost impossible, as the utilized sequencing effort varies from a few 100 Sanger sequencing reads
[11],
[13] to about 30 million Illumina GAII reads
[12]; and different sample types (targeted tissue selection, freshness of the sample) may result in different levels of host and contaminating nucleic acids. Given the limited approximate sensitivity of DNase SISPA-NGS, as any virus discovery method, careful selection of properly targeted and fresh field samples is necessary.
As expected in any metagenomic approach, a considerable part of the sequence reads represented diverse bacterial species and host nucleic acids (). It should be noted that the field samples were stored for a considerable time before the pretreatment and RNA extraction, during which opportunistic bacteria probably grew in the samples. Although we use a 0.22 µM filter to remove remaining cell fragments and bacteria, nucleic acids from disrupted cells can pass through the filter.
Other viral reads could be mainly identified as bacteriophages belonging to the families Myoviridae, Siphoviridae and Podoviridae (). Three reads showed partial similarity to a virus belonging to the Phycodnaviridae and two reads showed some similarity to viruses of the Mimiviridae family. None of these viruses have relatives known to infect animals. These sequences most likely represent contamination of the tissue samples during storage until analysis.
Compared to the metagenomic approach used by Hoffmann and colleagues
[1] that initially identified this novel Orthobunyavirus by shotgun sequencing of total RNA extracted from clinical samples, our virus discovery protocol attempts an enrichment in viral nucleic acids by selective filtration and nuclease treatment. A direct comparison between both data sets is impossible as we treated limited amounts of tissue samples representing a different host species. Moreover, the sequencing effort per library was not identical. Both studies indicate the need of high sequence throughput and proper sample selection as critical factors for successful virus discovery using metagenomics.
The partial SBV sequence info we obtained was compared to the whole genome sequence that was determined from the virus originally isolated from diseased bovines in Germany. Several coding and noncoding mutations could be observed (). The partial data from the two Belgian ovine field samples showed together 16 nucleotide differences (of which 9 well-supported by the sequence data, ) corresponding to 9 amino acid differences (of which 5 well-supported). Although this can be expected for an RNA virus that has now shown a distribution throughout a large part of Western Europe, this is to our knowledge the first documentation of genetic diversity within the Schmallenberg virus outbreak. Based upon the 8134 nucleotides in common between our partial sequence (BE/12/2478) and the genome of the virus isolated from diseased bovine in Germany (accession codes HE649912-HE649914), a mutation frequency of 1.7 10
−3 mutations per site was observed. The time frame between the two samples was about three months and the geographical distance between the sampling sites roughly 325 km. Although only based on a three month period, the observed mutation frequency is within the documented range of Bunyavirus variability (10
−2 to 10
−4 mutations/site/year documented for Hantavirus ;
[14]) and within the documented range of RNA virus variability (10
−3 to 10
−4 mutations/site/year ;
[15],
[16]). It should be noted that, while the samples in Germany were taken from acutely infected bovines, the ovine samples from Belgium present aborted lambs, making an estimation of the infection time of the maternal animal impossible. Future targeted molecular epidemiological studies including samples from the complete geographic range of the virus may shed light on the origin and time of introduction of this novel virus in Europe.
| Table 3Differences observed in Belgian SBV sequences in comparison with the German genome sequence of isolate BH80/11-4. |
Our data show that DNase SISPA-NGS viral discovery technology can be used on limited amounts of field tissue samples to identify emerging diseases. However, the sensitivity of the method seems to limit its applicability to samples containing about 104 to 106 virions per ml. Consequently, when applying this methodology to a cluster of cases of an undiagnosed disease, it is important to select properly targeted and fresh samples as well as to test multiple diseased animals to allow correct identification of an associated virus.