Viral Culture, Immunoassay, and Molecular Analyses
Inoculation of the clinical specimens led to the development of the cytopathic effect on C6/36 cells and Vero cells. Viral cultures of both isolates (IQE7620 and IQE7743) were reactive with CARV and MURV antibodies in standard, indirect immunofluorescence assays, indicating the presence of orthobunyavirus in the clinical isolates. Amplification of viral RNA was attempted using primer pairs described previously targeting S and M segment sequences; however, no cDNA was generated, despite repeated attempts under varied RT-PCR and PCR conditions. The inconclusive results suggested the probable presence of an orthobunyavirus with a novel genome sequence sufficiently different from published sequences.
Viral Genome Sequence Identification by Pyrosequencing
A new approach was attempted to overcome the technical difficulty in identification of viral genome RNA. The approach was comprised of two major experimental processes: (1) First Strand cDNA synthesis by random reverse transcription, followed by random PCR amplification for synthesis of dsDNA amplicons (random RT-PCR), and (2) ultrahigh-throughput pyrosequencing of the amplicons. Two random RT-PCR methods were tested for comparison. The modified, anchored, random RT-PCR and NuGEN Technologies' Ovation RNA-Seq system were productive in amplification, but larger fragment sizes were generated using the anchored, random RT-PCR approach as compared with NuGEN Technologies' method (). The result was not surprising, as NuGEN Technologies' Ovation RNA-Seq system was developed mainly for downstream application on the Illumina next-generation sequencing platform, which optimally requires small DNA fragments for its short read sequencing. Anchored RT-PCR generated amplicons with a broad size distribution from 200 bp to >1 kb. No outstanding DNA bands were observed, suggesting an unbiased randomness of the method. The relatively larger sizes of the amplicons are suited for efficient and productive pyrosequencing on the Roche GS FLX Titanium system, which typically generates read lengths of ~350 bp. Small RNA segments for many RNA virus genome segments are barely 1 kb or shorter, often resulting in small fragments in random RT-PCR amplification. Therefore, in most cases, we chose to bypass a size-selection or small-fragment removal step to avoid biased selection and loss of sequences from small genome segments or in particular, structural regions that are less-efficiently amplified. However, in emulsion PCR, the small fragments in the amplicon library were amplified on the sequencing beads much more efficiently than the large fragments, resulting in highly variable intensity from well to well during pyrosequencing, resulting in poor raw images and a low quality of sequencing reads. Among several feasible approaches, we chose to suppress emulsion PCR amplification for small fragments by reducing the primer concentration by 75%. As a result, from a pyrosequencing run having multiple samples on an eight-region PicoTiter Plate (PTP), we obtained ~150,000 reads totaling 30 Mb of sequence for the isolate IQE7620 from the area equivalent to 15.6% of a PTP.
FIGURE 1 Random RT-PCR amplification of total RNA extract and PCR amplification of RL library for quality assessment. (A) Total RNA extract from IQE7620 viral culture supernatant was amplified using an anchored, random amplification method (Lane 1) or NuGEN Technologies' (more ...)
To determine if an orthobunyavirus sequence was present, we used an unbiased approach, and all individual reads were first compared against nucleotide collection (nr/nt) in GenBank using mpiBLAST, as described above. The BLAST hits were then examined for viral sequences using a Perl script that was developed in-house. Thousands of viral sequence hits were identified, mostly belonging to the orthobunyavirus family, with the highest sequence similarity to CARV.15
Approximately 65% identity was observed with the CARV L segment—too low for the CARV genome sequence to be used as a reference for mapping the raw reads with software GSMapper to assemble the genome sequence of IQE7620. This result suggested the possibility of IQE7620 being a novel orthobunyavirus and the necessity of using a de novo approach to assemble its genome sequence.
Sequence Assembly and Finishing of the Novel Viral Genome
GS FLX system data analysis software, GSAssembler, was used to assemble raw reads without a reference genome sequence (i.e., de novo assembly). In total, 668 contigs (150 kb in total) were obtained from de novo assembly. The contigs were aligned with the CARV L segment sequence to identify homologous sequences (A
). Seven contigs (5378 bases in total) were identified, which covered 96.8% of the 5555-bp partial sequence of the CARV L segment. The size of a complete L segment for an orthobunyavirus is ~6900 bp.4
To identify additional sequences between contigs and at both ends, the National Center for Biotechnology Information (NCBI) Megablast alignment tool was used to search for individual reads with sequences overlapping the contigs and extended sequences adjacent to the contigs. In addition, two contigs for isolate IQE7743, which were 96% identical to IQE7620 in overlap regions, were also used to fill the gaps (A
). The process was repeated until the assembly reached its maximal length, and no additional reads containing extra sequences were found (B
). The resulting assembly was a 6917-bp draft genome sequence, the expected size of a full orthobunyavirus L segment. Finally, we used this assembled draft genome sequence as a tentative reference to map all pyrosequencing reads, ~150,000 reads in total, with GSMapper. A total of 5524 reads (1.46 Mb in total) was mapped, and a single contig of 6913 bp was obtained. The alignment coverage ranged from 23 to 1627, with an average coverage of 210 (C
). The mapped reads constituted only 3.1% of all reads, suggesting a low percentage of IQE7620 viral nucleic acids in the total RNA isolated from the viral culture.
FIGURE 2 Sequence assembly process for generating the complete L segment sequence for orthobunyavirus IQE7620. (A) Pyrosequencing reads were de novo-assembled using GSAssembler. Contigs, matched with partial sequence to the CARV segment L, were aligned and assembled (more ...)
The resulting nucleotide sequence comprised an open reading frame of 2248 aa. The deduced, putative protein had approximately the same size as the 2250-aa RNA polymerase protein of OROV, which was one of orthobunyavirus references chosen by NCBI.16
Interestingly, amino acid identity with OROV was low (54%). Although the generated L segment sequence already contained the complete coding region for a RNA polymerase, it did not have the terminal consensus sequence typical of orthobunyaviruses and was thus incomplete. Additionally, it was necessary to confirm the accuracy of the sequence because of the low identity to the reference and the novelty of the sequence. A set of primers was used to amplify sequences in the terminal, as well as several intermediate regions (). Instead of using a random RT-PCR product or the final random amplicon library, we used the reverse-transcription product (i.e., First Strand cDNA) as the template in PCR, in case there were amplification errors introduced in the anchored, random PCR. Purified PCR fragments were sequenced using the Sanger method. No mismatch was found. From the Sanger sequencing results, 12 and 11 bases, including terminal consensus sequences, were added to the 5′ and 3′ ends of the sequence, respectively. The final, complete L segment sequence for the IQE7620 genome was found to be 6936 bp, encoding a 2248-aa putative orthobunyavirus RNA polymerase.
Because of the significant nucleotide divergence from other reported orthobunyavirus sequences, the sequence was provisionally designated as Zungarococha virus (ZUNV) L segment (in reference to the lake near where it was isolated) and submitted to GenBank (accession JN157805
). Based on sequence identity, it was more closely related to OROV than two other NCBI reference orthobunyaviruses—La Crosse virus and Bunyamwera virus (). ZUNV was closely related to two Group C complex orthobunyaviruses—CARV and APEUV ().17
It should be noted that there was a paucity of South American orthobunyavirus sequences, particularly L segment sequences, available for comparison. Prior to this report, no full-length Group C orthobunyavirus L segments have been published.
L Segment Sequence Similarity of ZUNV to the Orthobunyaviruses Bunyawera Virus, La Crosse Virus, OROV, CARV, and APEUV
The low-sequence identity of IQE7620 with available orthobunyavirus sequences in GenBank may help to explain the unsuccessful RT-PCR amplification using primers designed based on known sequences. The low purity of viral nucleic acids in the total RNA extract, as suggested by the low percentage of mapped reads, was also consistent with the unsuccessful outcome of searching orthobunyavirus by cDNA cloning and Sanger sequencing (data not shown). The results demonstrated the effectiveness of a massively parallel pyrosequencing approach for identification of unknown viruses. Notably, the entire high-quality genome sequence was obtained from total RNA extracts by our methodology. Traditionally, viral particle purification by ultracentrifugation or sample preprocessing is used for sample enrichment to facilitate the analysis. However, these pretreatments may introduce bias or cause a loss of low-abundant fractions, which may hinder or prevent the discovery of unexpected agents. In this study, a random RT-PCR method was used for amplification in an unbiased manner to target all sequences in the specimen. Interestingly, we found a large portion of sequences that belonged to a few classes of nucleic acids, including mitochondrial DNA and ribosomal RNA from the host and DNA from mycoplasma, an organism common in clinical specimens and tissue culture (data not shown). Specific removal of these major contaminants likely will increase the efficiency of the method.
The methodology outlined here will be used in further aspects of the project, including completion of S and M segment sequences for the ZUNV isolate IQE7620. More isolates from different geographic sites and time periods will be sequenced for phylogenetic analyses and studies of orthobunyavirus evolution.5,9,18
The novel orthobunyavirus will be subjected to further virological investigations to reveal functional attributes, such as antigenic studies, and to define more precisely the genetic relationship with other orthobunyavirus group members.