|Home | About | Journals | Submit | Contact Us | Français|
The complete positive-sense single-stranded RNA genome of Cassava brown streak virus (CBSV; genus Ipomovirus; Potyviridae) was found to consist of 9,069 nucleotides and predicted to produce a polyprotein of 2,902 amino acids. It was lacking helper-component proteinase but contained a single P1 serine proteinase that strongly suppressed RNA silencing. Besides the exceptional structure of the 5′-proximal part of the genome, CBSV also contained a Maf/HAM1-like sequence (678 nucleotides, 226 amino acids) recombined between the replicase and coat protein domains in the 3′-proximal part of the genome, which is highly conserved in Potyviridae. HAM1 was flanked by consensus proteolytic cleavage sites for ipomovirus NIaPro cysteine proteinase. Homology of CBSV HAM1 with cellular Maf/HAM1 pyrophosphatases suggests that it may intercept noncanonical nucleoside triphosphates to reduce mutagenesis of viral RNA.
Cassava (Manihot esculenta Crantz; Euphorbiaceae) is an important tropical subsistence crop that is affected by cassava brown streak disease (CBSD) in the Indian Ocean coastal lowlands of East Africa (16, 35). There was a recent outbreak of CBSD at higher altitudes around Lake Victoria in Uganda and Tanzania (2, 26). The disease is caused by Cassava brown streak virus (CBSV), a whitefly-transmitted member of genus Ipomovirus that belongs to the family Potyviridae, which contains the largest number (ca. 200) of positive single-stranded RNA viruses infecting plants (13, 25, 27, 28). This virus family is divided into the genus Bymovirus with bipartite genomes and the genera Ipomovirus, Macluravirus, Potyvirus, Rymovirus, and Tritimovirus, containing monopartite viruses that encode a large polyprotein autoproteolytically cleaved into 10 mature proteins (Fig. (Fig.1)1) (13, 40). Additionally, a small open reading frame (ORF) created by frameshifting was recently detected in the P3 protein encoding region (10). Among members of the Potyviridae, ipomoviruses are exceptional in variability of protein-encoding sequences at the 5′ end of the genome (Fig. (Fig.1).1). Sweet potato mild mottle virus (SPMMV) (11) contains a single P1 serine proteinase at the polyprotein N terminus, whereas Cucumber vein yellowing virus (CVYV) (17, 22, 41) and Squash vein yellowing virus (SqVYV) (23) contain two P1 proteinases (P1a and P1b) that are evolutionary diversified (40). In addition, SqVYV and CVYV lack the multifunctional helper component proteinase (HC-Pro) (17, 23, 40), which is located second in the polyprotein (Fig. (Fig.1)1) in other monopartite Potyviridae members (1, 13) and which acts as a suppressor of RNA silencing (3, 7, 19, 46). This function has been adopted by P1b in CVYV (39, 41).
The genome structure of CBSV has hitherto not been known. Only partial coat protein (CP)-encoding sequences of coastal lowland isolates (27, 28) and complete CP sequences of highland isolates from East Africa are available, revealing that these isolates belong to two phylogenetically different strains (26). The complete sequence of the emergent highland strain of CBSV was determined in this study. Isolate MLB3 (26) from the Kagera region, Northwestern Tanzania, was mechanically transmitted to Nicotiana rustica, particles were purified from systemically infected leaves, and RNA was extracted from virions and used for cDNA synthesis as described previously (26). Overlapping fragments of the viral genome were amplified by PCR initially using degenerate primers designed according to the most-conserved regions of the SPMMV, CVYV, and SqVYV genomes and sequenced. Larger genomic segments were subsequently amplified using CBSV-specific primers (Fig. (Fig.1)1) and the high-fidelity Phusion DNA polymerase (Finnzymes, Espoo, Finland). At least two independently amplified fragments corresponding to each genomic segment (Fig. (Fig.1)1) were sequenced in both directions or sequenced directly without cloning as described previously (26). The 5′ end of the viral genome was determined using rapid amplification of cDNA ends (RACE) as described previously (32), by using SuperScript III reverse transcriptase (Invitrogen, Life Science Technologies, United Kingdom) for cDNA synthesis according to the manufacturer's instructions.
Nucleotide sequences were assembled by SeqMan (version 5.03; DNAStar, Madison, WI). Sequences of other viruses of Potyviridae were retrieved from the validated DPVweb database (http://www.dpvweb.net./seqs/plantviruses.php) and NCBI database (http://www.ncbi.nlm.nih.gov/). Polyprotein cleavage sites were predicted according to the previous comprehensive studies (1, 23, 40). Multiple sequence alignments of deduced amino acid sequences were done with ClustalX (version 1.83) using default settings. The sequences were exported to GeneDoc (version 2.6.002) for manual adjustment. The percent identities of nucleotides and deduced amino acid sequences were determined using the sequence distances option in the MegAlign program (version 5.03) from DNAStar, with default settings. Phylogenetic analyses were carried out according to the neighbor-joining method in MEGA4 (38). Vector NTI Advance software from Invitrogen (version 10.3.0) was used to estimate the molecular weights of polypeptides.
The genome of CBSV highland isolate MLB3 was 9,069 nucleotides (nt) long, excluding the poly(A) tail, and hence shorter than the genomes of ipomoviruses SPMMV (10,818 nt), SqVYV (9,836 nt), and CYVV (9,734 nt) (Fig. (Fig.1).1). The 5′ untranslated region (5′UTR; 134 nt) was followed by a single ORF (start codon AUG) terminated by the stop codon UAA at positions 8841 to 8843. The 3′UTR was 226 nt. Prediction of the proteolytic cleavage sites and motifs conserved in the polyproteins of Potyviridae members (1, 10, 17, 23, 33) revealed that CBSV encodes only 9 of the 10 expected proteins of Potyviridae, namely, P1, P3, 6K1, CI, 6K2, VPg, NIaPro, NIb, and CP (Fig. (Fig.1).1). No sequence homologous to HC-Pro was detected. PCR amplification of five additional highland isolates (BSA4, IGA8, LWR2, MLB9, and NTG10; see reference 26) of CBSV with primers CBSV2F1 and CBSV2R1 (Fig. (Fig.1)1) resulted in a 2.7-kb PCR product also in these isolates, consistent with a lack of the HC-Pro encoding sequence (data not shown). CBSV is hence the first member of Potyviridae that encodes a single P1 serine proteinase but lacks HC-Pro. Two other ipomoviruses, CVYV and SqVYV, lack HC-Pro but have two P1 proteinases (Fig. (Fig.11).
The P1 of CBSV (362 amino acids [aa]) is most closely related to the P1 of SPMMV and P1b of CVYV and SqVYV (Fig. (Fig.2A),2A), which all are related to the P1 of tritimoviruses (Fig. (Fig.2B)2B) (40). The conserved histidine, aspartic acid, and serine of the catalytic triad HDS (H-7X-D-34X-S) of P1 (44, 45) were observed at positions 265, 273, and 308 in CBSV. It matches with the spacing in P1b of CVYV and SqVYV, although the amino acid sequences of the proteins per se are only 31% and 30% identical, respectively (Fig. (Fig.2A).2A). High divergence of the P1 proteins is characteristic of Potyviridae (40). The identity between P1a of CVYV and SqVYV and P1 of CBSV was very low (Fig. (Fig.2A2A).
The deduced P3 sequence (294 aa) of CBSV was more identical to CVYV, SqVYV, and SPMMV P3 than were the P3 proteins of other Potyviridae members (Fig. (Fig.2A).2A). The small ORF PiPo created by a +2 frameshift (10) was identified in the P3 encoding region of CBSV (positions 1607 to 1852) and consisted of 82 codons, compared to 99 and 79 codons in SPMMV and CVYV, respectively (10). P3 and the other mature proteins, 6K1 and 6K2 (52 aa each), CI (628 aa), VPg (185 aa), NIaPro (234 aa), NIb (502 aa), and CP (367 aa), showed the highest sequence identities with SqVYV and CVYV, followed by SPMMV (Fig. (Fig.2A),2A), and much lower amino acid sequence identities and more distant phylogenetic relatedness with other members of the Potyviridae (Fig. 2A and B).
Surprisingly, alignment of the complete nucleotide and polyprotein amino acid sequences of CBSV, CVYV, SPMMV, and SqVYV revealed that CBSV contains an insertion (678 nt) resulting in a novel polypeptide of 226 aa between the replicase (NIb) and CP flanked by the predicted proteolytic cleavage sites VDTQ2309/T and IDVQ2535/A for the main viral proteinase, NIaPro (Fig. (Fig.1)1) (17, 23). BLASTN and BLASTP searches indicated that the novel sequence is homologous and shares the conserved amino acid motifs with the Maf/HAM1 superfamily of proteins known for prokaryotic and eukaryotic organisms, including bacteria, fungi, plants, insects, frogs, fishes, warm-blooded animals, and humans (Fig. 3A and B). The Maf/HAM1 proteins are nucleoside triphosphate (NTP) pyrophosphatases that reduce mutagenesis by intercepting noncanonical NTPs and preventing their incorporation into DNA or RNA (14). HAM1 of yeast (Saccharomyces cerevisiae) reduces sensitivity to 6-N-hydroxylaminopurine (HAP), which causes a hypermutable phenotype in phages, bacteria, yeast, and other eukaryotic organisms (29). HAM1 also has purine nucleoside triphosphatase activity on ITP and HAP triphosphate (8) and an ability to decompose and detoxify abnormal pyrimidine and purine nucleotides (37). The HAM1 homolog (HAM1h) was detected for all eight highland isolates of CBSV tested from Uganda and Tanzania (Fig. (Fig.3A)3A) and shared 87.6 to 99.6% and 86.3 to 100% nucleotide and amino acid identity, respectively, among isolates (Fig. (Fig.3C).3C). That HAM1h is an integral part of the CBSV genome was verified by mechanical transmission of the CBSV isolates to Nicotiana benthamiana, Nicotiana tabacum, and Nicotiana occidentalis and an analysis of virus progeny from the systemically infected leaves by reverse transcription-PCR using primer pairs in which one primer targeted HAM1h and the other primer the NIb- or CP-encoding region (data not shown). HAM1-like sequence was also detected for Euphorbia ringspot virus (EuRSV; 00.057.0.81.036 in the ICTVdB database, version 4; NCBI database accession number AY397600), belonging to the genus Potyvirus (13). Also in EuRSV, HAM1h was situated between NIb and CP and flanked by the predicted proteolytic cleavage sites of Potyvirus NIaPro (1) (Fig. (Fig.3A).3A). No other viruses carrying HAM1-like sequences were found in sequence databases.
P1, P3, and HAM1h of CBSV were tested for their ability to suppress RNA silencing using the previously described agroinfiltration assay (7, 18, 21). Agrobacterium tumefaciens (strain C58c1; pGV3850) was transformed with binary vectors (pA35Shp200) each expressing one of the proteins under Cauliflower mosaic virus 35S promoter, and the 5′UTR of Potato virus A (PVA) was used for translation enhancement (21). Beta-glucuronidase (GUS) and PVA HC-Pro were included as negative and positive controls, respectively (21). Leaves of N. benthamiana were coinfiltrated with Agrobacterium strains for expression of the gfp gene for green fluorescent protein (GFP), gfp-specific double-stranded (hairpin) RNA to induce “strong gfp silencing” (18), and a third construct to express the putative silencing suppressor, as described previously (21). Silencing was suppressed by CBSV P1 and PVA HC-Pro, allowing continued GFP expression, whereas GFP fluorescence faded out by 5 days postinfiltration following infiltration with other constructs (Fig. (Fig.4A),4A), including mutated P1-, P3-, and HAM1h-encoding sequences containing a stop codon and frameshift in the beginning of the ORF to prevent protein expression. Hence, the P1 protein was required for silencing suppression. It contains the conserved LXKA (aa 109 to 120) and zinc finger motif (cysteine residues 128, 131, 145, 148) required for efficient silencing suppression by CVYV P1b (39). Second, gfp transcripts were overexpressed by agroinfiltration in leaves of transgenic N. benthamiana constitutively expressing gfp (line 16c, courtesy of D. C. Baulcombe) (7) to induce gene cosuppression or “weak gfp silencing” (18). CBSV P1 and PVA HC-Pro suppressed gfp silencing, in contrast to the other constructs (Fig. (Fig.4B).4B). Visual observation of GFP fluorescence correlated with the accumulation of gfp mRNA, whereas a lack of fluorescence correlated with a high accumulation of gfp-specific small interfering RNA (siRNA) detected by Northern blot analysis (Fig. (Fig.4C)4C) using an [α-32P]rUTP-labeled gfp RNA probe (21) as described previously (30). CBSV P1 exhibited highly efficient suppression of gfp silencing and siRNA accumulation, whereas HC-Pro only partially prevented siRNA accumulation as reported previously (21, 46). Results of the three experiments were similar.
Valli et al. (40) have proposed scenarios on how the hypothesized ancestor polyprotein structure P1a-P1b-HCPro-P3 has evolved to the P1a-P1b-P3 structure of CVYV and SqVYV, which has involved the adoption of silencing suppression functions by P1b and the loss of HC-Pro. Evolution in this direction is observed for SPMMV, in which the large P1 protein (82 kDa) suppresses silencing, while HC-Pro contributes only to the durability of silencing suppression (15). Furthermore, in the tritimovirus Wheat streak mosaic virus, HC-Pro does not suppress silencing and is redundant for infectivity and symptom induction (34). Our data suggest that CBSV represents the latest step of evolution which shapes the polyprotein N terminus in ipomo- and tritimo-like viruses, because CBSV lacks both P1a and HC-Pro and uses the P1b-like P1 protein for suppression of silencing.
Intriguingly, CBSV has recombined a Maf/HAM1-like sequence to the 3′-proximal part of the viral genome, which is highly conserved in Potyviridae (13). The NIb/CP junction can accommodate heterologous genes in engineered potyvirus clones without compromising infectivity (see references 4, 20, and 43 and the references therein), but natural integration of foreign sequences has not been reported. While HAMh1 could not suppress RNA silencing, homology of the protein with cellular Maf/HAM1 NTP pyrophosphatases suggests that HAMh1 might intercept noncanonical NTPs to reduce mutation rates of viral RNA. When mutation rates exceed a critical threshold, the virus may experience an “error catastrophe,” i.e., decreased infectivity and extinction of the virus population (9). For example, the presence of ribavirin, an IMP dehydrogenase inhibitor, causes depletion of the canonical GTP pool and increases the mutation rate 10-fold in poliovirus (12), which is related to viruses of the family Potyviridae (13). Suppression of the mutation rates might provide a particular advantage to the virus under conditions that are stressful to the host plant. For example, oxidative stress increases mutation rates (5) and causes early senescence (31), which is observed with older leaves of CBSD-affected cassava plants (16, 26, 35). EuRSV also infects plants of the Euphorbiaceae family (ICTVdB database, version 4) and may encode HAM1h for similar reasons. Another strategy to counteract deleterious mutations is exhibited by replicases of many viruses of the Flexiviridae and some viruses of the Closteroviridae that contain an AlkB domain similar to the AlkB proteins in plant-infecting bacteria (6). However, Blackberry virus Y which represents a putative new genus of Potyviridae contains the AlkB domain inside the P1 proteinase (36). This virus is the only Potyviridae member besides CBSV and EuRSV that is known to carry an insertion of cellular origin. AlkB domains in the aforementioned viral proteins are functional in repairing methylation damage of nucleic acids by oxidative demethylation, as also shown for the homologous proteins in cellular organisms (42). The AlkB proteins of viruses have not been found to suppress RNA silencing (42), which was also the case with CBSV HAMh1 in this study. Since a large proportion of the AlkB-containing viruses infect perennial, often woody, host plants, it is anticipated that RNA repair by AlkB proteins may be advantageous for the stability of viruses in hosts infected for long periods of time under various environmental conditions (24). It has also been proposed that conditions enhancing the methylation of nucleic acids might promote the incorporation and maintenance of AlkB sequences by viruses (6). The benefits for viral fitness from incorporation of HAM1h in CBSV and EuRSV deserve to be investigated thoroughly.
The CBSV genome sequence determined in this study was submitted to GenBank under accession number FJ039520.
We thank Minna-Liisa Rajamäki and Wilmer Cuellar for technical advice and Alois Kullaya, Fred Tairo, and Samuel Kyamanywa for supporting this study.
This work was part of the BIO-EARN program funded by Sida, Sweden.
Published ahead of print on 22 April 2009.