We determined the complete nucleotide sequence of wt100, a laboratory strain of SA12. We sequenced the SA12 genome present in plasmid pUC119-SA12. This plasmid contains a complete infectious genome of the wt100 SA12 genome inserted at the unique PstI site (
8).
Nucleotide sequence of SA12. The complete nucleotide sequence reveals that SA12 has a genome organization typical of the Polyomaviridae (Fig. ). The viral regulatory region containing the promoter for early region transcription, the origin of viral DNA replication (ori), and the promoter for late region transcription and transcriptional enhancer sequences is flanked by coding sequences for large T and small t antigen and for an agnoprotein and the capsid proteins VP1, VP2, and VP3. We found no evidence for a middle T antigen, a protein encoded by murine polyomavirus and hamster polyomavirus. We also did not find any evidence for the expression of a T*-like protein; however, the sequence analysis and limited RNA studies that we performed do not exclude the possibility that one or more such proteins are produced. The organization of the late region is the same as is found in SV40, BKV, LPV, and JCV. This includes the position of the agnoprotein coding sequences, the overlap of VP2 and VP3 sequences, and the overlap of VP2/VP3 with VP1 coding sequences. Like SV40, SA12 encodes microRNAs on the late coding strand, overlapping the 3′ end of early mRNA (this study).
We had previously reported the sequence of a portion of SA12, including a region encompassing the viral regulatory sequences, small t antigen coding sequences, and the amino-terminal 163 amino acids of large T antigen (
8). In our current study we found that 822 out of 822 bp agreed with the published sequence.
Regulatory region. For the purposes of discussion, we define the viral regulatory region as the sequences that lie between the large T/small t antigen start codon and the agnoprotein initiation codon. This encompasses SA12 nucleotides 5124 to 294 (Fig. ). Our previous studies revealed that the organization of the SA12 regulatory region is typical of polyomaviruses (
8). The core
ori contains four potential T antigen binding sequences, two on each strand, separated from each other by a single base pair (nucleotides 5219 to 11). Three of these pentanucleotides match the consensus T antigen binding sequence 5′-GAGGC-3′, while the fourth has the sequence 5′-GTGGC-3′. A 27-bp imperfect palindrome lies to the early side of these T antigen binding sequences, while the late side is flanked by a 20-bp A/T-rich region. Based on studies with SV40, we hypothesize that these three elements (imperfect palindrome, T antigen binding pentanucleotides, and A/T-rich region) constitute the minimal
ori (
10). Two additional T antigen binding sites lie to the early side of the core
ori. These correspond to the SV40 site I regulatory sequences involved in the autoregulation of T antigen transcription in SV40. The 263 bp that lies to the late side of the A/T-rich region contains several repeated sequences and transcription factor binding sites. Based on similarity to other polyomaviruses, these sequences most likely contain the late promoter and transcriptional enhancers. As we previously reported, the SA12 regulatory region is most closely related to BKV.
Early region. Putative coding sequences for the SA12 large T and small t antigens are shown in Fig. . The 172-amino-acid small t antigen consists of an 82-amino-acid region that is common with large T antigen and includes the J domain and a unique 90-amino-acid region that contains cysteine repeats typical of other polyomavirus small t antigens. This region of the SV40 small t antigen binds to the cellular phosphatase pp2A. We have not assessed the ability of SA12 small t antigen to bind to pp2A. SA12 large T antigen consists of 699 amino acids. Sequence comparisons suggest that like other polyomavirus large T antigens, the SA12 protein consists of a J domain, an LXCXE motif and nuclear localization signal, a DNA-binding domain, a Zn-binding domain, and an ATPase domain. Sequence alignments indicate that, in each of these domains, the SA12 large T antigen is closely related to the large T antigens encoded by SV40, JCV, and BKV and is most closely related to that of BKV.
The large T antigens of SV40, JCV, and BKV have a host range (HR) domain at their carboxy termini. In the case of SV40 LT antigen, the HR domain is attached to the carboxy terminus of the helicase/ATPase domain by what appears to be a 45-amino-acid flexible linker. Deletion of the linker has no affect on virus growth or transformation, while mutants that alter the HR domain are defective for productive infection (
6,
26,
34). The SA12 large T antigen also contains a putative linker region and host range domain at its carboxy terminus. Sequence alignment reveals that the putative HR domains are conserved among the T antigens (Fig. ). In contrast, the SV40 linker region shows limited alignment with the corresponding regions of SA12, BKV, and JCV (Fig. ). However, the putative linker regions of the SA12, BKV, and JCV T antigens show excellent alignment (Fig. ). Perhaps this region of the SA12, BKV, and JCV T antigens has conserved a function that has either been lost or replaced by a different function in the SV40 LT antigen.
Late region. Putative coding sequences for agnoprotein and the capsid proteins VP1, VP2, and VP3 (Fig. ) are located in the viral late region. The organization of these genes is essentially the same as seen for SV40, BKV, and JCV. Again, sequence comparisons indicate that SA12 is most closely related to BKV.
Construction of an SA12 mutant with a consensus polyomavirus origin of DNA replication sequence. The large T antigen encoded by SV40 includes a DNA binding domain (origin binding domain) which recognizes the sequence 5′-GAGGC-3′. Four copies of this sequence are present in the SV40 core ori. Similarly, 9 of the previously sequenced 11 polyomaviruses have four copies of this pentanucleotide in their core ori regions, oriented in the same manner as SV40. The remaining two polyomaviruses, BFDV and GHFV, do not have this sequence in their ori regions. Presumably, the large T antigens encoded by these viruses recognize a different DNA sequence.
We previously reported that SA12 has three 5′-GAGGC-3′ pentanucleotides in its
ori region but that the fourth has the sequence 5′-GTGGC-3′ (
8). We hypothesized that the lower rate of viral DNA replication and small plaque size exhibited by SA12 might be due to the presence of this nonconsensus pentanucleotide. To test this hypothesis, we constructed a mutant, SA12-T10A, in which this pentanucleotide was changed to the consensus 5′-GAGGC-3′ sequence. As shown in Fig. , SA12-T10A exhibited the same plaque size (data not shown), growth kinetics, and virus yield as wild-type SA12. Furthermore, the time course of viral DNA replication was the same for SA12-T10A and wild-type SA12. However, SA12-T10A replicated DNA to a somewhat lower level than wild-type SA12. Thus, conversion of the SA12
ori to the consensus does not enhance viral DNA replication or virus yield.
Mapping early-region splice junctions. Polyomaviruses express an early-region precursor RNA that is differentially spliced to create multiple mRNAs. Two of the resulting mRNAs encode large T antigen or small t antigen. Several polyomaviruses have been shown to express mRNAs encoding smaller T antigens termed tiny T, 17K T, or T* (
30,
39,
43). In addition, murine polyomavirus and hamster polyomavirus express an mRNA encoding a middle T antigen. We predicted the splice junctions of the SA12 large T antigen mRNA and small t antigen mRNAs by sequence alignment with SV40, BKV, and JCV (Fig. ).
We confirmed these splice junctions by sequence analysis. BSC40 cells were infected with SA12 at an MOI of 5, and total RNA was extracted at 24 h and 72 h postinfection. Next, cDNA was synthesized using this RNA as a template. To sequence across the splice junctions, we generated two pairs of PCR primers that flanked the putative early-region splice sites (Fig. ). PCR was performed on the cDNA from both time points, and pUC119-SA12 served as a positive control. The PCR products were resolved through an agarose gel and visualized (Fig. ).
The predicted size of the PCR fragment from genomic SA12 is 549 bp, which agrees with the PCR product of pUC119-SA12, where one band about this size is visible (Fig. , lane 2). The large T antigen cDNA PCR product is visible in the 24- and 72-h cDNA reactions, but the small t antigen cDNA PCR product is visible only in the 72-h cDNA reaction (Fig. , lanes 4 and 6). The PCR amplification of the 24- and 72-h cDNA revealed a band that comigrated with the band in the pUC119-SA12 reaction. This fragment was later shown to be contaminating SA12 genomic DNA through sequencing (data not shown). Since subsequent sequencing of the small t antigen PCR product using primer pair 15/16 was ineffective, we performed PCR on 72-h cDNA using primer pair 15/17 (Fig. , lane 3).
Since the 72-h cDNA amplification yielded the strongest signals for the small t and large T antigen PCR products, the bands corresponding to the small t (Fig. , lane 3) and large T (Fig. , lane 6) antigens from these amplifications were excised, purified, and sequenced. Sequencing revealed that both share the same 3′ acceptor splice site, which ends with the sequence AG (where the G is nucleotide position 4880). The 5′ donor splice site for large T antigen starts with a GT (where the G is nucleotide position 4533), and the 5′ donor splice site for small t antigen starts with a GT (where the G is nucleotide position 4603). Therefore, processing of large T antigen and small t antigen pre-mRNAs results in the removal of 348 and 71 bases of RNA, respectively (Fig. ).
SA12 encoded microRNAs. It is becoming increasingly clear that miRNAs, originally identified in
Caenorhabditis elegans, play a broad and diverse role in the control of gene function in most eukaryotes (
1). miRNAs are approximately 22-nucleotide RNAs that function by binding to the mRNAs of target genes with various degrees of antisense complementarity. miRNAs inhibit protein expression of their target mRNAs by driving cleavage or translational inhibition (
37). Several viruses have recently been described to encode microRNAs; however, as with their cellular complements, little is known about their function (
2,
28,
29). We have recently identified a virally encoded pre-microRNA expressed late in SV40 infection that is processed into several miRNAs (
36). These miRNAs function to down-regulate early gene expression and reduce T-cell-mediated cellular lysis of SV40-infected cells in vitro, perhaps by reducing the amount of T antigen presented by major histocompatibility complex class I. Importantly, this pre-miRNA is computationally predicted to be conserved in other polyomaviruses, including JCV, BKV, and SA12 (
36) (Fig. ), which suggests a conserved function for the miRNAs generated among these viruses. Thus, determining whether SA12 actually expresses virally encoded miRNAs would have important implications for the replication cycle of SA12 as well as the validity of the in silico engine that generated these predictions.
To determine if SA12 expresses an miRNA, BSC40 monkey kidney epithelial cells were infected with SA12, and total RNA was harvested late in infection. Northern blot analysis was conducted with radiolabeled oligonucleotide probes directed to either the 5′ arm or 3′ arm of the predicted miRNA (Fig. ). In infected cells, we detected a prominent band migrating around ~60 nt, corresponding well with the observed pre-miRNA reported for SV40. Importantly, we also detected an approximately 22-nt band in all lanes that express the 60-nt band, consistent with a model in which SA12 derives ~22-nt miRNAs from its abundant pre-miRNA precursor, similarly to SV40. Thus, we conclude that SA12 is the second member of the Polyomaviridae confirmed to encode miRNAs.
Quantitative PCR assay that distinguishes SA12 from other primate polyomaviruses. In order to be able to quantify SA12 DNA for in vitro and in vivo studies, we have developed real-time, quantitative PCR assays for this viral genome. Two sets of primers and probes were chosen based on the Primer Express software (Table ), one set for each of the early and late regions. When tested against the SA12 genome, the reactions were sensitive down to between 1 and 10 copies per reaction (Table ). When tested against 107 copies of the related primate polyomaviruses BKV, JCV, SV40, and LPV, no reaction was found, demonstrating the specificity of the assay (Table ).
| TABLE 2.Sensitivities and specificities of the SA12 primer/probe sets |