|Home | About | Journals | Submit | Contact Us | Français|
The Polyomaviridae have small icosahedral virions that contain a genome of approximately 5,000 bp of circular double-stranded DNA. Polyomaviruses infect hosts ranging from humans to birds, and some members of this family induce tumors in test animals or in their natural hosts. We report the complete nucleotide sequence of simian agent 12 (SA12), whose natural host is thought to be Papio ursinus, the chacma baboon. The 5,230-bp genome has a genetic organization typical of polyomaviruses. Sequences encoding large T antigen, small t antigen, agnoprotein, and the viral capsid proteins VP1, VP2, and VP3 are present in the expected locations. We show that, like its close relative simian virus 40 (SV40), SA12 expresses microRNAs that are encoded by the late DNA strand overlapping the 3′ end of large T antigen coding sequences. Based on sequence comparisons, SA12 is most closely related to BK virus (BKV), a human polyomavirus. We have developed a real-time PCR test that distinguishes SA12 from BKV and the other closely related polyomaviruses JC virus and SV40. The close relationship between SA12 and BKV raises the possibility that these viruses circulate between human and baboon hosts.
The polyomaviruses are a family of viruses characterized by small circular double-stranded DNA genomes and by 45-nm icosahedral particles consisting only of protein and nucleic acid. Some members of the family, such as budgerigar fledgling disease virus (BFDV), goose hemorrhagic polyomavirus (GHFV), and pneumotropic polyomavirus, cause lethal diseases, while others, such as hamster polyomavirus and murine polyomavirus, induce tumors in their natural hosts (11, 13, 14, 15, 17, 18, 21, 31, 35). In other cases, exposure to polyomaviruses results in a lifelong persistent infection with no clear symptoms. This is the case for the two known human polyomaviruses, BK virus (BKV) and JC virus (JCV), and for the monkey virus simian virus 40 (SV40) (12). In fact, a majority of humans harbor asymptomatic BKV and JCV infections. However, JCV can cause progressive multifocal leukoencephalopathy in humans that are immunocompromised, and several reports have linked BKV with hemorrhagic and nonhemorrhagic cystitis and nephropathy that are linked to immunosuppressive transplant therapy (4, 9, 16).
Although over 11 polyomaviruses have been identified thus far, SV40 and murine polyomavirus are the best characterized. This is because they were the first identified members of the family and because both viruses are easily propagated in culture, where they undergo a well-defined productive infectious cycle. Following attachment to a specific receptor and transport of virions to the nucleus, viral chromatin is released and the early viral promoter is activated by the cellular transcription apparatus. This results in the synthesis of the viral early proteins, the tumor antigens or T antigens. The multifunctional T antigens function to drive the infected host cells into S phase, activate transcription of the viral late promoter, initiate and maintain viral DNA replication, and inactivate host defense systems. Consequently, new viral genomes are replicated and the virion proteins (VP1, VP2, and VP3) are expressed. The result is the assembly of about 300 infectious virions per cell.
SV40 is about 50 to 60% identical to murine polyomavirus DNA, and protein coding region identity ranges from 50% to 70% for the two viruses. SV40 is most closely related to BKV and JCV, with protein coding region identity ranging from 60 to 70% for JCV and 70 to 80% for BKV. While the protein coding regions of SV40, BKV, and JCV are very similar, each of these viruses has a distinct regulatory region. Specifically, the transcriptional enhancers that lie to the late side of the viral origin of DNA replication (ori) are virus specific. This observation leads to the speculation that polyomavirus host range might be determined by the compatibility of viral promoter elements and the cellular transcription apparatus.
Simian agent 12 (SA12) was first isolated from cultured kidney cells derived from Cercopithecus pygerythrus, the South African vervet monkey (20). Subsequently, it was found that a high percentage of Papio ursinus monkeys, chacma baboons, were seropositive for SA12 (41). Like other polyomaviruses, SA12 transforms cells in culture (40). Early studies demonstrated that SA12 is a polyomavirus closely related to SV40, JCV, and BKV (3, 19, 23, 32, 33). SA12 is more distantly related to a second nonhuman primate polyomavirus, lymphotropic polyomavirus (LPV) (24). In fact, limited sequence analysis indicated that SA12 is most closely related to BKV (8). Since that time, BKV and JCV have emerged as important human pathogens. Thus, as a step toward gaining a better understanding of the genetic elements that govern polyomavirus host range and to develop reagents that distinguish SA12 DNA from other polyomaviruses, especially BKV, we determined the complete sequence of a laboratory strain of SA12.
All sequencing reactions were carried out on an ABI PRISM 377 machine in the DNA facility in the Department of Biological Sciences, University of Pittsburgh. The genome of SA12 was sequenced, with threefold coverage, using the following primers (listed 5′ to 3′): CTCTGGTTTGGATAGATTGC, complement(4716.4735); GTAGACACTCCAGTCATGGGC, complement(4196.4216); GCAGTGGATACAGTATTAGC, complement(3733.3752); GGCAGAATCAAAAGATTTGC, complement(3314.3333); GGATGAAATATAACATTTGC, complement(2910.2929); ATGCACCAGCCTCTCAAAC, complement(2764.2782); ATCCAGCAGCACAGTTGTGG, complement(2195.2214); AGTTTTGGAACTCGCACGG, complement(1535.1553); GATTAAACAGCTCCAAAGCC, complement(887.906); TTACCAACTTTCACAGAGGC, complement(322.341); TGACAAAGGGGGAGACGAAG, complement(4976.4995); and GGATGTAAAGGTAGCCCATC, complement(4913.4932). The templates for sequencing SA12 included SA12 viral DNA and two preparations of pUC119-SA12. pUC119-SA12 was constructed by inserting SA12 genomic DNA into the unique PstI site of pUC119 (8).
Nucleotide (nt) 10 of the SA12 genome was converted from an A to a T by using the QuikChange XL site-directed mutagenesis kit (Stratagene, La Jolla, CA) and by following the manufacturer's directions. pUC119-SA12 acted as the template in the mutagenesis reaction using the primers 5′-GAGGCGGCCTCGGCCTCTTATATATTATAAAAAAAAAGGC and 5′-GCCTTTTTTTTTATAATATATAAGAGGCCGAGGCCGCCTC. The base pair change in the resulting plasmid called pUC119-SA12-T10A was verified by sequencing.
SA12 stocks were prepared from cloned viral DNA following transfection of BSC40 cells as described previously (8, 25). Briefly, pUC119-SA12 was digested with PstI to liberate the viral genome from the vector. Following inactivation of the restriction enzyme, the digest mixture was diluted in ligation buffer to a concentration of 3 μg/ml, and 1 unit of T4 DNA ligase added. The ligation reaction was carried out at 37°C for 4 h. Next the ligated DNA mixture was introduced into BSC40 cells by DEAE-dextran-mediated transfection as previously described (38). Transfected cells were maintained at 37°C in minimal essential medium supplemented with 10% fetal bovine serum. At 7 days posttransfection, the dishes were placed at −20°C until frozen. Following three freeze-thaws cycles and clarification by low-speed centrifugation, the virus stock was serially diluted and subjected to a plaque assay (8). In some cases, transfected cells were directly overlaid with agar for a plaque assay. Virus stocks were prepared by amplification of the virus present in individual plaques in BSC40 cells. Typically, titers of 3 × 107 PFU/ml were obtained.
Freshly confluent BSC40 cells in a 6-cm dish were infected with SV40, SA12, or SA12-T10A at an multiplicity of infection (MOI) of 10. At various times postinfection, viral DNA was isolated by following the QIAprep spin miniprep kit protocol as described previously (44). After the monolayer was washed once with phosphate-buffered saline, the cells were lysed by adding 300 μl of buffers P1 and P2 (QIAGEN) to the dish and incubating it for 5 min at room temperature. With a rubber policeman, the cell lysate was transferred to a 1.5-ml polypropylene microcentrifuge tube. Next the lysate was digested by adding proteinase K to a final concentration of 800 μg/ml and incubating the tube at 55°C for 2 h. After the addition of 350 μl of buffer N3 and incubation on ice for 5 min, cellular DNA was precipitated by centrifugation for 10 min at 17,900 × g. Next the supernatant was applied to a QIAprep spin column and centrifuged for 1 min. The column was washed with 0.5 ml and 0.75 ml of buffers PB and PE, respectively, and samples were centrifuged after each addition of buffer for 1 min. Finally, DNA was eluted from the column by addition of 50 μl of elution buffer, incubation for 5 min, and centrifugation for 1 min. The DNA was then digested with BamHI or PstI to linearize the viral DNA. Following gel electrophoresis (0.7% agarose) the viral DNA was visualized by staining with GelStar (Cambrex Bio Science).
BSC40 cells were infected with SA12 at an MOI of 5 and maintained in minimal essential medium-10% fetal bovine serum at 37°C. At 24 and 72 h postinfection, cells were harvested using trypsin, and the cell pellet was washed three times with cold phosphate-buffered saline-EDTA. The cell pellet was frozen at −80°C until used. Total RNA was isolated from the cells by using the RNeasy kit (QIAGEN) with optional on-column DNase digestion. The isolation yielded on average 50 μg of total RNA per 10-cm tissue culture dish. Next, 1 μg of total RNA was used in a first-strand cDNA synthesis reaction mixture using Superscript II reverse transcriptase (Invitrogen) by following the recommendations of the manufacturer except that 50 units of enzyme was used. Primer 15 (5′-GGTGGGGTTGAGTGTTGAGAATC) was used in conjunction with primer 16 (5′-GATGGAGCAGGATGTAAAGGTAGC) and primer 17 (5′-TGGAGAAACACCCTTCAGAGA), in separate reactions, to flank the putative large T (LT) and small t (ST) early-region splice sites, respectively. Thirty-five cycles of PCR were performed on 1 μl of cDNA using 0.75 units of Taq DNA polymerase (Invitrogen) in a buffer containing 1.5 mM MgCl2, 0.2 mM deoxynucleoside triphosphates, and 0.4 μM of each primer. PCR products were resolved through a 1.5% agarose gel in 1× Tris-acetate-EDTA and stained with GelStar (Cambrex Bio Science). PCR bands were purified from the gel by using the QIAQuick gel extraction kit (QIAGEN) according the manufacturer's protocol and sequenced using primer 15.
RNA secondary-structure prediction showing predicted pre-microRNA (pre-miRNA) (36) was performed with mFOLD (45). Total RNA was isolated using RNABee (Tel-Test Inc., Friendswood, TX). RNA was fractionated on a denaturing urea 15% polyacrylamide gel, electroblot transferred to a ZetaProbe-GT membrane (Bio-Rad), hybridized in 12 ml of hybridization buffer (0.2 M sodium phosphate, 7% sodium dodecyl sulfate), and then probed overnight at 42°C. Membranes were washed, exposed to autoradiography film with an intensifying screen (Kodak), and developed after 1 to 3 days. Partially double- or single-stranded, radioactive oligonucleotide probes were generated by performing a Klenow fill-in reaction (Klenow exo-; New England Biolabs) with radioactive dCTP. Prior to the fill-in reaction, a universal linker [5′-(G)36ACCTGC] was annealed to the following oligonucleotides, which are antisense to portions of the predicted SA12 pre-miRNA hairpin: the 5′ probe, 5′-CAGTGCTTTTCCCAAGCCTCAGATACCTCAGGCTCTGGCAGGTCG, and the 3′ probe, 5′-CAGAAACTGAAGACTCTGGACATGGATCAAGCACTGGCAGGTCG.
The sequence of SA12 was aligned using the ClustalW program with those of the primate polyomaviruses (SV40, BKV, JCV, and LPV) to identify genomic regions that were not conserved among these viruses so that primers and probes developed for SA12 would not be expected to react with the genomes of the other viruses. Two regions were identified—one in the T antigen gene and the other in the region common to the VP2 and VP3 genes. Primers and probes were selected from these two regions using the Primer Express program (PE Applied Biosystems, Foster City, CA). The sequences of the primers/probe sets are shown in Table Table11.
TaqMan reagents were purchased from PE Applied Biosystems. The primers and probes were synthesized by the Facility for Biotechnology Resources, CBER, FDA. The probes were tagged with 6-carboxyfluorescein as the reporter dye at the 5′ end or the quencher dye 6-carboxytetramethylrhodamine at the 3′ end. TaqMan PCR assays were performed by following the manufacturer's recommendations using TaqMan Universal PCR 2× master mix (Applied Biosystems; catalog no. 4304437). The reactions were carried out in a total volume of 25 μl in the presence of 1 μM of the primers and 0.6 μM of the probe. Amplification was carried out in 0.2 ml TaqMan optical tubes using the Applied Biosystems SDS 7700 machine. We used an incubation of 2 min at 50°C followed by 10 min at 95°C, and subsequently, 60 cycles of denaturation at 95°C for 15 s and annealing and extension at 60°C for 1 min were carried out.
Standard curves were generated using cloned SA12 DNA. A standard curve was prepared for each experiment using 10-fold dilutions of purified viral DNA, representing 101 to 108 copies of DNA in 10 μl. All dilutions were prepared in 1× Tris-EDTA containing pUC19 DNA at 100 ng/ml as a stabilizer. TaqMan data were analyzed using Sequence Detection Systems v.1.7.0 (PE Applied Biosystems). Threshold values and baseline parameters for analysis of the raw data were selected according to the cycle threshold values for each experiment and by following the manufacturer's guidelines. The number of copies of viral genomes in a given sample is calculated by reference to the standard curve.
The sequence of SA12 was submitted to GenBank and was assigned accession number AY614708.
We determined the complete nucleotide sequence of wt100, a laboratory strain of SA12. We sequenced the SA12 genome present in plasmid pUC119-SA12. This plasmid contains a complete infectious genome of the wt100 SA12 genome inserted at the unique PstI site (8).
The complete nucleotide sequence reveals that SA12 has a genome organization typical of the Polyomaviridae (Fig. (Fig.1).1). The viral regulatory region containing the promoter for early region transcription, the origin of viral DNA replication (ori), and the promoter for late region transcription and transcriptional enhancer sequences is flanked by coding sequences for large T and small t antigen and for an agnoprotein and the capsid proteins VP1, VP2, and VP3. We found no evidence for a middle T antigen, a protein encoded by murine polyomavirus and hamster polyomavirus. We also did not find any evidence for the expression of a T*-like protein; however, the sequence analysis and limited RNA studies that we performed do not exclude the possibility that one or more such proteins are produced. The organization of the late region is the same as is found in SV40, BKV, LPV, and JCV. This includes the position of the agnoprotein coding sequences, the overlap of VP2 and VP3 sequences, and the overlap of VP2/VP3 with VP1 coding sequences. Like SV40, SA12 encodes microRNAs on the late coding strand, overlapping the 3′ end of early mRNA (this study).
We had previously reported the sequence of a portion of SA12, including a region encompassing the viral regulatory sequences, small t antigen coding sequences, and the amino-terminal 163 amino acids of large T antigen (8). In our current study we found that 822 out of 822 bp agreed with the published sequence.
For the purposes of discussion, we define the viral regulatory region as the sequences that lie between the large T/small t antigen start codon and the agnoprotein initiation codon. This encompasses SA12 nucleotides 5124 to 294 (Fig. (Fig.2).2). Our previous studies revealed that the organization of the SA12 regulatory region is typical of polyomaviruses (8). The core ori contains four potential T antigen binding sequences, two on each strand, separated from each other by a single base pair (nucleotides 5219 to 11). Three of these pentanucleotides match the consensus T antigen binding sequence 5′-GAGGC-3′, while the fourth has the sequence 5′-GTGGC-3′. A 27-bp imperfect palindrome lies to the early side of these T antigen binding sequences, while the late side is flanked by a 20-bp A/T-rich region. Based on studies with SV40, we hypothesize that these three elements (imperfect palindrome, T antigen binding pentanucleotides, and A/T-rich region) constitute the minimal ori (10). Two additional T antigen binding sites lie to the early side of the core ori. These correspond to the SV40 site I regulatory sequences involved in the autoregulation of T antigen transcription in SV40. The 263 bp that lies to the late side of the A/T-rich region contains several repeated sequences and transcription factor binding sites. Based on similarity to other polyomaviruses, these sequences most likely contain the late promoter and transcriptional enhancers. As we previously reported, the SA12 regulatory region is most closely related to BKV.
Putative coding sequences for the SA12 large T and small t antigens are shown in Fig. 3A and B. The 172-amino-acid small t antigen consists of an 82-amino-acid region that is common with large T antigen and includes the J domain and a unique 90-amino-acid region that contains cysteine repeats typical of other polyomavirus small t antigens. This region of the SV40 small t antigen binds to the cellular phosphatase pp2A. We have not assessed the ability of SA12 small t antigen to bind to pp2A. SA12 large T antigen consists of 699 amino acids. Sequence comparisons suggest that like other polyomavirus large T antigens, the SA12 protein consists of a J domain, an LXCXE motif and nuclear localization signal, a DNA-binding domain, a Zn-binding domain, and an ATPase domain. Sequence alignments indicate that, in each of these domains, the SA12 large T antigen is closely related to the large T antigens encoded by SV40, JCV, and BKV and is most closely related to that of BKV.
The large T antigens of SV40, JCV, and BKV have a host range (HR) domain at their carboxy termini. In the case of SV40 LT antigen, the HR domain is attached to the carboxy terminus of the helicase/ATPase domain by what appears to be a 45-amino-acid flexible linker. Deletion of the linker has no affect on virus growth or transformation, while mutants that alter the HR domain are defective for productive infection (6, 26, 34). The SA12 large T antigen also contains a putative linker region and host range domain at its carboxy terminus. Sequence alignment reveals that the putative HR domains are conserved among the T antigens (Fig. 4A and B). In contrast, the SV40 linker region shows limited alignment with the corresponding regions of SA12, BKV, and JCV (Fig. (Fig.4C).4C). However, the putative linker regions of the SA12, BKV, and JCV T antigens show excellent alignment (Fig. (Fig.4D).4D). Perhaps this region of the SA12, BKV, and JCV T antigens has conserved a function that has either been lost or replaced by a different function in the SV40 LT antigen.
Putative coding sequences for agnoprotein and the capsid proteins VP1, VP2, and VP3 (Fig. 3C to E) are located in the viral late region. The organization of these genes is essentially the same as seen for SV40, BKV, and JCV. Again, sequence comparisons indicate that SA12 is most closely related to BKV.
The large T antigen encoded by SV40 includes a DNA binding domain (origin binding domain) which recognizes the sequence 5′-GAGGC-3′. Four copies of this sequence are present in the SV40 core ori. Similarly, 9 of the previously sequenced 11 polyomaviruses have four copies of this pentanucleotide in their core ori regions, oriented in the same manner as SV40. The remaining two polyomaviruses, BFDV and GHFV, do not have this sequence in their ori regions. Presumably, the large T antigens encoded by these viruses recognize a different DNA sequence.
We previously reported that SA12 has three 5′-GAGGC-3′ pentanucleotides in its ori region but that the fourth has the sequence 5′-GTGGC-3′ (8). We hypothesized that the lower rate of viral DNA replication and small plaque size exhibited by SA12 might be due to the presence of this nonconsensus pentanucleotide. To test this hypothesis, we constructed a mutant, SA12-T10A, in which this pentanucleotide was changed to the consensus 5′-GAGGC-3′ sequence. As shown in Fig. Fig.5,5, SA12-T10A exhibited the same plaque size (data not shown), growth kinetics, and virus yield as wild-type SA12. Furthermore, the time course of viral DNA replication was the same for SA12-T10A and wild-type SA12. However, SA12-T10A replicated DNA to a somewhat lower level than wild-type SA12. Thus, conversion of the SA12 ori to the consensus does not enhance viral DNA replication or virus yield.
Polyomaviruses express an early-region precursor RNA that is differentially spliced to create multiple mRNAs. Two of the resulting mRNAs encode large T antigen or small t antigen. Several polyomaviruses have been shown to express mRNAs encoding smaller T antigens termed tiny T, 17K T, or T* (30, 39, 43). In addition, murine polyomavirus and hamster polyomavirus express an mRNA encoding a middle T antigen. We predicted the splice junctions of the SA12 large T antigen mRNA and small t antigen mRNAs by sequence alignment with SV40, BKV, and JCV (Fig. (Fig.6A6A).
We confirmed these splice junctions by sequence analysis. BSC40 cells were infected with SA12 at an MOI of 5, and total RNA was extracted at 24 h and 72 h postinfection. Next, cDNA was synthesized using this RNA as a template. To sequence across the splice junctions, we generated two pairs of PCR primers that flanked the putative early-region splice sites (Fig. (Fig.6B).6B). PCR was performed on the cDNA from both time points, and pUC119-SA12 served as a positive control. The PCR products were resolved through an agarose gel and visualized (Fig. 6C and D).
The predicted size of the PCR fragment from genomic SA12 is 549 bp, which agrees with the PCR product of pUC119-SA12, where one band about this size is visible (Fig. (Fig.6C,6C, lane 2). The large T antigen cDNA PCR product is visible in the 24- and 72-h cDNA reactions, but the small t antigen cDNA PCR product is visible only in the 72-h cDNA reaction (Fig. (Fig.4C,4C, lanes 4 and 6). The PCR amplification of the 24- and 72-h cDNA revealed a band that comigrated with the band in the pUC119-SA12 reaction. This fragment was later shown to be contaminating SA12 genomic DNA through sequencing (data not shown). Since subsequent sequencing of the small t antigen PCR product using primer pair 15/16 was ineffective, we performed PCR on 72-h cDNA using primer pair 15/17 (Fig. (Fig.6D,6D, lane 3).
Since the 72-h cDNA amplification yielded the strongest signals for the small t and large T antigen PCR products, the bands corresponding to the small t (Fig. (Fig.6D,6D, lane 3) and large T (Fig. (Fig.6C,6C, lane 6) antigens from these amplifications were excised, purified, and sequenced. Sequencing revealed that both share the same 3′ acceptor splice site, which ends with the sequence AG (where the G is nucleotide position 4880). The 5′ donor splice site for large T antigen starts with a GT (where the G is nucleotide position 4533), and the 5′ donor splice site for small t antigen starts with a GT (where the G is nucleotide position 4603). Therefore, processing of large T antigen and small t antigen pre-mRNAs results in the removal of 348 and 71 bases of RNA, respectively (Fig. (Fig.6E6E).
It is becoming increasingly clear that miRNAs, originally identified in Caenorhabditis elegans, play a broad and diverse role in the control of gene function in most eukaryotes (1). miRNAs are approximately 22-nucleotide RNAs that function by binding to the mRNAs of target genes with various degrees of antisense complementarity. miRNAs inhibit protein expression of their target mRNAs by driving cleavage or translational inhibition (37). Several viruses have recently been described to encode microRNAs; however, as with their cellular complements, little is known about their function (2, 28, 29). We have recently identified a virally encoded pre-microRNA expressed late in SV40 infection that is processed into several miRNAs (36). These miRNAs function to down-regulate early gene expression and reduce T-cell-mediated cellular lysis of SV40-infected cells in vitro, perhaps by reducing the amount of T antigen presented by major histocompatibility complex class I. Importantly, this pre-miRNA is computationally predicted to be conserved in other polyomaviruses, including JCV, BKV, and SA12 (36) (Fig. (Fig.7A),7A), which suggests a conserved function for the miRNAs generated among these viruses. Thus, determining whether SA12 actually expresses virally encoded miRNAs would have important implications for the replication cycle of SA12 as well as the validity of the in silico engine that generated these predictions.
To determine if SA12 expresses an miRNA, BSC40 monkey kidney epithelial cells were infected with SA12, and total RNA was harvested late in infection. Northern blot analysis was conducted with radiolabeled oligonucleotide probes directed to either the 5′ arm or 3′ arm of the predicted miRNA (Fig. 7B and C). In infected cells, we detected a prominent band migrating around ~60 nt, corresponding well with the observed pre-miRNA reported for SV40. Importantly, we also detected an approximately 22-nt band in all lanes that express the 60-nt band, consistent with a model in which SA12 derives ~22-nt miRNAs from its abundant pre-miRNA precursor, similarly to SV40. Thus, we conclude that SA12 is the second member of the Polyomaviridae confirmed to encode miRNAs.
In order to be able to quantify SA12 DNA for in vitro and in vivo studies, we have developed real-time, quantitative PCR assays for this viral genome. Two sets of primers and probes were chosen based on the Primer Express software (Table (Table1),1), one set for each of the early and late regions. When tested against the SA12 genome, the reactions were sensitive down to between 1 and 10 copies per reaction (Table (Table2).2). When tested against 107 copies of the related primate polyomaviruses BKV, JCV, SV40, and LPV, no reaction was found, demonstrating the specificity of the assay (Table (Table22).
Members of the Polyomaviridae have been found in birds, rodents, and human and nonhuman primates. Most studies of this viral family have focused on SV40 and murine polyomavirus because they were the first members of the group to be discovered and because both grow well in cell culture. Little is known about polyomavirus transmission in nature, including that in humans. Furthermore, it is not known how many different polyomaviruses exist in nature or whether specific polyomaviruses are restricted to a single host or circulate among multiple hosts. As a start towards studying these issues, we have characterized the primate polyomavirus SA12.
As with other members of the Polyomaviridae, the early and late genes are encoded on opposite DNA strands and are separated by a noncoding regulatory region that includes the transcriptional promoters and enhancers and ori. The polyadenylation signals for early and late transcripts are located near each other at the opposite side of the circular genome from the ori region.
We have previously reported the sequence of the SA12 regulatory region (8). The ori regions of most characterized polyomaviruses consist of four GAGGC pentanucleotides flanked by A/T-rich sequences on the late side and an imperfect palindrome on the early side. The transcriptional enhancers lie to the late side of the A/T-rich sequences. Each of these features is present in the SA12 regulatory region, except that SA12 contains only three GAGGC repeats. The position where the fourth pentanucleotide should be is occupied by a GTGGC pentanucleotide instead. As we previously observed, SA12 is very closely related to the human virus BKV, including in the enhancer, a region thought to be important in determining host range specificity. Analysis of the complete SA12 sequence reveals a close relationship between SA12 and BKV in all coding regions of the viral genome. This is evident at both the DNA and protein level as calculated by global sequence alignment.
SA12 is the third example of a polyomavirus that does not contain four GAGGC pentanucleotides within its core ori. The other exceptions are BFDV and GHFV, which do not contain any GAGGC pentanucleotides, suggesting that these viral T antigens have entirely different DNA binding specificities than SA12 and the other polyomaviruses. We considered the possibility that the SA12 strain that we sequenced acquired a mutation in one pentanucleotide during laboratory passage. To test this, we generated a mutant of SA12 that converted the GTGGC sequence to a consensus GAGGC pentanucleotide. The time courses of viral DNA replication were similar for the mutant and the laboratory isolate, although the mutant did not replicate DNA to the same extent. In addition, the presence of four consensus pentanucleotides did not alter SA12 plaque size or virus yield. We still cannot exclude the possibility that the GTGGC pentanucleotide is an artifact of laboratory passage. Sequencing of natural SA12 isolates will be needed to address this issue.
We detected coding sequences for large T antigen and small t antigen in the viral early region. SA12 does not contain sequences related to the middle T antigens encoded by the murine or hamster polyomaviruses. We also did not detect any evidence for T*- or tiny T-related proteins encoded by SV40, JCV, or murine polyomavirus; however, our data do not exclude the possibility that such proteins are made.
We previously reported that SA12 encodes an ST protein that consists of two regions: an amino-terminal J domain and a carboxy-terminal domain that contains spatially conserved cysteine repeats (8). This is typical of the ST antigens encoded by other polyomaviruses.
SA12 encodes an LT protein that consists of multiple domains that are related to the LT proteins encoded by other polyomaviruses. The amino-terminal J domain is followed by a putative flexible region (56 amino acids in SA12) that includes an LXCXE motif that governs interaction with the retinoblastoma family of tumor suppressor proteins and a nuclear localization signal. Amino acids that are phosphorylated in SV40 LT antigen as well as residues shown to be important for the interaction of SV40 LT antigen with CUL7 and BUB1 are conserved in SA12. These sequences are followed by the putative origin-binding domain, Zn-binding domain, and ATPase domain.
The LT proteins of many polyomaviruses terminate at the end of the ATPase domain. However, like SV40, BKV, and JCV, the SA12 LT antigen includes additional sequences at its carboxy terminus. These sequences consist of a variable region and the HR domain (27). The SV40 HR function maps to the carboxy-terminal 38 amino acids of LT antigen. There is good alignment between this region of SV40 LT antigen and the LT proteins of SA12, BKV, and JCV. The variable region of SV40 is not well conserved with the corresponding regions of the SA12, BKV, and JCV T antigens. Surprisingly we found an excellent alignment of the putative variable regions of SA12, BKV, and JCV. Thus, this region of the SV40 LT antigen has diverged significantly from the corresponding region of these other polyomavirus T antigens. The close sequence conservation of the variable regions of the SA12, BKV, and JCV T antigens makes it likely that this part of LT antigen serves an as-yet-unknown function.
Like SV40, SA12 encodes microRNAs on the late coding strand opposite the LT variable region (36). BKV and JCV are also predicted to encode microRNAs in this region, while other polyomaviruses are not predicted to encode microRNAs in this general region. So far all polyomaviruses that possess a variable region and an HR domain encode microRNAs. One intriguing possibility that would explain the close genetic linkage between the presence of a variable region/HR domain and microRNAs is that this region of LT antigen performs some function associated with perturbing microRNA biogenesis or function.
The SA12 late-coding region encodes an agnoprotein, as well as VP2, VP3, and VP1. The organization of the late coding sequences is similar to that found in other polyomaviruses, including overlapping VP2 and VP3 coding sequences, and the overlap of VP2/VP3 coding sequences with those of VP1. Known functional elements of the capsid proteins, such as the VP2/VP1 interaction surface, and key VP1 cysteine residues are conserved in SA12.
Like SV40, we have found that SA12 encodes microRNAs. The SA12 miRNAs share a remarkable degree of similarity with the SV40 miRNAs. First, the miRNAs derived from both viruses are complementary to the early transcripts, are encoded in the late orientation past the 3′ polyadenylation cleavage site, and are expressed at higher levels as productive infection progresses late into the viral life cycle (data not shown). Second, both SA12 and SV40 display a pre-miRNA/miRNA ratio that is significantly higher than for most other reported viral or cellular microRNAs, with the pre-miRNA being more abundant than the miRNA (Fig. (Fig.7C).7C). Third, the miRNAs derived from the pre-miRNA are readily detected from both arms of the pre-miRNA (Fig. (Fig.7C;7C; note the ~22-nt band in both the left and right panels), which is an atypical although not completely uncommon finding. This is consistent with miRNAs from both arms having complementarity to, and thus the ability to target, the early transcript. The congruity of miRNA attributes between SV40 and SA12 strongly supports a conserved function. Thus, it is likely that SA12, like SV40 (36), utilizes miRNAs to autoregulate early gene expression late in infection. Confirming that SA12 expresses this pre-miRNA lends credence to our earlier in silico predictions that also suggested that JCV and BKV express homologs of this pre-miRNA. Expression of these miRNAs may lead to reduced antigen presentation of the early gene products and perhaps contribute to less T-cell activation, as has been demonstrated for SV40 in vitro (36). In vivo, it is possible that miRNA-mediated autoregulation contributes to tamping down virus production during the course of a lifelong, persistent, subclinical infection that is common among these polyomaviruses.
The known primate polyomaviruses (SV40, BKV, JCV, LPV, and SA12) are closely related, with BKV, SV40, JCV, and SA12 all being remarkably similar to one another. The amino acid sequences of SA12 and BKV VP1 are 83% identical, and the overlapping VP2/VP3 coding sequences of SA12 and BKV are 82% identical. Therefore, it may be very difficult to distinguish these viruses by serology, as has been the case for SV40 and BKV (5, 22, 42). These viruses are also closely related at the nucleotide sequence level in that the amino acid sequences of SV40 and BKV VP1 are 82% identical and the overlapping VP2/VP3 coding sequences of SV40 and BKV are 72% identical.
Little is known about SA12 infections in nature. The fact that SA12 is so closely related to BKV raises the possibility that SA12 might infect humans. In order to follow the course of SA12 infection in baboons, the presumed natural host, and to distinguish SA12 infections from BKV infections, we developed a quantitative PCR assay that distinguishes SA12 from the other primate polyomaviruses. In an earlier study, we had developed quantitative PCR assays against SV40, BKV, and JCV (A. Pal et al., submitted for publication) and against LPV (unpublished). These primers and probes did not react against SA12, and thus specific quantitative PCR assays exist for all four of these primate polyomavirus genomes.
The known primate polyomaviruses encode very closely related proteins. However, each possesses a unique regulatory region. Thus, it is hypothesized, though not yet proven, that the cell type tropism and host range exhibited by these viruses stems largely from their regulatory regions. While SA12 possesses a distinct regulatory region, it is closely related to the regulatory region of the human virus BKV. Based on the similarity of these two viruses, it will be interesting to determine if SA12 can infect humans. One intriguing possibility is that some polyomaviruses are continually crossing between human and nonhuman primates. However, the controversy that SV40 is a human infectious agent may indicate how difficult this will be to document. Finally we note that only 11 polyomaviruses have been sequenced to date and that each possesses common as well as unique features. This variation in polyomavirus sequences suggests that many other members of this family are circulating in the biosphere.
This work was supported by NIH grant CA40586 to J.M.P. Support to K. W. C. Peden and A. M. Lewis was from the National Vaccine Program Office. A. Pal was supported in part by a fellowship from the National Vaccine Program Office. C. S. Sullivan was supported by a G. W. Hooper Research Foundation Fellowship.
We thank Don Ganem for miRNA reagents and helpful discussions.