|Home | About | Journals | Submit | Contact Us | Français|
Exploration of the genetic diversity of WU polyomavirus (WUV) has been limited in terms of the specimen numbers and particularly the sizes of the genomic fragments analyzed. Using whole-genome sequencing of 48 WUV strains collected in four continents over a 5-year period and 16 publicly available whole-genome sequences, we identified three main WUV clades and five subtypes, provisionally termed Ia, Ib, Ic, II, IIIa, and IIIb. Overall nucleotide variation was low (0 to 1.2%). The discriminatory power of the previous VP2 fragment typing method was found to be limited, and a new, larger genotyping region within the VP2/1 interface was proposed.
In 2007, two new human polyomaviruses isolated from respiratory samples of pediatric patients suffering from respiratory disease were discovered, with one being KI polyomavirus (KIV) (2) and the other being WU polyomavirus (WUV) (8).
WU polyomavirus shares most of the genomic characteristics of other polyomaviruses, with a noncoding control region (NCCR) separating the early and late coding regions on opposite strands. However, unlike for JCV and BKV, but similar to what was observed for KIV, a late-region-residing agnoprotein gene has not been identified in WUV (8).
Despite being frequently detected in respiratory samples of ill patients, no distinct disease associations have so far been conclusively identified for WUV (1, 2, 4, 8, 10, 27). There have been some suggestions that sequence variation plays a role in disease severity and pathogenesis in other polyomaviruses (6, 24). Unfortunately, due to the early nature of research into WUV, there has been a dearth of available complete genomic sequences.
In this study, we set out to investigate a large sample set of whole WUV genomes from diverse geographical, temporal, and clinical origins. Incorporating existing WUV genomes with this data set allowed us to investigate global WUV genomic diversity, to characterize the WUV genome, and to propose a new robust typing scheme.
The study sample set was obtained from both published and undocumented sources (see Table S1 in the supplemental material). All candidate samples were either detected with or confirmed by the WU-B and WU-C real-time PCR assays (3). Of the 48 samples chosen, only 4 (B38 to -41) had previously been subjected to sequencing, and these were limited to the NCCR and VP1 regions (5).
A total of 33 samples in which WUV had been detected by PCR in previously published study populations in Australia (n = 19) (4, 5), Canada (n = 4) (1), Netherlands (n = 4) (27), South Korea (n = 3) (10), and Sweden (n = 3) (13) were selected (see Table S1 in the supplemental material).
WUV-positive samples from ongoing studies were also used (n = 15) and included respiratory samples collected during 2008 from South-East Queensland (Australia) hospital patients suffering from respiratory disease (n = 5), nose and throat swabs from hematology patients (Westmead Hospital, NSW, Australia) (including hematopoietic stem cell transplant recipients [n = 4]), and nasopharyngeal swabs from pediatric otitis media (OM) patients in an isolated Northern Territory (Australia) indigenous community (n = 6) (see Table S1 in the supplemental material).
Overlapping regions spanning the entire WUV genome were amplified utilizing 10 primer pairs (see Table S2 in the supplemental material). cDNA fragments were sequenced bidirectionally with BigDye 3.1 chemistry (Applied Biosystems Pty. Ltd., Australia), with anomalous sequencing results between overlapping regions reamplified and resequenced.
Alignments, entropy plots, and contig assembly were achieved using BioEdit version 184.108.40.206 (9). The GenBank accession numbers for all generated genomes are shown in Table S1 in the supplemental material. All numbering conventions follow the prototype B0 (GenBank accession number NC_009539) WUV sequence.
The overall genomic variability of the 64 WUV strains investigated in this study was low (0 to 1.2%), with several islands of dense diversity in the VP2, VP1, and LTAg N-terminal-end regions (Fig. (Fig.1).1). Variation was greatest in the VP1 region on both the nucleic and the amino acid levels and, to a lesser extent, in the VP2 and STAg genes (see Table S3 in the supplemental material).
Recurring variant positions were noted in the core NCCR, including several type-specific conserved changes (Fig. (Fig.1).1). No variation was observed in the NCCR past position 286, regardless of genotype, patient origin, or patient immune status, making the late-NCCR-to-early-VP2 stretch comprising nucleotides (nt) 287 to 722 the largest fully conserved region within the WUV genome. This conservation was also found in the variation-rich NCCR sequences described by Sharp et al. (21), implying that a strong negative selective pressure is being exerted on that region. It is tempting to speculate that the lack of variation in the region which typically codes for the agnoprotein in other polyomaviruses could be an indication of a yet-to-be-discovered regulatory or coding function critical to WUV viability or fitness.
After removal of the LTAg and VP2 start codon-framed NCCR, genomic sequences from this study, along with the 16 WUV genomes available in GenBank (under accession numbers EU711058, EU711057, EU711056, EU711055, EU711054, FJ890982, FJ890981, NC_009539, EU358769, EU358768, EU296475, EF444554, EF444553, EF444552, EF444551, and EF444550 as of 17 September 2009), were analyzed using neighbor-joining (NJ) analyses with 1,000 bootstraps and the Tamura-Nei substitution method (Mega4.1) (26). The best available nucleotide substitution model was chosen with the help of the FindModel application (http://www.hiv.lanl.gov/content/sequence/findmodel/findmodel.html).
Three clear clades supported by high bootstrap values (≥99) were evident and, in accordance with BKV typing terminology, were putatively named genotypes I, II, and III (Fig. (Fig.2).2). Parallel likelihood heuristic (PAUP* 4.0b10) (25) and maximum parsimony (MEGA 4.1) (26) analyses confirmed the distinct nature of the 3 clades, producing equally high bootstrap values for all 3 genotypes (data not shown). On the basis of the genomic NJ tree, further subdivisions were evident within genotypes I and III. Genotype I could be split into what could be considered the main body of the group, or subtype Ia, followed by the major branch Ib and the divergent subtype Ic (Fig. (Fig.2).2). Genotype III could be further divided into two groups, subtype IIIa and subtype IIIb (Fig. (Fig.2).2). Network phylogeny generated in the SplitsTree4 software package (http://www.splitstree.org/) (11) by the Neighbor-Net method confirmed the distinct separation of the three main genotypes as well as the five subtypes (see Fig. S1 in the supplemental material). Additionally, subtype Ic retained an association with the main body of genotype I while at the same time illustrating its divergent nature (see Fig. S1 in the supplemental material). All individual gene NJ tree analyses retained the distinction between genotypes; however, further subtype and topographic resolutions showed marked variability (Table (Table1;1; see also Fig. S2 in the supplemental material).
Of particular note was the reorientation of genotype II, which brought the orientation of type II closer to that of type III in both the VP2 and the VP1 trees (see Fig. S2 in the supplemental material), which would suggest a potential recombination event. To explore this further, two methods were used to investigate possible recombination; however, no such evidence was detected with the use of the RDP3beta application (http://darwin.uvigo.es/rdp/rdp.html) (16, 17), and no significant (P = 0.90) recombination events were identified by SplitsTree4's PHI test (11).
Since its description in the original WUV article written by Gaynor et al., the 207-bp-long VP2 typing region has been the typing target of choice in the majority of subsequent WUV studies (1, 8, 10, 28). This region was originally chosen based on sequence data from six whole WUV genomes; however, no subsequent evaluation of the accuracy of such a typing scheme has been performed, partially due to the limited availability of full WUV genomes. Of additional concern to us was the short length of the typing region, as similar-sized regions for BKV have been found to lack discriminatory power to adequately separate all known genotypes (14). When applied to the full genomic alignment, the WUV VP2 typing region was sufficient to discriminate between types I and III but lacked resolution for most subtypes and type II, as well as having substantially lower bootstrap values for many of the clades (Table (Table11 and Fig. Fig.3).3). Several discrepancies were noted with the genotyping scheme of Venter et al. (28), which utilizes the short VP2 typing region, although a thorough comparison was unable to be preformed, due to unavailable full genomes from the appropriate isolates. Briefly, genotypes 1, 3, and 4 as described by Venter et al. generally correlated with our genomic types I, IIIa, and IIIb, respectively. No equivalents could be found for genotype 2 described by Venter et al. and our genome type II and subtype Ic sequences in each other's classification schemes. Subtype Ia was in overall agreement (excluding isolate v367) with our subtype Ib, although the remaining subtypes, Ib to -e, did not provide sufficient differentiation or bootstrap values under the VP2 typing region to allow for further comparison with this study's proposed classification scheme.
Thus, we set out to design a more robust typing scheme which aimed to accurately reflect the genomic NJ tree. In brief, regions of genetic variability which also contained a high proportion of informative single nucleotide polymorphisms (SNPs) (Fig. (Fig.1)1) were used to generate potential candidate WUV typing regions. Consensus NJ trees were generated for each candidate typing region, and candidate typing region trees were judged for clade and outer taxonomic unit fidelity as well as node strength.
Several promising sites across the four main genes and of lengths ranging from 300 to 800 bp were assessed; however, most were found to be unable to discriminate between all of the genotypes and subtypes (data not shown). A 679-bp region spanning the VP2/VP1 genes at positions 1620 to 2299 provided better overall clade separation and bootstrap values than the other analyzed genomic regions as well as the original VP2 typing region (Table (Table11 and Fig. Fig.33).
On the basis of the VP2/1 typing region comprising nt 1620 to 2299, fully conserved flanking typing primers WUT-F (5′-GGTACTCCCCATTATGCAGCC-3′) and WUT-R (5′-GGTTGGAGGGGCTGCAA-3′), which were completely conserved between all genotypes and flanked the typing region producing an 806-bp-sized amplicon, were designed. The suggested typing target is much larger than the original VP2 typing region; however, larger typing fragment sizes have been recognized to confer a greater ability to hold discriminatory information and to achieve greater fidelity with the true tree (14, 18).
From available clinical records, no apparent correlation with genotype and immune status or clinical features was noted to occur outside the OM originating sequences (see Table S1 in the supplemental material). The inability to identify distinct clinical associations with genotype may be due to the disproportionate number of respiratory sample-based WUV sequences used, although the one sequence obtained from feces clustered with the general respiratory-oriented population. WUV sequences from all chronic OM samples clustered and were the sole representatives of groups II and IIIb (Fig. (Fig.2).2). Due to the small sample size, it is not known if types II and IIIb are associated with chronic OM or if they merely reflect geographic diversification. Further studies are under way to clarify this issue.
Our results suggest a trend between geographical origin and genotype (see Table S1 in the supplemental material), similar to that found with JCV and BKV (23, 29), although more sequence data are needed to confirm such a correlation. In particular, genotype II and subtype IIIb samples originated from a small indigenous island community and may represent genotypes unique to that specific geographical region or to indigenous Australians more generally.
Of the three identical genotype II WUV sequences originating from children with OM, two (O61 and O63) were collected from the same patient, 3 months apart. The third, O3, was collected from a second patient, but within the same isolated island community approximately 3 years prior, which suggests that genetically stable WUV populations can circulate within communities.
Several WUV isolates were obtained sequentially from immunosuppressed patients 1, 7, 15, or 112 days apart (see Table S1, boxed sequences, in the supplemental material). Sequences from samples collected over a short time span (1, 7, and 15 days) did not change, while the WUV sequences from the sample collected 112 days apart contained one nonsynonymous change (830C → T) within the VP2 gene. The collection timing of these two samples correlated with the progression from flu-induced symptoms to community-acquired pneumonia of unknown origin; however, we were not able to determine if the residue change happened in vivo in response to the patient's disease progression or if the acquisition of pneumonia facilitated reinfection with a new WUV strain.
Overall, there was little to no variation seen in predicted early and late protein features throughout the genotypes (see Table S4 in the supplemental material) (2, 7, 8, 12, 15, 19, 22). Within VP2, all genotype II and subtype IIIa sequences contained a corrupted stop codon (TAA → TCA), allowing for the C-terminal-end extension of two additional serines. Two conserved point deletions were observed outside the LTAg coding region, a T deletion in all genotype III and Chinese FZTF and FZ18 sequences within the LTAg intron and a T deletion immediately downstream of the LTAg stop codon within genotypes II and III (Fig. (Fig.1).1). The conservation of the deletions within genotypes and across continents suggests regulatory or processing functions within those areas. Indeed, one of the deletions resides in an analogous area which has been shown to produce regulatory microRNAs (miRNAs) in other polyomaviruses (20) and thus may betray likely sites of miRNA production in the WUV genome. These sites are currently being investigated for their potential miRNA production using independent experimental infectious system methods.
Of particular interest were the observed type-specific changes within VP1's predicted antigenic loops (see Table S4 in the supplemental material) (12), which could indicate that genotypes II and III may also be serologically distinct, in a fashion similar to that observed for BKV. The antigenic qualities of the type-specific VP1 sequences therefore need further investigation to determine if such antigenic diversity exists.
Accurate classification and typing methods will be critical in the future for cataloguing WUV isolates and investigating what role these genetic groups play in human biology and disease. This study has used full genomic sequencing of samples collected from four continents and various patient groups to characterize the diversity of WU polyomavirus and to propose a new classification scheme. Finally, a typing method was rationally designed based on genome-wide informative variables, which accurately represented the genomic phylogeny.
This study was supported by Royal Children's Hospital Foundation grant I 922-034, sponsored by the Woolworths Fresh Futures Appeal.
We thank Cheryl Blechly and the Molecular Diagnostics Unit staff at the Royal Brisbane Hospital for their assistance in obtaining clinical samples, Tae-Hee Han for providing WUV-positive material, and Marieke van der Zalm for sharing samples from the WHISTLER study.
This study was conducted predominantly at the Sir Albert Sakzewski Virus Research Centre, Queensland Children's Medical Research Institute, Children's Health Service District, Brisbane, Queensland, Australia. Additional screening was performed in: Le Centre Hospitalier Universitaire de Québec and Laval University, Québec City, Quebec, Canada, the Department of Microbiology, Tumor and Cell Biology, Karolinska Institutet and Department of Clinical Microbiology, Karolinksa University Hospital, SE-17176 Stockholm, Sweden, the Department of Pediatrics, Sanggyepaik Hospital, Inje University College of Medicine, Obang-dong, Kyungnam, Republic of Korea, and the Laboratory of Medical Microbiology and Immunology, St Elisabeth Hospital, Tilburg, The Netherlands.
Published ahead of print on 31 March 2010.
†Supplemental material for this article may be found at http://jvi.asm.org/.