|Home | About | Journals | Submit | Contact Us | Français|
Untreated sewage samples from 12 cities in the United States were screened for the presence of recently characterized RNA and DNA viruses found at high prevalence in the stool specimens of South Asian children. Genetic variants of human cosaviruses and cardioviruses in the Picornaviridae family and of DNA circoviruses and human bocaviruses were detected, expanding the known genetic diversity and geographic range of these newly identified viruses. All four virus groups were detected in sewage samples of less than a milliliter from multiple U.S. cities. PCR screening of particle-protected viral nucleic acid in sewage samples could therefore rapidly establish the presence and determine the diversity of four newly described enteric viruses in large urban populations. More frequent and deeper sampling of viral nucleic acids in sewage samples could be used to monitor changes in the prevalence and genetic composition of these and other novel enteric viruses.
Screening of sewage water for human viruses has a long tradition, starting with the detection and monitoring of polioviruses (9, 10, 12, 21, 25, 52). Untreated sewage water can harbor a diverse mixture of enteric viruses (8, 18, 19, 22, 42, 55). Screening of sewage from the United States, Europe, Australia, and South Africa has revealed numerous human viral pathogens including members of the family Picornaviridae (e.g., polioviruses, coxsackieviruses, and echoviruses), adenoviruses, noroviruses, reoviruses, rotaviruses, and picobirnaviruses (18, 34, 37, 41, 42, 46, 55, 58). In this study, we tested raw sewage samples from the United States for the presence of recently characterized RNA and DNA viruses, including human cardioviruses, cosaviruses, bocaviruses, and circoviruses, whose geographic distribution throughout the United States is largely unknown (14). Three of the four viral groups (human cosaviruses [HCoSV], human bocavirus 2 [HBoV2], and circoviruses) were initially discovered in stool samples from children with nonpolio acute flaccid paralysis (AFP) collected as part of the poliovirus eradication campaign (31, 32, 60) (L. Li, unpublished data). Because these viruses were also present in equal proportions in stool samples from demographically matched healthy children, it is unlikely that they are highly neurovirulent (7, 31, 32) (Li, unpublished), but their involvement in other diseases or their possible pathogenicity in subsets of infections remains to be determined. This study is the first to evaluate the prevalence and genetic diversity of these four novel viral groups in the sewage of cities in the United States.
Human cardioviruses, also known as Saffold viruses (SAFV) or human Theiler's murine encephalomyelitis virus-like cardioviruses, are a recently characterized, highly diverse monophyletic group of picornaviruses closely related to the rodent theiloviruses (7, 36). The first human cardiovirus sequence was reported in 2007 from the archived cell culture supernatant of fetal diploid kidney cells exhibiting unexplained cytopathic effects after inoculation with stool collected from a U.S. child with fever of unknown origin in 1981 (28). Human cardioviruses have since been found in respiratory secretions and stool samples from both healthy children and children with gastroenteritis, influenza-like symptoms, or non-polio-associated AFP in North America, Europe, and South Asia (1, 7, 14, 17). Using a VP1 genetic distance criterion corresponding to neutralization serotypes in members of the Enterovirus genus (another genus of the family Picornaviridae) (43), the known human cardioviruses could be organized into eight VP1 genotypes (7). With increased sampling, it is expected that additional genotypes will be identified. In the United States, in addition to the 1981 strain, seven human cardioviruses have been detected from stool and respiratory samples collected between 2000 and 2007 (14). The highest prevalence of human cardioviruses has been found in stool samples from Pakistani children under 15 years of age (9 to 12%) followed by children in day care in Germany (7.8%) (7, 17). A recent study measuring neutralizing antibodies to a human cardiovirus isolate belonging to the most common of eight known genotypes (gt2) showed seroconversion rates approaching 100% in 2-year-old children in Europe, Africa, and Southeast Asia, reflecting a high rate of early human exposure (61).
Cosavirus, a novel genus in the Picornaviridae family, consists of a highly diversified group of human viruses and includes at least four species based on the genetic distance criteria used for Enterovirus species (24, 32). Human cosaviruses (HCoSV) were recently identified in stool samples from children in Pakistan and Afghanistan, where they were found in roughly equal proportions, nearing 40%, in both healthy children and in children with non-polio AFP (32). In contrast, cosaviruses were detected at much lower frequencies in clinical stool samples from the United Kingdom (1/1,000 samples tested) (32). A cosavirus species was also found recently in a stool sample of an Australian infant with acute diarrhea (24). Although cosaviruses have been detected in South Asia, the United Kingdom, and Australia, the prevalence of this group of viruses in the United States is currently unknown.
Human bocavirus (HBoV), a common respiratory pathogen of children, is a single-stranded linear DNA virus belonging to the genus Bocavirus in the family Parvoviridae, subfamily Parvovirinae (3, 29, 50, 53). HBoV was first identified in 2005 in respiratory tract samples from Sweden (4) and was later found, although less frequently, in stool samples from children with gastroenteritis in South and North America, Europe, Africa, Asia, and Australia (6, 13, 27, 30, 54, 59). A related species, human bocavirus 2 (HBoV2), was recently identified in approximately 5% of stool samples from South Asian children, both healthy and with non-polio-associated AFP, and in about 0.4% of clinical stool samples from United Kingdom residents (two children and one adult) (31). HBoV2 and a new species, HBoV3, were also recently reported in stool samples from Australian children with diarrhea as well as from healthy children with a borderline positive association with gastrointestinal problems (5).
Circoviruses comprise one of two genera within the Circoviridae family having small circular single-stranded DNA genomes. Circoviruses have been reported in birds and mammals and are associated with potentially fatal diseases causing lymphoid tissue damage and immunosuppression in infected animals (48, 56, 57). Circovirus pathogenicity in pigs has been studied extensively, since frequent infections have a significant economic impact (11, 26, 45, 51). Porcine circovirus infections have been reported worldwide, including many countries in Europe and Asia and in Canada, United Kingdom, New Zealand, and the United States (2). We recently identified highly divergent circoviruses in stool samples from humans in Pakistan, Nigeria, and Tunisia and from chimpanzees (Li et al., unpublished). In addition, novel circovirus-like genomes have been identified in environmental samples through metagenomic sequencing; the host range and geographic distribution of these novel viruses remain to be determined (33, 49).
In order to evaluate the prevalence and genetic diversity of newly identified human cardioviruses, cosaviruses, bocaviruses, and circoviruses in the U.S. population, we analyzed sewage samples from 12 cities in 11 states by PCR and sequencing. We show that all four viral groups could be found throughout the United States and that a high degree of genetic diversity was present within each group. Untreated sewage water therefore represents a readily available source of material to rapidly assess the presence and genetic variation of newly discovered enteric viruses in large urban populations.
Raw sewage samples were collected in the fall of 2007 from 12 wastewater treatment facilities from the coastal United States: Alabama, California, Oregon, Washington, Louisiana, Maine, Maryland, New Jersey, North Carolina, Wisconsin, and Florida (including two different treatment facilities in Florida, one in western Florida and one in the Florida Keys). Samples were also collected at multiple times from the two sites in Florida (Table (Table11).
Purification of viral particles, extraction of viral nucleic acid, and synthesis of cDNA were performed as described earlier (55). Briefly, for each sample, 10 ml was filtered through a 0.45-μm polyether sulfone membrane filter cartridge (Millipore, Billerica, MA). The degree of viral concentration in each sample before nucleic acid extraction ranged from 2- to 167-fold and is shown in Table Table11 (55). Nucleic acid was extracted using the QIAamp MinElute virus spin kit (Qiagen, Valencia, CA), and cDNA was synthesized using the first-strand synthesis SuperScript III reverse transcription kit (Invitrogen, Carlsbad, CA) with random hexamers (55). After considering the dilutions of nucleic acids, the equivalent of 76 μl of raw sewage was tested in each PCR.
The nested PCR products were analyzed by electrophoresis on a 1% agarose gel stained with ethidium bromide. The PCRs were treated with ExoSap-IT exonuclease (USB, Cleveland, OH) and then directly sequenced by dideoxy sequencing.
A nested PCR targeting a 265-base pair (bp) region in the conserved 2C (helicase) region was used to test for SAFV. Primers HelF1 and HelR1 were used in the first round, and primers HelF2 and HelR2 were used in the second round. PCR conditions were as previously described (7).
A nested PCR targeting 316 bp in the 5′ untranslated region (5′UTR) was used to test for HCoSV. Primers DKV-N5U-F1 and DKV-N5U-R2 were used in the first round of PCR and primers DKV-N5U-F2 and DKV-N5U-R3 were used in the second round as previously described (32).
A nested PCR targeting 455 bp within the VP1 region was used to test for HBoV1- and HBoV2-related viruses (A. Kapoor, unpublished data). Primers BocaF1 (5′-CGCCGTGGCTCCTGCTCT-3′) and BocaR1 (5′-TGTTCGCCATCACAAAAGATGTG-3′) were used in the first round, and primers BocaF2 (5′-GGCTCCTGCTCTAGGAAATAAAGAG-3′) and BocaR2 (5′-CCTGCTGTTAGGTCGTTGTTGTATGT-3′) were used in the second round. In the first round of PCR, the reaction mixture included 1× ThermoPol reaction buffer with 2 mM MgSO4 (New England Biolabs), 0.25 mM of each deoxynucleoside triphosphate (dNTP), 20 pmol of each primer (BocaF1 and BocaR1), 0.75 μl of Taq DNA polymerase (New England Biolabs), and 1 μl of template DNA in a 50-μl mix. Amplification cycles were as follows: (i) denaturation at 95°C for 2 min; (ii) 6 cycles of amplification, with 1 cycle consisting of 35 s at 95°C, 45 s at 58°C, and 45 s at 72°C, with a decrease of 0.5°C per cycle in the annealing temperature; (iii) 34 cycles of amplification, with 1 cycle consisting of 30 s at 95°C, 30 s at 54°C, and 45 s at 72°C; and (iv) a final extension step of 7 min at 72°C. In the second round, the reaction mixture included 1× ThermoPol reaction buffer with 2 mM MgSO4 (New England Biolabs), 0.25 mM of each dNTP, 20 pmol of each primer (BocaF2 and BocaR2), 0.75 μl of Taq DNA polymerase (New England Biolabs), and 1.5 μl of first-round product as template in a 50 μl mix. The cycle for the second round was as follows: (i) denaturation at 95°C for 2 min; (ii) 10 cycles of amplification, with 1 cycle consisting of 35 s at 95°C, 45 s at 60°C, and 45 s at 72°C, with a decrease of 1°C per cycle in the annealing temperature; (iii) 30 cycles of amplification, with 1 cycle consisting of 30 s at 95°C, 30 s at 58°C, and 45 s at 72°C; and (iv) a final extension step of 7 min at 72°C.
A nested PCR targeting a 500-bp region of the replicase gene was used to test for circoviruses. The first-round PCR primers were CV-F1 (GTIMGIMGITTYTGYTTYACITGG) and CV-R1 (CCICCYTTIAYYTGIACYTTRTA), and the second-round primers were CV-F2 (GGIGARGARYTIGCICCIACIAC) and CV-R2 (GGICCCCARTARTARTAIACYTC). In the first round, the PCR mix included 1× ThermoPol reaction buffer with 2 mM MgSO4 (New England Biolabs), 0.25 mM of each dNTP, 50 pmol of each primer, 0.75 μl of Taq DNA polymerase (New England Biolabs), and 2 μl of template DNA in a 50-μl mix. The reactions were carried out with the following cycling profile: (i) 95°C for 5 min; (ii) 40 cycles, with 1 cycle consisting of 1 min at 95°C, 1 min at 52°C, and 1 min at 72°C; and (iii) a final incubation for 10 min at 72°C. The second-round PCR conditions and annealing temperature were identical to those of the first round, except that the annealing temperature was set at 56°C. One microliter of the first-round PCR product was used as the template for the second PCR round.
For each group of viruses, the sequences generated in this study were aligned to available sequences of related viruses from GenBank. Alignments were performed using CLUSTAL W with default settings (23) and then used to generate phylogenetic trees using neighbor joining with bootstrap values calculated from 1,000 replicates in MEGA 4 (35). Nucleotide sequences were used for cardioviruses, bocaviruses, and cosaviruses, and amino acid alignments were used for the more diverse circoviruses.
The nucleotide sequence accession numbers of sequences used are listed in the GenBank database as follows: GQ243574 to GQ243692.
The presence of human cardoviruses, cosaviruses, bocaviruses, and circoviruses was analyzed in sewage waters (Table (Table1).1). At least one of the four viral groups was detected in 9 out of 11 states. Cardioviruses were found in 9 of 21 (43%) samples, cosaviruses were found in 8 out of 21 (38%) samples, circoviruses were found in 6 out of 14 samples (43%), and bocaviruses, the most frequently detected virus, were found in 17 of 21 samples (81%).
In neighbor-joining trees, all the human cardiovirus helicase regions amplified from U.S. sewage samples clustered with those of VP1 genotype 2 sequences (Fig. (Fig.1A)1A) (7). Genotype 2 human cardioviruses have been previously reported in the United States and Canada (SAFV-UC1 and SAFV-Can112051-06) (1, 14). Only a single U.S. sequence belonging to VP1 genotype 1, which had been cultured from a Californian child in 1981 (28), did not cluster with the cardiovirus helicase sequences derived from the U.S. sewage samples. Unexpectedly, five out of six cardiovirus sequences that were derived from sewage samples collected in a treatment facility over a 2-week period were identical (Table (Table11 and Fig. Fig.1A,1A, SAFL-FL2-3 to SAFL-FL2-7).
Direct sequencing of PCR product from a subset of cosavirus amplicons resulted in ambiguous sequences due to mixed nucleotide bases, indicating that multiple genetic variants had been simultaneously amplified. In these cases, the PCR product was cloned and sequenced. Sequencing of multiple plasmids confirmed the coamplification of distinct sequence variants. A subset of bocavirus and circovirus amplicons were similarly subcloned into plasmids prior to sequencing. According to the phylogenetic analysis, the majority of cosavirus 5′UTR sequences (18 of 29, or 62%) could be classified within cosavirus species D (Fig. (Fig.1B)1B) (32). However, some sequences were more closely related to cosavirus species A and the recently identified species F (Kapoor, unpublished). In addition, some sequences were identified in samples from Florida and New Jersey that did not cluster with known Cosavirus species (see the clusters marked with question marks in Fig. Fig.1B).1B). Further genetic analyses of other loci will be required to determine whether these divergent cosavirus sequences represent new species of human cosaviruses.
Phylogenetic analysis of the VP1 region indicated that 4/67 sequences (6%) were closely related to HBoV, 45/67 (67%) were related to HBoV2, and 18/67 (26%) were related to the recently identified HBoV3 (5) (Fig. (Fig.1C).1C). None of the sequences was related to the new HBoV4 species recently identified in African stool samples (Kapoor, unpublished). Most of the sequences from U.S. sewage samples clustered with HBoV2 genotype A, a variant previously reported in Pakistan, United Kingdom, and recently, Australia (5, 31) (Kapoor, unpublished).
A tBLASTx analysis revealed that 9 out of 13 sequences obtained from PCRs that target the circovirus replicase gene were 35 to 58% identical to viruses from the Circoviridae family. In neighbor-joining trees, these sequences clustered with porcine and bird circoviruses (Fig. (Fig.1D).1D). In contrast, the remaining four circovirus-like sequences fell outside of the circovirus clade but within a larger cluster including the plant geminiviruses and nanoviruses. Definitive classification of the viruses with the divergent circovirus-like replicase sequences will require amplification and sequencing of their entire genomes. For three of these four circovirus-like sequences, the 20 best tBLASTx matches were to circoviruses, but for one sequence (WA-1), the best tBLASTx hits were to both bird circoviruses and plant pathogens from the Nanoviridae family (i.e., faba bean necrotic yellow virus, milk vetch dwarf virus, and banana bunchy top virus). It is possible that this group of divergent circovirus-like replicase sequences represents a new viral family. In our neighbor-joining tree, these sequences are not included in either the circovirus or nanovirus clade and might represent either animal or plant viruses.
We report here the detection and genetic diversity of cardioviruses, cosaviruses, bocaviruses, and circoviruses in untreated sewage water samples from different U.S. cities. Human bocaviruses were the most commonly detected viruses and were found in all but two of the analyzed states (Wisconsin and Louisiana) (Table (Table1).1). HBoV1 has a very broad geographical distribution when respiratory samples are analyzed but is less commonly detected in stool samples (3, 6, 13, 27, 29, 30, 50, 53, 54, 59). In this study, HBoV1 was detected in only one city in Florida, while HBoV2 and HBoV3 were detected in sewage samples from all bocavirus-positive cities (Fig. (Fig.1C).1C). This result might reflect enhanced human gut tropism for HBoV2 and HBoV3 relative to HBoV1. To our knowledge, this is the first report of HBoV2 or HBoV3 in the United States. HBoV2 genotype A dominated the HBoV2 sequences from U.S. sewage samples, while HBoV2 genotype B dominated in Africa (Kapoor, unpublished). A divergent sequence (HBoV?-CA-1-C1) was also detected and may reflect another genotype of HBoV3 or possibly another species of HBoV.
Human cosaviruses were detected in 38% of the samples analyzed (Table (Table1).1). This is the first documentation of cosaviruses in the United States. Sequencing of the conserved 5′UTR showed that some variants clustered with the known species A and D, while others formed separate clusters unrelated to currently known species (Fig. (Fig.1B).1B). The predominance of species D distinguishes the United States from what has been observed in other countries (Pakistan, Nigeria, and Tunisia), where species A predominates (32) (Kapoor, unpublished).
The human cardiovirus helicase sequences from sewage samples were found to cluster weakly with two of the three available sequences from the United States (1, 14) (Fig. (Fig.1A).1A). Since the cardiovirus helicase sequences found in the United States are distinct from sequences obtained from Pakistani samples, it is likely that, as for cosaviruses species and HBoV2 genotypes, the prevalence of human cardiovirus genotypes differs between geographic regions. Further work studies are needed to obtain the VP1 genotypes of these strains from U.S. sewage samples to determine whether they correspond to genotype 2, as do the recently acquired U.S. isolates (1, 14).
The observation that 5 of 6 samples collected days apart in Bradenton, Florida, yielded identical human cardiovirus sequences was unexpected in light of the diverse sequences acquired from the other sites. Possible explanations for the genetically homogeneous cardioviruses sampled in this town include differences in the geographic distribution of cardiovirus strains, a recent spread in that Florida community of the repeatedly sampled strain, and even the contribution to the local sewage of cardioviruses from a single individual shedding very high levels of virus.
Circoviruses are currently known to infect birds and swine, and as of yet, no human circoviruses have been demonstrated. Circoviruses present in U.S. sewage water samples were phylogenetically diverse, falling into two main groups distinct from known circoviruses. Nine sequences clustered into four groups related to pig and bird circoviruses but were sufficiently distinct to represent possible new species of circoviruses. Four other replicase sequences fell outside of the known animal circovirus clade in our neighbor-joining trees but were included in a larger clade with both circoviruses and geminiviruses. The N-terminal region of circovirus replicase proteins has been shown to be related to those of nanoviruses, while its C-terminal region is more closely related to the 2C RNA binding protein of picornaviruses, indicating a possible chimeric origin from plant and animal viruses (20). Whether these sequences represent novel circoviruses, nanoviruses, or a new family of replicase-containing viruses will require further analyses, including sequencing of entire genomes.
In conclusion, newly characterized enteric viruses (human bocaviruses, cardioviruses, and cosaviruses) as well as distant relatives of animal viruses (circoviruses) could be readily detected in untreated sewage water samples from different U.S. cities. An abundance of viral diversity can therefore be readily accessed in untreated sewage samples Since the nucleic acids derived from less than 100 μl equivalent of untreated sewage were examined in each PCR, it is conceivable that testing larger volumes would have revealed an even greater prevalence of these enteric viruses. As has been previously shown, sewage water represents a convenient material to test large populations for the presence of enteric viruses (8-10, 18, 19, 21, 22, 25, 42, 55). Modeling studies have indicated that a single poliovirus toilet “flush” could contain enough enterovirus to be detected in a city's sewage system for up to 4 days (47). Studies of untreated sewage waters collected at different times might show seasonal fluctuation and yield further insight into the epidemiology of these viruses. Similarities have also been detected in the poliovirus strains in sewage samples and in those in the stool specimens of local children (15).
More detailed genetic analyses of the viruses in untreated sewage might also show how rapidly new variants can colonize populations in different cities. Such a phenomenon might have been reflected in the identical cardioviruses sampled over several weeks from the same city in Florida. Long-standing infections in regions where the viruses are endemic would be expected to yield viral populations with more genetic diversity among strains, due to introduction of new strains and evolution of local strains.
Efficient and reproducible sampling of the complex viral nucleic acid populations in sewage will need to be demonstrated before solid conclusions regarding the frequency of particular viral genetic variants can be drawn (25). Increasing the starting volume of sewage water, and therefore the subsequent nucleic acid input, might provide a better and deeper sampling of the replicating viral populations. Quantitative PCR assays can also be developed to measure fluctuations in viral loads in sewage. Detailed analyses of PCR amplicons using ultradeep sequencing technologies may also allow a more accurate view into the composition and change in the frequency of circulating variants (39, 40, 60). Untreated sewage therefore provides a readily available source to test for the presence of newly identified viruses and to monitor their genetic diversity on a population level scale. Monitoring viral concentration in sewage might also be used for surveillance and, if done in real time, to provide early warning signs of impending outbreaks (16, 38, 52).
Published ahead of print on 30 September 2009.