In this study, we examined the viral nucleic acids in stool samples from 35 South Asian children with nonpolio AFP and 6 healthy contacts. Viruses were detected in 29 of 35 children with AFP and in all six healthy contacts. We also used 454 pyrosequencing and compared this technique to traditional shotgun subcloning and Sanger sequencing. Pyrosequencing provided superior genomic coverage (41% versus the 19% average coverage of all detected viral genomes) and more sensitive viral detection (average, 2.6 versus 1.4 viruses per patient sample) (Fig. ). Consistent with the data in previous reports (40
), the shorter reads associated with pyrosequencing nearly doubled the portion of unclassifiable sequences compared to the portion obtained by Sanger sequencing (51 versus 29%) (Fig. ). It appears that this increase in unclassifiable sequences is due primarily to a reduced ability to classify diverse bacterial sequences, resulting in a reduction in the portion of sequences classified as bacterial from 28% by Sanger sequencing to 7.9% by 454 pyrosequencing. Viral sequences accounted for ~23% of the total, regardless of the sequencing method. Despite the problems associated with shorter sequence reads, pyrosequencing was superior in both viral detection and genome coverage for the 10 patients tested, at approximately the same financial cost. As pyrosequencing technology improves to generate longer sequence reads, it is likely to supplant Sanger shotgun sequencing as a method of viral identification and discovery.
Prior viral metagenomic studies of feces utilized Sanger shotgun sequencing of 532 (8
), 4,600 (13
), and 36,769 (41
) plasmid subclones at the cost of analyzing fewer samples (1, 12, and 3, respectively). Two of these studies detected primarily plant viruses (41
) or bacteriophage (8
) (in this case, likely the result of focusing on dsDNA viruses), while the third study, using diarrhea samples, detected known viral pathogens as well as sequences divergent enough to potentially belong to two new viral species (astrovirus and nodavirus) (13
). In our study, the most common plant virus detected was pepper mild mottle virus, which has also been reported to occur at high frequencies in North American and Singaporean human stool samples (41
). In addition to bacteriophage and plant viruses, we detected known pathogenic enteric viruses, including rotavirus, adenovirus, picobirnavirus, and numerous members of the Picornaviridae
family, including parechovirus, Aichi virus, rhinovirus, cardioviruses, and HEV-A to HEV-C, as well as several new viral species. The high proportion of healthy children with viruses in their stool samples (six of six) (Table ) underlines the often asymptomatic nature of many enteric viral infections whose clinical outcomes are likely dictated by a combination of viral and host genetics, active and passive immunity (i.e., maternal antibodies), and overall health (26
Specific nested panenterovirus PCR primers detected HEV infection in 23 of 35 of the AFP cases (20
), while 17 of 35 AFP samples exhibited at least one HEV sequence in the viral metagenomic analysis. Both metagenomic analysis and pan-HEV PCR detected HEV infection in all six healthy contacts. This correlation was less pronounced for members of the new candidate picornavirus genus Cosavirus than for HEV, as cosavirus sequences were found in only 9 of 41 samples from AFP patients and healthy contacts by shotgun sequencing, compared to 19 of these 41 by nested PCR (20
). Similarly, human cardiovirus SAFV was found in 3 of 57 nonpolio AFP children using shotgun Sanger sequencing and in 9 of 57 patients using RT-nested PCR (5
). It is possible that the cosavirus loads in stool samples are generally lower than HEV loads, thereby making detection using limited shotgun sequencing less likely. Indeed, in-depth 454 sequencing of 8,276 clones from the sample from patient 5550 and 25,516 clones from the sample from patient 6572 revealed the presence of cosaviruses missed by Sanger sequencing. These results indicate that while a wide range of distinct viruses (belonging to different and in some cases new viral species) can be detected using low-level Sanger subclone sequencing, the very high sensitivity of nested PCR stills allows more cases of presumably low-level infections with known viruses to be detected.
We detected at least five novel viruses or groups of viruses: a new human bocavirus (18
), members of a new Picornaviridae
), a new circovirus (unpublished results), a new nodavirus (unpublished results), and new discistroviruses (unpublished results). Sequences from divergent viruses that may represent new genotypes of enteroviruses, parechoviruses (23
), cardioviruses (5
), and picobirnaviruses (unpublished data) were also found. The novel nodavirus sequences were clearly distinguishable from the nodavirus sequences recently generated from diarrhea samples, overall exhibiting less than 41% amino acid identity to the previously generated sequences (13
). The most diverse viral sequences detected and reported here belonged to the dicistrovirus-like category, in which polymerase and other enzymatic regions exhibited less than 35% amino acid identity to dicistrovirus sequences currently in GenBank. Dicistrovirus-like sequences were detected in samples from three patients, two of which, patients 6178 and 6344, were coinfected with members of the new Picornaviridae
genus, Cosavirus. The dicistrovirus-like sequences exhibited 70 to 75% nucleotide identity to one another, a level of divergence otherwise seen among different species of dicistroviruses.
It remains to be determined which of these novel viruses are capable of replication in the human gut, as it is conceivable that some were consumed and their nucleic acids traveled through the digestive tract intact, as attested to by the detection of nucleic acids from plant viruses which have previously been shown to remain infectious (41
Nodaviruses are small, single-stranded, bipartite RNA viruses that to date have been shown to naturally infect only insects and fish. Nodaviruses have been detected previously in human stool (13
) and are semipermissive of replication in mammalian tissues (4
). Dicistroviruses have been shown to replicate and be pathogenic in insects (10
). The internal ribosomal entry site between the two cistronic segments can act as a powerful promoter in mammalian cells (28
). However, reports of viral replication within mammalian cell lines are contradictory; one group has demonstrated the replication of a dicistrovirus, Taura syndrome virus, in human cell lines (3
), while another has failed to reproduce Taura syndrome virus growth in mammalian cell lines (27
). The ability of pathogenic porcine circovirus 2 to replicate in pigs is well established (1
). Whether the circovirus detected in the sample from patient 5006 represents the first human circovirus or a circovirus from ingested meat remains unknown. In vitro replication as well as serological and larger epidemiological studies will be necessary to determine the range of host species tropisms and pathogenic potentials of these new viruses.
Three of the 35 AFP cases were fatal: the sample from child 5550, in which six distinguishable eukaryotic viruses (adenovirus, cosavirus, HEV-B, HEV-C, rhinovirus, and cucumber mosaic virus) were observed, exhibited the highest level of coinfection; patient 2296 was coinfected with HEV-B and a cosavirus; and patient 6178 exhibited coinfection with dicistrovirus and cosavirus. Stool from patient 6178 was likely to contain a high titer of dicistrovirus based on the large fraction of sequence from this virus derived by random amplification (Fig. ) and dilution end point PCR, which indicated a viral load of approximately 106
genome copies per ml of stool supernatant (data not shown). While cosaviruses were present in all three fatal cases, the difference in cosavirus prevalence among all AFP patients combined and healthy controls was not statistically significant (20
). Since even clearly pathogenic picornaviruses, such as poliovirus, typically produce no clinical manifestations in 99 to 99.9% of infections (26
), failure to detect a significant association with disease in this small cohort does not absolve cosaviruses, cardioviruses, or other new viruses of possible pathogenic roles.
In summary, we have used limited Sanger sequencing of stool samples from children with AFP to detect both known and novel viruses. By increasing the depth of the nucleic acid sampling using 454 pyrosequencing, we detected more viruses likely present at lower viral loads. These studies provide a framework for further studies that can be applied to numerous cases of AFP reported by the Global Polio Laboratory Network; of the 700,000 cases reported since 1997, only ~6.5% have been attributed to poliovirus and 15 to 30% have been attributed to nonpolio enteroviruses (9
). PCR studies of stool and tissue samples from subjects of different ages and geographic origins, both with and without diseases, as well as serological testing, will be required to determine the epidemiology and pathogenicity of these new viruses. The numerous known and new viruses in stool samples from developing countries, a likely result of limited access to adequate sanitary conditions resulting in frequent enteric infections, also indicates that such samples provide readily accessible material for further viral discovery.