|Home | About | Journals | Submit | Contact Us | Français|
Although widely used for the characterization of the transmission of intestinal Cryptosporidium spp., genotyping tools are not available for C. muris and C. andersoni, two of the most common gastric Cryptosporidium spp. infecting mammals. In this study, we screened the C. muris whole-genome sequencing data for microsatellite and minisatellite sequences. Among the 13 potential loci (6 microsatellite and 7 minisatellite loci) evaluated by PCR and DNA sequencing, 4 were eventually chosen. DNA sequence analyses of 27 C. muris and 17 C. andersoni DNA preparations showed the presence of 5 to 10 subtypes of C. muris and 1 to 4 subtypes of C. andersoni at each locus. Altogether, 11 C. muris and 7 C. andersoni multilocus sequence typing (MLST) subtypes were detected among the 16 C. muris and 12 C. andersoni specimens successfully sequenced at all four loci. In all analyses, the C. muris isolate (TS03) that originated from an East African mole rat differed significantly from other C. muris isolates, approaching the extent of genetic differences between C. muris and C. andersoni. Thus, an MLST technique was developed for the high-resolution typing of C. muris and C. andersoni. It should be useful for the characterization of the population genetics and transmission of gastric Cryptosporidium spp.
Cryptosporidium muris and Cryptosporidium andersoni are gastric Cryptosporidium species of various mammals. Cryptosporidium muris was first identified in the gastric glands of mice over a century ago, whereas C. andersoni was long considered C. muris because of the genetic and biological similarity between the two and was only recently established as a separate species (25, 42). Cryptosporidium muris is well known to have broad host specificity. In addition to various rodent species, natural C. muris infections have been documented for pigs, Bactrian camels, giraffes, dogs, cats, nonhuman primates, seals, bilbies, and tawny frogmouth (3, 16, 17, 20, 22, 26, 27, 29, 32, 34, 37, 38, 40, 44). In contrast, C. andersoni is mostly a parasite of cattle, having been found only occasionally in other animals such as Bactrian camels, sheep, and goats. In recent years, C. muris or C. andersoni infection has been reported in a few human cases (1, 8, 11, 13, 15, 19, 24, 30, 31, 33, 39). Thus, both C. muris and C. andersoni are considered zoonotic Cryptosporidium species. Because of its biological uniqueness and zoonotic potential, the complete genome of C. muris is being sequenced.
Recently, various molecular diagnostic tools have been used for the characterization of the transmission of human-pathogenic intestinal Cryptosporidium spp. such as C. hominis and C. parvum (35, 41). These tools have proven to be especially useful for comparisons of parasite population genetics among hosts or Cryptosporidium species, characterization of host specificity of Cryptosporidium spp., identification of infection sources in humans, tracking of the temporal and geographical spread of pathogens, and investigation of outbreaks and endemicity. One such high-resolution subtyping tool is multilocus sequence typing (MLST). For C. hominis and C. parvum, MLST tools have recently been developed using the polymorphic microsatellite and minisatellite markers identified in the recently published whole-genome sequencing data (9, 10, 12).
In this study, we screened the C. muris genome for microsatellite and minisatellite sequences and developed an MLST technique for the high-resolution typing of C. muris and C. andersoni isolated from humans and various animals.
A total of 27 DNA extractions from 25 C. muris specimens and 17 extractions from 17 C. andersoni specimens were used in the study (Table (Table1).1). The C. muris specimens were from humans, various rodents, Bactrian camels, and one mountain goat, snake, dog, cat, and tawny frogmouth each in the Czech Republic, Egypt, Kenya, and Peru. In contrast, all C. andersoni specimens were from cattle in the United States, Canada, the Czech Republic, and China. A few of the isolates were maintained in laboratory rodents (mice, SCID mice, and Mastomys coucha) (Table (Table1).1). Prior to experimental infection, these animals were maintained in breeding facilities designed for experimental animals and determined to be free of Cryptosporidium infection by microscopy of consecutive fecal samples and, in the case of SCID mice and Mastomys coucha, by PCR analysis. Three of the DNA preparations were from the same C. muris specimen from a child in Kenya, extracted at different storage times. These specimens were diagnosed as being positive for C. muris or C. andersoni by DNA sequence analysis of an ~830-bp fragment of the small-subunit (SSU) rRNA gene (43). Seven C. muris DNA preparations (preparations 6853, 7379, 9412, 14260, 14272, 14713, and 14714) and two C. andersoni preparations (preparations 1808 and 11084) were used for initial primer evaluations and the selection of markers. These DNA extractions were chosen for the initial screening of PCR primers and targets because of their various efficiencies of amplification at the SSU rRNA locus, with DNAs 9412 and 14272 producing amplification in only 1 of 3 PCR runs. The remaining specimens were analyzed only at the selected microsatellite and minisatellite loci.
A search for microsatellite and minisatellite sequences in the C. muris genome was conducted on 8 March 2007. The first 2,500 sequences (traces 1649410657 to 1649413156) in the C. muris whole-genome sequencing database were retrieved from the website of the Institute for Genomic Research (TIGR, now the J. Craig Venter Institute [http://gsc.jcvi.org/projects/msc/Cryptosporidium_muris/). Microsatellite and minisatellite sequences in the retrieved sequences were identified by using the software Tandem Repeat Finder (http://tandem.bu.edu/trf/trf.html).
Nested PCR was used for the amplification of microsatellite and minisatellite targets. Primary and secondary PCR primers were designed based on nucleotide sequences of the potential microsatellite and minisatellite loci. The potential targets were amplified by nested PCR, using 1 μl of DNA in the primary PCR and 2 μl of primary PCR products in the secondary PCR. For both the primary and secondary PCRs, the PCR mixture consisted of 200 mM (each) deoxynucleotide triphosphate, 1× PCR buffer (Perkin-Elmer, Foster City, CA), 3.0 mM MgCl2, 5.0 U of Taq polymerase (Promega, Madison, WI), and 100 nM primers in a total volume of 100 μl. The reactions were performed with a GeneAmp PCR 9700 thermocycler (Perkin-Elmer) for 35 cycles of 94°C for 45 s, the annealing temperatures specified in Table Table22 for 45 s, and 72°C for 60 s, with an initial denaturation step (94°C for 5 min) and a final extension step (72°C for 10 min). To neutralize PCR inhibitors, 400 ng/μl of nonacetylated bovine serum albumin (Sigma-Aldrich, St. Louis, MO) was used in the primary PCR. The secondary PCR products were detected by agarose gel electrophoresis and ethidium bromide staining.
The secondary PCR products were sequenced in both directions with an ABI 3130 genetic analyzer (Applied Biosystems, Foster City, CA) using the secondary primers and the BigDye1 Terminator V3.1 cycle sequencing kit (Applied Biosystems). The sequences obtained were aligned with each other and reference sequences downloaded from the C. muris whole-genome database using ClustalX (http://www.clustal.org/). To assess the genetic relatedness of various C. muris and C. andersoni subtypes, neighbor-joining trees were constructed by using the program TreeconW (http://bioinformatics.psb.ugent.be/software/details/3), based on the evolutionary distances calculated by the Kimura two-parameter model.
Unique sequences generated in this study have been deposited in the GenBank database under accession numbers HM565066 to HM565101.
The search for tandem repeats in the 2,500 short sequences retrieved from the C. muris whole-genome sequencing project identified 101 sequences with microsatellites and minisatellites. Based on the nature (largely an absence of imperfect repeats) and length (>6 copies for minisatellite targets and >10 copies for microsatellite targets) of the repeats and the availability of suitable sequences for primer design (excluding those with short or AT-rich 5′- or 3′-flanking nucleotide sequences), 13 loci were chosen from the 101 potential targets, including 6 microsatellite loci and 7 minisatellite loci. The location of the loci was not considered, as the C. muris genome was not assembled and annotated at the execution of the study. Primers for nested PCR were designed for each locus (Table (Table2).2). Because DNA sequencing rather than fragment length measurement was used for the determination of polymorphism, the final PCR product was larger than that normally used for microsatellite and minisatellite analysis, with expected PCR products ranging from 307 bp to 751 bp.
The amplification efficiency of the 13 sets of nested PCR primers was initially evaluated by using seven C. muris DNA preparations (preparations 6853, 7379, 9412, 14260, 14272, 14713, and 14714) and two C. andersoni DNA preparations (preparations 1808 and 11084) (Fig. (Fig.1).1). The primers of three loci (CM-MS8, CM-MS13, and CM-MS14) did not produce the expected PCR products. Primers of the remaining loci produced the expected PCR products for 2 to 8 of the DNA preparations used in the analysis, with one locus (CM-MS6) generating only light bands in gel electrophoresis analyses of the PCR products (Fig. (Fig.1,1, Table Table2).2). DNA preparations 9412 (except for MS1 and MS16) and 14272 (except for MS18) were not amplified at most loci. Positive PCR products of the amplified loci were sequenced mostly successfully, with the exception of two loci, CM-MS12 and CM-MS18, which produced unreadable sequences with numerous underlying signals in the electropherogram.
Four loci, CM-MS1, CM-MS2, CM-MS3, and CM-MS16, were chosen for sequence polymorphism evaluations using a total of 44 DNA extractions from C. muris and C. andersoni specimens. Of these, 41, 39, 24, and 38 DNA preparations were amplified at the CM-MS1, CM-MS2, CM-MS3, and CM-MS16 loci, respectively (Table (Table1).1). These PCR products were sequenced successfully, except for 2 to 3 DNA preparations at each locus, which produced mixed signals in electropherograms. The nucleotide sequences generated at each locus were homologous to those downloaded from the database of the C. muris whole-genome sequencing project. A BLAST analysis of the CM-MS3 sequences against the GenBank sequence database identified three sequencing errors in the primary reverse primer (CM-MS3-R1) designed based on initial sequences obtained from TIGR project database. A replacement primer (CM-MS3-R1r) was then designed, which led to the PCR amplification of nine additional DNA preparations at the CM-MS3 locus.
Multiple-sequence alignment analysis of the acquired sequences grouped the parasites into three groups at each of the four loci under analysis. One major group was formed by most C. muris specimens, including the C. muris reference sequence; one group (referred to hereafter as the C. muris variant) was formed by three Czech C. muris specimens, 14242, 14243, and 14260; and one was formed by C. andersoni. The formation of the three sequence groups was supported by results of phylogenetic analyses, as three distinct groups were seen in neighbor-joining trees constructed with the sequences (Fig. (Fig.22).
The three groups of parasites identified differed from each other by having numerous nucleotide substitutions in the nonrepeat region. Within each group, sequences differed from each other only in the number of microsatellite and minisatellite repeats. As the microsatellite and minisatellite repeats occurred in the coding region of the genes, the insertions and deletions were always in trinucleotides. The only exceptions were reference sequences of C. muris from the genome sequencing project, which differed in the nonrepeat region (by single-nucleotide deletions/insertions and substitutions) from the C. muris sequences acquired in this project at three of the four loci: CM-MS2, CM-MS3, and CM-MS16. These nucleotide deletions, insertions, and substitutions, however, were likely due to sequencing errors, as indicated by a comparison of the original reference sequences retrieved from the C. muris genome sequence project website and their respective sequences (GenBank accession no. XM_002141771 for CM-MS1, XM_002141007 for CM-MS2, XM_002142635 for CM-MS3, and XM_002141705 for CM-MS16) recently downloaded from the GenBank database.
The three groups of parasites further differed from each other in the nature of microsatellite and minisatellite repeats at each locus. With the exception of CM-MS16, the C. muris variant had repeat sequences different from those of C. andersoni and C. muris. Cryptosporidium andersoni also had repeat sequences different from those of C. muris at the CM-MS2 and CM-MS3 loci. Sometimes, the difference was just a slight modification of the repeat, such as one of the two minisatellite regions in the C. muris variant at the CM-MS1 locus, whereas sometimes the repeat sequences were totally different, such as those at the CM-MS3 locus (Table (Table33).
Altogether, there were 10, 5, 5, and 6 subtypes for C. muris and 2, 3, 4, and 1 subtypes for C. andersoni at the CM-MS1, CM-MS2, CM-MS3, and CM-MS16 loci, respectively (Table (Table1).1). A total of 16 C. muris specimens and 12 C. andersoni specimens were subtyped successfully at all four genetic loci, forming 11 C. muris and 7 C. andersoni multilocus sequence typing (MLST) subtypes. Most of the MLST subtypes had only one specimen, with the exception of four C. muris MLST subtypes and three C. andersoni MLST subtypes, which had two or three specimens (Table (Table1).1). A neighbor-joining tree was constructed with concatenated sequences from the four genes. The tree topology obtained was identical to that obtained with sequences from individual loci, with the formation of three distinct groups (Fig. (Fig.33).
In this study, an MLST tool targeting microsatellite and minisatellite sequences was developed for C. muris and C. andersoni. This tool allowed the identification of at least 11 MLST subtypes of C. muris and 7 MLST subtypes of C. andersoni. Unlike what was previously observed for C. hominis and C. parvum (9, 10, 12), the sequence polymorphism in C. muris and C. andersoni was largely in the form of differences in the copy numbers of the microsatellite and minisatellite repeats. In contrast, both C. hominis and C. parvum have extensive single-nucleotide substitutions in the nonrepeat regions of most microsatellite and minisatellite targets. The coding nature of the targets was probably not responsible for the differences observed between the gastric and intestinal Cryptosporidium spp., as most microsatellites and minisatellites in Cryptosporidium occur in coding regions of protein genes because of the presence of few introns and short intergenic regions as the result of a compact genome. This difference might be a reflection of intrinsic biological and genetic differences between gastric and intestinal Cryptosporidium species, as indicated by previous data on the phylogenetic relationships and G/C contents of the SSU rRNA genes between the two groups of Cryptosporidium spp. (42).
Two groups of C. muris were identified in this study. Most C. muris specimens had sequences similar to each other at the four genetic loci examined, with differences only in the copy number of microsatellite and minisatellite repeats. They formed a cluster in the phylogenetic analysis. In contrast, three C. muris isolates from the Czech Republic, including an isolate (TS03) that originated from an East African mole rat (Tachyoryctes splendens) and was maintained in Mastomys coucha, had sequences very different from those of most C. muris specimens. This isolate was previously shown to have different infectivity and/or host specificity in experimental animal models (23). It was also shown previously that the East African mole rat C. muris isolate had sequence differences in the SSU rRNA gene that were comparable to those between C. andersoni and most C. muris isolate. Surprisingly, two other C. muris isolates maintained in laboratory animals by the same research group, RN66 and Bactrian camel isolate CB03, also had sequences identical to those of the East African mole rat C. muris isolate. The genetic similarity of the three C. muris isolates was also confirmed by the sequencing of the SSU rRNA gene, which produced sequences identical to the one from the East African mole rat isolate. A contamination of C. muris isolates could have happened during the animal passage of C. muris isolates RN66 and CB03. The substantial sequence differences between the C. muris East African mole rat isolate and other C. muris isolates in the SSU rRNA gene and four loci in this study indicate that the East African mole rat isolate could present a different species. A parasite identical to the C. muris East African mole rat isolate was previously identified in an eastern gray squirrel in New York (6), suggesting that this parasite is probably widespread. A C. muris isolate from Japanese field mice was also shown to have minor differences in mouse infectivity and the sequence of the SSU rRNA gene from other C. muris isolates (16). Thus, genetic and biological diversities exist in C. muris, and with the inclusion of more genetic loci and samples from a wide range of hosts and geographical areas, the MLST approach developed in this study should be useful in elucidating the genetic basis for the difference in host specificity among C. muris isolates and in examining the spread of the parasite in geographically isolated areas such as the continent of Australia.
The genetic diversity of C. andersoni appears to be much lower than that of C. muris. Only three of the four loci examined in this study were polymorphic in C. andersoni, and only 2 to 4 subtypes of C. andersoni were seen at each polymorphic locus. The low genetic diversity of C. andersoni in comparison with that of C. muris is expected, as the domestication of cattle is a recent event. Thus, modern cattle are thought to have originated from a few places in the Near East and Europe and introduced to the rest of the world during the last 15,000 years (2, 14). The narrow host specificity of C. andersoni has probably further reduced its genetic diversity. The recent introduction of C. andersoni into many areas is supported by the finding of one MLST subtype of C. andersoni in the United States, Canada, and the Czech Republic (Table (Table22 and Fig. Fig.3).3). In contrast, rodents are abundant and widespread in distribution and consist of numerous species living in diverse ecological niches. The broad host specificity of C. muris and the geographical isolation of some rodent species have probably led to the emergence of host-adapted subtypes, as seen for the better-known C. parvum (41). Nevertheless, biological differences are known to exist in C. andersoni. Isolates of C. andersoni in Japan, the so-called strain Kawatabi, differ from C. andersoni isolates in other areas in its ability to infect SCID mice (28). The genetic difference between the two biological types of C. andersoni is not yet clear. Again, the inclusion of more genetic loci and a large number of samples from different geographical areas are needed before we can have a better understanding of the geographical spread of C. andersoni and genetic determinants of host specificity.
Among the 18 MLST subtypes identified, 7 were found in multiple specimens. The number of specimens with complete data for all four loci was limited by the fact that some DNA preparations were not amplified at all loci, especially in the initial PCR analysis of the CM-MS3 locus. The reason for the poor initial PCR amplification (only 24 of 44 produced the expected PCR product) of CM-MS3 was due largely to sequencing errors in the primary reverse primer region in the original C. muris sequence downloaded from the C. muris genome sequencing project. The primer sequence used was 5′-TTGCTTTAAGTGTAGAGCATAGAA-3′. A comparison with the corrected sequence (GenBank accession no. XM_002142635) recently downloaded from the GenBank database indicated that the primers had three nucleotide errors, and the corrected primer sequence should be 5′-TTGCTTTAAGTGTAAATAATACAA-3′. Because the correct primer sequence had a much lower annealing temperature, a new primary reverse primer was designed using a sequence 17 nucleotides downstream of the original location: 5′-TCAAGTACAGCAGTCTATTGCTT-3′. It is not clear whether the poor amplification efficiency of other loci was also caused by the sequencing errors in the genome sequencing project. More microsatellite and minisatellite targets may also be needed to increase the differentiation power of the MLST tool.
In conclusion, an MLST tool for subtyping C. muris and C. andersoni was developed. With further refinement, especially the inclusion of more loci, the tool should be useful for the characterization of the population genetics and the dispersal of the two parasites and especially the potential role of either host species or geography in genetic structuring. It should also be useful for the epidemiological investigation of cryptosporidiosis outbreaks caused by C. muris in some animals (26) and the public health significance of parasites of animal origin. These studies should analyze a larger number of specimens from more diverse regions and assess the relationship among MLST subtypes, host specificity, virulence or clinical presentations, and risk factors.
We thank Wangeci Gatei, Mosaad Hilali, and Carol Palmer for providing specimens.
This work was supported in part by the Shanghai Science and Technology Committee (grant no. 09540704400), the National Natural Science Foundation of China (grant no. 30771881 and 30928019), and Fundamental Research Funds for the Central Universities, China (grant no. WB0914044).
The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.
Published ahead of print on 27 October 2010.