|Home | About | Journals | Submit | Contact Us | Français|
Clostridium botulinum type A strains are known to be genetically diverse and widespread throughout the world. Genetic diversity studies have focused mainly on strains harboring one type A botulinum toxin gene, bont/A1, although all reported bont/A gene variants have been associated with botulism cases. Our study provides insight into the genetic diversity of C. botulinum type A strains, which contain bont/A2 (n = 42) and bont/A3 (n = 4) genes, isolated from diverse samples and geographic origins. Genetic diversity was assessed by using bont nucleotide sequencing, content analysis of the bont gene clusters, multilocus sequence typing (MLST), and pulsed-field gel electrophoresis (PFGE). Sequences of bont genes obtained in this study showed 99.9 to 100% identity with other bont/A2 or bont/A3 gene sequences available in public databases. The neurotoxin gene clusters of the subtype A2 and A3 strains analyzed in this study were similar in gene content. C. botulinum strains harboring bont/A2 and bont/A3 genes were divided into six and two MLST profiles, respectively. Four groups of strains shared a similarity of at least 95% by PFGE; the largest group included 21 out of 46 strains. The strains analyzed in this study showed relatively limited genetic diversity using either MLST or PFGE.
Botulinum neurotoxins (BoNTs) are produced principally by Clostridium botulinum, but rare strains of Clostridium butyricum and C. baratii can also produce BoNT serotypes E and F, respectively. There are seven serologically distinct BoNTs (designated serotypes A through G), which were originally defined by the neutralization of toxicity by specific polyclonal antibodies. The amino acid sequences of these seven toxin serotypes differ by 35 to 70% (22). Nucleotide sequences of the botulinum toxin genes (bont) also show diversity within a single serotype, and several variants of bont genes have been identified over the past 5 years in serotypes A, B, E, and F (1, 3, 5, 8, 16, 20). These bont variants are often referred to as toxin subtypes, although little information is known about the effect of these genetic variations on toxin structure or function. In addition to the diversity observed among the bont gene sequences, C. botulinum strains can be grouped into four phylogenetically distinct lineages, and it was proposed previously that these groups are in fact four different clostridial species (4).
Type A is one of the most common serotypes that cause human botulism in the United States and other countries. From 2001 to 2010, 52% (727/1,404) of botulism cases reported by the Centers for Disease Control and Prevention (CDC) were due to serotype A strains (http://www.cdc.gov/nationalsurveillance/botulism_surveillance.html). In addition, Koepke et al. (14) reported previously that 51% (1,516/2,943) of infant botulism cases reported worldwide from 1976 to 2006 were due to serotype A strains. Consequently, the genetic diversity within this serotype has been extensively investigated. Based on phylogenetic analyses, five subtypes of the bont/A gene that differ by 2.9 to 16% at the amino acid level have been reported (subtypes A1 through A5) (1, 5, 8). The genes encoding components of the BoNT/A complexes can also differ in sequence and organization; i.e., bont/A1 genes can be associated with an ha cluster (ha70, ha17, ha33, botR, ntnh, and bont genes) or with an orfX cluster (orfX3, orfX2, orfX1, botR, p47, ntnh, and bont genes). Studies of bont/A2, bont/A3, and bont/A4 genes so far have shown these gene variants to be associated with the orfX cluster only (10).
There is no known correlation between a particular bont gene variant and the pathogenicity of the host strain. Moreover, similar bont genes are found in strains with different genomic backgrounds, and diverse bont genes are present in strains that have similar genomic backgrounds (9). Therefore, it is important to understand the genomic diversity of the organism as a whole in addition to the variability of the bont gene. To estimate the genetic diversity of C. botulinum type A strains, several strains have been analyzed by different molecular subtyping methods, such as amplified fragment length polymorphism (AFLP), DNA microarrays, multilocus sequence typing (MLST), multiple-locus variable-number tandem-repeat analysis (MLVA), and pulsed-field gel electrophoresis (PFGE) (1, 8, 10, 13, 15, 18, 19, 21). Those studies focused mainly on strains harboring bont/A1 genes, although all reported bont/A gene variants have been associated with botulism cases worldwide (1, 5, 6, 8, 16, 18). Therefore, our knowledge about the genetic diversity among strains containing bont/A2 or bont/A3 genes is substantially limited. In this study, we used PCR analysis of the bont gene cluster, nucleotide sequencing of the bont genes, MLST, and PFGE to assess the genetic diversity among 42 strains of C. botulinum harboring a bont/A2 gene and 4 strains harboring a bont/A3 gene.
At least 1,000 nucleotides were amplified and sequenced to screen bont gene subtypes of 46 C. botulinum strains; the sequencing of bont gene genes was then completed for 30 strains (Table 1). Overlapping fragments of the bont/A gene were amplified and sequenced by using previously reported primers (16, 21). PCR was performed as described previously (20). BioNumerics 5.10 (Applied Maths, Austin, TX) was used for sequence assembly and phylogenetic analysis of the bont genes. Multiple-sequence alignments were generated with MULTALIN (http://bioinfo.genotoul.fr/multalin/). Neighbor-joining (NJ) trees were created with MEGA4 (23), using full-length bont gene sequences generated in this study and previously reported sequences. GenBank accession numbers for nucleotide sequences are shown in Table 1.
Seven genes were used for MLST analysis: recA, rpoB, oppB, hsp60, aceK, mdh, and aroE. These genes were amplified and sequenced by using primers and thermocycling conditions described previously by Jacobson et al. (10). PCR amplicons were purified by using the UltraClean PCR Cleanup kit (MoBio, Carlsbad, CA) and then sequenced by using an Applied Biosystems 3730 DNA analyzer. Sequences were assembled with BioNumerics 5.10 (Applied Maths, Austin, TX). Allelic numbers and sequence types (STs) were identified by querying the Clostridium botulinum MLST Database (http://pubmlst.org/cbotulinum/) (11). New alleles and new STs were submitted to the Clostridium botulinum MLST Database (http://pubmlst.org/cbotulinum/) (11). Neighbor-joining trees were constructed by using concatenated sequences of the seven genes with MEGA4 software (23); the inferred clusters were tested with 500 bootstrap replications.
C. botulinum strains were analyzed by PFGE according to PulseNet protocols (http://www.pulsenetinternational.org/protocols). Alternatively, C. botulinum isolates that were not typeable by that method were treated with formaldehyde, as described previously by Hielm et al. (7). TIFF images of gels were analyzed with BioNumerics 5.10 (Applied Maths, Austin, TX). PFGE patterns were compared by the unweighted-pair group method with averages (UPGMA) analysis using the Dice coefficient with a tolerance window of 1.5% and an optimization of 1.5%.
Sequences of the bont genes from strains CDC52353, CDC53119, CDC53120, CDC53123, CDC53125, CDC53126, CDC53127, CDC53174, CDC54053, CDC54054, CDC54055, CDC54056, CDC54057, CDC54059, CDC54060, CDC54061, CDC54062, CDC54063, CDC54064, CDC54065, CDC54066, CDC54067, CDC54069, CDC54070, CDC54071, CDC54082, CDC54094, CDC54095, CDC57329, and CDC65104 were deposited in the GenBank database under accession numbers JX110946 to JX110975. New allele sequences of the mdh, aceK, rpoB, oppB, and aroE genes were deposited in the GenBank database under accession numbers JX193722, JX193723, JX193724, JX193725, and JX193726, respectively.
The full-length coding sequences of bont/A genes from 30 C. botulinum strains were compared with other bont/A gene sequences available in public databases. Sequences from 23 C. botulinum strains showed 100% identity with the previously reported bont/A2 nucleotide sequence of strain Kyoto-F (GenBank accession number X73423) (Table 1). The nucleotide sequences from four other strains differed from the bont/A2 nucleotide sequence of that strain by only 1 nucleotide (strains CDC54053 and CDC54066) or 3 nucleotides (strains CDC54060 and CDC54082). Three strains (CDC54054, CDC54059, and CDC54064) harbored bont nucleotide sequences that differed from the bont/A3 gene sequence of strain CDC40234 by 8 nucleotides (99.9% identity). Interestingly, 6 of those 8 different nucleotides are identical to those found in bont/A2 sequences. The dendrogram resulting from the analysis of 32 bont/A nucleotide sequences, including sequences from public databases and sequences obtained in this study, showed the previously described subtypes A1 through A5 (Fig. 1). Nucleotide identities among the bont/A sequences represented in Fig. 1 ranged between 92 and 100%.
The organization of the neurotoxin cluster genes in the 46 C. botulinum strains was analyzed by using PCR assays designed to target internal fragments of each gene and specific intergenic regions. Although all strains contained ntnh, p21, p47, orfX1, orfX2, and orfX3 genes, the sizes of the intergenic spacing between the p21 and orfX1 genes varied among them. Most of the strains had the 1.2-kb insertion sequence between p21 and orfX1 reported previously for strains of subtypes A2, A3, and A4 (10); however, strains CDC53160, CDC54060, CDC54073, and CDC2171 did not have that insertion (data not shown).
Seven genes were sequenced (recA, rpoB, oppB, hsp60, aceK, mdh, and aroE), and the allelic numbers and STs were identified querying the Clostridium botulinum MLST Database (http://pubmlst.org/cbotulinum/) (11). Strains harboring bont/A2 genes were divided into three previously known STs (ST-2, ST-7, and ST-22) and three new STs (ST-26, ST-27, and ST-28) (Table 1). ST-2 was the most common ST among those strains, containing 27 out of 42 strains. ST-7 and ST-26 included six strains each, and ST-22, ST-27, and ST-28 had one strain each. Strains harboring bont/A3 genes were divided into two STs: ST-3 and a new ST, ST-25 (Table 1). Notably, all eight STs reported here included strains isolated from specimens associated with botulism cases.
The newly described ST-25, ST-26, and ST-27 represented new combinations of known alleles, while ST-28 corresponded to new alleles of the mdh, aceK, rpoB, oppB, and aroE genes (Table 2). To determine the genetic relatedness of the newly described STs, an NJ dendrogram was constructed by using concatenated sequences of the seven genes (Fig. 2). Compared to previously reported STs, ST-25 and ST-27 clustered with ST-2 and ST-22; ST-26 clustered with ST-4, ST-6, and ST-12; and ST-28 did not cluster with other STs, although it was not an outlier either.
The 42 strains containing a bont/A2 gene and the 4 strains containing a bont/A3 gene were analyzed by PFGE. Twenty-seven SmaI profiles were identified among those strains (Table 1); six profiles were represented by two or more strains (Fig. 3). PFGE profile 9 was the largest and included 13 strains that originated from samples collected in Argentina: 6 strains isolated from stool specimens linked to infant botulism cases, 2 strains isolated from soil, 3 strains isolated from foods associated with food-borne botulism cases, and 2 strains isolated from chamomile. Interestingly, this profile also included one strain isolated from infant formula in the United States. Similarly, PFGE profiles 10 and 11 included strains originating from samples collected in the United States and Argentina.
Three strains containing a bont/A3 gene (CDC54054, CDC54059, and CDC54064) showed 90% similarity by PFGE, although their patterns were not identical. Those strains had very distinct PFGE SmaI profiles compared to that of the other strain harboring a bont/A3 gene, strain CDC40234 (Fig. 3).
Interestingly, PFGE profiles seemed to correspond with MLST types in most cases. For instance, although only three ST-7 strains were 100% identical, all six ST-7 strains were grouped by PFGE with 80% similarity. In addition, all three ST-25 strains were 90% similar by PFGE, and the six ST-26 strains were grouped by PFGE with 80% similarity.
Type A is one of the most common serotypes that cause human botulism in several countries, including the United States (2, 14), and numerous investigations have focused on the genetic diversity of this serotype. All previously reported bont/A gene variants have been associated with botulism cases worldwide (1, 5, 6, 8, 16, 18); however, most of the studies assessing genetic diversity examined strains harboring bont/A1 genes. Here, we analyzed the genetic diversity among 42 C. botulinum strains containing a bont/A2 gene and 4 strains harboring a bont/A3 gene.
Sequences of bont genes obtained in this study showed 99.9 to 100% identity with other bont/A2 or bont/A3 gene sequences available in public databases. Previously described subtypes A1 through A5 were also noticeable in the dendrogram shown in Fig. 1. It is worthwhile to mention that bont sequences from strains CDC2171, CDC54068, CDC41376, and CDC41370 could represent new subtypes, although the value of naming bont sequences as subtypes is unclear. The bont gene subtypes have been defined as bont sequences differing by at least 2.6% at the amino acid level (22); this definition was proposed in 2005 based on the differences observed among 49 sequences of bont/A through bont/G genes available at that time. Currently, numerous bont gene sequences are available in public databases, and many differ by less than 2.6% at the amino acid level. For instance, five variants of bont/E differ from other bont/E subtypes by 0.9 to 2.2% at the amino acid level (17), and a new bont/B variant (subtype B3) differs from the closest sequence by 2% at the amino acid level (8). The reporting of additional unique sequences will likely increase as more strains are analyzed, making the classification of bont gene sequences into subtypes more intricate. Although it is not clear if all amino acid variations will result in changes to toxin properties, evidence exists to demonstrate that such variations can affect toxin function; e.g., BoNT/A1 and BoNT/A2 show differential monoclonal binding, and BoNT/F5 cleaves synaptobrevin-2 in a different location from the other BoNT/F subtypes (12, 22). We propose that nonidentical bont gene sequences within a single toxin subtype be identified as gene variants and that those that result in a particular toxin characteristic that differs from other variants be classified as bont subtypes, e.g., differences in antigenicity, toxicity, potency, binding to receptors, substrate cleavage site, and pathogenicity.
The neurotoxin gene clusters of the strains analyzed here were similar in gene content, since all strains harbored ntnh, p21, p47, orfX1, orfX2, and orfX3 genes; however, the intergenic spacing between the p21 and orfX1 genes varied among some of them. Most of the strains had the 1.2-kb insertion sequence between p21 and orfX1 reported previously for bont/A2, bont/A3, and bont/A4 (10), but four strains did not have that insertion sequence. Raphael et al. also reported the absence of that insertion sequence in strain CDC2171 (20), and Franciosa et al. reported the lack of the same insertion sequence in another strain harboring a bont/A2 gene (strain Mascarpone) (6). It is unclear if recombination has occurred within the toxin gene cluster of those strains, resulting in the absence of the partial insertion sequence.
C. botulinum is a very diverse bacterial group. In addition to the diversity among the bont gene sequences and toxin gene cluster arrangements, the genomic backgrounds that contain those genes vary as well. Horizontal transfer events and recombination events have been identified among C. botulinum strains, and the variability observed among the bont gene clusters does not always correspond to a particular genomic background (9). Thus, we used MLST and PFGE to provide a comprehensive assessment of the genetic diversity among these type A strains harboring bont/A2 and bont/A3 genes.
Compared to strains harboring bont/A1 genes, the C. botulinum strains with bont/A2 and bont/A3 genes used in this study showed a limited diversity by MLST, since they were divided into six and two STs, respectively, while strains with bont/A1 genes were divided in 20 STs (10). The dendrogram in Fig. 2 shows that ST-2, ST-22, ST-25, ST-26, and ST-27 are grouped together, and thus, the majority of the strains analyzed in this study were related by MLST, regardless of their source or origin. This may be due to the selection of housekeeping genes used for the MLST analysis, based primarily on strains with bont/A1 genes (10). Alternatively, this limited diversity by MLST could be due to a more conserved genetic background among strains harboring bont/A2 or bont/A3 genes. This idea is also supported by the PFGE results. Although the 46 C. botulinum strains were divided into 27 PFGE profiles, the dendrogram in Fig. 3 shows that four groups of strains shared a similarity of ≥95%, and the largest group included 21 out of the 46 strains analyzed. These groups included strains isolated from diverse sources from Argentina and the United States, including clinical and food specimens associated with botulism cases as well as environmental samples unrelated to botulism cases. MLST profiles were represented by strains isolated from environmental samples as well as strains isolated from stool or food samples associated with botulism cases. These observations indicate that strains that are indistinguishable by subtyping methods are not necessarily epidemiologically related. For instance, although strains FRI Honey and CDC53140 produced identical PFGE and MLST profiles, one was isolated from honey in the United States and the other one was isolated in Argentina from a stool sample from an infant with botulism. It is tempting to suggest that these strains are somehow related; however, the worldwide distribution of PFGE and MLST profiles remains unknown. Thus, it is not currently possible to make any conclusions about the relatedness of these strains. Additional studies are needed to determine the genetic variability among C. botulinum strains from diverse sources and geographical areas to completely understand their relatedness.
It is interesting to note that strain CDC53174, which originated from a sample collected in Africa, had unique MLST and SmaI profiles and did not cluster with any of the other strains. Unfortunately, this was the only strain from that continent available in the CDC culture collection, and thus, it is not possible to make further conclusions.
The only two strains harboring bont/A2 previously available in the C. botulinum MLST database produced an ST-2 profile (10), as did the majority of the strains analyzed in our study. The six strains in our study that produced an ST-7 profile had bont/A2 genes, whereas the only ST-7 profile previously available in the C. botulinum MLST database corresponded to a single strain (CDC657) which harbors a bont/A4 gene (10). Similarly, the only ST-22 profile in the C. botulinum MLST database was a strain of Clostridium sporogenes (10), while in our study, one C. botulinum strain with a bont/A2 gene produced an ST-22 profile.
The strains analyzed in this study showed limited genetic diversity using either MLST or PFGE. Other molecular subtyping methods, such as AFLP, DNA microarrays, and MLVA, have also shown limited genetic diversity among C. botulinum strains harboring a bont/A2 gene, although the number of strains analyzed in those studies was very limited (1, 8, 10, 18). This apparently highly conserved genetic background could represent an obstacle for laboratory investigations of botulism outbreaks; therefore, methods with higher resolution, such as whole-genome sequencing, may be required to distinguish unrelated strains.
This study enhances our understanding of the genetic diversity of C. botulinum type A strains by providing information about strains that harbor bont/A2 or bont/A3 genes. Our results highlight the significance of determining the genetic diversity of C. botulinum as a whole in addition to sequencing bont genes, in particular during botulism outbreak investigations, when it may be important to determine the relatedness of isolates from different patients or to identify the sources of BoNT or the origin of C. botulinum spores. In some investigations, epidemiologic information will be critical to establish links between isolates that appear to be genetically similar by currently available methods.
DNA sequencing was performed at the Division of High-Consequence Pathogens and Pathology Genomics Unit, Centers for Disease Control and Prevention.
This publication was supported by funds made available from the Centers for Disease Control and Prevention Office of Public Health Preparedness and Response.
The findings and conclusions in this report are those of the authors and do not necessarily represent the views of the Centers for Disease Control and Prevention.
Published ahead of print 5 October 2012