|Home | About | Journals | Submit | Contact Us | Français|
Serogroup classifications based upon the O-somatic antigen of Shiga toxin-producing Escherichia coli (STEC) provide significant epidemiological information on clinical isolates. Each O-antigen determinant is encoded by a unique cluster of genes present between the gnd and galF chromosomal genes. Alternatively, serogroup-specific polymorphisms might be encoded in loci that are encoded outside of the O-antigen gene cluster. Segments of the core bacterial loci mdh, gnd, gcl, ppk, metA, ftsZ, relA and metG for 30 O26 STEC strains have previously been sequenced, and comparative analyses to O157 distinguished these two serogroups. To screen these loci for serogroup-specific traits within a broader range of clinically significant serogroups, DNA sequences were obtained for 19 strains of 10 additional STEC serogroups. Unique alleles were observed at the gnd locus for each examined STEC serogroup, and this correlation persisted when comparative analyses were extended to 144 gnd sequences from 26 O-serogroups (comprising 42 O:H-serotypes). These included O157, O121, O103, O26, O5:non-motile (NM), O145:NM, O113:H21, O111:NM and O117:H7 STEC; and furthermore, non-toxin encoding O157, O26, O55, O6 and O117 strains encoded distinct gnd alleles compared to STEC strains of the same serogroup. DNA sequencing of a 643 bp region of gnd was, therefore, sufficient to minimally determine the O-antigen of STEC through molecular means, and the location of gnd next to the O-antigen gene cluster offered additional support for the co-inheritance of these determinants. The gnd DNA sequence-based serogrouping method could improve the typing capabilities for STEC in clinical laboratories, and was used successfully to characterize O121:H19, O26:H11 and O177:NM clinical isolates prior to serological confirmation during outbreak investigations.
Shiga toxin-producing Escherichia coli (STEC) are bacterial pathogens that result in both outbreak and sporadic occurrences of human mortality and disease. Symptoms can include bloody and non-bloody diarrhoea, and children are susceptible to renal failure due to haemolytic uraemic syndrome. STEC are transmitted to humans by consumption of contaminated food or water, person-to-person contact or animal-to-person contact, where natural reservoirs include cattle, pigs and sheep (Karch et al., 2005). Serogroup classifications based upon the O-somatic or H-flagellar antigens of STEC provide significant epidemiological information on clinical isolates, and this measure can provide the first indication of relatedness between strains during outbreak investigations. The serogroup is also indicative of the overall genetic relatedness between E. coli strains, including virulence gene content, such as the locus for enterocyte effacement (LEE) pathogenicity island, and the stx1 and stx2 loci encoding Shiga toxins (Prager et al., 2005; Girardeau et al., 2005; Karmali et al., 2003).
The predominant O-serogroup of STEC that is observed clinically in North America is O157 (Johnson et al., 2006); however, biased sampling likely results from the availability of clinical media and detection reagents that target this serogroup. Directed studies for the isolation and characterization of both O157 and non-O157 STEC from clinical samples have indicated that the proportion of non-O157 in North America is likely higher than clinical records have indicated (Thompson et al., 2005; Jelacic et al., 2003; Fey et al., 2000). In Canada, over 90% of STEC strains detected are serotype O157:H7 or O157:non-motile (NM) (Woodward et al., 2002). The global prevalence of non-O157 includes significant outbreaks of O26, O121, O103, O111 and O145, and in some countries it is recognized that these serogroups exceed the prevalence of O157 STEC (Karch et al., 2005). Furthermore, non-O157 strains have been identified along with O157 strains in clinical samples (Paton et al., 1996), so it is possible that a diagnostic bias towards O157 may prevent the detection of the aetiological STEC serogroup during human illness.
Molecular methods for the characterization and identification of O-antigen determinants have been devised using restriction profiling and allele-specific PCR. The entire O-antigen-encoding gene cluster could be amplified using primers that targeted conserved regions in the neighbouring gnd sequence (encoding 6-phosphogluconate dehydrogenase) and JUMPstart sequence, and enzymic digestion of this amplicon identified RFLPs correlating to O-antigen determinants (Coimbra et al., 2000). This method was problematic due to the length of the amplicon (upwards of 20 kbp) and the absence of unique restriction profiles for all serotypes. Within the O-antigen gene cluster the wzx and wzy loci encode the O-antigen flippase and polymerase, respectively, and distinct alleles corresponding to each O-serogroup have been used for molecular serogrouping of O103, O157, O26, O113 and O111 strains (Perelle et al., 2005; DebRoy et al., 2004; Paton & Paton, 1999a; Fratamico et al., 2005; D'Souza et al., 2002). It has been suggested that these assays could replace traditional serological methods (DebRoy et al., 2005); however, the individual tests currently detect only one to three O-serogroups. In the absence of a priori knowledge of a serogroup, a large number of reagents may be required to confirm serogroup identity with these methods. Robust platforms such as DNA microarrays containing wzx and wzy probes targeting up to four E. coli serogroups are currently being investigated (Liu & Fratamico, 2006), and broad subtyping of STEC has been achieved using allelic variants of a LEE-encoded determinant (Gilmour et al., 2006).
Multilocus sequence typing has been attempted for each of the STEC serotypes O26:H11, O121:H19, O103:H2 or O157:H7, but this method was not appropriate for subtyping because very few polymorphisms were observed between strains of the same serotype (Gilmour et al., 2005; Tarr et al., 2002; Noller et al., 2003; Beutin et al., 2005). The genetic differentiation and subtyping of E. coli serotype O26:H11 was attempted by sequencing 10 loci for 30 strains encoding stx1, or both stx1 and stx2 (Gilmour et al., 2005). Amongst the O26:H11 strains all loci were identical, with the exception of three alleles of mdh and two alleles of ppk that each differed by a single point mutation. Notably, comparative analyses of the mdh, gnd, gcl, ppk, metA, ftsZ, relA and metG alleles encoded by O26:H11 STEC cumulatively distinguished this serotype from O157:H7 (Gilmour et al., 2005). The conservation of these loci between O26:H11 strains, and the genetic distance from the other E. coli serotypes suggested that sequence-based typing of additional STEC might reveal serotype-specific alleles. In this study, additional DNA sequence data at these loci was obtained for a range of STEC and a single locus was observed to encode allelic variants correlating to individual STEC O-serogroups. We therefore present a simple molecular method for the identification of STEC serogroups, including both O157 and non-O157 strains.
STEC strains (Table 1) were obtained from the reference stocks of the Enteric Diseases Program at the National Microbiology Laboratory that originated from human sources at various Canadian provincial health laboratories during 1985–2005, or were recent clinical isolates obtained from the Alberta Provincial Laboratory for Public Health (nomenclature XX-YYYY, where XX generally refers to the year of isolation). During the course of these studies, five outbreak-associated STEC isolates were provided by Nova Scotia Public Health, Halifax, Nova Scotia, Canada. Confirmation of O:H serotype was completed with antisera prepared at the National Microbiology Laboratory (Ewing, 1986).
Template DNA was prepared by centrifuging 1 ml exponential phase culture grown in brain heart infusion broth, resuspending the pellet in 1 ml TE buffer (Sigma; 10 mM Tris/HCl, 1 mM EDTA, pH 8.0) and boiling the cells for 15 min. Boiled cells were pelleted, and the supernatant was removed and used as the DNA template in PCR.
Oligonucleotide primers used to amplify segments of mdh, gnd, gcl, ppk, metA, ftsZ, relA and metG are presented in Table 2. PCR was performed with high fidelity Platinum Taq (Invitrogen), following the manufacturer’s directions. The thermocycling parameters for ftsZ, relA and metG included an initial denaturation at 94°C for 5 min, 35 cycles of denaturation at 94°C for 40 s, annealing at 50°C for 45 s and extension at 68°C for 45 s, with a final extension at 68°C for 5 min. The annealing temperature for metA, mdh, gcl and ppk was 58°C, and 52°C for gnd. PCR products were purified using the QIAquick PCR purification kit (Qiagen) and sequenced using the same primers that generated these amplicons. Sequencing was performed on an ABI3730 (Applied Biosystems) and the data were deposited in GenBank with accession nos DQ472524–DQ472651. Existing genomic sequence data for E. coli O157:H7 EDL933, O157:H7 Sakai, O6:H1 CFT073 and K-12 (GenBank accession nos NC_000913, BA000007, NC_002655, NC_004431) was included in our dataset for each of the above loci. From directed studies against the gnd locus (Tarr et al., 2000; Paton & Paton, 1999b; Wang et al., 1998), we included sequence data from O157:H7 and O157:NM (GenBank accession nos AF176359, AF176358, AF176357, AF176356, AF176360, AF176361 and AB008676), O113:H2 (AF172324), O111 (AF078736) and non-toxin encoding O157 and O55 (AF176368, AF176367, AF176366, AF176363, AF176362, AF176369 and AF176373). Our previously acquired sequence data from O26:H11, O26:H6 and O26:H32 strains were also included (GenBank accession nos AY973395–AY973421; Gilmour et al., 2005).
Multiple sequence alignments were completed using ClustalW (www.ebi.ac.uk/clustalw/), neighbour-joining trees were constructed with Hasegawa–Kishino–Yano (HKY85) distance correction using SplitsTree4 (Huson, 1998), and genetic diversity statistics were calculated using DnaSP 4.10.3 (Rozas et al., 2003). Pairwise global alignments were calculated using Align (www.ebi.ac.uk/emboss/align/#).
The alleles of mdh, gnd, gcl, ppk, metA, ftsZ, relA and metG encoded by O26:H11 STEC cumulatively distinguished this serotype from O157:H7 (Gilmour et al., 2005), and the corresponding segments of these loci were sequenced for STEC serotypes O111:NM, O113:H21, O157:NM, O145:NM, O91:H21, O121:H19, O121:NM, O103:H2, O165:H25 and O5:NM. This panel of STEC strains included isolates from each of the most predominant O-serogroups and O:H-serotypes observed in Canada (Gilmour et al., 2005, 2006), and amongst individual serotypes, strains with different stx genotypes were included when available (Table 1). This sequence dataset was compared to previously published sequence data for STEC serotypes O157:H7 and O26:H11, as well as non-toxin producing O26:H32, O26:H6, K12 and O6:H1 (strain CFT073) strains using the 4464 nucleotide concatenate of the eight genetic determinants (Fig. 1). Each of the examined serogroups had distinct sequence types, including NM STEC strains of O121 and O157, were 99.8 and 99.9% identical to O121:H19 and O157:H7 strains, respectively. The observed phylogenetic separation between serogroups, and homogeneity within strains of the same serogroup, indicated that these genetic traits have been acquired by and vertically inherited within individual STEC serogroup lineages.
Additional sequencing was performed at selected loci in an expanded panel of strains to determine if the phylogenetic separation observed between serogroups was maintained in a larger dataset (Table 1). The genetic determinants that contributed the majority of the observed genetic diversity (gnd and gcl; Table 3) or encoded putative serogroup-specific regions (ppk and relA; data not shown) were selected for further study. This panel included further strains from the serotypes represented in Fig. 1, as well as seropathotype D and non-toxin encoding E. coli strains recovered from paediatric stool samples (L. Chui, unpublished data). The overall genetic distinction between STEC serogroups (as determined in the eight locus scheme) was also represented amongst these four loci, and the additional strains and serogroups (Fig. 2).
The gnd locus was the most genetically diverse of all examined loci (Table 3), and notably, this determinant is immediately adjacent to the O-antigen gene cluster. Additional sequencing of the 643 bp region of gnd was performed (Table 1), and gnd sequence data available in GenBank for O157, O113 and O111 STEC, as well as non-toxin encoding O157 and O55 strains, were also included in comparative analyses. In total, gnd DNA sequences were collected from 144 strains and 26 O-serogroups (comprising 42 O:H-serotypes). The overall genetic distinction between serogroups (as determined in the eight and four loci schemes) was also represented in this single locus, as each examined STEC O-serogroup encoded a unique gnd allele (Fig. 3). For some of the most clinically significant STEC serogroups (O157, O26, O121, O145, O111 and O103) the gnd DNA sequences were compared between multiple strains (from 5 to 43 sequences), and for each serogroup all STEC strains encoded an identical gnd allele (Fig. 3). The only exception was O157:H7 strain 87-16 (GenBank accession no. AF176360), which encoded a single nucleotide polymorphism compared to the other O157 strains, but otherwise the gnd alleles were conserved within STEC serogroup classifications. Furthermore, non-toxin encoding strains of O157, O26, O55, O6 and O117 encoded distinct gnd alleles compared to STEC strains of the same serogroup. Sequence typing of gnd was, therefore, a promising molecular method correlating minimally with the O-serogroup of clinical STEC strains. The O111:NM STEC and non-toxin-producing O55 strains encoded gnd sequences outlying from the main cluster (Fig. 3) and these were homologous to Citrobacter spp. gnd alleles (Nelson & Selander, 1994). However, since pure bacterial isolates are preferred for preparation of DNA sequencing template, all isolates undergoing gnd DNA sequence-based serogrouping should previously be classified as STEC.
During the course of this study, outbreak-related isolates of non-O157 STEC were sent to the National Microbiology Laboratory for serotyping and genetic characterization. The gnd sequence data for each of isolates 05-6541 to 05-6543 clustered with known O121 strains (Fig. 3). A concurrent non-O157 sporadic isolate (05-6544) was also examined at gnd and this sequence clustered with known O26:H11 strains (Fig. 3). Strain 06-5121 was isolated from a hospitalized patient with haemolytic uraemic syndrome and the gnd sequence of this strain was 99.8% identical to a known O177:NM isolate (Fig. 3). In correlation with these molecular data, subsequent serotyping using traditional methodologies characterized these isolates as O121:H19, O26:H11 and O177:NM. The gnd DNA sequence-based serogrouping method therefore provided an advantageous alternative to O-specific immunoreagents during these crises. Over 55 serogroups of STEC have been reported to be associated with human disease (Johnson et al., 2006), and an international panel of STEC strains from each serogroup, including the emerging sorbitol-fermenting O157, will be required to further validate this method.
The proportion of synonymous and nonsynonymous mutations was calculated for each locus from the accumulated DNA sequence data (Table 3). As expected for core loci, the majority of mutations were synonymous (dN/dS <1), but the gnd locus had the greatest number of nonsynonymous sites. This locus has already been identified as a polymorphic E. coli locus compared to other core loci (Bisercic et al., 1991; Nelson & Selander, 1994; Dykhuizen & Green, 1991). A comparable ratio of synonymous versus nonsynonymous mutations was also reported by Bisercic et al. (1991). Genetic diversity at gnd arose in parallel to the extensive diversity and recombination that occurred at the neighbouring O-antigen gene cluster, and it is likely that these two genetic traits were co-inherited between lineages (Tarr et al., 2000; Nelson & Selander, 1994). To our knowledge, there is no indication that O-serogroups that encode similar gnd alleles (e.g. STEC O121 and O55) also encode similar O-antigen gene clusters, nor are the antigens themselves similar. The potential utility of a locus subject to recombination between genera might be seemingly limited for the purpose of molecular-based serogrouping; however, we currently observed conserved STEC serogroup-specific genetic polymorphisms at the gnd locus. Between strains of an individual STEC O-serogroup we observed conserved gnd alleles, and no serogroup encoded a gnd allele that was identical to another serogroup. This study provides a simple method for molecular-based serogrouping of E. coli strains encoding stx, which can be detected by a wealth of molecular reagents (Gilmour et al., 2006; Hsu et al., 2005; Nielsen & Andersen, 2003; Reischl et al., 2002; Wang et al., 2002). This method was used to characterize O121:H19, O26:H11 and O177:NM clinical isolates prior to serological confirmation during an outbreak investigation, and could, therefore, improve the scope of STEC molecular diagnostics beyond the O157 serogroup.
We thank Tim Mailman and Nova Scotia Public Health for providing outbreak-associated strains, and John Wylie at the Manitoba Cadham Provincial Laboratory, Winnipeg, Manitoba, Canada, Ana Paccagnella at the British Columbia Centre for Disease Control, Vancouver, British Columbia, Canada, and Yvonne Yaschuk at New Brunswick Public Health, Saint John, New Brunswick, Canada, for providing strains. We also thank Julie Walsh, Helen Tabor, Dobryan Tracz and Clifford Clark for helpful discussions. Oligonucleotide synthesis and DNA sequencing was performed by the DNA core facility, and serology was performed by the Serotyping and Identification Unit at the National Microbiology Laboratory. This work was supported by the Office of Biotechnology and Science.
The GenBank/EMBL/DDBJ accession numbers for the nucleotide sequences reported in this paper are DQ472524–DQ472651.