|Home | About | Journals | Submit | Contact Us | Français|
Strain typing methods that compare electrophoresis banding patterns are commonly used but are difficult to standardize and poorly portable. Multilocus sequence typing (MLST) is a sequence-based alternative, but it is not practical for large-scale epidemiological studies. In the present study, the usefulness of fimH single-nucleotide polymorphisms (SNPs) for Escherichia coli typing was explored. fimH SNPs were determined for 345 E. coli clinical isolates (including 3 reference strains) and compared to PCR-based ECOR (E. coli reference collection) phylogrouping. The fimH gene could be amplified for 316 (92%) of the 345 isolates. fimH SNP analysis found 46 distinct terminal groups in the nucleotide sequence-based phylogenetic tree (fimH types). A subset of the E. coli isolates (162 clinical isolates and the 3 reference strains) were compared by fimH type, PCR phylogroup, and MLST. These isolates fell into 27 fimH types and 18 MLST clonal complexes (CCs) that contained 2 to 28 isolates per complex. The combination of PCR phylogroup and fimH type corresponded to a single CC for 113 (68%) isolates and 2 or 3 CCs for the other 52 (32%) isolates. We propose that the combination of PCR phylogrouping and fimH SNP analysis may be a useful method to type a large collection of clinical E. coli isolates for epidemiologic studies.
Molecular strain typing methods have enhanced our understanding of the epidemiology of many infectious diseases by contributing to the characterization of new modes of transmission, vehicles, and risk factors for infections. Recently, a new understanding about the epidemiology of community-acquired urinary tract infections (UTI) emerged from the systematic strain typing analysis of uropathogenic Escherichia coli (UPEC). Community-acquired UTI typically occurs as an endemic infection. In 2001, Manges and colleagues identified clusters of UTI on 3 different college campuses in the United States caused by an E. coli clonal group that was designated CgA based on a characteristic enterobacterial repetitive intergenic consensus (ERIC)-PCR electrophoresis banding pattern (14). Additional strain typing by E. coli reference collection (ECOR) phylogenetic grouping, pulsed-field gel electrophoresis (PFGE), multilocus sequence typing (MLST), serotyping, and virulence profiling revealed that the CgA strains belonged to an identical or a very closely related lineage of E. coli (9, 14, 20). In 2009, in Rio de Janeiro, Brazil, CgA was also found to comprise the largest clonal group observed among women with community-acquired UTI (4).
Although reference laboratory surveillance systems using PFGE, such as PulseNet, have been highly successful, the analysis of multiple electrophoresis banding patterns is subject to interlaboratory variability and is not amenable to simple standardization for portability and the creation of public databases. MLST was developed as an attractive sequence-based genotyping technique because it provides reproducibility, comparability, and transferability between laboratories. Although MLST seems to be an excellent technique, it is still impractical for large-scale epidemiological studies. In a previous study, Tartof et al. explored fimH single-nucleotide polymorphism (SNP) analysis as a screening test for the epidemiological study of UPEC (21). FimH is a specific adhesin located at the tip of type 1 fimbriae that determines mannose-sensitive binding of bacteria to target cells (11). fimH SNPs were compared with MLST for 34 E. coli isolates from urine, blood, animals, and water, belonging to 14 distinct sequence types (STs), including 22 CgA isolates and strains K-12 and CFT073 (21). The two techniques (MLST versus fimH SNP analysis) showed similar discriminatory powers; however, the study used a collection of well-defined, selected E. coli isolates. The performance of fimH SNP analysis on population-based clinical isolates of E. coli is not known. In the present study, we explored fimH SNP analysis as a typing tool for a large collection of community- and hospital-associated extraintestinal E. coli isolates and compared this typing method to ECOR phylogrouping and MLST.
(This work was presented in part at the 49th Interscience Conference on Antimicrobial Agents and Chemotherapy, San Francisco, CA, 12 to 15 September 2009.)
Between 2001 and 2006, 342 E. coli isolates from clinical samples were collected in Rio de Janeiro, Brazil. One hundred twenty-seven and 12 isolates were recovered from urine samples of women and men, respectively, with UTI in the community of Rio de Janeiro (4, 5); 42 hospital isolates were recovered from different clinical specimens (including blood, lower respiratory tract, surgical site, normally sterile secretions and tissues, catheter tip, and other sources) of patients (22 men and 20 women) at a public university-affiliated hospital (3); and 120 and 41 isolates were recovered from urine samples of hospitalized female and male patients, respectively (unpublished data). Only one isolate per patient was included. E. coli reference strains ATCC BAA-457 (CgA), CFT073, and K-12 were also studied for comparison.
E. coli isolates were obtained from strains stored in suspensions containing skim milk (10%) and glycerol (10%) at −80°C. Isolates were incubated at 37°C on nutrient agar plates overnight. Single colonies were picked and inoculated into 2 ml of nutrient broth and further incubated in a shaking incubator for 16 to 18 h at 37°C. A 500-μl suspension of bacteria was centrifuged, and chromosomal DNA was extracted from the bacterial pellet suspended in 100 ml of sterile ultrapure water by boiling for 10 min. After thermal lysis, the supernatant was transferred to a new microtube and kept at −20°C.
fimH SNP analysis was performed as reported previously (21), with modifications. The primers used for PCR amplification and partial fimH gene sequencing were FimH-f (5′-CGAGTTATTACCCTGTTTGCTG-3′) and FimH-r (5′-ACGCCAATAATCGATTGCAC-3′). Both strands of the 878-bp PCR-amplified fragment (located at bp 7 to 884 of E. coli sequence NC000913 [GenBank]) were sequenced. After visual inspection and editing with BioEdit (version 126.96.36.199), fragments of 424-bp fimH sequences (located at bp 401 to 824 of E. coli sequence NC000913 [GenBank]) were compared to the sequence of E. coli K-12.
We performed MLST using a standardized protocol for E. coli maintained at the MLST databases at the ERI website (http://mlst.ucc.ie/) of University College Cork, based on the seven housekeeping genes adk, fumC, icd, purA, gyrB, recA, and mdh. The primers used for amplification and sequencing of both DNA strands were same as those described on the MLST website, except for the adk forward primer. A 704-bp adk fragment was obtained, with primer adkF-f2 (5′-CTCGCCATTAACCGTTTCA-3′), instead of the 583-bp fragment described in the MLST website. Briefly, amplifications were carried out in a total volume of 50 μl. Each reaction mixture contained 0.2 mM each deoxynucleoside triphosphate, 1.5 mM MgCl2, 0.2 μM each primer, 2.5 U AmpliTaq Gold (Applied Biosystems), and 2 μl template DNA. The reaction conditions were 2 min of initial denaturation at 95°C followed by 30 1-min cycles of denaturation at 95°C; 1 min of annealing at 54°C for adk, fumC, icd, and purA or 60°C for gyrB, mdh, and recA; 2 min of extension at 72°C; and a final extension step of 5 min at 72°C.
Phylogenetic grouping of E. coli isolates was assessed by a previously reported triplex-PCR-based assay (2, 6). The results allowed the classification of isolates into one of the four major phylogroups (A, B1, B2, or D). E. coli strains ATCC BAA-457 (CgA), CFT073, and K-12 were used as controls for groups D, B2, and A, respectively.
Amplicons were purified with a QIAquick gel extraction kit (Qiagen, Inc., Valencia, CA), and DNA sequencing of both strands was carried out at the DNA Sequencing Facility of the University of California, Berkeley, CA. The forward and reverse PCR primers were used for sequencing of the fimH and housekeeping genes. The facility runs a 25-cycle sequencing reaction with the following program: 10 s at 96°C, 5 s at 50°C, and 4 min at 60°C. Forward- and reverse-strand DNA sequence traces were visually inspected and edited with BioEdit Sequence Alignment Editor, version 188.8.131.52 (7).
The finished sequences were aligned by using ClustalW Multiple Alignment in BioEdit. The alignment data were assessed by bootstrap analyses based on 1,000 resamplings. The 424-bp fimH gene fragment alignment of the 342 clinical E. coli isolates and E. coli reference strains was imported into MEGA4 (19) for phylogenetic analysis and construction of trees. For the same strains, an MLST alignment was made from the concatenated sequences of the seven housekeeping genes (3,423 bp). To conserve computing time, duplicate concatenated sequences were removed from the input sample. Aligned sequences were examined for molecular evolutionary relationships by the neighbor-joining method (17). The significance of the branching order was evaluated by bootstrap analysis with 500 replicates. The evolutionary distances were computed by the Kimura 2-parameter method (10).
The fimH gene sequences analyzed (partial coding sequence [CDS] located at bp 44 to 862 of E. coli sequence NC_000913 [GenBank]) were deposited in GenBank under accession nos. GQ486876 to GQ486913 and GQ487043 to GQ487188 (hospital isolates), GQ486914 to GQ487042 (community isolates), and GQ487189 to GQ487191 (reference strains ATCC BAA-457 [CgA], K-12, and CFT073, respectively).
We first examined SNPs of the 424-bp fimH sequence of all E. coli isolates to estimate more precisely the potential use of this gene as a typing tool. Twenty-nine (8%) of the 345 E. coli isolates included in the study were fimH negative (10 [7%] community isolates and 9 [4%] hospital isolates).
Each DNA sequence was compared to that of E. coli K-12. Among the 345 isolates, 46 distinct fimH allelic variants were observed (Fig. (Fig.1),1), with 53 unique mutations (SNPs) at 49 polymorphic sites. All mutations were point substitutions; 14 (26%) were transversions, 39 (74%) were transitions, 16 resulted in amino acid replacements, and 37 were silent substitutions. Among the amino acid replacements, only four were caused by transversion. Seventeen of the 53 SNPs were singletons (observed in only one fimH type), with 8 amino acid replacements.
We analyzed the distribution of fimH variants with an unrooted phylogram (Fig. (Fig.1).1). In this tree, 46 distinct terminal groups representing distinct allelic variants were named f-1 (E. coli K-12 type) to f-46. GenBank accession numbers for each fimH type are shown in Table Table1.1. No specific node was observed on the tree according to the source or date of isolation. However, some fimH types were found only among hospital (f-16, f-18, f-22 to f-28, and f-36 to f-46) or community (f-17 and f-29 to f-35) isolates. The fimH-negative phenotype was named f-0.
The correlation between fimH type and MLST was further explored in a subset of 165 isolates comprising the more diverse collection of UPEC isolates obtained in the community (139 isolates, including 10 fimH-negative strains), the three reference strains, and all isolates (29 of the isolates already included in the collection of community isolates and 23 additional hospital isolates) included in the largest group sharing identical fimH sequences. This way, we could select the isolates with the highest possible diversity and the largest possible cluster.
The 165 isolates were classified into 51 distinct STs, including 16 single-locus (13) or double-locus (3) variants (SLV and DLV, respectively) of known STs. Five new STs were observed (ST697, ST706, ST827, ST828, and ST1393). Of these five STs, three included one new allele each (mdh101, icd154, or icd215). These data were deposited into the E. coli MLST website database (http://web.mpiib-berlin.mpg.de). The isolates were then classified into clonal complexes (CCs), defined as groups of isolates with an identical ST or an SLV or DLV from another ST within the CC (12). A total of 149 (90%) of the 165 isolates grouped into 18 CCs with 28 to 2 isolates per CC, numbered from “I” (CC with 28 isolates) to “XVIII” (CC with two isolates) (Fig. (Fig.2).2). The remaining 16 (10%) isolates belonged to a distinct ST each. The 10 fimH-negative strains fell into 3 MLST CCs and 3 STs not included in a CC.
The 165 isolates were discriminated into 27 fimH types. The fimH type and MLST assignments were then compared. Each of 19 fimH types corresponded to a single CC or an ST not included in a CC. The other 8 fimH types corresponded to multiple CCs or STs; these CCs and STs also included several different fimH types.
The PCR phylogrouping discriminated the 165 isolates into 75 B2, 59 D, 20 A, and 11 B1 ECOR groups. Of 27 fimH types, 20 were observed to belong to a single PCR phylogroup (f-20 and f-32 to phylogroup A; f-10, f-11, f-15, f-21, and f-35 to B1; f-5, f-13, f-19, f-33, and f-34 to D; and f-2, f-3, f-8, f-14, f-17, f-29, f-30, and f-31 to B2); the other 7 fimH types belonged to 2 (f-7, f-9, and f-12), 3 (f-6), or all 4 PCR phylogroups (f-0, f-1, and f-4) (Table (Table2).2). All isolates within each ST belonged to a single PCR phylogroup (Fig. (Fig.2).2). In addition, fimH type-MLST combinations within each phylogroup were also unique (Table (Table2).2). On the other hand, the fimH type-PCR phylogroup combinations corresponded to a single CC (or an ST not included in a CC) for 113 (68%) isolates and to 2 or 3 CCs (or STs not included into a CC) for the other 52 (32%) isolates.
A new phylogenetic tree was built with the 155 concatenated MLST sequences (corresponding to 46 STs) of the fimH-positive isolates. This new tree was then rebuilt with the 26 fimH type sequences added to the concatenated MLST sequences. As expected, the tree containing fimH was more discriminatory (with 57 terminal groups) than the tree without fimH (with 46 terminal groups). The nodes corresponding to the PCR phylogroups in the tree containing fimH overlapped the respective nodes in the tree without fimH. The internal nodes showed a few changes, but, with only one exception (isolates included in CCII), the arrangements within CCs were maintained (data not shown).
In a previous study, fimH SNP analysis was explored as a possible screening tool to type UPEC (21). For a select and small collection of E. coli isolates, the fimH SNP analysis showed a discriminatory power similar to that of MLST (21). In the present study, we explored the usefulness of this typing method for a larger and population-based collection of E. coli isolates. Here we found that fimH SNP analysis showed a lower discriminatory power than MLST. Furthermore, the fimH type assignments were inconsistent with those made by MLST. This finding indicates that fimH SNP analysis is not useful as a screening tool in a population-based sample of clinical isolates of E. coli. Nevertheless, when the fimH type was combined with the PCR phylogroup, the combination formed unambiguous types which correlated well with the MLST CCs. Specific fimH type-PCR phylogroup combinations corresponded to 1 to 3 CCs (or single STs not included in CCs) for the entire collection of isolates studied. Thus, although the combination was still less discriminatory than MLST, the application of this combination test could substantially reduce the number of isolates that need to be tested by MLST. On the other hand, MLST itself may be less discriminatory than PFGE. In the study of UPEC, isolates included in one ST are usually also clustered by PFGE with at least 68% similarity (4, 15, 20).
In conclusion, fimH SNP analysis together with PCR phylogrouping seems to be a simple strain typing tool for epidemiological studies of E. coli isolates. The sequence analysis involves a single small (424 bp) DNA fragment. The phylogrouping analysis involves comparison of only up to three bands of known molecular weight; it does not involve analysis of electrophoresis banding patterns, as does PFGE or PCR fingerprinting. This analysis requires only 2 sets of primers as opposed to 7 with MLST. Although this typing tool may not replace MLST, this method is suitable for screening large collections of E. coli isolates, allowing for the rapid identification of STs or CCs. Furthermore, concatenated MLST and fimH sequences were found to be more discriminatory than concatenated sequences based on MLST alone. However, the epidemiological usefulness of the clusters formed by fimH typing needs further studies that include detailed patient clinical and epidemiologic information.
This study was supported by Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq), Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES), and Fundação de Amparo à Pesquisa do Estado do Rio de Janeiro (FAPERJ), Brazil; the Fogarty International Program in Global Infectious Diseases (grant TW006563); and grant AI059523 from the National Institutes of Health.
Published ahead of print on 16 December 2009.