Separating UPEC strains is contingent on the phylogenetic background, the acquisition of antibiotic resistance, and the presence or absence of virulence factor genes. In the last 4 years, the genomes of three UPEC strains have been sequenced (
Brzuszkiewicz et al., 2006;
Chen et al., 2006b;
Welch et al., 2002). Each of these sequencing endeavours has collectively added a tremendous amount of information about the genetic makeup of UPEC strains, but has done little to sort
E. coli into distinct subgroups because of the lack of comparative analysis. Genome sequencing of more
E. coli strains would close this gap, but optical mapping is a more rapid and cost-effective technique that could provide the necessary information to assess the prevalence of virulence factor genes and changes that have led to antibiotic resistance, as well as help group the UPEC into distinct subgroups.
In this study, we have optically mapped 33 clinical isolates and compared their maps with
in silico maps of sequenced strains of
E. coli, representing what is believed to be the first study in which optical mapping has served as the basis for clustering bacterial species as shown in Fig. . From our analysis, the
E. coli isolates divided into five subgroups using UPGMA alignments and a cut-off of 30

% dissimilarity. All of the sequenced UPEC strains (536, CFT073, UTI89) fell into the first subgroup, which also held most of the recent UPEC isolates and was the more diverse of the two UPEC subgroups. Our optical mapping scheme placed the APEC 01 strain in the same subgroup as the UTI89 strain. Both strains have similar MLST patterns (
Johnson et al., 2007). It is possible that some minimal set of virulence factors is found in the genomes of these strains that allows them to initiate infections outside of the intestinal tract. Within the second subgroup, the seven isolates exhibited less diversity at the genetic level than the first subgroup. However, none of the isolates were identical. Past studies have not shown the separation of UPEC strains into major subgroups that we have demonstrated in this study. Subgroup three had the K-12 laboratory substrains of
E. coli, W3110 and DH10B. The K-12 strain has lost many of the virulence factors associated with
E. coli and has been altered to accept foreign DNA more readily. EHEC strains were found in the fourth subgroup. Two groups have successfully used optical mapping to distinguish between individual strains of ETEC H10407 and EHEC O157

:

H7, but strain clustering was not performed (
Chen et al., 2006a;
Kotewicz et al., 2007,
2008). The fifth subgroup harboured an environmental strain of
E. coli isolated from a heavy metal-contaminated site in South Carolina. Surprisingly, the UPEC isolates were more different from the other non-UPEC isolates as compared with the differences between
Shigella isolates and non-UPEC strains.
The only significant difference between the tree based upon sequence (
Henz et al., 2005) and the tree based upon optical maps was the relative position of the
Shigella strains. A sequence-based tree will focus exclusively upon relative changes in conserved genes, whereas an optical map-based tree examines the overall structure of the genome. Since there are numerous genomic rearrangements between
E. coli and
Shigella (
Yang et al., 2005), focusing only on the conserved genes will tend to mask many of the differences between these bacteria. On the other hand, optical mapping gives an unbiased look at the whole genome as compared with a sequence-based clustering and will pick up the structural changes in the genome more readily.
A high degree of similarity of UPEC isolates from disparate locations was found across the dataset, as isolates from Boston, Massachusetts, clustered closely with isolates collected in Wisconsin. From this optical mapping dataset, the geographical location did not correlate with the backbone structure of the
E. coli isolates. Thus, optical mapping shows that the geographical structure appears to account for only a small fraction of the genomic variations of our isolates, which is consistent with an earlier study (
Selander et al., 1987).
Because optical mapping shows a genome-wide snapshot of each isolate at a fraction of the cost and time needed to sequence a genome, this technology could potentially be used to sort UPEC isolates into pathotypes. Prior attempts at pathotyping have been hampered by the limited genetic information available and the laborious process of screening for specific virulence factor genes (
Marrs et al., 2005;
Foxman et al., 1995). A confounding factor for UPEC is their high mutation rate (
Guttman & Dykhuizen, 1994;
Denamur et al., 2002), so virulence factor genes may be missed because of the heterogeneity that results from mutations that prevent primer binding to a site. Because of these genetic limitations, several UPEC strains have been sequenced to study their pathogenesis: CFT073 (
Mobley et al., 1990;
Welch et al., 2002), UTI89 (
Mulvey et al., 2001;
Chen et al., 2006b) and 536 (
Hacker et al., 1983). In this study, we have shown that
E. coli isolates sort into distinct subgroups with similar haemolytic profiles. Our optical maps also demonstrate that PAIs can be identified among the isolates using optical mapping, and this includes differences in PAI positioning within the genome. Acquisition of virulence factors, including those found on PAIs, is independent of the main backbone structure of
E. coli (
Lloyd et al., 2007,
2009;
Pupo et al., 1997,
2000), and the optical mapping illustrates that fact. The sorting of the
E. coli isolates by optical mapping into a few distinct groups offers some hope that the technique will assist in the pathotyping of UPEC strains. In this regard, the screening for virulence factor genes among the 33 UPEC isolates also showed strong correlations with the subgrouping represented by the optical mapping analysis. An examination of virulence factor gene distribution has been applied before to UPEC strains (
Bidet et al., 2005;
Bingen-Bidois et al., 2002;
Johnson & Stell, 2000;
Sabate et al., 2006;
Takahashi et al., 2006). These studies have shown a correlation between specific virulence factor genes and PAIs. We have also demonstrated this same correlation when comparing the data from Fig. with the data provided in Table . Subgroups of UPEC sorted by the optical mapping technology also sorted along similar lines based on the presence or absence of nine virulence factor genes. Although the UPEC isolates were quite diverse, an even more striking observation was that the one major subgroup based on a 30

% dissimilarity cut-off was very uniform for the absence of most of these key virulence factor genes. Thus, optical mapping separated the UPEC isolates along similar lines to those that would be associated with their pathotype.
Besides an assessment of PAIs and other virulence factor-related genetic elements, this study also points to the utility of using optical mapping for tracking antibiotic-resistance patterns among UPEC strains. The emergence and spread of antibiotic-resistant
E. coli strains is a serious health concern, particularly with respect to resistance to front-line drugs such as fluoroquinolones. Although plasmid-based resistance genes are one way to disseminate the resistance genes, more common are transposon or prophage insertion and the integration of the mobile genetic elements known as integrons, which have site-specific recombination systems built in (
Rowe-Magnus et al., 2002). Several groups have tried to correlate the phylogenetic background with antibiotic resistance, with mixed results (
Barl et al., 2008;
Bruant et al., 2006;
Graziani et al., 2009;
Piatti et al., 2008;
Rijavec et al., 2006;
Yu et al., 2004). The most successful techniques used DNA microarrays (
Barl et al., 2008;
Bruant et al., 2006;
Yu et al., 2004), but they are limited to the known genetic sequences available. In this study, all of the ciprofloxacin-resistant strains clustered together. A recent study has also shown that ciprofloxacin-resistant strains possess fewer virulence factor genes than ciprofloxacin-sensitive strains (
Graziani et al., 2009). The potential utility of optical mapping has been demonstrated for tracking antibiotic-resistant outbreak strains.
Although optical mapping has shown a lot of commonality among the isolates, the technique was able to identify each individual isolate. The relative ease with which
E. coli is able to gain and lose genetic elements, such as prophages and PAIs, means that each strain can be tagged as unique. Because optical mapping can track the large-scale changes mentioned above, it is capable of both clustering
E. coli strains into subgroups based on greater commonality, as shown in this study, and distinguishing between different isolates, as described elsewhere (
Kotewicz et al., 2007,
2008). Thus, we have shown for the first time that optical mapping is a viable way to subgroup bacteria and that it could be used as an alternative methodology to ribotyping or MLST analyses. In fact,
Kotewicz et al. (2007) used optical mapping to track mobile genetic elements. Overall, we see optical mapping as a powerful tool able to analyse whole-genome structure for a fraction of the cost and time needed for whole-genome sequencing.