Search tips
Search criteria 


Logo of iaiPermissionsJournals.ASM.orgJournalIAI ArticleJournal InfoAuthorsReviewers
Infect Immun. 2009 November; 77(11): 4696–4703.
Published online 2009 August 31. doi:  10.1128/IAI.00522-09
PMCID: PMC2772509

Genetic Structure and Distribution of the Colibactin Genomic Island among Members of the Family Enterobacteriaceae[down-pointing small open triangle]


A genomic island encoding the biosynthesis and secretion pathway of putative hybrid nonribosomal peptide-polyketide colibactin has been recently described in Escherichia coli. Colibactin acts as a cyclomodulin and blocks the eukaryotic cell cycle. The origin and prevalence of the colibactin island among enterobacteria are unknown. We therefore screened 1,565 isolates of different genera and species related to the Enterobacteriaceae by PCR for the presence of this DNA element. The island was detected not only in E. coli but also in Klebsiella pneumoniae, Enterobacter aerogenes, and Citrobacter koseri isolates. It was highly conserved among these species and was always associated with the yersiniabactin determinant. Structural variations between individual strains were only observed in an intergenic region containing variable numbers of tandem repeats. In E. coli, the colibactin island was usually restricted to isolates of phylogenetic group B2 and inserted at the asnW tRNA locus. Interestingly, in K. pneumoniae, E. aerogenes, C. koseri, and three E. coli strains of phylogenetic group B1, the functional colibactin determinant was associated with a genetic element similar to the integrative and conjugative elements ICEEc1 and ICEKp1 and to several enterobacterial plasmids. Different asn tRNA genes served as chromosomal insertion sites of the ICE-associated colibactin determinant: asnU in the three E. coli strains of ECOR group B1, and different asn tRNA loci in K. pneumoniae. The detection of the colibactin genes associated with an ICE-like element in several enterobacteria provides new insights into the spread of this gene cluster and its putative mode of transfer. Our results shed light on the mechanisms of genetic exchange between members of the family Enterobacteriaceae.

Horizontal gene transfer between bacteria—even between different species—has been shown to be an important mechanism for exchange of genetic material. This confers a selective advantage to the recipient, e.g., the rapid acquisition of gene clusters coding for pathogenicity or fitness factors. The colibactin genomic island previously discovered in Escherichia coli (17, 23) displays several features of a horizontally acquired genomic region: (i) the chromosomal insertion into the asnW tRNA locus, (ii) the presence of a P4-like integrase gene, (iii) the presence of flanking 16-bp direct repeats, and (iv) an elevated G+C content relative to the E. coli core genome. This genomic island is ~54 kb in size and consists of 20 open reading frames (ORFs), of which 8 code for putative polyketide synthases, nonribosomal peptide synthetases, and hybrids thereof. Until the discovery of this island the only known nonribosomal peptide and polyketide/nonribosomal peptide hybrids in Enterobacteriaceae have been the iron chelators enterobactin and yersiniabactin, respectively (13, 29). In contrast to these iron chelators, the synthesized hybrid nonribosomal peptide-polyketide colibactin exerts a cytopathic effect on eukaryotic cells in vitro. Upon cocultivation of colibactin island-positive bacteria with eukaryotic cells, DNA double-strand breaks are induced, and the cells are arrested in the G2 phase of the cell cycle and exhibit megalocytosis and cell death (23). These effects are comparable to the effects of the cyclomodulin cytolethal distending toxin (27, 36), but the biological function of colibactin in vivo is still unknown.

An important mechanism during the evolution of bacteria is horizontal gene transfer. This contributes to the variability of bacterial genomes by enabling bacteria to acquire and incorporate genetic material into their genome, where it may form genomic islands (14). Such genetic material may not always be advantageous to the host and is therefore a genetic and metabolic burden for the bacteria. In this case bacterial genomes tend to lose this excessive information (1, 21). On the other hand, genetic material coding for pathogenicity or fitness factors confers a selective advantage to the host. In the case of pathogenic bacteria, this horizontally acquired genetic material may contribute to the colonization and invasion of host tissue. Increased bacterial fitness or pathogenicity promotes the stabilization of the corresponding determinants in the recipient's genome, and the stable integration of horizontally acquired DNA is most frequently connected to a distinct biological function (25).

Until now, the colibactin island was only detected in E. coli isolates of phylogenetic lineage ECOR-B2 (23) and was significantly associated with other virulence gene clusters among extraintestinal pathogenic E. coli (ExPEC) isolates of ECOR group B2 from diverse clinical sources and with a high virulence potential (18, 19). To learn more about the capacity of dissemination of this genomic island, we investigated its distribution, genetic conservation, and structural organization among members of the Enterobacteriaceae.


Bacterial strains and culture conditions.

A total of 640 enterohemorrhagic E. coli strains, 205 extraintestinal pathogenic E. coli isolates, 135 E. coli fecal isolates from healthy volunteers, and 56 E. coli isolates from diverse sources were included in the present study. Furthermore, 287 Klebsiella isolates were tested. This group was composed of 141 clinical Klebsiella pneumoniae isolates from France and Germany; 103 K. oxytoca strains (including a well-characterized collection from Sweden) (38); 14 K. terrigena, 11 K. planticola, 1 K. edwardsii, 1 K. rhinoscleromatis, and 1 K. ozaenae isolate; and 15 Klebsiella strains that were not further typed. A total of 114 Salmonella enterica isolates from different subspecies and serovars, including the SARC collection and 13 yersiniabactin-positive isolates of subspecies III and VI (26), and 40 Yersinia strains (including multiple Y. pestis, Y. pseudotuberculosis, Y. enterocolitica, and Y. kristensii isolates) were also tested for the presence of the colibactin island. Also, 33 Proteus strains (including several P. mirabilis, P. morganii, and P. penneri isolates), 17 Serratia strains, 12 Enterobacter strains (including 11 E. aerogenes and 1 E. cloacae strain), and 10 Shigella isolates (including 4 S. dysenteriae, 2 S. flexneri, 2 S. sonnei, and 2 S. boydii strains) were also included into the present study. In addition to the sequenced Citrobacter koseri strain ATCC BAA-895, four C. freundii isolates were screened for the colibactin gene cluster. We also examined two Photorhabdus luminescens, three Xenorhabdus spp., two Pantoea agglomerans, one Providencia sp., and one Erwinia herbicola isolate, as well as one Escherichia hermannii and one E. fergusonii isolate, for the presence of the colibactin island. E. coli JM109 [recA1 endA1 gyrA96 thi-1 hsdR17(rK mK+) e14mcrA) supE44 relA1 Δ(lac-proAB)/F′ (traD36 proAB+ lacIq lacZΔM15)] was used to prepare competent cells. A deletion mutant of colibactin-producing E. coli strain Nissle 1917 that carries a 29.5-kb deletion comprising the yersiniabactin determinant (ybtS-fyuA) was used to investigate the dependence of colibactin expression on the presence of the yersiniabactin determinant. Depending on the experiments, the strains were grown in Luria broth or on Mueller-Hinton agar at 37°C for 18 to 24 h.

Detection of the colibactin island in different isolates.

The presence of the colibactin island among enterobacterial isolates was determined by PCR using primers published previously (23). The initial PCR screening was performed with the primers ORF 1907-1908 and ORF 1919-1920 using E. coli strain Nissle 1917 genomic DNA as a positive control. Intra-colibactin island-specific PCRs were then performed using the primers ORF 1911-1912, ORF 1913-1914, ORF 1915-1918, and ORF 1920-1922. The primers asnW-PAIleftend and asnW-PAIrightend specific for the left and right junctions of the colibactin island, respectively, were also used.


The allocation of the E. coli isolates to different clonal lineages was performed as described elsewhere ( Sequence types (STs) were assigned using the E. coli multilocus sequence typing (MLST) database hosted at the University College Cork, Cork, Ireland ( Information regarding new STs was deposited at the E. coli MLST database.

Sequencing of the clbA gene and the left junction of the colibactin island.

According to the published sequence of the colibactin island (accession no. AM229678), primers ClbA 1F and ClbA 1R (see Table S1 in the supplemental material) were used to amplify a 735-bp fragment from the K. pneumoniae strain CF1 genomic DNA. Amplification was performed using Platinum Taq DNA polymerase (Invitrogen) according to the manufacturer's instructions. The PCR fragment was purified with the NucleoSpin Extract II kit from Macherey-Nagel. The primers ClbA 1F and asnW-PAIleftend2 (see Table S1 in the supplemental material) were used to amplify a 3,303-bp fragment from strain CF1. Amplification was performed by using the Expand Long Template PCR system (Roche Diagnostics, Meylan, France) according to the manufacturer's instructions. The PCR fragment was purified, ligated into the pDrive cloning vector (Qiagen PCR cloning kit), and transformed into JM109 competent cells. Plasmid DNA was isolated by using NucleoSpin plasmid (Macherey-Nagel), and the left-hand junction of the colibactin determinant was sequenced by using the universal primers SP6 and T7 promoter. Sequences were commercially obtained from Cogenics (Meylan, France). Sequence homology was analyzed by using the BLAST 2.0 search logarithm at the National Center for Biotechnology Information (2). The sequence has been submitted to the National Center for Biotechnology Information database (accession no. FJ899134). Putative ORFs were identified by using Vector NTI (InforMax, Oxford, United Kingdom) and Artemis (30). The Artemis Comparison Tool was used as a DNA sequence comparison viewer (10).

Characterization of the colibactin determinant left-hand sequence context.

The left-hand junction of the colibactin gene cluster in other colibactin-positive K. pneumoniae and E. aerogenes strains was amplified with the primers pksp F and asnW-PAlleftend2 (see Table S1 in the supplemental material). In addition, ICEKp1-like elements adjacent to the colibactin determinant were searched among the colibactin-positive K. pneumoniae and E. aerogenes strains by PCR using the primer fyuA (5′ region) and virB1 (3′ region) genes as described by Lin et al. (20). Primers outside the middle region (HPI 3′-F and virB1-F inverse) and primers derived from the left (orf3-R) and right (orf16-F) parts of the middle region were also used. Analysis of the C. koseri strain ATCC BAA-895 genome sequence (accession no. CP000822) indicated the insertion of a DNA region with similarity to ICEEc1 and ICEKp1 between the clbQ gene and the intP4 homolog of the colibactin gene cluster. To screen this 85-kb DNA region between the colibactin biosynthesis determinant and the next asn tRNA gene of C. koseri strain ATCC BAA-895 in more detail in other enterobacterial isolates, 102 primer pairs (see Table S1 in the supplemental material) were designed to screen for the presence of this genomic region by overlapping PCRs.

Determination of chromosomal integration sites.

To determine the chromosomal integration site of the colibactin island in Klebsiella and E. coli strains, restriction fragments of genomic DNA were sequenced by inverse PCR (24). Alternatively, genomic DNA was directly sequenced by primer walking starting from the regions with known nucleotide sequences. In case of direct sequencing of genomic DNA, a Qiagen genomic DNA isolation kit was used to isolate the genomic DNA. A total of 6 μg of DNA was used as a template for direct sequencing of genomic DNA using an ABI 310 sequencer. Primer concentrations ranged from 0.5 to 1 μM. For destabilization of DNA secondary structures, betaine was added to a concentration of 0.25 M.

Analysis of the variable-number tandem repeat (VNTR) region.

The number of variable repeats between clbB and clbR was determined by DNA sequence analysis upon amplification of the corresponding genomic region by PCR using the primers varregionPKS.for and varregionPKS.rev. The resulting PCR product was then purified by using a QIAquick PCR purification kit (Qiagen) and directly used as a template for DNA sequencing with the primer pair varregionPKSseq.for and varregionPKSseq.rev using an ABI 310 sequencer.

Cell culture, bacterial infections, γH2AX staining, and cell cycle analysis.

For bacterial infections, overnight Luria broth cultures of bacteria were diluted in interaction medium (Dulbecco modified Eagle medium, 5% fetal calf serum, 25 mM HEPES [Invitrogen]), and ~50% confluent HeLa cell cultures (ATCC CCL2) were infected at a multiplicity of infection of 100. Cells were washed three to six times 4 h after inoculation and incubated in Dulbecco modified Eagle medium, 10% fetal calf serum, and 200 μg of gentamicin or 500 μg or streptomycin/ml until analysis as described previously (23). Briefly, for γH2AX staining the cells were fixed in 95% methanol-5% acetic acid and then incubated with anti-phospho(Ser139)-H2AX antibodies (JBW301; Upstate), followed by fluorescein isothiocyanate-conjugated secondary antibodies. DNA was stained with TO-PRO-3 (Invitrogen), and images were acquired with an Olympus IX70 laser scanning confocal microscope, objective PlanApo ×60 (NA 1.4), and Fluoview FV500 software, the confocal aperture being set to achieve a z optical thickness of ~0.5 μm. For cell cycle analysis, nuclear suspensions were made directly from adherent cells with 0.1% sodium citrate, 1% NP-40, 50 μg of propidium iodide/ml, and 250 μg of RNase/ml. Nuclei DNA content data were acquired with a FACSCalibur (Becton Dickinson) and analyzed with FlowJo software (Tree Star).


Prevalence of the colibactin island among different genera and species of the Enterobacteriaceae.

A total of 1,565 bacterial isolates of different enterobacterial genera and species (Table (Table1)1) were screened by PCR (Fig. (Fig.1)1) for the presence of the colibactin determinant.

FIG. 1.
Genetic structure of the colibactin island of E. coli strain IHE3034. This genomic island is flanked by direct repeats (DR) and is inserted at asnW into the bacterial chromosome. The colibactin biosynthesis gene cluster is indicated by gray arrows. DNA ...
Presence of colibactin island in different members of the family Enterobacteriaceae and related genera

In E. coli, 104 of 1,092 (9.5%) isolates tested harbored the island, although the majority (73.1%) of the colibactin-positive E. coli strains were clinical ExPEC, and 26.9% were commensal E. coli strains isolated from healthy volunteers. In contrast, none of the 689 intestinal pathogenic E. coli strains tested harbored the colibactin island. Interestingly, this gene cluster was only present in E. coli strains of phylogenetic lineage B2, except three extended-spectrum β-lactamase-positive E. coli O153:H31 isolates of ECOR group B1 (U12633, U19010, and U15156). According to MLST, these three strains belong to ST 101.

The colibactin island was detected not only in E. coli but also in 5 of 141 (3.5%) K. pneumoniae strains (strains CF1, CF44, 41, Kp52145, and SB3431), 3 of 11 (27.3%) E. aerogenes strains (strains 20, 50, and 64), and a C. koseri isolate (ATCC BAA-895). These colibactin island-positive strains are clinical extraintestinal pathogenic isolates, which are frequently resistant to several antibiotics.

The functionality of the polyketide gene cluster in K. pneumoniae, E. aerogenes, E. coli B1, and C. koseri strains was confirmed on HeLa cell cultures (Fig. (Fig.2).2). The cells exposed to each colibactin island-positive strain exhibited histone H2AX phosphorylation, cell body, and nucleus enlargement (megalocytosis), G2 cell cycle arrest and DNA fragmentation (sub-G1 peak indicative of cell death), findings indicating that all colibactin-producing enterobacteria induced host DNA double-strand breaks similarly to colibactin-positive E. coli strains of phylogenetic group B2 (23). Thus, the cytopathic phenotype associated with the colibactin island was fully conserved in the different enterobacterial isolates.

FIG. 2.
Phenotypic analysis of colibactin expression in different enterobacteria. HeLa cells were infected for 4 h with C. koseri, K. pneumoniae, E. aerogenes, and ECOR-B1 E. coli representative isolates or with DH10B pBACpks as a positive control (23). In the ...

A positive correlation was observed between the presence of the colibactin determinant and the high pathogenicity island (HPI) coding for the siderophore system yersiniabactin (9): all colibactin-positive strains were also yersiniabactin positive. Nevertheless, the presence of the HPI was not always associated in E. coli, K. pneumoniae, and E. aerogenes with that of the colibactin genes. Similarly, in S. enterica isolates of subspecies IIIa, IIIb, and VI that have been previously described to be HPI positive (26), the colibactin gene cluster could not be detected by PCR screening. Expression of colibactin was independent of yersiniabactin expression. This was corroborated by the fact that colibactin could be expressed in yersiniabactin-negative E. coli K-12 strain DH10B (23) and also by the comparison of the cytopathic effect of strain Nissle 1917 and its HPI-negative mutant since both strains induced the cytopathic effect characteristic for colibactin expression (data not shown).

Size of the VNTR of the colibactin island.

A DNA sequence comparison of the available genome sequences of E. coli strains 536 (accession no. NC_008253), UTI89 (NC_007946), CFT073 (NC_004431) and the colibactin island of E. coli isolate IHE3034 (accession no. AM229678), as well as sample sequencing of clbA of K. pneumoniae strain CF1, indicated that the colibactin biosynthesis and secretion determinant was generally highly conserved (>98% nucleotide sequence identity) in the different enterobacterial isolates. As an exception, a VNTR between clbB and clbR exhibited marked size variations in individual isolates (Fig. (Fig.1).1). DNA sequence analysis of this region in 99 different colibactin-positive extraintestinal pathogenic or commensal E. coli and Klebsiella isolates revealed that the repeat region comprises between 2 and 20 repeats of the octanucleotide sequence 5′-ACAGATAC-3′ and thus typically represents a VNTR (22). The most prevalent variants of the repeat region consist of 6 to 10 repeats. In the case of the C. koseri isolate ATCC BAA-895, 11 repeat units were found. VNTRs have been used as DNA markers for molecular typing of several bacterial species (11). A correlation between (i) the size of the region, (ii) even or odd numbers of repeats, (iii) relatedness of strains (as determined by MLST), (iv) the pathotype, and (vi) colibactin activity could not been detected thus far (data not shown).

Sequence context of the tRNA-proximal junction of the colibactin determinant.

In ECOR group B2 E. coli, the colibactin gene cluster was found to be integrated at the asnW tRNA gene. To characterize the genetic context of the colibactin gene cluster in K. pneumoniae, a 3,303-bp PCR fragment covering the left junction was amplified in strain CF1 with the primers ClbA 1F and asnW-PAIleftend2 and subsequently sequenced (accession no. FJ899134). Database comparison of this sequence revealed a 3,243-bp fragment with 99% identity to the genome sequence of C. koseri ATCC BAA-895 (accession no. NC_009792). In this fragment, the E. coli IHE3034 1,841-bp region upstream of ORF2 comprising the asnW tRNA locus and the P4-like integrase-encoding gene intP4 (accession no. AM229678) was replaced in K. pneumoniae CF1 and C. koseri ATCC BAA-895 by a 1,174-bp DNA stretch without significant homology, followed by a 1,134-bp region with 83% identity to ORFs coding for the putative MobC and MobB proteins of pCRY of Yersinia pestis biovar Microtus strain 91001 (NC_005814) and pMET-1 of K. pneumoniae FC1 (EU383016), respectively (Fig. (Fig.33 and and4).4). The sequence context of the putative mobC (CKO_00879) and mobB (CKO_00880) genes in C. koseri BAA-895 (CKO_00917 to CKO_00879) exhibits 91 to 98% nucleotide identity to the asn tRNA gene-associated integrative and conjugative elements ICEEc1 (AY233333) and ICEKp1 (AB298504), respectively.

FIG. 3.
Association of an ICE-like DNA region with the colibactin determinant in different members of the family Enterobacteriaceae. The tRNA-proximal sequence context of the integrative element comprising the colibactin gene cluster in C. koseri ATCC strain ...
FIG. 4.
Genetic structure of the sequence context of the colibactin determinant in C. koseri strain ATCC BAA 895 and its comparison to other corresponding enterobacterial genome regions. Nucleotide sequence homology between different DNA regions is indicated ...

Detection of the ICE element and identification of its chromosomal insertion site in enterobacterial isolates.

Colibactin-positive enterobacterial isolates were screened by PCR with 102 primer pairs (see Table S1 in the supplemental material) for the presence of the 85-kb chromosomal segment of C. koseri BAA-895 that covers the ICE-like region located between the colibactin determinant to the asn tRNA gene (Fig. (Fig.3).3). Interestingly, the colibactin gene cluster present in K. pneumoniae, E. aerogenes, and ECOR-B1 E. coli isolates was part of an ICE similar to that of C. koseri BAA-895. The genetic organization of the ICE-like element containing the colibactin genes in C. koseri was structurally conserved in these isolates (Fig. (Fig.3).3). Certain regions including genes involved in DNA transfer and mobilization and yersiniabactin biosynthesis, as well as some hypothetical ORFs further upstream of the yersiniabactin determinant, exhibited minor DNA sequence variation (Fig. (Fig.33 and Table Table2),2), as evidenced by the necessity to amplify these regions with different combinations of individual primers located further up- or downstream (Fig. (Fig.3).3). Additional PCR screenings with the primers fyuA-F, fyuA-R, virB1-F, virB1-R, HPI 3′-F, virB1-F inverse, orf3-R, and orf16-f (see Table S1 in the supplemental material) designed on the basis of the ICEKp1 nucleotide sequence further supported our results (data not shown). One region, located within the yersiniabactin determinant of K. pneumoniae Kp52145, differed in size from the C. koseri sequence because of a 331-bp deletion of the C. koseri chromosomal region from positions 875542 to 875873 that affected ORFs CKO_00887 to CKO_00889 coding putatively for hypothetical proteins. In K. pneumoniae SB3431 and CF44 and in E. aerogenes strain 50, a 100- to 200-bp insertion was detected at positions 910971 to 912006 of the C. koseri chromosome. Finally, a 1,330-bp insertion sequence (IS2), integrated at base 950500 of the reference C. koseri ATCC BAA895 genome, was found to interrupt ORF CKO_00952 (Fig. (Fig.33 and Table Table22).

Variable regions of ICECk1 in K. pneumoniae, E. aerogenes, and E. coli

Inverse PCR and direct primer walking on the chromosome was performed to identify the ICE chromosomal insertion site in all of the colibactin-positive E. coli and K. pneumoniae isolates. Whereas the colibactin island has been exclusively detected in the asnW tRNA locus in ExPEC and commensal E. coli, the ICE-like region containing the colibactin genes was inserted at the asnU locus in the three E. coli strains of ECOR group B1. In K. pneumoniae, the integrative element was inserted into different asn tRNA genes: in asn-1 for strains CF1 and 41, in asn-2 for strain Kp52145, and in asn-3 for strain SB3431. In K. pneumoniae CF44 the chromosomal insertion site of the ICE-like element differs from that found in the other K. pneumoniae isolates but is not adjacent to the asn-4 tRNA gene.

Comparative analysis of the ICE-like element in C. koseri and other related enterobacterial sequences.

In C. koseri, the putative ICE comprising the colibactin determinant was chromosomally located in close vicinity to asparagine (asn) tRNA genes (between CKO_0838 and CKO_0953) and was flanked by 17-bp direct repeats. One 17-bp direct repeat sequence was found upstream of the integrase gene (CKO_00917) of the yersiniabactin determinant, and another direct repeat was downstream of CKO_0855 at the right end of the colibactin gene cluster.

The yersiniabactin gene cluster located in this ICE-like element showed 98% DNA sequence identity to its counterparts in ICEEc1 and ICEKp1. In the latter two elements, this siderophore determinant is followed by DNA sequences (region IICEEc1 and IIICEEc1 or segments 3′-1ICEKp1 and 3′-2ICEKp1, respectively) that are involved in conjugative transfer and mating-pair formation, as well as by a third variable segment (region IIIICEEc1 or segment 3′-3ICEKp1) that comprises hypothetical genes (20, 33).

Overall, the right-hand region of the yersiniabactin gene cluster of C. koseri ATCC BAA895 exhibited a similar structural organization of the segments 3′-1 (virB1-virB11), 3′-2 (mobB, mobC, and a putative origin of transfer [oriT] located upstream of mobB at positions 872973 to 872893 on the C. koseri chromosome), and 3′-3 (colibactin gene cluster) (Fig. (Fig.4).4). On the other hand, comparative analysis of the different ICEs revealed a modular organization with homologous regions (the yersiniabactin gene cluster and the genes involved in mating-pair formation and DNA mobilization) separated by variable DNA stretches: in ICEKp1, the yersiniabactin determinant is followed by a so-called “middle region” that is similar to a portion of the large virulence plasmid pLVPK of K. pneumoniae CG43 (NC_005249). In C. koseri, however, fragments of this middle region lacking the virulence-associated genes vagC-vagD, iroN-iroB-iroC-iroD, and rmpA are located at both ends of the ICE-like element and exhibit 76 to 100% nucleotide identity to parts of plasmid pLVPK (Fig. (Fig.4).4). The conserved region required for mating-pair formation and DNA mobilization is identical to a region of the K. pneumoniae multiresistance plasmid pMET1 (87 to 93%), to a region of the Y. pestis plasmid pCRY (88 to 91%), and to the Enterobacter sakazakii plasmid pESA2 (77 to 79%). In the case of C. koseri ATCC strain BAA895, the colibactin-positive K. pneumoniae, E. aerogenes, and E. coli ECOR-B1 isolates, this region is then followed by the colibactin determinant, whereas other variable DNA sequences can be found at the right-hand end of this region in ICEKp1 and ICEEc1 (Fig. (Fig.44).


In this study, we report on the prevalence, genetic structure, and sequence context of the colibactin island in Enterobacteriaceae. We show that this polyketide determinant is present on the chromosome of various coliform enterobacterial species such as E. coli, C. koseri, K. pneumoniae, and E. aerogenes and that its presence is associated with that of the yersiniabactin gene cluster, which also encodes a polyketide (28). The distribution of the colibactin determinant in Enterobacteriaceae resembles that of the HPI coding for yersiniabactin as described previously (3, 31) and thus further corroborates our observation that the colibactin gene cluster is linked to the yersiniabactin determinant. In analogy to the HPI, the colibactin genes seem to be more widely distributed among certain lineages of E. coli than among other coliform enterobacteria: the colibactin genes have been exclusively detected in E. coli isolates of phylogenetic lineage ECOR-B2 thus far (23). This island was also found to be significantly associated with multiple other virulence gene clusters, including the HPI among ExPEC, as well as with high virulence potential (18, 19). Furthermore, in E. coli the colibactin gene cluster, together with other ExPEC virulence genes, has been reported to be more frequently detected in mucosa-associated isolates of patients with colon cancer than in strains isolated from healthy individuals (5). The higher prevalence of colibactin genes in pathogenic isolates relative to commensal variants is probably generally true in enterobacteria since the other colibactin-positive enterobacterial strains described in the present study are also clinical pathogenic isolates. Whether colibactin expression contributes to an increased pathogenic potential remains to be elucidated. On the other hand, the colibactin determinant has been detected in probiotic E. coli strain Nissle 1917 (15) but not in probiotic E. coli strain O83:K24:H31 (Colinfant) (16). This observation indicates that colibactin expression is not a prerequisite for efficient intestinal colonization and the increased competitiveness that may be associated with the probiotic character of these strains.

Another striking similarity between the colibactin and yersiniabactin polyketide determinants is their localization within two different forms of genetic elements usually associated with asn tRNA genes. Thus far, these polyketide determinants have been described as part of individual genomic or pathogenicity island, in which they are, together with a P4-like bacteriophage integrase gene, flanked by direct repeats (6, 23). Alternatively, both polyketide gene clusters together can be part of a genomic region with similarity to integrative and conjugative elements that comprises additional genes coding for a conjugative DNA transfer and mobilization system (7, 8). These integrative elements can transfer the yersiniabactin and probably also the colibactin determinants to new recipients, and their transfer activity and plasticity are thus crucial for the genetic diversity observed at asn tRNA loci in yersiniabactin- and colibactin-positive enterobacteria. The presence of a second yersiniabactin determinant (CKO_01243 to CKO_01253) located in a different genomic context ~243 kb away from the ICE-like element harboring the yersiniabactin and colibactin genes in the C. koseri strain BAA-895 genome also mirrors the considerable genome plasticity involved in the spread of this siderophore determinant. Interestingly, this copy of the yersiniabactin gene cluster lacks an integrase gene and the flanking 17-bp direct repeats.

Based on amino acid sequence comparison between integrases encoded by different integrative elements including genomic islands, integrons, ICEs, conjugative transposons and bacteriophages, it has been recently suggested that genomic islands form a distinct group of ancient genetic elements unrelated to the other integrative elements (4). The analysis of the integrase proteins CKO_0917 and CKO_0953 encoded by the integrative element in C. koseri comprising the colibactin determinant indicates that they are 95% identical to each other. In addition, they are highly similar to the integrases (IntG) of other archetypal genomic islands but not to those of well-described ICEs (IntC). Similarly, the asn tRNA gene-associated integrase gene of the HPI in ECOR-B2 E. coli isolates codes for an IntG-like integrase (4). The ICEEc1-encoded integrase has been shown to differ from other HPI-encoded integrases, and Boyd et al. hypothesized that an ICE used the same chromosomal insertion site as the HPI in this strain, thereby displacing the HPI integrase (4).

The similarity between the integrative element in C. koseri and other chromosomal integrative elements such as the ICEEc1 or ICEKp1 (20, 33, 34), as well as their partial homology to regions of enterobacterial plasmids such as pCRY (35), pMET1 (34), pESA2 (CP000784), and pLVPK (12) involved in DNA transfer and mobilization indicates that chromosomally and plasmid-encoded virulence and resistance-associated gene clusters can efficiently recombine. Our results also demonstrate that active genetic exchanges occur between different members of the Enterobacteriaceae (Fig. (Fig.4).4). It can be assumed that the HPI and the colibactin island, as they have been described in the archetypal ECOR-B2 strains, may have evolved from an integrative element by DNA rearrangements and subsequent loss of ICE regions between the colibactin and yersiniabactin determinants. This could be in line with a selective pressure exerted on the yersiniabactin and colibactin determinants, a finding indicative of a role in bacterial fitness.

ICEEc1 and ICEKp1 can excise themselves from the chromosome and, upon circularization and subsequent transfer into a suitable host, integrate into asn tRNA loci (20, 33). The colibactin determinant might spread among different enterobacteria by such a mechanism. Our finding that this integrative element is found inserted at different asn tRNA genes in E. coli, K. pneumoniae, and E. aerogenes supports this hypothesis. These asn-tRNA locus-associated genetic elements seem to represent vehicles of variable modular composition that can be transmitted among different members of the Enterobacteriaceae. This group of enterobacterial ICE-like genetic elements seems to be responsible for the dissemination of the yersiniabactin gene cluster and other DNA regions, represented by different region 3 (ICEEc1) and segment 3′-3 (ICEKp1) variants or the colibactin determinant, once they are colocalized. Comparison of the average G+C content of the genomes of C. koseri ATCC BAA_895 (53% G+C), K. pneumoniae MGH 78578 (57% G+C), and E. coli (50% G+C) to that of the colibactin gene cluster (53% G+C) as we describe here suggests that this polyketide determinant may originate from C. koseri. It is, nevertheless, still unclear from which source C. koseri acquired the colibactin gene cluster.

Interestingly, the four asn tRNA genes in E. coli and thus the yersiniabactin and colibactin gene clusters are located within a “hot spot of phylogenetic incongruence” that is characterized by a high frequency of DNA insertion and recombination (37). It has also been recently suggested that the HPI has been propagated in E. coli by homologous recombination after a unique and recent chromosomal integration event (32). The correlation between the presence of the yersiniabactin and colibactin determinants together with their genetic linkage implies that the high conservation of both polyketide gene clusters in E. coli is a result of this unique and recent acquisition followed by their dissemination via homologous recombination. Although we still miss the origin of the colibactin determinant, our findings demonstrate that coliform enterobacteria acquired this gene cluster recently together with the HPI and that mobile DNA elements such as plasmids and ICE-like elements are involved in its horizontal transfer.

Supplementary Material

[Supplemental material]


We thank B. Plaschke (Würzburg, Germany), M. Boury (Toulouse, France), and N. Charbonnel (Clermont-Ferrand, France) for excellent technical assistance. We thank L. Wieler (Berlin, Germany) for kindly providing the ECOR group B2 enteropathogenic E. coli strains IHIT0304, IHIT Z311-94, IHIT Z412-94, 549-00, and 312-00; L. Emődy (Pécs, Hungary) for providing 30 Proteus isolates; and H. Sahly (Kiel, Germany) for Klebsiella and Serratia isolates.

The Würzburg group was supported by the German Research Foundation (SFB479, TP A1). J.P. received a Ph.D. scholarship from the French-German Graduate College (Nice-Würzburg) funded by the German Research Foundation. This study was carried out with the support of the European Virtual Institute for Functional Genomics of Bacterial Pathogens (CEE LSHB-CT-2005-512061) and the ERA-NET project Deciphering the Intersection of Commensal and Extraintestinal Pathogenic E. coli.


Editor: A. J. Bäumler


[down-pointing small open triangle]Published ahead of print on 31 August 2009.

Supplemental material for this article may be found at


1. Ahmed, N., U. Dobrindt, J. Hacker, and S. E. Hasnain. 2008. Genomic fluidity and pathogenic bacteria: applications in diagnostics, epidemiology, and intervention. Nat. Rev. Microbiol. 6:387-394. [PubMed]
2. Altschul, S. F., T. L. Madden, A. A. Schaffer, J. Zhang, Z. Zhang, W. Miller, and D. J. Lipman. 1997. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 25:3389-3402. [PMC free article] [PubMed]
3. Bach, S., A. de Almeida, and E. Carniel. 2000. The Yersinia high-pathogenicity island is present in different members of the family Enterobacteriaceae. FEMS Microbiol. Lett. 183:289-294. [PubMed]
4. Boyd, E. F., S. Almagro-Moreno, and M. A. Parent. 2009. Genomic islands are dynamic, ancient integrative elements in bacterial evolution. Trends Microbiol. 17:47-53. [PubMed]
5. Bronowski, C., S. L. Smith, K. Yokota, J. E. Corkill, H. M. Martin, B. J. Campbell, J. M. Rhodes, C. A. Hart, and C. Winstanley. 2008. A subset of mucosa-associated Escherichia coli isolates from patients with colon cancer, but not Crohn's disease, share pathogenicity islands with urinary pathogenic E. coli. Microbiology 154:571-583. [PubMed]
6. Buchrieser, C., R. Brosch, S. Bach, A. Guiyoule, and E. Carniel. 1998. The high-pathogenicity island of Yersinia pseudotuberculosis can be inserted into any of the three chromosomal asn tRNA genes. Mol. Microbiol. 30:965-978. [PubMed]
7. Burrus, V., J. Marrero, and M. K. Waldor. 2006. The current ICE age: biology and evolution of SXT-related integrating conjugative elements. Plasmid 55:173-183. [PubMed]
8. Burrus, V., G. Pavlovic, B. Decaris, and G. Guedon. 2002. Conjugative transposons: the tip of the iceberg. Mol. Microbiol. 46:601-610. [PubMed]
9. Carniel, E. 2001. The Yersinia high-pathogenicity island: an iron-uptake island. Microbes Infect. 3:561-569. [PubMed]
10. Carver, T. J., K. M. Rutherford, M. Berriman, M. A. Rajandream, B. G. Barrell, and J. Parkhill. 2005. ACT: the Artemis Comparison Tool. Bioinformatics 21:3422-3423. [PubMed]
11. Chang, C. H., Y. C. Chang, A. Underwood, C. S. Chiou, and C. Y. Kao. 2007. VNTRDB: a bacterial variable number tandem repeat locus database. Nucleic Acids Res. 35:D416-D421. [PMC free article] [PubMed]
12. Chen, Y. T., H. Y. Chang, Y. C. Lai, C. C. Pan, S. F. Tsai, and H. L. Peng. 2004. Sequencing and analysis of the large virulence plasmid pLVPK of Klebsiella pneumoniae CG43. Gene 337:189-198. [PubMed]
13. Crosa, J. H., and C. T. Walsh. 2002. Genetics and assembly line enzymology of siderophore biosynthesis in bacteria. Microbiol. Mol. Biol. Rev. 66:223-249. [PMC free article] [PubMed]
14. Dobrindt, U., B. Hochhut, U. Hentschel, and J. Hacker. 2004. Genomic islands in pathogenic and environmental microorganisms. Nat. Rev. Microbiol. 2:414-424. [PubMed]
15. Grozdanov, L., C. Raasch, J. Schulze, U. Sonnenborn, G. Gottschalk, J. Hacker, and U. Dobrindt. 2004. Analysis of the genome structure of the nonpathogenic probiotic Escherichia coli strain Nissle 1917. J. Bacteriol. 186:5432-5441. [PMC free article] [PubMed]
16. Hejnova, J., U. Dobrindt, R. Nemcova, C. Rusniok, A. Bomba, L. Frangeul, J. Hacker, P. Glaser, P. Sebo, and C. Buchrieser. 2005. Characterization of the flexible genome complement of the commensal Escherichia coli strain A0 34/86 (O83:K24:H31). Microbiology 151:385-398. [PubMed]
17. Homburg, S., E. Oswald, J. Hacker, and U. Dobrindt. 2007. Expression analysis of the colibactin gene cluster coding for a novel polyketide in Escherichia coli. FEMS Microbiol. Lett. 275:255-262. [PubMed]
18. Johnson, J. R., B. Johnston, M. A. Kuskowski, J. P. Nougayrède, and E. Oswald. 2008. Molecular epidemiology and phylogenetic distribution of the Escherichia coli pks genomic island. J. Clin. Microbiol. 46:3906-3911. [PMC free article] [PubMed]
19. Le Gall, T., O. Clermont, S. Gouriou, B. Picard, X. Nassif, E. Denamur, and O. Tenaillon. 2007. Extraintestinal virulence is a coincidental by-product of commensalism in B2 phylogenetic group Escherichia coli strains. Mol. Biol. Evol. 24:2373-2384. [PubMed]
20. Lin, T. L., C. Z. Lee, P. F. Hsieh, S. F. Tsai, and J. T. Wang. 2008. Characterization of integrative and conjugative element ICEKp1-associated genomic heterogeneity in a Klebsiella pneumoniae strain isolated from a primary liver abscess. J. Bacteriol. 190:515-526. [PMC free article] [PubMed]
21. Moran, N. A. 2002. Microbial minimalism: genome reduction in bacterial pathogens. Cell 108:583-586. [PubMed]
22. Nakamura, Y., M. Leppert, P. O'Connell, R. Wolff, T. Holm, M. Culver, C. Martin, E. Fujimoto, M. Hoff, E. Kumlin, et al. 1987. Variable number of tandem repeat (VNTR) markers for human gene mapping. Science 235:1616-1622. [PubMed]
23. Nougayrède, J. P., S. Homburg, F. Taieb, M. Boury, E. Brzuszkiewicz, G. Gottschalk, C. Buchrieser, J. Hacker, U. Dobrindt, and E. Oswald. 2006. Escherichia coli induces DNA double-strand breaks in eukaryotic cells. Science 313:848-851. [PubMed]
24. Ochman, H., F. J. Ayala, and D. L. Hartl. 1993. Use of polymerase chain reaction to amplify segments outside boundaries of known sequences. Methods Enzymol. 218:309-321. [PubMed]
25. Ochman, H., J. G. Lawrence, and E. A. Groisman. 2000. Lateral gene transfer and the nature of bacterial innovation. Nature 405:299-304. [PubMed]
26. Oelschlaeger, T. A., D. Zhang, S. Schubert, E. Carniel, W. Rabsch, H. Karch, and J. Hacker. 2003. The high-pathogenicity island is absent in human pathogens of Salmonella enterica subspecies I but present in isolates of subspecies III and VI. J. Bacteriol. 185:1107-1111. [PMC free article] [PubMed]
27. Oswald, E., J. P. Nougayrède, F. Taieb, and M. Sugai. 2005. Bacterial toxins that modulate host cell cycle progression. Curr. Opin. Microbiol. 8:83-91. [PubMed]
28. Pelludat, C., A. Rakin, C. A. Jacobi, S. Schubert, and J. Heesemann. 1998. The yersiniabactin biosynthetic gene cluster of Yersinia enterocolitica: organization and siderophore-dependent regulation. J. Bacteriol. 180:538-546. [PMC free article] [PubMed]
29. Pfeifer, B. A., C. C. Wang, C. T. Walsh, and C. Khosla. 2003. Biosynthesis of yersiniabactin, a complex polyketide-nonribosomal peptide, using Escherichia coli as a heterologous host. Appl. Environ. Microbiol. 69:6698-6702. [PMC free article] [PubMed]
30. Rutherford, K., J. Parkhill, J. Crook, T. Horsnell, P. Rice, M. A. Rajandream, and B. Barrell. 2000. Artemis: sequence visualization and annotation. Bioinformatics 16:944-945. [PubMed]
31. Schubert, S., S. Cuenca, D. Fischer, and J. Heesemann. 2000. High-pathogenicity island of Yersinia pestis in Enterobacteriaceae isolated from blood cultures and urine samples: prevalence and functional expression. J. Infect. Dis. 182:1268-1271. [PubMed]
32. Schubert, S., P. Darlu, O. Clermont, A. Wieser, G. Magistro, C. Hoffmann, K. Weinert, O. Tenaillon, I. Matic, and E. Denamur. 2009. Role of intraspecies recombination in the spread of pathogenicity islands within the Escherichia coli species. PLoS Pathog. 5:e1000257. [PMC free article] [PubMed]
33. Schubert, S., S. Dufke, J. Sorsa, and J. Heesemann. 2004. A novel integrative and conjugative element (ICE) of Escherichia coli: the putative progenitor of the Yersinia high-pathogenicity island. Mol. Microbiol. 51:837-848. [PubMed]
34. Soler Bistué, A. J., D. Birshan, A. P. Tomaras, M. Dandekar, T. Tran, J. Newmark, D. Bui, N. Gupta, K. Hernandez, R. Sarno, A. Zorreguieta, L. A. Actis, and M. E. Tolmasky. 2008. Klebsiella pneumoniae multiresistance plasmid pMET1: similarity with the Yersinia pestis plasmid pCRY and integrative conjugative elements. PLoS ONE 3:e1800. [PMC free article] [PubMed]
35. Song, Y., Z. Tong, J. Wang, L. Wang, Z. Guo, Y. Han, J. Zhang, D. Pei, D. Zhou, H. Qin, X. Pang, Y. Han, J. Zhai, M. Li, B. Cui, Z. Qi, L. Jin, R. Dai, F. Chen, S. Li, C. Ye, Z. Du, W. Lin, J. Wang, J. Yu, H. Yang, J. Wang, P. Huang, and R. Yang. 2004. Complete genome sequence of Yersinia pestis strain 91001, an isolate avirulent to humans. DNA Res. 11:179-197. [PubMed]
36. Taieb, F., J. P. Nougayrède, C. Watrin, A. Samba-Louaka, and E. Oswald. 2006. Escherichia coli cyclomodulin Cif induces G2 arrest of the host cell cycle without activation of the DNA-damage checkpoint-signaling pathway. Cell Microbiol. 8:1910-1921. [PubMed]
37. Touchon, M., C. Hoede, O. Tenaillon, V. Barbe, S. Baeriswyl, P. Bidet, E. Bingen, S. Bonacorsi, C. Bouchier, O. Bouvet, A. Calteau, H. Chiapello, O. Clermont, S. Cruveiller, A. Danchin, M. Diard, C. Dossat, M. E. Karoui, E. Frapy, L. Garry, J. M. Ghigo, A. M. Gilles, J. Johnson, C. Le Bouguenec, M. Lescat, S. Mangenot, V. Martinez-Jehanne, I. Matic, X. Nassif, S. Oztas, M. A. Petit, C. Pichon, Z. Rouy, C. S. Ruf, D. Schneider, J. Tourret, B. Vacherie, D. Vallenet, C. Medigue, E. P. Rocha, and E. Denamur. 2009. Organized genome dynamics in the Escherichia coli species results in highly diverse adaptive paths. PLoS Genet. 5:e1000344. [PMC free article] [PubMed]
38. Tullus, K., B. Ayling-Smith, I. Kuhn, W. Rabsch, R. Reissbrodt, and L. G. Burman. 1992. Nationwide spread of Klebsiella oxytoca K55 in Swedish neonatal special care wards. APMIS 100:1008-1014. [PubMed]

Articles from Infection and Immunity are provided here courtesy of American Society for Microbiology (ASM)