General structure of pCoo.
The pCoo plasmid with a kanamycin resistance gene inserted into cooB was obtained from the ETEC-derived strain JEF100 (Table ) and its sequence determined. Analysis of the sequence of this 98,396-bp plasmid revealed 129 predicted CDSs, including the inserted Km gene (see Table S1 in the supplemental material for the locations and homologies of all CDSs of pCoo). Like many other virulence plasmids, pCoo appears to be a mosaic consisting of regions derived from multiple sources. DNA homologous to IS elements is scattered throughout the plasmid and comprises 24% of the total pCooKm DNA (Fig. ). There are 11 regions between 195 and 7,612 bp in length that contain intact and partial IS elements. Many of these regions contain fragments of several different IS elements, which suggests that they are not the result of single insertion events, although some recent simple insertions were identified.
FIG. 1. Circular representation of pCooKm. The concentric circles represent, from the outside in, the following: 1 + 2, coding sequences on the forward and reverse strands (red, plasmid functions [replication, partition, maintenance, etc.]; green, cell (more ...)
The G+C content of the pCoo DNA sequences between these IS elements ranges from 57% to 35%. This wide range of G+C content is consistent with pCoo being derived by the amalgamation of DNA from a variety of sources. The pCoo sequences between the IS fragments are labeled with a bold lowercase letter in Fig. , and they have G+C contents of 49%, 40%, 49%, 46%, 42%, 45%, 53%, 57%, 36%, 35%, and 36% for regions a to k, respectively. These regions are described below.
A large contiguous part of pCoo region a
(bp 83057 through 98396 and 1 through 2250) is 95% identical at the level of DNA to a contiguous region of plasmid R64 (bp 105487 through 120826; accession no. AP 005147
). Part of region a
of pCoo (pCoo bp 72009 through 78749) is similar to the closely related plasmids R64 and ColIb-P9 (accession no. AP 005147
and NC 002122
, respectively). Although these plasmids are in the same incompatibility group, there are DNA segments that differ in sequence between them. This part of pCoo appears to be a hybrid between these two plasmids because it has one segment (bp 72009 through 72759) similar to a unique region in ColIb-P9 and another (bp 75370 through 77265) similar to a unique region in R64. Within the part similar to both R64 and ColIb-P9, there are two genes with no similarities to genes of known function. The rest of the CDSs in this section have homologies to genes of known function and will be discussed later.
A homologue of the AraC family regulatory gene csvR
(accession no. X60106
) is located in a part of pCoo (bp 65421 through 69147; spanning regions i
, the IS element between them, and portions of the flanking IS elements) that is 96% identical to a section of a plasmid from an ETEC strain of serotype O167 H5 (6
). Although CsvR is 75% identical to Rns, the positive regulator for the coo
operon present on another plasmid in the same ETEC strain, the pCoo copy of CsvR contains a deletion leading to a frameshift, so it is not expected to be expressed.
A number of conserved hypothetical genes in pCoo show similarity to genes from several other different gram-negative chromosomes and plasmids, contributing to the impression of a mosaic origin for pCoo.
Potential virulence genes encoded on pCoo.
In addition to the four coo genes (Fig. , region b) required for CS1 pilus production, several predicted gene products of pCoo resemble proteins implicated in pathogenesis in other bacteria, and these may be important for the virulence of an ETEC strain containing pCoo.
of pCoo (Fig. , between bp 32036 and 36029) is 95% identical to the shf
operon of the Shigella flexneri
virulence plasmid pINV (accession no. AY206446
]). The two last proteins of the pCoo shf
locus are 97% identical to their counterparts in pINV of Shigella flexneri
, and these genes have been implicated in virulence. The first of these proteins, VirK (pCoo045), mediates intracellular spreading of Shigella flexneri
by regulating the expression of VirG (IcsA), an outer membrane protein, and the last protein encoded in this locus, MsbB2 (pCoo044), is required for the full virulence of a septicemic E. coli
The protein encoded by pCoo056 (Fig. , region e
) is 94% identical to the secreted autotransporter EatA of an ETEC strain that produces CFA/I pili, another member of the CS1 pilus family. EatA is a serine protease that increases fluid accumulation in a rabbit ileal loop model of ETEC infection (25
). PCR analysis of 41 human clinical ETEC isolates showed that about 61% contain the eatA
gene, suggesting that it may be important in human disease as well. Directly downstream of the pCoo eatA
gene and transcribed convergently is a member of the AraC family of transcription regulators (Fig. , region e
) (pCoo055), which, because of its proximity, might regulate the expression of eatA
. The sequence of the corresponding region in the CFA/I plasmid is not yet available.
A second AraC family member is pCoo089, which is 70% identical to cafR, the regulator of the Yersinia pestis operon that encodes the f1 nonfimbrial adhesin (accession no. P26950). This protein is encoded in region k (Fig. ). If it is expressed from pCoo, it might regulate other virulence factors on plasmids and/or the host chromosome.
Several CDSs of pCoo are homologous to genes whose functions do not suggest roles in pathogenesis. A segment homologous to the conjugative transfer region of the IncFV plasmid pED208 (TraM, TraJ, TraA, TraL, TraE, and a TraK/X fusion [pCoo28-33] are 65%, 27%, 83%, 86%, 92%, and 85%/73% identical, respectively; accession no. AF411480) is found at region d of pCoo. In pED208, there are 20 transfer genes between TraK and TraX which are absent from pCoo, suggesting that the fusion TraK/X protein was formed as a result of a deletion of a large portion of this transfer region.
Surprisingly, pCoo region d encodes a protein 98% identical to the E. coli peptide deformylase (pCoo040; accession no. P71251), which is essential for bacterial protein maturation. To our knowledge, this is the first report of this essential enzyme being encoded on a plasmid, although homologues of other essential chromosomal genes, often involved in DNA metabolism, have been found on other plasmids.
Two other predicted pCoo proteins (pCoo071 and pCoo072, region h
) have homology to R100 proteins thought to contribute to the conjugative processing of DNA (accession no. P18148
). Both of these are only partial genes and probably represent remnants of genes from other plasmids since pCoo is not conjugative (9
Plasmid pCoo has two functional origins of replication.
In addition to the functional replication region homologous to the conjugative IncI1 plasmid R64 (9
), pCoo contains a second replication region (pCoo bp 27285 through 29452) (Fig. ) that is closely related to that of the IncFII plasmid R100. This region contains all of the gene products and cis
-acting sites required for the replication of R100, including the replication initiation protein RepA1 (pCoo036). This contiguous region of pCoo is organized identically to that of R100. The expression of repA1
is positively regulated by RepA6 (pCoo35a) and negatively regulated by RepA2 and by the antisense RNA, RNA-I (7
). The incompatibility determinant, RNA-I, of pCoo, is 90% identical to that of R100. The RepA1 and RepA6 proteins of pCoo are 99% identical to the R100 RepA1 and RepA6 proteins, and pCoo RepA2 (pCoo0035) is 71% identical and 90% similar to the RepA2 of R100 (accession no. AP000342
). The pCoo origin is 99% identical to the origin region, oriR
, of R100. There are additional cis
-acting sites downstream of the origin in R100 that are important for stable inheritance of the plasmid, and these are also present in pCoo (16
). These homologies strongly suggest that the R100 origin in pCoo is functional (see below).
It seems likely that pCoo is a cointegrate formed by the recombination of two independent replicons because the two pCoo origins are separated by long (1,953-bp) direct repeats comprised of recent IS100 insertions. Analysis of the patterns of target site duplications, and the genes into which the IS100s have inserted, suggests that they transposed into separate genes and subsequently recombined (Fig. , direct repeat I [DRI] and DRII). Recombination to resolve the cointegrate is expected to be frequent because of the length of the repeated region. To detect the presence of the independent replicons that would be generated by recombination, we performed PCR analysis using primers that flank the direct repeats (Fig. ). Primers 1 and 2, which flank DRII, and primers 3 and 4, which flank DRI, will amplify PCR products from unresolved intact pCoo but not from the two independent plasmids (Fig. ). Primer pairs 2 and 3 and 1 and 4 will amplify PCR products from the separate replicons, R64 and R100, respectively, but not from the pCoo cointegrate. By using total plasmid DNA from the pCooKm-containing strain JEF100 as template, PCR products of the expected sizes were obtained with all four primer pairs (1 + 2, 3 + 4, 3 + 2, and 1 + 4) (data not shown). This demonstrates that the pCoo cointegrate plasmid coexists with the two separate replicons in a single culture.
FIG. 2. DNA region containing the direct repeats of whole pCoo and repeats after resolution of the pCoo cointegrate. The repeats (arrows) in whole pCoo are labeled DRI and DRII. The repeat regions contained within the products of recombination are labeled pCoo (more ...)
We have shown previously that the R64 origin is functional (9
); however, it is possible that the R100 origin is not functional. The R100-related small plasmid may be continuously generated from the cointegrate replicon by recombination. Assuming that the pCoo origins have low copy numbers like both R100 and R64, a nonreplicating R100 plasmid would not be expected to reside in many cells containing the cointegrate because of incompatibility with the cointegrate. However, a nonreplicative R100 might be present transiently. To determine whether the R100-related plasmid is a functional replicon, we used incompatibility to isolate colonies that had lost the R64 portion of pCoo (9
) and asked whether an R100 replicon was still present.
A culture of strain EU2574, which carries both R64Tc and pCooKm (the Km marker is in cooB, which is in the R64 part of the large plasmid) (Fig. ), was grown for 24 generations without antibiotic selection, and colonies were scored for Km and Tc. We found that all 643 of the EU2574 colonies scored still contained the Tc marker of R64. In 24 generations, 80% of the colonies had lost the Km marker, indicating a loss of both pCooKm and its smaller R64Km derivative. If the R100-like plasmid could replicate independently, cells that had lost pCooKm should still contain DNA from the R100-like replicon. Plasmid DNA was isolated from 7 Kms colonies, and the presence of the R100-like replicon was assessed by PCR using primers 1 and 4, which flank the repeat in the R100-like replicon (Fig. , pCoo R100 DR). A PCR product of the expected size was amplified from all of the colonies tested, demonstrating the presence of the R100-like replicon (data not shown). As expected, fragment G within the R64 part of pCoo (Fig. ) was not amplified from these Kms colonies. Thus, the R100-like replicon is stable in the absence of the R64 replicon of pCoo, indicating that the R100-like origin, like the R64 origin, is functional. It remains formally possible that the R100 (and R64) resolution products undergo chromosomal integration in such a way that the recombinant junctions are intact, since this could not be detected by our PCR analysis.
Since the incompatibility determinant, RNA-I, of the R100-like part of pCoo is not 100% (but only 90%) identical to that of R100, it seemed possible that the two plasmids might be compatible. We showed above that the pCoo cointegrate plasmid resolves to form the two separate replicons, and since introduction of an IncI1 plasmid leads to loss of the R64 replicon of pCoo (9
), we applied the same test for R100 incompatibility. We investigated whether the introduction of R100 (Tcr
) into a strain carrying pCooKm (Table , JEF100) leads to loss of the R100-like part of the pCoo cointegrate. After 25 generations in the absence of selection, 300 colonies were found to grow on plates containing both Km and Tc, indicating that both the R64 replicon of pCooKm (Kmr
) and R100 (Tcr
) were present. Plasmid DNA was isolated from 35 of these colonies and the presence of the R100-like replicon of pCooKm was assessed by PCR, using primers within the eatA
gene and the araC
homolog directly downstream of eatA
(Fig. ). A PCR product of the expected size was amplified from all colonies tested (data not shown). Using the same primers, no PCR product was amplified from plasmid isolated from the negative control (Table , K1250/R100). These results demonstrate that the R100-like pCoo replicon can coexist with R100 in the absence of selection, demonstrating that they are in different incompatibility groups.
Plasmid stability regions of pCoo.
Since both replicons of pCoo can function independently, they might be expected to be lost independently in the absence of known selection. The strain analyzed was selected for production of the CS1 pili encoded by the R64 homologue, but there was no obvious selection for the R100 homologue. Thus, the continued presence of both replicons suggests that they are both inherited very stably. Several different mechanisms are used by plasmids to ensure stable and equal distribution into daughter cells following replication (for a review, see reference 38
). More than one of these mechanisms is often present in the same plasmid. Often plasmid replication results in the production of multimers that would be segregated as one molecule into the same daughter cell at division unless they were resolved. Thus, production by the plasmid of a site-specific recombinase to resolve these multimers into monomers is an important stability mechanism. A second mechanism is an active centromere-like partition process (encoded by par
genes), which precisely distributes copies of the plasmid to each daughter cell. Third, and perhaps most effective (4
), is the postsegregation killing or plasmid addiction mechanism. In the last case, plasmid-free segregants are killed because of the production of a plasmid-encoded stable toxin that is counteracted in plasmid-containing cells by the production of an unstable antidote.
The plasmid stability region in the R64 portion of pCoo (Fig. , region a
) contains genes for the first two types of stability mechanisms: site-specific recombination and active partition (9
). The resolvase ResD (pCoo097; 91% identical to the Rsv protein required for maintenance of the F plasmid), ParA (pCoo98; 61% identical to a member of the ParA family from Xanthomonas citri
), StbA (pCoo100; 66% identical to StbA of EPEC plasmid pB171), and StbB (pCoo101; 46% identical to StbB of pB171) are all likely to contribute to the stable inheritance of pCoo and the pCoo R64 replicon.
The R100 part of pCoo also has genes homologous to those that encode all three stability mechanisms described above. This region encodes two postsegregation killing systems: ccdA/B
(Fig. , region e
) and hok/sok/mok
(Fig. , region h
). CcdB (pCoo051) is the cytotoxic protein that kills plasmid-free segregants, and CcdA (pCoo050) is the antitoxin protein that protects the plasmid-containing cells (13
). In plasmid pCoo, CcdA and -B are 100% identical to the CcdA and -B proteins found in E. coli
plasmid pB171 (accession no. BAA84907
, respectively [35
]). The pCoo Hok protein (pCoo075), which is the stable toxin that kills plasmidless cells, is predicted to be 74% identical to Hok of R100, and pCoo Mok (pCoo074), the modulator of Hok, is 51% identical to Mok of R100 (accession no. AP000342
In addition to the postsegregation killing systems, the R100 half of pCoo encodes two proteins homologous to those of an active partitioning system. ParA (pCoo064; region f) is 87% identical to ParA from Citrobacter freundii (accession no. AF550415) and 53% identical to StbA from E. coli plasmid R100 (accession no. AP000342). ParB (pCoo065; region f) is 75% identical to ParB of Citrobacter freundii (accession no. AF 550415) and 39% identical to StbB of E. coli plasmid R100 (accession no. AP000342).
Both replicons of pCoo also encode putative resolvases. The R100 part encodes two resolvase homologues. One, Res (pCoo053; region e
), is 94% identical to a Proteus vulgaris
resolvase (accession no. AP004237
), and the second, ResD (pCoo052; region e
), is 96% identical to the Rsv protein of the F plasmid (8
There is also a ResD homologue in the R64 part of pCoo (pCoo097; region a). The two ResD proteins are oriented in the same direction, and a 400-bp segment of the two CDSs is 95% identical. These direct repeats, like the repeats between the two replicons, could serve as a substrate for recombination. However, in this case, one recombinant would contain a replication origin and the other would not. That pCoo has not lost the region between the two ResD homologues is probably due to the presence of the mok/hok postsegregation killing system on the nonreplicating recombinant circle. The presence of the stable Hok protein in the culture would insure the killing of any cell lacking this region. Because so many plasmid stability genes are present, it is not surprising that both of the smaller replicons, as well as the pCoo cointegrate, are stably maintained.
Structure of pCoo plasmids from other CS1-expressing ETEC strains.
Although both parts of the pCoo cointegrate are likely to be stably inherited both because of stability functions and possibly because of genes providing a selective advantage, the advantage to maintaining them on one large cointegrate plasmid is not obvious. Therefore, one might expect to find some ETEC strains that have only the two smaller plasmids and have lost the direct repeat region required for recombination, leading to formation of the cointegrate.
To learn about the presence and configuration of the pCoo-related plasmids of other ETEC strains, we used PCR to probe unrelated clinical ETEC isolates that produce CS1 pili (Table ). With the exception of strains 294, 295, and 296, which had not been studied as completely, these strains were chosen for the study because they differed in one or more known genetic traits, like serotype, drug resistance, or type of pilus (in addition to CS1) produced. Primer pairs (Table S2) were selected to amplify the pCoo regions shown as uppercase letters in Fig. . As expected from the presence of CS1 pili on their surfaces, region E, containing cooA, and region A, which is specific to pCoo and within the R64 replicon, were amplified from all of the strains (data summarized in Table ).
PCR analysis of plasmid DNA isolated from CS1-positive ETEC strainsa
At least a part of the region encoding the R64 thin pilus (Fig. ; Table , regions B, C, and D) was amplified from all but two of the strains (TW03923 and TW03677). From both of these, a product was obtained using primers within the R64 origin region (E), which is just upstream of the R64 thin pilus region. This suggests that most, but not all, of the R64 homology found in pCoo is absent from these strains. Deletions of parts of the thin pilus operon in some of the ETEC strains suggested by the inability to amplify regions B, C, and/or D indicates that this operon may not be essential for ETEC infection. This is supported by the nonfunctionality of this system in pCoo.
All of the ETEC strains have the R100-like origin region (Table , amplified region I), and primers within eatA (region J), which is also in the R100 part of pCoo, generated a PCR product from all but one of the strains (TW03677). Thus, all of the strains have some regions from both the R64 and R100 halves of pCoo.
To determine whether the cointegrate structure found in pCoo of C921b-1 is retained in the plasmids of the other strains, we used primers flanking the direct repeats (DRI and DRII) together with primers within the repeat. If DRI is present, amplified fragments G and H will be obtained, and if DRII is present, K and L should be seen. The presence or absence of DRI and DRII divides these strains into two classes.
The largest class, A, with 10 members (Table ), contains both repeats and includes our reference strain, C921b-1. Where known, the serotype of all class A strains examined is O6. To determine whether the cointegrate plasmid is resolved during growth of these strains, we sought to amplify the expected junctions of the cointegrate and of the individual small plasmids from three of these strains (1392-75-2A, 60R70, and 296) as described above for C912b-1. As we had found for C921b-1, the two smaller replicons were present in the same culture, with the whole cointegrate plasmid in all three strains examined (data not shown).
The other class (Table , class B) of pCoo-related plasmids in these ETEC strains has eight members, all of which lack parts of one or both of the repeats or have repeats of sizes different from those in pCoo from strain C921b-1. The members of class B have a variety of serotypes and are probably more distantly related than the O6 group of strains. In summary, we find that several segments of pCoo, such as region A of R64, some portion of the R64 thin pilus, the R64 and R100 origin regions, and eatA, are common to the plasmids in most of the ETEC strains of all serotypes. However, only from the ETEC strains of serotype 06 could we amplify fragments with all the pCoo primer pairs used to identify both DRI and DRII. Thus, it appears that the pCoo cointegrate structure may be characteristic of CS1-producing strains of serotype O6. The traits that distinguish the O6 strains we examined include the production of LT and ST and production of additional pili (CS2, -3, -4, and/or -6), which are all plasmid encoded. Thus, these strains may be very close relatives that differ only in plasmid content. The strains with other O serotypes may be more distantly related, and thus their pCoo-related plasmids may have had a greater time to diverge from that in C921b-2.
Comparison of the pCoo plasmids from the CS1-producing ETEC strains suggests that they were all derived from an ancestor plasmid that has diverged over time. They all contain both the R64- and R100-like origins of replication, the pCoo-specific region A (that encodes the CS1 pilus genes) in the R64 half of pCoo, and at least some portion of one of the direct repeats, DRI and DRII, that delimit the individual replicons in the cointegrate plasmid. The extent of the R64 thin pilus operon present in the ETEC strains varies; deletions in this region suggest that it is not essential. All ETEC strains that we examined with the O6 serotype have both of the direct repeats, but for the more distantly related strains with other serotypes, the primers we used did not always amplify all parts of the direct repeats. In some of the non-O6 strains, one of the direct repeats was present, suggesting that the cointegrate was formed. For the rest of the strains, we were unable to determine whether the two parts of pCoo formed a cointegrate and, if so, whether the cointegrate would resolve. It is not clear whether there is selective value to maintenance of the cointegrate plasmid over that of the two independent replicons which constitute it, but its continued presence suggests some type of advantage.
The presence of at least part of both the R100-like and R64-like halves of pCoo in all of the CS1-piliated ETEC strains suggests that there might be essential functions on both. On the R64 part, the coo
genes are probably required for virulence. On the R100 portion of pCoo, candidates for functions important for pathogenesis include the shf
operon, which is almost identical to that found on the pINV virulence plasmid of Shigella flexneri
, and eatA
, which encodes the secreted serine protease that enhances the virulence in a CFA/I-pilus-producing ETEC strain (25
In the CS1-producing ETEC strains, replicons of three incompatibility groups are apparently necessary for pathogenesis. An IncFI plasmid contains the positive regulator, rns, required for expression of the coo genes, LT and ST, and, in some strains, the colonization factor CS3. The pCoo cointegrate plasmid is made up of IncI1 (R64) and an R100-like replicon, both of which may contain genes that may be important for virulence. A better understanding of the role of these genes and of contributions to virulence from the IncF1 plasmid and from the chromosome awaits further sequence analysis and studies on the pathogenesis of ETEC.