|Home | About | Journals | Submit | Contact Us | Français|
CS1 is the prototype of a class of pili of enterotoxigenic Escherichia coli (ETEC) associated with diarrheal disease in humans. The genes encoding this pilus are carried on a large plasmid, pCoo. We report the sequence of the complete 98,396-bp plasmid. Like many other virulence plasmids, pCoo is a mosaic consisting of regions derived from multiple sources. Complete and fragmented insertion sequences (IS) make up 24% of the total DNA and are scattered throughout the plasmid. The pCoo DNA between these IS elements has a wide range of G+C content (35 to 57%), suggesting that these regions have different ancestries. We find that the pCoo plasmid is a cointegrate of two functional replicons, related to R64 and R100, which are joined at a 1,953-bp direct repeat of IS100. Recombination between these repeats in the cointegrate generates the two smaller replicons which coexist with the cointegrate in the culture. Both of the smaller replicons have plasmid stability genes as well as genes that may be important in pathogenesis. Examination by PCR of 17 other unrelated CS1 ETEC strains with a variety of serotypes demonstrated that all contained at least parts of both replicons of pCoo and that strains of the O6 genotype appear to contain a cointegrate very similar to pCoo. The results suggest that this family of CS1-encoding plasmids is evolving rapidly.
Enterotoxigenic Escherichia coli (ETEC) is an important cause of diarrheal diseases in humans (1, 2, 14, 20). Although ETEC infections are usually self-limiting in adults, the disease results in death in infants and young children in developing countries. Following colonization of the small intestine, these bacteria produce the heat-labile (LT) and heat-stable (ST) toxins, which are responsible for the extensive fluid secretion from the intestine that characterizes this disease.
An important first step in ETEC infection, colonization of the host, requires attachment of the bacteria to cells of the small intestine. In gram-negative bacteria, such attachment is usually mediated by pili. In ETEC strains associated with human disease, at least 21 serologically distinct pili have been identified (12, 27). CS1, the best characterized of these, is the prototype of a family of ETEC pili that includes CS1, CS2, CS4, CS14, CS17, CS19, and colonization factor antigen I (CFA/I) (12), whose structural and assembly proteins have similar sequences.
For production of functional CS1 pili, only four genes (cooB, -A, -C, and -D) are necessary when cloned in E. coli K-12 (11, 26, 30). Like many other virulence genes, the coo genes are located on a plasmid, called pCoo (9, 26). In ETEC strains, a positive regulator, rns, is required for expression of the coo operon. Rns is encoded on another plasmid, which has IncF1 incompatibility (3, 5, 10, 22). In some CS1-producing strains, this plasmid also encodes the LT and ST and the CS3 colonization factor (22).
A partial sequence of the pCoo plasmid (36 kb) from the ETEC-derived strain C921b-1 revealed regions homologous to parts of the IncI1 plasmid R64 (9). This homology includes the R64 replication region, and, as expected, pCoo is incompatible with R64. In addition, this homology includes most of the genes required for synthesis of the R64 thin pilus, required for conjugation in liquid. However, because this locus is not complete, pCoo is not self-conjugative.
In this work, we report the complete sequence of the 98,369-bp pCooKm plasmid. Like many other virulence plasmids, pCoo is a mosaic made up of DNA from various sources. Surprisingly, we found that pCoo is a cointegrate with two functional replication regions (R64- and R100-like). The two smaller replicons are separated by 1,953-bp direct repeats of insertion sequence 100 (IS100) that serve as homologous recombination sites leading to resolution into the individual small plasmids. Each of the smaller replicons has genes important for stable plasmid inheritance. We also examined pCoo plasmids from 17 CS1-producing ETEC strains with a variety of serotypes and found that all contain portions of both replicons.
The bacterial strains used are listed in Table Table1.1. Bacteria were grown in Luria-Bertani medium at 37°C with aeration. The plasmid pHSG576 is a pSC101 replicon (Cmr) (33). R64colK is a conjugative IncI1 Tcr plasmid (36) and will be referred to as R64 in the rest of this paper. Antibiotics used were kanamycin (50 μg/ml) and tetracycline (10 μg/ml).
About 36 kb of pCoo has been previously sequenced (9). Plasmid DNA isolated from JEF100 (Table (Table1)1) was used for sequencing the rest of the pCooKm plasmid. The DNA was fragmented by sonication and size fractionated before constructing libraries in pUC19. The sequence was generated from 821 paired end-reads from two pUC19 libraries with insert sizes of 1.4 to 2 kb and 713 paired end-reads from two pUC19 libraries with insert sizes of 2 to 4 kb. These 1,534 sequence reads were performed with ABI BigDye Terminator chemistry on ABI3730 sequencing machines and gave a total coverage for the plasmid of 9.94-fold. All identified repeats were bridged with read pairs or end-sequenced PCR products. Error checking and finishing of the sequence were performed to standard criteria, and the final sequence had a quality score of >30 at each base (equivalent to an estimated error rate of <1 bp per 1.73 Mb). The sequence was annotated using Artemis (28). Predicted coding sequences (CDSs) were identified manually with reference to positional base composition and amino acid usage plots. The entire sequence was searched in all six reading frames against the nonredundant TrEMBL database using BLASTX to ensure no genes were missed. Each CDS was searched against the nonredundant databases using FASTA and BLASTP and against the PFAM and Prosite databases of protein motifs. Transmembrane helices were identified with TMHMM (18) and signal sequences with SignalP (24). Repeats were identified using Dotter (32).
The nucleotide sequence of the pCoo plasmid has been deposited in EMBL/GenBank under accession no. CR942285.
The pCoo plasmid with a kanamycin resistance gene inserted into cooB was obtained from the ETEC-derived strain JEF100 (Table (Table1)1) and its sequence determined. Analysis of the sequence of this 98,396-bp plasmid revealed 129 predicted CDSs, including the inserted Km gene (see Table S1 in the supplemental material for the locations and homologies of all CDSs of pCoo). Like many other virulence plasmids, pCoo appears to be a mosaic consisting of regions derived from multiple sources. DNA homologous to IS elements is scattered throughout the plasmid and comprises 24% of the total pCooKm DNA (Fig. (Fig.1).1). There are 11 regions between 195 and 7,612 bp in length that contain intact and partial IS elements. Many of these regions contain fragments of several different IS elements, which suggests that they are not the result of single insertion events, although some recent simple insertions were identified.
The G+C content of the pCoo DNA sequences between these IS elements ranges from 57% to 35%. This wide range of G+C content is consistent with pCoo being derived by the amalgamation of DNA from a variety of sources. The pCoo sequences between the IS fragments are labeled with a bold lowercase letter in Fig. Fig.1,1, and they have G+C contents of 49%, 40%, 49%, 46%, 42%, 45%, 53%, 57%, 36%, 35%, and 36% for regions a to k, respectively. These regions are described below.
A large contiguous part of pCoo region a (bp 83057 through 98396 and 1 through 2250) is 95% identical at the level of DNA to a contiguous region of plasmid R64 (bp 105487 through 120826; accession no. AP 005147) (9). Part of region a of pCoo (pCoo bp 72009 through 78749) is similar to the closely related plasmids R64 and ColIb-P9 (accession no. AP 005147 and NC 002122, respectively). Although these plasmids are in the same incompatibility group, there are DNA segments that differ in sequence between them. This part of pCoo appears to be a hybrid between these two plasmids because it has one segment (bp 72009 through 72759) similar to a unique region in ColIb-P9 and another (bp 75370 through 77265) similar to a unique region in R64. Within the part similar to both R64 and ColIb-P9, there are two genes with no similarities to genes of known function. The rest of the CDSs in this section have homologies to genes of known function and will be discussed later.
A homologue of the AraC family regulatory gene csvR (accession no. X60106) is located in a part of pCoo (bp 65421 through 69147; spanning regions i and j, the IS element between them, and portions of the flanking IS elements) that is 96% identical to a section of a plasmid from an ETEC strain of serotype O167 H5 (6). Although CsvR is 75% identical to Rns, the positive regulator for the coo operon present on another plasmid in the same ETEC strain, the pCoo copy of CsvR contains a deletion leading to a frameshift, so it is not expected to be expressed.
A number of conserved hypothetical genes in pCoo show similarity to genes from several other different gram-negative chromosomes and plasmids, contributing to the impression of a mosaic origin for pCoo.
In addition to the four coo genes (Fig. (Fig.1,1, region b) required for CS1 pilus production, several predicted gene products of pCoo resemble proteins implicated in pathogenesis in other bacteria, and these may be important for the virulence of an ETEC strain containing pCoo.
Region e of pCoo (Fig. (Fig.1,1, between bp 32036 and 36029) is 95% identical to the shf operon of the Shigella flexneri virulence plasmid pINV (accession no. AY206446 ). The two last proteins of the pCoo shf locus are 97% identical to their counterparts in pINV of Shigella flexneri, and these genes have been implicated in virulence. The first of these proteins, VirK (pCoo045), mediates intracellular spreading of Shigella flexneri by regulating the expression of VirG (IcsA), an outer membrane protein, and the last protein encoded in this locus, MsbB2 (pCoo044), is required for the full virulence of a septicemic E. coli strain (17, 23).
The protein encoded by pCoo056 (Fig. (Fig.1,1, region e) is 94% identical to the secreted autotransporter EatA of an ETEC strain that produces CFA/I pili, another member of the CS1 pilus family. EatA is a serine protease that increases fluid accumulation in a rabbit ileal loop model of ETEC infection (25). PCR analysis of 41 human clinical ETEC isolates showed that about 61% contain the eatA gene, suggesting that it may be important in human disease as well. Directly downstream of the pCoo eatA gene and transcribed convergently is a member of the AraC family of transcription regulators (Fig. (Fig.1,1, region e) (pCoo055), which, because of its proximity, might regulate the expression of eatA. The sequence of the corresponding region in the CFA/I plasmid is not yet available.
A second AraC family member is pCoo089, which is 70% identical to cafR, the regulator of the Yersinia pestis operon that encodes the f1 nonfimbrial adhesin (accession no. P26950). This protein is encoded in region k (Fig. (Fig.1).1). If it is expressed from pCoo, it might regulate other virulence factors on plasmids and/or the host chromosome.
Several CDSs of pCoo are homologous to genes whose functions do not suggest roles in pathogenesis. A segment homologous to the conjugative transfer region of the IncFV plasmid pED208 (TraM, TraJ, TraA, TraL, TraE, and a TraK/X fusion [pCoo28-33] are 65%, 27%, 83%, 86%, 92%, and 85%/73% identical, respectively; accession no. AF411480) is found at region d of pCoo. In pED208, there are 20 transfer genes between TraK and TraX which are absent from pCoo, suggesting that the fusion TraK/X protein was formed as a result of a deletion of a large portion of this transfer region.
Surprisingly, pCoo region d encodes a protein 98% identical to the E. coli peptide deformylase (pCoo040; accession no. P71251), which is essential for bacterial protein maturation. To our knowledge, this is the first report of this essential enzyme being encoded on a plasmid, although homologues of other essential chromosomal genes, often involved in DNA metabolism, have been found on other plasmids.
Two other predicted pCoo proteins (pCoo071 and pCoo072, region h) have homology to R100 proteins thought to contribute to the conjugative processing of DNA (accession no. P18148 and P28044). Both of these are only partial genes and probably represent remnants of genes from other plasmids since pCoo is not conjugative (9).
In addition to the functional replication region homologous to the conjugative IncI1 plasmid R64 (9), pCoo contains a second replication region (pCoo bp 27285 through 29452) (Fig. (Fig.1)1) that is closely related to that of the IncFII plasmid R100. This region contains all of the gene products and cis-acting sites required for the replication of R100, including the replication initiation protein RepA1 (pCoo036). This contiguous region of pCoo is organized identically to that of R100. The expression of repA1 is positively regulated by RepA6 (pCoo35a) and negatively regulated by RepA2 and by the antisense RNA, RNA-I (7). The incompatibility determinant, RNA-I, of pCoo, is 90% identical to that of R100. The RepA1 and RepA6 proteins of pCoo are 99% identical to the R100 RepA1 and RepA6 proteins, and pCoo RepA2 (pCoo0035) is 71% identical and 90% similar to the RepA2 of R100 (accession no. AP000342). The pCoo origin is 99% identical to the origin region, oriR, of R100. There are additional cis-acting sites downstream of the origin in R100 that are important for stable inheritance of the plasmid, and these are also present in pCoo (16). These homologies strongly suggest that the R100 origin in pCoo is functional (see below).
It seems likely that pCoo is a cointegrate formed by the recombination of two independent replicons because the two pCoo origins are separated by long (1,953-bp) direct repeats comprised of recent IS100 insertions. Analysis of the patterns of target site duplications, and the genes into which the IS100s have inserted, suggests that they transposed into separate genes and subsequently recombined (Fig. (Fig.1,1, direct repeat I [DRI] and DRII). Recombination to resolve the cointegrate is expected to be frequent because of the length of the repeated region. To detect the presence of the independent replicons that would be generated by recombination, we performed PCR analysis using primers that flank the direct repeats (Fig. (Fig.2).2). Primers 1 and 2, which flank DRII, and primers 3 and 4, which flank DRI, will amplify PCR products from unresolved intact pCoo but not from the two independent plasmids (Fig. (Fig.2).2). Primer pairs 2 and 3 and 1 and 4 will amplify PCR products from the separate replicons, R64 and R100, respectively, but not from the pCoo cointegrate. By using total plasmid DNA from the pCooKm-containing strain JEF100 as template, PCR products of the expected sizes were obtained with all four primer pairs (1 + 2, 3 + 4, 3 + 2, and 1 + 4) (data not shown). This demonstrates that the pCoo cointegrate plasmid coexists with the two separate replicons in a single culture.
We have shown previously that the R64 origin is functional (9); however, it is possible that the R100 origin is not functional. The R100-related small plasmid may be continuously generated from the cointegrate replicon by recombination. Assuming that the pCoo origins have low copy numbers like both R100 and R64, a nonreplicating R100 plasmid would not be expected to reside in many cells containing the cointegrate because of incompatibility with the cointegrate. However, a nonreplicative R100 might be present transiently. To determine whether the R100-related plasmid is a functional replicon, we used incompatibility to isolate colonies that had lost the R64 portion of pCoo (9) and asked whether an R100 replicon was still present.
A culture of strain EU2574, which carries both R64Tc and pCooKm (the Km marker is in cooB, which is in the R64 part of the large plasmid) (Fig. (Fig.1),1), was grown for 24 generations without antibiotic selection, and colonies were scored for Km and Tc. We found that all 643 of the EU2574 colonies scored still contained the Tc marker of R64. In 24 generations, 80% of the colonies had lost the Km marker, indicating a loss of both pCooKm and its smaller R64Km derivative. If the R100-like plasmid could replicate independently, cells that had lost pCooKm should still contain DNA from the R100-like replicon. Plasmid DNA was isolated from 7 Kms colonies, and the presence of the R100-like replicon was assessed by PCR using primers 1 and 4, which flank the repeat in the R100-like replicon (Fig. (Fig.2,2, pCoo R100 DR). A PCR product of the expected size was amplified from all of the colonies tested, demonstrating the presence of the R100-like replicon (data not shown). As expected, fragment G within the R64 part of pCoo (Fig. (Fig.1)1) was not amplified from these Kms colonies. Thus, the R100-like replicon is stable in the absence of the R64 replicon of pCoo, indicating that the R100-like origin, like the R64 origin, is functional. It remains formally possible that the R100 (and R64) resolution products undergo chromosomal integration in such a way that the recombinant junctions are intact, since this could not be detected by our PCR analysis.
Since the incompatibility determinant, RNA-I, of the R100-like part of pCoo is not 100% (but only 90%) identical to that of R100, it seemed possible that the two plasmids might be compatible. We showed above that the pCoo cointegrate plasmid resolves to form the two separate replicons, and since introduction of an IncI1 plasmid leads to loss of the R64 replicon of pCoo (9), we applied the same test for R100 incompatibility. We investigated whether the introduction of R100 (Tcr) into a strain carrying pCooKm (Table (Table1,1, JEF100) leads to loss of the R100-like part of the pCoo cointegrate. After 25 generations in the absence of selection, 300 colonies were found to grow on plates containing both Km and Tc, indicating that both the R64 replicon of pCooKm (Kmr) and R100 (Tcr) were present. Plasmid DNA was isolated from 35 of these colonies and the presence of the R100-like replicon of pCooKm was assessed by PCR, using primers within the eatA gene and the araC homolog directly downstream of eatA (Fig. (Fig.1).1). A PCR product of the expected size was amplified from all colonies tested (data not shown). Using the same primers, no PCR product was amplified from plasmid isolated from the negative control (Table (Table1,1, K1250/R100). These results demonstrate that the R100-like pCoo replicon can coexist with R100 in the absence of selection, demonstrating that they are in different incompatibility groups.
Since both replicons of pCoo can function independently, they might be expected to be lost independently in the absence of known selection. The strain analyzed was selected for production of the CS1 pili encoded by the R64 homologue, but there was no obvious selection for the R100 homologue. Thus, the continued presence of both replicons suggests that they are both inherited very stably. Several different mechanisms are used by plasmids to ensure stable and equal distribution into daughter cells following replication (for a review, see reference 38). More than one of these mechanisms is often present in the same plasmid. Often plasmid replication results in the production of multimers that would be segregated as one molecule into the same daughter cell at division unless they were resolved. Thus, production by the plasmid of a site-specific recombinase to resolve these multimers into monomers is an important stability mechanism. A second mechanism is an active centromere-like partition process (encoded by par or stb genes), which precisely distributes copies of the plasmid to each daughter cell. Third, and perhaps most effective (4), is the postsegregation killing or plasmid addiction mechanism. In the last case, plasmid-free segregants are killed because of the production of a plasmid-encoded stable toxin that is counteracted in plasmid-containing cells by the production of an unstable antidote.
The plasmid stability region in the R64 portion of pCoo (Fig. (Fig.1,1, region a) contains genes for the first two types of stability mechanisms: site-specific recombination and active partition (9). The resolvase ResD (pCoo097; 91% identical to the Rsv protein required for maintenance of the F plasmid), ParA (pCoo98; 61% identical to a member of the ParA family from Xanthomonas citri), StbA (pCoo100; 66% identical to StbA of EPEC plasmid pB171), and StbB (pCoo101; 46% identical to StbB of pB171) are all likely to contribute to the stable inheritance of pCoo and the pCoo R64 replicon.
The R100 part of pCoo also has genes homologous to those that encode all three stability mechanisms described above. This region encodes two postsegregation killing systems: ccdA/B (Fig. (Fig.1,1, region e) and hok/sok/mok (Fig. (Fig.1,1, region h). CcdB (pCoo051) is the cytotoxic protein that kills plasmid-free segregants, and CcdA (pCoo050) is the antitoxin protein that protects the plasmid-containing cells (13). In plasmid pCoo, CcdA and -B are 100% identical to the CcdA and -B proteins found in E. coli plasmid pB171 (accession no. BAA84907 and BAA84908, respectively ). The pCoo Hok protein (pCoo075), which is the stable toxin that kills plasmidless cells, is predicted to be 74% identical to Hok of R100, and pCoo Mok (pCoo074), the modulator of Hok, is 51% identical to Mok of R100 (accession no. AP000342 ).
In addition to the postsegregation killing systems, the R100 half of pCoo encodes two proteins homologous to those of an active partitioning system. ParA (pCoo064; region f) is 87% identical to ParA from Citrobacter freundii (accession no. AF550415) and 53% identical to StbA from E. coli plasmid R100 (accession no. AP000342). ParB (pCoo065; region f) is 75% identical to ParB of Citrobacter freundii (accession no. AF 550415) and 39% identical to StbB of E. coli plasmid R100 (accession no. AP000342).
Both replicons of pCoo also encode putative resolvases. The R100 part encodes two resolvase homologues. One, Res (pCoo053; region e), is 94% identical to a Proteus vulgaris resolvase (accession no. AP004237), and the second, ResD (pCoo052; region e), is 96% identical to the Rsv protein of the F plasmid (8).
There is also a ResD homologue in the R64 part of pCoo (pCoo097; region a). The two ResD proteins are oriented in the same direction, and a 400-bp segment of the two CDSs is 95% identical. These direct repeats, like the repeats between the two replicons, could serve as a substrate for recombination. However, in this case, one recombinant would contain a replication origin and the other would not. That pCoo has not lost the region between the two ResD homologues is probably due to the presence of the mok/hok postsegregation killing system on the nonreplicating recombinant circle. The presence of the stable Hok protein in the culture would insure the killing of any cell lacking this region. Because so many plasmid stability genes are present, it is not surprising that both of the smaller replicons, as well as the pCoo cointegrate, are stably maintained.
Although both parts of the pCoo cointegrate are likely to be stably inherited both because of stability functions and possibly because of genes providing a selective advantage, the advantage to maintaining them on one large cointegrate plasmid is not obvious. Therefore, one might expect to find some ETEC strains that have only the two smaller plasmids and have lost the direct repeat region required for recombination, leading to formation of the cointegrate.
To learn about the presence and configuration of the pCoo-related plasmids of other ETEC strains, we used PCR to probe unrelated clinical ETEC isolates that produce CS1 pili (Table (Table1).1). With the exception of strains 294, 295, and 296, which had not been studied as completely, these strains were chosen for the study because they differed in one or more known genetic traits, like serotype, drug resistance, or type of pilus (in addition to CS1) produced. Primer pairs (Table S2) were selected to amplify the pCoo regions shown as uppercase letters in Fig. Fig.1.1. As expected from the presence of CS1 pili on their surfaces, region E, containing cooA, and region A, which is specific to pCoo and within the R64 replicon, were amplified from all of the strains (data summarized in Table Table22).
At least a part of the region encoding the R64 thin pilus (Fig. (Fig.1;1; Table Table2,2, regions B, C, and D) was amplified from all but two of the strains (TW03923 and TW03677). From both of these, a product was obtained using primers within the R64 origin region (E), which is just upstream of the R64 thin pilus region. This suggests that most, but not all, of the R64 homology found in pCoo is absent from these strains. Deletions of parts of the thin pilus operon in some of the ETEC strains suggested by the inability to amplify regions B, C, and/or D indicates that this operon may not be essential for ETEC infection. This is supported by the nonfunctionality of this system in pCoo.
All of the ETEC strains have the R100-like origin region (Table (Table2,2, amplified region I), and primers within eatA (region J), which is also in the R100 part of pCoo, generated a PCR product from all but one of the strains (TW03677). Thus, all of the strains have some regions from both the R64 and R100 halves of pCoo.
To determine whether the cointegrate structure found in pCoo of C921b-1 is retained in the plasmids of the other strains, we used primers flanking the direct repeats (DRI and DRII) together with primers within the repeat. If DRI is present, amplified fragments G and H will be obtained, and if DRII is present, K and L should be seen. The presence or absence of DRI and DRII divides these strains into two classes.
The largest class, A, with 10 members (Table (Table2),2), contains both repeats and includes our reference strain, C921b-1. Where known, the serotype of all class A strains examined is O6. To determine whether the cointegrate plasmid is resolved during growth of these strains, we sought to amplify the expected junctions of the cointegrate and of the individual small plasmids from three of these strains (1392-75-2A, 60R70, and 296) as described above for C912b-1. As we had found for C921b-1, the two smaller replicons were present in the same culture, with the whole cointegrate plasmid in all three strains examined (data not shown).
The other class (Table (Table2,2, class B) of pCoo-related plasmids in these ETEC strains has eight members, all of which lack parts of one or both of the repeats or have repeats of sizes different from those in pCoo from strain C921b-1. The members of class B have a variety of serotypes and are probably more distantly related than the O6 group of strains. In summary, we find that several segments of pCoo, such as region A of R64, some portion of the R64 thin pilus, the R64 and R100 origin regions, and eatA, are common to the plasmids in most of the ETEC strains of all serotypes. However, only from the ETEC strains of serotype 06 could we amplify fragments with all the pCoo primer pairs used to identify both DRI and DRII. Thus, it appears that the pCoo cointegrate structure may be characteristic of CS1-producing strains of serotype O6. The traits that distinguish the O6 strains we examined include the production of LT and ST and production of additional pili (CS2, -3, -4, and/or -6), which are all plasmid encoded. Thus, these strains may be very close relatives that differ only in plasmid content. The strains with other O serotypes may be more distantly related, and thus their pCoo-related plasmids may have had a greater time to diverge from that in C921b-2.
Comparison of the pCoo plasmids from the CS1-producing ETEC strains suggests that they were all derived from an ancestor plasmid that has diverged over time. They all contain both the R64- and R100-like origins of replication, the pCoo-specific region A (that encodes the CS1 pilus genes) in the R64 half of pCoo, and at least some portion of one of the direct repeats, DRI and DRII, that delimit the individual replicons in the cointegrate plasmid. The extent of the R64 thin pilus operon present in the ETEC strains varies; deletions in this region suggest that it is not essential. All ETEC strains that we examined with the O6 serotype have both of the direct repeats, but for the more distantly related strains with other serotypes, the primers we used did not always amplify all parts of the direct repeats. In some of the non-O6 strains, one of the direct repeats was present, suggesting that the cointegrate was formed. For the rest of the strains, we were unable to determine whether the two parts of pCoo formed a cointegrate and, if so, whether the cointegrate would resolve. It is not clear whether there is selective value to maintenance of the cointegrate plasmid over that of the two independent replicons which constitute it, but its continued presence suggests some type of advantage.
The presence of at least part of both the R100-like and R64-like halves of pCoo in all of the CS1-piliated ETEC strains suggests that there might be essential functions on both. On the R64 part, the coo genes are probably required for virulence. On the R100 portion of pCoo, candidates for functions important for pathogenesis include the shf operon, which is almost identical to that found on the pINV virulence plasmid of Shigella flexneri, and eatA, which encodes the secreted serine protease that enhances the virulence in a CFA/I-pilus-producing ETEC strain (25).
In the CS1-producing ETEC strains, replicons of three incompatibility groups are apparently necessary for pathogenesis. An IncFI plasmid contains the positive regulator, rns, required for expression of the coo genes, LT and ST, and, in some strains, the colonization factor CS3. The pCoo cointegrate plasmid is made up of IncI1 (R64) and an R100-like replicon, both of which may contain genes that may be important for virulence. A better understanding of the role of these genes and of contributions to virulence from the IncF1 plasmid and from the chromosome awaits further sequence analysis and studies on the pathogenesis of ETEC.
This work was supported by grant AI24870 from the National Institutes of Health and by the Wellcome Trust. We gratefully acknowledge the support of the Sanger Institute core sequencing and informatics teams.
We thank Myron Levine for ETEC strain 1391-75-2A; Eileen Barry for ETEC strains 60T75 and E4377/O/A; The Foodborne and Diarrheal Disease Laboratory at the CDC for ETEC strains 294, 295, 296; and Tom Whittam for ETEC strains TW03577, TW03604, TW03923, TW03875, TW04205, TW04215, TW04237, TW04144, TW04316, TW04244, and TW04211.
†Supplemental material for this article may be found at http://jb.asm.org/.