|Home | About | Journals | Submit | Contact Us | Français|
Summary: Bacterial plasmids are self-replicating, extrachromosomal elements that are key agents of change in microbial populations. They promote the dissemination of a variety of traits, including virulence, enhanced fitness, resistance to antimicrobial agents, and metabolism of rare substances. Escherichia coli, perhaps the most studied of microorganisms, has been found to possess a variety of plasmid types. Included among these are plasmids associated with virulence. Several types of E. coli virulence plasmids exist, including those essential for the virulence of enterotoxigenic E. coli, enteroinvasive E. coli, enteropathogenic E. coli, enterohemorrhagic E. coli, enteroaggregative E. coli, and extraintestinal pathogenic E. coli. Despite their diversity, these plasmids belong to a few plasmid backbones that present themselves in a conserved and syntenic manner. Thanks to some recent research, including sequence analysis of several representative plasmid genomes and molecular pathogenesis studies, the evolution of these virulence plasmids and the implications of their acquisition by E. coli are now better understood and appreciated. Here, work involving each of the E. coli virulence plasmid types is summarized, with the available plasmid genomic sequences for several E. coli pathotypes being compared in an effort to understand the evolution of these plasmid types and define their core and accessory components.
Bacterial plasmids are self-replicating, extrachromosomal replicons that are key agents of change in microbial populations (1, 63). Naturally occurring plasmids are able to promote the dissemination of a variety of traits including drug resistance, virulence, and the metabolism of rare substances (63a). Recombinant plasmids have been essential to the field of molecular biology, but the wild-type plasmids from which these tools were derived are often underappreciated. Escherichia coli, perhaps the most-studied microorganism, has been found to possess a variety of plasmid types including those associated with virulence (91). Several types of E. coli virulence plasmids exist, including those essential for the virulence of enterotoxigenic E. coli (ETEC), enteroinvasive E. coli (EIEC), enteropathogenic E. coli (EPEC), enterohemorrhagic E. coli (EHEC), enteroaggregative E. coli (EAEC), and extraintestinal pathogenic E. coli (ExPEC). Despite the large number of plasmid types known to occur among E. coli strains, plasmids encoding virulence-associated traits fall almost exclusively within a single incompatibility family known as IncF. Thanks to recent genome sequencing efforts, the evolution of these IncF-type virulence plasmids and the implications of their acquisition by a host E. coli strain are now beginning to be better understood.
E. L. Tatum and Joshua Lederberg first described their work involving genetic recombination in Escherichia coli in 1947 (180). In 1953, they described “sex in bacteria,” a body of work that was among the earliest to use the terms “episome” and “plasmid” to describe extranuclear structures capable of reproducing in an autonomous state (99). The earliest studies involving bacterial plasmids were focused on those encoding antimicrobial resistance, known as R factors (190). From those studies, it was found that plasmids possessed certain properties including autonomous replication, mobility, host range, and incompatibility with other plasmids (63a). Work involving the fertility factor, or “F factor,” in E. coli further broadened our understanding of plasmid transfer and replication. For example, studies of the F factor provided insight into the mechanisms of gene transfer between donor and recipient bacteria and led to the discovery that plasmids contain “replicons” responsible for the control of their own replication (82). Thanks to the extensive work of many researchers involving R plasmids, F factors, and colicin-encoding (Col) plasmids, the core attributes of plasmids were established in the 1960s. The early identification of an association between plasmids and the transfer of multidrug resistance (MDR) by several Japanese laboratories accelerated interest in plasmid genetics and led to an improved understanding of bacterial conjugation (50). For over 20 years, Pierre Fredericq studied Col factors, which he described as being episomal traits conferring a competitive advantage in microbial populations through the production of substances known as colicins. He discovered that some of these traits were self-transmissible while others were not. He also reported that a nontransmissible Col plasmid, when placed into the same cell as a conjugative plasmid, was transferred to recipient cells; such plasmids were deemed to be mobilizable, or capable of being moved from a donor to a recipient cell in the presence of another plasmid's conjugative apparatus. Fredericq also determined that these plasmids could be integrated into the host chromosome, and much subsequent work involved Hfr strains, which were shown in vitro to be capable of such integration (56-59, 96).
It was later found that properties conferred by bacterial plasmids, such as autonomous replication, transmissibility, stability, and drug resistance, could be localized to individual genetic modules within a plasmid. Such modules could be dissociated and transformed independently into recipient bacteria (63a). The recognition that insertion sequences (ISs) often flank these modules suggested mechanisms of underlying plasmid plasticity; that is, transposition events mediating deletions or introductions of single or large blocks of genes could result in plasmid remodeling (106).
In the 1970s, small high-copy-number plasmids of E. coli were modified for use as core tools in recombinant DNA technology. The work of Cohen et al. firmly established plasmids as tools for DNA cloning (33). The development of stable plasmid vectors and the use of restriction enzymes and antibiotics as selective agents contributed to the development of DNA cloning in E. coli (63a). Subsequently, other cloning vectors with broad host range and compatibility were developed to suit the needs of research involving other organisms. These fundamental concepts of plasmid biology continue to be used today toward developing the molecular tools needed to study bacteria of emerging interest.
The first attempts to classify plasmids in the 1960s were derived from the discovery that certain plasmids could inhibit F-mediated conjugation when present in the same cell. Based on this property, plasmids were initially characterized as fi+ (fertility inhibition) or fi−. These categories reflected similarities in conjugal transfer systems, where fi+ plasmids were those that were able to inhibit the transfer of a similar conjugative system (191). Because of its limitations, this scheme was replaced by incompatibility typing in the 1970s. This classification scheme was based upon the finding that some plasmids could coexist in the same bacterial host cell, while others could not (63a). Coresident plasmids are defined as being incompatible when they share the same replication mechanisms, and over time, one of these plasmids will be lost by the host strain due to the unstable nature of their coexistence. Since the plasmid replicon type determines the Inc group, the terms “Inc” and “Rep” are used interchangeably to describe plasmid types (34). The work of Couturier et al. resulted in the development of an invaluable set of DNA probes specific for each known incompatibility group of the Enterobacteriaceae that allowed screening for replicon-associated regions that were unique to each known plasmid type (34). Based on these probes, others have devised PCR-based replicon typing schemes (25, 51, 67, 89, 107, 173), which identify the major Inc types occurring among the Enterobacteriaceae. Further refinements of replicon-based plasmid classification are likely, as the influx of genome sequencing data continues to increase and thus clarifies our understanding of plasmid structure and also because novel plasmid groups continue to be identified (186). Although recent reviews have acknowledged that as many as 26 incompatibility groups are known, this number will undoubtedly continue to increase in the future (63) (Table (Table1).1). Despite this ever-growing number of known plasmid replicon types, only a few have been associated with E. coli virulence. The vast majority of virulence-associated plasmids in E. coli belong to the F incompatibility group (91).
In an effort to find an alternative to replicon-based plasmid typing, Francia et al. and Garcillan-Barcia et al. recently proposed the use of conjugative relaxases as a plasmid-typing tool (55, 65). There are several potential advantages to using this typing scheme over traditional replicon-based typing. First, it is more extensive, covering a wider range of plasmids. Traditional plasmid-typing schemes have been limited to replication genes of the known incompatibility groups, although these groups are only the tip of the iceberg with regard to plasmid diversity. Relaxase-based typing can theoretically be applied to any plasmid with either a mobilization region or a type IV secretion system (T4SS). Also, most plasmids contain only one relaxase, whereas plasmids frequently contain multiple plasmid replicons (65). Finally, relaxases have been shown to evolve congruently and are thus stable markers of plasmid evolution. They were hypothesized to be under less evolutionary pressures than replication regions (65). Relaxase-based plasmid typing is still in its infancy but holds promise as a future classification schema. Since replicon-based typing is still the most widely used method for plasmid typing and holds historical importance, it will be used here for our comparisons.
Pathogenic E. coli strains have been divided into different pathotypes based upon the diseases that they cause, the virulence factors that they possess, and their host of isolation (90, 91). The types of disease caused by E. coli are grossly classified as being either intestinal or extraintestinal. Six pathotypes of intestinal pathogenic E. coli strains are recognized (Table (Table2),2), all of which possess virulence-associated plasmids (90, 91). ExPEC strains are a broad group of pathogens that colonize the extraintestinal compartment of animal and human hosts, resulting in such diverse conditions as urinary tract infections (UTIs), meningitis, peritonitis, and sepsis (171). Like the intestinal pathotypes, many ExPEC strains also contain virulence plasmids. As we will see in this review, the vast majority of these E. coli virulence plasmids have evolved from a single plasmid backbone type through the acquisition of traits that are essential for and specific to their respective pathotypes. Some of these pathotype-specific virulence plasmids are highly conserved; others are extremely diverse.
Plasmid replication, mobility, and stability have been subjected to intense scrutiny since the early days of cloning. Indeed, there are many who have dedicated their entire scientific careers toward an improved understanding of plasmid biology. Today, there are journals (i.e., Plasmid) and scientific organizations (http://www.ispb.org) devoted to the study of plasmids and mobile genetic elements. The advent of rapid and affordable genome sequencing, coupled with sustained and intense interest in this field, has resulted in a great increase in our understanding of plasmid structure and biology. However, this rapid increase in knowledge comes at a cost, as many of these sequences get “lost in the shuffle” because of the large number of draft genome sequences in the NCBI database. Perhaps this is especially true for E. coli plasmids, as they are the most represented plasmids in the database. Here, the available virulence plasmid sequences from each E. coli pathotype are analyzed, and literature related to these sequences is reviewed in an effort to better understand E. coli virulence plasmid composition, evolution, and diversity.
ETEC strains are important agents of traveler's diarrhea in humans, neonatal diarrhea in production animals, and postweaning diarrhea in swine (39, 120, 185). Collectively, these diseases account for much human and animal misery and can be economically devastating in terms of lost productivity and health care costs. To cause disease, ETEC strains must first colonize the host's intestinal epithelium. Intestinal colonization is mediated by pili/fimbriae acting as adhesins and promoting adherence (39, 120, 185). Adhesins of human and porcine ETEC strains are encoded by plasmids, as are many other known ETEC virulence factors (39, 120, 185). These adhesins will be referred to throughout as colonization factors (CFs). A second defining trait of ETEC is enterotoxin production, and these toxins can also be plasmid encoded. Highlighted below are the major ETEC types involved in human and porcine disease, their CFs and other virulence factors, and the plasmids encoding these traits.
Arguably, human diarrhea caused by ETEC is the most common disease caused by pathogenic E. coli strains. It is estimated that there are more than 650 million cases of ETEC infection each year, resulting in nearly 800,000 deaths (147). The majority of these cases occur in underdeveloped countries. Thus, ETEC strains pose a significant threat to the indigenous populations of these countries as well as travelers and military personnel visiting them (185). Human ETEC strains are acquired via the ingestion or handling of contaminated food and water. Following infection, a rapid onset of watery diarrhea ensues, which is usually self-limiting but can cause life-threatening dehydration.
For human ETEC strains to cause disease, they must attach to the small-intestinal epithelia, replicate on mucosal surfaces, evade host defenses, and cause damage to the host (185). Human ETEC strains produce either heat-labile enterotoxin (LT), heat-stable enterotoxin (ST), or both. LT and ST traits occur uniformly among human ETEC isolates, with about 35% of isolates expressing ST, 35% expressing ST and LT (ST-LT), and the remainder expressing only LT (64). Greater than 90% of the strains expressing ST-LT have a CF, while significantly fewer strains expressing ST or LT alone also have CFs. Human ETEC strains belong to a large number of serogroups, a typing scheme based upon O-antigen cross-reactivity. The most commonly occurring ETEC serogroups are O6, O8, O25, O78, O128, and O153. These comprise ~60 to 70% of the isolates examined worldwide, with the remaining 30 to 40% belonging to a large number of different serogroups (195). While diversity does exist with regard to serogroup, it appears that a certain chromosomal background is required for an ETEC strain to be capable of causing disease.
Perhaps more important to ETEC strains than their chromosomes, however, are the plasmids that they possess. These plasmids encode CFs of ETEC strains, their toxins, and other adhesins. ETEC strains attach to the intestinal mucosa via CFs, which are proteinaceous structures with a high specificity for the host intestinal epithelia. Gaastra and Svennerholm described the broader human ETEC CF groups in an excellent review article (64). It is thought that CFs are involved in the initial attachment of ETEC to the host cells and that other ETEC virulence factors strengthen this attachment and promote tissue invasion (64). Human ETEC CFs can be either plasmid or chromosome encoded (185), typically by a polycistronic operon that includes the fimbrial subunit genes, chaperones, and ushers. However, the majority of human ETEC CFs are plasmid encoded and appear to have been horizontally acquired via flanking ISs and transposons. These CFs have undergone extensive evolutionary modification, resulting in a number of genetic variants. In fact, there are more than 20 known human ETEC CFs that are genetically distinct, with many also possessing distinct serological properties, further demonstrating that these CFs have the ability to move and rapidly evolve (64).
The diversity of the human ETEC CFs complicates our understanding of ETEC pathogenesis. In most of the studies that have characterized human ETEC populations for known CFs, substantial subpopulations of isolates that lacked known CFs were identified, suggesting that the extent of CF diversity is not fully appreciated and that novel CF types remain to be identified (195). Also, a complicating factor in an understanding of the role of CFs in ETEC pathogenesis is the evolving nature of CF nomenclature, with the result being that the literature can be confusing regarding the classification of the different human ETEC CFs. The original CF groups, known as CF antigen (CFA) groups I through IV, are the most prevalent CF types worldwide. In 1975, the first human ETEC CF was described because of its similarities to porcine CF type K88. This plasmid-encoded CF, named CFA/I, was isolated from archetypical human ETEC strain H10407 (47). Plasmids encoding these fimbriae are positively regulated by trans-acting regulatory genes on the same plasmids or other plasmids, such as rns and cfaR (14, 26). Evans and Evans subsequently described CFA/II, the second described human CF (46). CFA/II was originally described as being a distinct CF type with similarities to CFA/I (46). However, it was later observed that CFA/II actually consists of three antigenically distinct components, CS1, CS2, and CS3 (172). These can occur as CS3 alone or in combination with CS1 or CS2 fimbriae (64). Similarly, CFA/IV was initially discovered to be a single CF type, but it was later found that the CFA/IV antigen has multiple components, expressing CS6 alone or in combination with CS5. Since the discovery of CFA types I through IV, many other antigenically and genetically distinct CF types have been discovered. Also, the nomenclature has shifted: CFA was replaced by putative CF (PCF), and PCF was subsequently replaced by colonization surface antigen (CS) type (64). Some of the original designations have been replaced by newer nomenclature (i.e., CFA/III is CS8, and the PCF types have all been replaced by CS types), whereas others have retained their original names (i.e., CFA/I) (64).
Plasmid pCoo was the first completely sequenced human ETEC CF-encoding plasmid (62). pCoo was isolated from human ETEC strain C921b-1, which is known to express CS1 and CS3 (143). Sequencing of this plasmid revealed that it was cointegrate in nature, containing regions sharing homology with RepI1 plasmid R64 from Salmonella enterica serovar Typhimurium (92, 94) and RepFIIA plasmid R100 from Shigella spp. (124). The composite regions of pCoo are separated by IS100-associated direct repeats, suggesting that a recombination event occurring between regions of two different plasmids resulted in the formation of the cointegrate pCoo (62). The key feature identified on this plasmid was the polycistronic coo operon, which contains four genes encoding the CS1 pilus. Additionally, pCoo encodes a predicted protein similar to the serine protease autotransporter EatA, which was implicated in ETEC virulence (141). The coo genes fall within the RepI1-like portion of pCoo, while eatA is within the RepFIIA-like portion. Interestingly, this cointegrate plasmid appears to be stable, since an analysis of clinical CS1+ isolates revealed that they all contain both cointegrate portions of pCoo (62).
ETEC strain H10407 is a prototypical CFA/I strain isolated from a patient in Bangladesh in 1973 (48). Oral challenge studies with this strain in human volunteers confirmed its ability to cause diarrhea (177). CFA/I is a single fimbrial structure that has been associated with ST enterotoxin on a single mobilizable plasmid (153). Strain H10407 (O78:K80:H11) was recently sequenced by the Wellcome Trust Sanger Institute. At the time of this review, its sequence was publicly available but not yet published (http://www.sanger.ac.uk/Projects/E_coli_H10407/). Like most ETEC strains, H10407 contains multiple plasmids possessing a variety of virulence-associated genes (Table (Table3).3). These plasmids include a 95-kb plasmid, pH10407_95, encoding CFA/I, EtpABC, and EatA, and a 66-kb plasmid, pH10407_66, encoding a conjugative transfer system and LT (Fig. (Fig.1).1). Both of these plasmids appear to possess a RepFIIA-like replication region. pH10407_66 contains an F-like plasmid transfer region, while pH10407_95 contains only remnants of this system.
Recently, the genome of archetypical ETEC strain E24377A was sequenced and analyzed (150). This strain belongs to the O139:H28 serotype and produces LT, ST, CS1, and CS3. Strain E24377A contains six plasmids, ranging in size from 5 to 80 kb. The CS1 antigen, as with strain E1392/75, is encoded on a single plasmid, pETEC_73. A comparison of pETEC_73 with pCoo (strain C921b-1) and pH10407_95 (strain H10407) reveals that both pCoo and pETEC_73 possess the CS1 operon within a RepI1 backbone region, while pH110407_95 contains CFA/I within a RepFIIA backbone (Fig. (Fig.1).1). The RepI1 backbone regions of pCoo and pETEC_73 include the RepI1 replication gene (repZ), genes encoding RepI1 stability (sopAB- and psiAB-like), and genes encoding the production of the R64-like thin pilus (Tra). The primary difference between pCoo and pETEC_73 is the presence of a RepFIIA-like module in pCoo that is absent from pETEC_73. Also, these plasmids differ slightly within the CS1-encoding region, as they are flanked by differing IS elements in the two plasmids. Thus, it appears that the CS1 operon was introduced into an ancestral RepI1 plasmid, and this occurred prior to pCoo's integration of RepFIIA backbone components and eatA.
CFA/I-encoding plasmid pH10407_95 does not possess any RepI1-related regions but does contain a RepFIIA-like replicon and backbone (Fig. (Fig.1).1). This plasmid contains eatA (141); the two-partner secretion locus etpABC, involved in adhesion to intestinal epithelia (54); and the genes encoding CFA/I. The presence of CFA/I on pH10407_95, which is a RepFIIA plasmid, suggests that the ETEC CS operons have been acquired on multiple occasions on multiple plasmid backbones. Like pCoo, pH10407_95 contains a truncated F transfer region. In fact, all three CF-encoding plasmids have incomplete transfer regions, leaving them to rely on other plasmids for mobilization and cotransfer (150, 153). This is a possibility for strain E24377A, which has apparently functional transfer regions on its coresident plasmids (Fig. (Fig.1).1). In addition to the core components and virulence factors mentioned above, many of the sequenced ETEC plasmids and other sequenced E. coli virulence plasmids contain a group II intron-encoded reverse transcriptase/maturase in proximity to fimbrial operons or other horizontally acquired genetic regions. Plasmids containing this region include pETEC_73, pETEC_74, pETEC_80, pCoo, pB171 (176), pUTI89 (31), p55989, and the K88+ and K99+ plasmids. This intron, E.c.I4, has apparently been inserted into IS629 and IS911, accounting for its mobility and distribution among E. coli and Shigella sp. strains (112). This intron is widely distributed among E. coli populations, but its specific function and importance are not known. Whatever the case, it appears that this intron was inserted into these plasmid backbones prior to the evolution of these pathotype-associated plasmids.
Despite the known genetic diversity of human ETEC plasmids with regard to CF types, only a few these plasmids have been sequenced. Therefore, our full understanding of the evolution of these plasmids and its impact on virulence capabilities remains to be elucidated. Future comparative genomic studies will aid in our understanding of this dynamic and interesting group of plasmids that have relevance to human and animal health.
ETEC strains are routinely isolated from cases of diarrheal disease occurring in neonatal and postweaning pigs. These diseases account for significant losses to producers worldwide (39). Interestingly, porcine ETEC CFs are distinct from human CFs, and the binding specificities of animal and human ETEC strains are often quite different (120). However, both human and porcine ETEC strains possess ST and LT together with their fimbrial adhesins (CFs). In addition to these traits, porcine ETEC strains may also harbor toxin-encoding genes such as EAST1 (120). All of these traits can be plasmid encoded. Porcine ETEC strains are perhaps the best-studied ETEC type, and there are several commonly occurring CFs implicated in ETEC-caused porcine disease. Porcine ETEC strains differ among themselves in the CF types that they contain, which relates to the age of the animal that they infect (39).
Neonatal diarrhea in piglets is caused primarily by ETEC strains possessing K99 fimbriae (also known as F5 fimbriae) (39). The K99 antigen, encoded by a ~78-kb conjugative plasmid, was identified by Smith and Linggood (169). The K99 antigen is expressed as mannose-resistant fimbriae of ~9 nm in diameter (39). K99 fimbriae mediate bacterial binding to the small intestinal glycolipid ganglioside of newborn pigs. The K99 trait is transmissible, and these plasmids have been found to be highly conserved among K99+ isolates with regard to size and restriction patterns (81). The K99 operon (fanABCDEFGH) includes positive regulators, a major pilin subunit, an usher, a chaperone, and pilus assembly and elongation genes (5, 97, 154, 163). The expression of K99 is temperature dependent, inhibited at lower temperatures, and activated at body temperature (187). Bradley demonstrated that K99+-encoding plasmids possess a repressible conjugation system similar to that of the F plasmid (17), and Harnett and Gyles showed that the K99 and STa genetic machineries were linked to a single plasmid of ~80 MDa (70). Conversely, later work by Harnett and Gyles showed that STa and STb occurred on heterogeneous plasmids among porcine ETEC strains independent of the K99 plasmid (71).
Currently, no completed K99 plasmid sequences are available. However, draft sequencing has been performed on the plasmid content of a porcine ETEC K99+ isolate (http://www.umn.edu/~joh04207). In this isolate, the K99 operon was located on a conjugative plasmid containing the RepFIA and RepFIIA replicons and the porcine attaching-and-effacing (A/E)-associated (paa) gene (3). This strain also harbored a separate RepI1 plasmid (our unpublished results). These results are in agreement with the results of a previously reported replicon typing study using K99+ isolates (107) and a pilus typing study (17).
987P, or F6, fimbriae are also found in porcine ETEC strains that cause neonatal diarrhea in pigs, with 987P acting to mediate adhesion to intestinal cells (38). 987P pili are subject to phase variation, and this is dependent on growth conditions (80, 121, 122). The operon encoding 987P fimbriae can be located on plasmids or on the bacterial chromosome, although this has been subject to debate (27). Plasmids containing the 987P operon range in size from 35 to 40 MDa (159). The 987P fimbrial gene cluster contains eight genes, fasABCDEFGH, adjacent to a Tn1681-like transposon containing genes encoding the heat-stable enterotoxin STIa (160). No 987P-encoding plasmids have been sequenced, and little is known regarding the genetic makeup of these plasmids.
Postweaning diarrhea in pigs is also caused by ETEC (120). This disease usually occurs within 1 week after weaning, resulting in decreased weight gain and often death. The classical ETEC type implicated in postweaning diarrhea is characterized by the possession of the K88 CF, but recently, ETEC types possessing the F18 CF and other novel CFs have emerged as porcine pathogens. Complicating the control of this disease is a rapid increase in the percentage of multidrug-resistant porcine ETEC strains, which is thought to be due to the extensive use of antibiotics as growth promoters and therapeutic agents (15, 41, 114). While vaccination strategies have been effective in preventing neonatal diarrhea, those aimed at protecting weaned pigs have had mixed results (49). These results are likely multifactorial due to difficulties in stimulating a protective mucosal response and the vast diversity of ETEC strains capable of causing this disease. Classically, however, the key virulence factors that have been implicated in ETEC-caused postweaning diarrhea are hemolysins and K88 or F18 fimbriae (39).
The K88 (F4) antigen was first described in 1961, the first CF of the porcine ETEC types to be discovered (137). K88+ ETEC strains are the most common etiological agents of neonatal and postweaning diarrhea in pigs, and these strains belong primarily to the O8, O45, O138, O141, O147, O149, and O157 serogroups (65). The K88 antigen is encoded by transmissible plasmids (136) and has been associated with the ability to utilize raffinose as a sole carbon source (170). Shipley et al. showed that the genes encoding K88 production and raffinose fermentation were linked to a ~75-kb nonconjugative plasmid (162). Some larger variants of the K88 plasmid also exist, which do contain a functional transfer region. K88 antigen has been associated with enteritis and edema disease in swine, and it was found that these strains possessed an enhanced ability to adhere to porcine intestinal epithelia (120). K88 fimbriae, encoding adherence to the epithelial mucosa, are classified into three genetic variants based upon their antigenic regions, K88ab, K88ac, and K88ad, with K88ac being the most common variant found among porcine ETEC strains (49). These three variants all bind to carbohydrates or glycoconjugants present on intestinal epithelial cells, intestinal mucus, or red blood cells. However, they have different binding specificities for porcine tissue. The operon encoding the K88 CF includes 10 different genes (faeABCDEFGHIJ) (78). These include regulatory genes (faeAB), a major subunit (faeG), minor subunits (faeCFHIJ), and ushers and chaperones (faeDE). The mapping of this operon showed that it is surrounded by ISs, with multiple copies of IS1 separating faeA and faeB, the 5′ end of the operon being flanked by IS91, and the 3′ end being flanked by IS629 (78). Bradley demonstrated that K88+ plasmids possessed conjugation systems resembling those of the IncI1 incompatibility group (17). Mainil et al. demonstrated that many K88 plasmid-containing isolates possess an F-type replicon, but only F-type replicons were probed in this study, leaving the possibility that K88+ plasmids might belong to either the IncF or IncI1 group (108).
No completed sequence of a K88-encoding plasmid is currently available, but draft sequencing has been performed on plasmid preparations of K88ab+ and K88ac+ isolates (http://www.umn.edu/~joh04207). The K88ab and K88ac operons from these plasmids are highly similar to one another and are located on plasmids with a similar arrangement and a similar core backbone. These sequences are also identical to the K88ab operon described by Huisman et al. (78). These operons occur in close proximity to a raffinose fermentation operon, as previously described in the literature (162), and they appear to be located on a RepFIIA plasmid backbone. The virulence and accessory components of these plasmids have evolved in an IS-mediated fashion typical for this plasmid type.
As mentioned above, F18+ ETEC strains are important agents of porcine postweaning diarrhea. The F18 fimbrial adhesins, which are plasmid encoded, display a zigzag pattern when imaged with electron microscopy (123). In addition to their occurrence among porcine ETEC strains causing postweaning diarrhea, F18-encoding plasmids are also found among verotoxin-producing isolates implicated in edema diseases of pigs (29, 30, 37, 73). Typically, F18+ isolates also possess other plasmids encoding STa and STb. Olasz et al. studied prototypical F18ac+ porcine ETEC O147 strain Ec2173 (134). They found that the F18-encoding genes fedABCEF appear to be located on a 200-kb transmissible plasmid, whereas STa and STb were encoded on a separate plasmid. The F18-encoding plasmid in this study, pF18, was also found to contain a hemolysin determinant. A later study involving a 120-kb plasmid known as pTC identified that it encoded STa, STb, and tetracycline resistance (134). Later analysis of pTC found that it contained a putative pathogenicity island (PAI) with a 10-kb fragment containing genes encoding the ST enterotoxins (52). Mainil et al. later identified the AIDA adhesin as being an additional component of the F18 plasmid (110), and Fekete et al. demonstrated that F18-encoding plasmids possessed a single known plasmid replicon, RepFIC (51). Those authors also observed that variants of the F18 adhesin exist: F18ab-encoding plasmids, which were highly variable in their sizes within a population examined, and F18ac-encoding plasmids, which were found to be of a more consistent size.
While no completed F18-encoding plasmid sequences are available, draft sequencing has been performed on the plasmid complement of an F18+, multidrug-resistant ETEC strain, UMNVDL (http://www.umn.edu/~joh04207). Within this strain's plasmid complement are at least seven plasmids, ranging in size from ~1 to over 120 kbp. The largest plasmid sequenced contained drug resistance determinants on a Rep1 plasmid backbone. This plasmid contains a sul3-associated class 1 integron with adjacent genes encoding resistance to macrolides, streptomycin-spectinomycin, mercury, silver, and quaternary ammonium compounds similar to the novel region described by Liu et al. (103). A second plasmid harbors the sepA gene, which was previously implicated in porcine postweaning diarrhea (66). This plasmid contains a RepFII-type plasmid backbone with an intact F transfer region. A third plasmid of at least 60 kb in size has a RepFIIA/RepFIC-like backbone and contains the F18 fimbrial operon. Additionally, this plasmid possesses a hemolysin operon and the aidA adhesin gene, which was shown to play a role in porcine ETEC pathogenesis (152). Data from the sequencing of these plasmids agree with previous observations regarding plasmid sizes, numbers, replicon types, and virulence factors of F18+ porcine ETEC strains. The completion of these and additional F18-related plasmid sequences will aid in our understanding of essential components of these strains.
EAEC strains are the most recently described of the E. coli intestinal pathotypes (77). These bacteria were first described by Nataro et al. in 1987, based upon their distinct aggregative adherence phenotype, which is seen as a brick-like pattern when the bacteria adhere to cultured HEp-2 cells (129). EAEC strains are considered to be an emergent diarrheal pathotype implicated in traveler's diarrhea and affecting immunocompromised children in developing countries (77). In fact, EAEC strains are second only to ETEC strains as being the most common agent of traveler's diarrhea. It is thought that food and water are the most likely means of transmission (77). Epidemiological studies involving this strain have demonstrated that EAEC virulence is heterogeneous, complex, and likely dependent on multiple bacterial factors and host immune status (126). EAEC pathogenesis is thought to involve three primary steps. First, the bacteria adhere to the intestinal mucosa using aggregative adherent fimbriae (AAF). Second, the bacteria produce a mucus-mediated biofilm on the enterocyte surface. Finally, the bacteria release toxins that affect the inflammatory response, intestinal secretion, and mucosal toxicity (77). Aspects of each of these steps involve plasmid-encoded traits.
A primary virulence factor of EAEC is that encoding the aggregative adherence phenotype (72). This trait was found to be associated with AAF (127) and is localized to a 55- to 65-MDa plasmid, termed the “pAA plasmid” (129). Like ETEC CFs, allelic variants of AAF have been identified. AAF from prototypical EAEC strain 17-2 (127) is genetically distinct from AAF from prototypical strain O42 (126), and their respective allelic variants are named types AAF/I and AAF/II. Other allelic variants of AAF have been described, including AAF/III from prototypical strain 55589 (9) and AAF/IV from strain C1010-00 (16). All the identified AAF allelic types appear to be plasmid encoded, and most of the strains analyzed tend to possess only a single AAF allelic type (72). AAF genes are regulated by an AraC-like transcriptional activator, AggR, and strains containing AggR have been termed “typical” EAEC strains (131). The AAF regulon contains both fimbrial genes and a regulator linked to one another on the pAA-type plasmid. There is evidence that AggR is a global regulator of EAEC virulence, as it exhibits effects on a number of chromosomal virulence factors as well (125). The major AAF pilins regulated by AggR include aggA (AAF/I), aafA (AAF/II), and agg3 (AAF/III) (9, 35, 131). AggR also regulates the expression of aap, a dispersin that is highly prevalent among EAEC isolates and facilitates the movement of EAEC across the intestinal mucosa for subsequent aggregation and adherence (77). This dispersin is exported out of the EAEC cell via the antiaggregation protein transporter system, encoded by the genes aatPABCD (8). This ABC transporter system is highly prevalent among EAEC populations, highly conserved, and regulated by AggR (77). While few studies have involved large numbers of EAEC isolates, recent work by Jenkins et al. found that two groups of EAEC exist based upon gene clustering. They are distinguished by the presence or absence of genes encoded on plasmid pAA and en bloc sets of genes located on genomic islands near the pheU and glyU loci (83). The definition of “typical” versus “atypical” EAEC strains has thus been supported by such results, with typical EAEC strains possessing pAA-associated genes and certain chromosomal islands, apparently coinherited.
The EAEC plasmids also encode toxins such as the plasmid-encoded toxins Pet and EAST1 (45). Pet appears to belong to the serine protease autotransporter family and has been shown to confer cytoskeletal rearrangements, suggesting a role for Pet in EAEC pathogenesis (24, 188). EAST1 has been found to activate guanylate cyclase, resulting in ion secretion (128). However, relatively few EAEC strains actually possess the genes encoding Pet and EAST1, so their role in EAEC pathogenesis may be limited (36).
Three EAEC plasmids have been completely sequenced: pO42, belonging to AAF/II+ strain O42; 55989p, belonging to AAF/III+ strain 55989; and pO86A1, containing a novel AAF-like operon. All three of these plasmids are F-type plasmids with stability, maintenance, and transfer regions (Fig. (Fig.2).2). Plasmid 55989p is considerably smaller than plasmids pO42 and pO86A1, which is due to truncations in the F transfer region. This plasmid also differs from pO86A1 and pO42 in that it contains a RepFIC replication region instead of RepFIIA, although all three plasmids also contain a second replication region known as RepFIB (Fig. (Fig.2).2). All three plasmids encode their respective AAF types, and each contains the AAF regulatory gene aggR. While the AAF types possess considerable genetic diversity, aggR is generally highly conserved among the plasmid sequences available. A phylogenetic comparison based upon a nucleotide alignment of available aggR sequences revealed that aggR genes from AAF types I and III appear to be most closely related, whereas other AAF types are more divergent (Fig. (Fig.3).3). Also, sharing nucleotide similarity with aggR is the AraC-type transcriptional regulator rns of human ETEC plasmid types (26).
The features common to all three sequenced EAEC plasmids are the AAF operons, aggR, and aatPABCD (Fig. (Fig.2).2). In all three plasmids, these sequences are present on a RepFIB/FIIA-type backbone. Each of these plasmids also has unique regions not present in the other two sequenced plasmids, including the pet gene in pO42, the ipd gene in pO86A1 encoding an extracellular serine protease, and the Ets iron transport system in pO42 (Fig. (Fig.2).2). The acquisition of Eit by pO42 is particularly interesting because it was previously found only within ExPEC ColV and ColBM plasmids on a RepFIB/FIIA plasmid backbone (87, 88). Although the EAEC plasmids share a common plasmid backbone and core EAEC-associated virulence genes, the gross genetic composition and synteny of these three plasmids are quite different from one another. This would suggest that a significant amount of gene shuffling and rearrangement has occurred since the introduction of their virulence-associated module or that this module has been introduced on different occasions.
EIEC and Shigella strains are described for their ability to cause shigellosis or bacillary dysentery in human hosts (140). Worldwide, this disease is estimated to cause nearly 1 million deaths per year from over 150 million cases (95). EIEC strains are so named for their ability to invade HeLa cells, and in vivo, they can infect human mucosa of the colon and invade M cells, macrophages, and epithelial cells. These interactions with the intestinal mucosa can result in a watery diarrhea that may contain mucous and blood.
EIEC and Shigella strains are characterized by their possession of a large plasmid, pINV, which encodes the ability to invade host cells (140). The Inv plasmid is perhaps the most mosaic of the E. coli virulence plasmids; in fact, nearly one-third of the Inv plasmid encodes IS elements. Previous analysis of the first two sequenced Inv plasmids revealed a 30-kb region responsible for entry into epithelial cells (22). This region included transcriptional activators (virB and mxiEab), effectors (ipaADCB, ipgB1D, and icsB), chaperones (ipgACE and spa15), components of the needle complex (mxiGHIJMD), and inner membrane protein-encoding genes (mxiA, spa24, spa9, spa29, and spa40). While some other components of the pInv type III secretion system are scattered throughout the plasmid, the expression of Inv-encoded virulence factors is globally regulated by VirB and MxiE (22, 100).
Multilocus sequence analysis of Inv plasmids was previously performed by using the ipgD, mxiA, and mxiC genes (98). This analysis resulted in the separation of the Inv plasmid into two distinct but closely related clusters, and the two Inv plasmid types identified from these clusters shared different incompatibility properties, even though they all contain the same RepFIIA plasmid replicon. This is likely reflected by historical difficulties in classifying the IncFIIA incompatibility group and its known genetic diversity (156). Recent genome sequencing efforts have resulted in the completed sequences of eight Inv-type plasmids, and these results further underscore the genetic diversity of these plasmids and the presence of a variety of Inv evolutionary intermediates (Fig. (Fig.4).4). Overall, the Inv plasmids appear to have evolved via a series of deletion/acquisition events, with a high level of nucleotide homology (>95%) within their conserved genetic regions. The lack of polymorphic sites within these regions suggests a recent ancestry among these plasmids. Based upon comparative analysis of the eight sequenced Inv plasmids, the core components of these plasmids lie within syntenic blocks of DNA containing a RepFIIA-like plasmid replicon and the ospB, ospD2D1, sopAB, ospC1D3, ipaH9.8, and traI genes (Fig. (Fig.4).4). Gross comparisons and sequence alignments of these plasmids suggest that the Shigella Inv plasmids have evolved from an EIEC Inv plasmid predecessor via a series of deletions and additions of colinear blocks of DNA (Fig. (Fig.44 and and5).5). It appears that the EIEC Inv ancestral plasmid contained a functional F-type transfer region, the sopAB stability genes adjacent to a group II intron-encoding reverse transcriptase, and a RepFII-like replicon. All of the sequenced Shigella Inv plasmids contain similar remnants of the F transfer region, including its flanking regions and a truncated traI, suggesting that a deletion of this region occurred after its introduction into Shigella species. In addition to an intact F transfer region, EIEC p53638_226 also contains a fimbrial operon and a phosphoglycerate transport (Pgt) system that are not present the Shigella Inv plasmids, suggesting that these elements were later acquired by EIEC plasmids or lost by Shigella plasmids. p53638_226 also contains a genetic region encoding iron- and temperature-regulated sensitivity to colicin Js and the enterotoxin SenB (165) bounded by ISs. This region has been retained by some of the Shigella Inv plasmids but lost in others (Fig. (Fig.44).
In addition to the loss of portions of the F transfer region, Shigella plasmids appear to have evolved from an EIEC plasmid ancestor through the acquisition of type III secretion system components. Of the Shigella Inv plasmids sequenced, pSF5 and pWR501 are most closely related to one another based upon alignments of conserved gene sequences (Fig. (Fig.5).5). However, pWR501 contains nearly 90 kb of genetic information more than pSF5, including ipaH2.5, shET2, ipaJ, virB, ipaADCB, ipgCB1ADEF, mxiGHJKNLMEDCA, virAG, and ushA. Many of these genes are components of the pInv type III secretion system. Falling in between the apparent minimal Shigella Inv plasmid pSF5 and Shigella Inv plasmids pCP301 and pWR501, which contain the largest genetic loads, are intermediates characterized by genetic rearrangements, additions, and deletions (Fig. (Fig.4).4). Data from sequence analysis of conserved plasmid genes (Fig. (Fig.5)5) are congruent with data from global comparisons of the sequenced Inv plasmids (Fig. (Fig.4);4); that is, pCP301, pWR501, and pSF5 have similar arrangements of colinear DNA segments, even though pSF5 lacks regions that pCP301 and pSF5 have apparently acquired. In contrast, pSB4_227 and pBS512_211 have similar genetic arrangements that are different from pCP301, pWR501, and pSF5, suggesting an independent evolution. pSs046 is divergent from both of the above-described clusters, although it has an arrangement resembling portions of pCP301 and pWR501. Overall, the Inv plasmids have undergone a stepwise evolution characterized by additions and deletions of large blocks of DNA within an apparently short evolutionary time frame. This evolution has been shaped by an abundance of mobile elements within this plasmid type.
EHEC strains are a subset of Shiga toxin-producing E. coli (STEC) strains responsible for hemorrhagic colitis (HC) and hemolytic-uremic syndrome (HUS) in humans (197). Although E. coli strains of several serogroups are members of the EHEC pathotype, E. coli strains of the O157 serogroup are responsible for the most severe cases of these diseases (91, 128). EHEC strains are typically known for their ability to produce Shiga toxin and to induce A/E lesions in the host's gut epithelium. While these strains are pathogenic for humans, they can reside peacefully as reservoirs of infection in many production animal types including cattle, swine, and poultry (197). Disease outbreaks in humans are usually associated with the ingestion of some type of food product such as undercooked beef, fresh vegetables, unpasteurized milk and cider, and salami. While the annual health care burden associated with EHEC-caused infections is not overwhelming ($0.3 to $0.7 billion per year in the United States), the severity of the disease is alarming. In young children and the elderly particularly, this disease can progress to HUS, a development that is accompanied by a dramatic increase in rates of morbidity and mortality (197).
The primary virulence determinants of EHEC strains are chromosomally encoded. These include a number of variants in terms of the Shiga toxin and the locus of enterocyte effacement PAI encoding A/E lesion production. However, plasmids may play an important role in the pathogenesis of O157-caused disease. Plasmid pO157 is found in 99 to 100% of clinical O157:H7 isolates from humans (102, 138, 151). However, the role of pO157 in EHEC pathogenesis has not been clearly defined. Some reports have correlated pO157 with hemolytic activity and adherence to intestinal epithelial cells, but the absence of a reliable model of human infection has hindered the progress of our overall understanding of pO157 related to EHEC pathogenesis (23).
pO157, from U.S. HC outbreak-associated strain EDL933, was the first such plasmid sequenced (23). The authors of that study found that pO157 was a 92-kb, F-like plasmid containing previously identified EHEC virulence factors including a hemolysin operon (ehx), a type II secretion system (etpC to etpO), an extracellular protease (espP), and a toxB homolog. Shortly after the EDL933 pO157 sequence was published, a second pO157 sequence from a strain implicated in an outbreak in Japan was published (111). This plasmid was found to be highly similar to pO157 from strain EDL933, differing essentially only at the single nucleotide polymorphism (SNP) level. Subsequent sequences of multiple pO157 plasmids have been obtained through genome sequencing efforts; these sequences further support initial comparisons suggesting that the pO157 plasmids are essentially identical. More sensitive means have subsequently been applied to study the evolution of plasmid pO157. By using pO157 from the Sakai strain as a reference for the resequencing and mapping of pO157 from multiple strains, Zhang et al. found that differences between pO157 plasmids ranged from few to many at the SNP level (198). Such results reinforce the idea that SNP analysis can be applied on a plasmid genome scale as a more sensitive means for examining the evolution of closely related plasmid genomes.
While O157:H7 strains have classically been characterized because of their inability to ferment sorbitol, sorbitol-fermenting O157:H7 strains have recently emerged as important etiological agents of diarrhea and HUS (21). Brunder et al. published the first plasmid sequence from a sorbitol-fermenting O157:H− isolate (20). This plasmid, pSFO157, was significantly larger than pO157, at 121 kb in size. This size difference was due primarily to the presence of an intact F transfer region and sfp fimbriae in pSFO157 (absent in pO157), while pSFO157 was found to lack the katP, espP, and toxB genes present in pO157 (Fig. (Fig.2).2). Similar to the EIEC/Shigella plasmids, it was hypothesized that pSFO157 is an ancestor of pO157, which has evolved via reductive evolution involving a loss-of-transfer function (20). It is also thought that the acquisition of these plasmids played a major role in the evolution of the hypothesized ancestral O55 strain, A3, to the extant O157 human pathogens (53, 193).
Large hemolysin-encoding plasmids are found in the majority of STEC isolates, including those not belonging to the O157 serogroup (133). For example, STEC strains belonging to the O113 serogroup can cause sporadic cases of disease that are indistinguishable from some O157-caused diseases. A study by Newton et al. compared the completed sequence of pO113, an STEC plasmid, to plasmids of O157 isolates (133). Between the two plasmid types, genes shared included the ehx (hly) hemolysin operon, espP, and iha (133). However, pO113 also contained genes sharing similarity with the IncI1 transfer region, several putative adhesins and toxins, but it lacked the toxB region found in pO157. Those authors also found that pO113 is highly conserved among the O113 STEC isolates that they examined. Furthermore, analysis of the ehxA virulence gene and the repA gene of the RepFIB replicon demonstrated the evolutionary divergence of plasmids pO157 and pO113 from a common ancestor. Interestingly, phylogenetic analyses using ehxA, a virulence factor, and repA, a replication gene, were incongruent. This could reflect differences in selective pressures between virulence genes and constitutive genes, but it also emphasizes the difficulties in examining the phylogeny of plasmid genomes, which contain a high degree of plasticity and mobility. Also interesting in this regard is work by Suzuki et al. examining the genomic signatures of all published genomes compared to the signature of pO157 using Mahalanobis distance (175). They used this approach to generate a hypothesis about the evolutionary history of pO157. By using this approach, it was determined that pO157 shared the closest genomic signature with the Yersinia pestis chromosome—closer than its natural host EHEC O157:H7 chromosome. pO157 also shared similar signatures with Y. pestis plasmid pCD1. Overall, these results led those authors to hypothesize that pO157 was acquired from Y. pestis, which is supported by experimental evidence demonstrating the in vivo transfer of genes and plasmids between E. coli and Y. pestis (75).
EPEC is a common etiological agent of acute and persistent diarrhea in infants aged less than 5 years (28). EPEC strains were first described in the 1940s for their association with infantile diarrhea during summer outbreaks in developed countries. While this phenomenon has apparently subsided in developed countries, EPEC still presents a major problem in developing countries, where frequent outbreaks can have mortality rates approaching 30% (28). The EPEC pathotype was so named by Neter et al. in 1955 while describing primary intestinal pathogens not typically present in the feces of healthy individuals (132). The primary serogroups identified among EPEC isolates include O26, O55, O86, O111, O114, O119, O125, O126, O127, O128ab, O142, and O158 (28, 184). Signs and symptoms caused by EPEC infection include diarrhea, vomiting, fever, and malaise (28). EPEC strains are identified by their trademark A/E histopathology (32). In the gut, they can also cause a decrease in the number and size of microvilli and a loss of the mucosal surface layer. Another property exhibited by EPEC is localized adherence to human cell lines such as HEp-2 (130). Studies by Baldini et al. (6, 7) and McConnell et al. (115) demonstrated that localized adherence could be attributed to a 60-MDa plasmid, and these plasmids were thus named EPEC adherence factor (EAF) plasmids. By using human volunteer challenge studies, a wild-type EPEC strain was found to be significantly more likely to cause diarrhea than was its EAF plasmid-cured derivative, further suggesting a role for the EAF plasmid in EPEC pathogenesis (101). Subsequent studies have shown that the EAF plasmid is responsible for localized adherence but is not required for the formation of A/E lesions (93). Later work identified a chromosomal PAI, the locus of enterocyte effacement PAI, responsible for the production of A/E lesions (116). Two categories of EPEC have been described: typical EPEC, containing the EAF plasmid, and atypical EPEC, lacking this plasmid. It is currently not completely understood if atypical EPEC strains are less virulent than typical EPEC strains or if they possess an alternative set of virulence factors accounting for the absence of the EAF plasmid.
Three completed EAF plasmid sequences are publicly available. The first plasmid sequenced, pB171, was described by Tobe et al. (182). This plasmid was isolated from EPEC strain B171-8, belonging to the O111 serogroup and implicated in human diarrhea. This plasmid was found to be 68,817 bp in size and to possess the RepFIB and RepFIIA replicons. Within this plasmid was a cluster of genes encoding bundle-forming pili (bfp genes), which are responsible for the localized adherence patterns exhibited by EPEC, and the perABC genes, involved in the transcriptional activation of bfp and other chromosomally encoded virulence factors. The sequencing of a second plasmid, pMAR7, revealed a high degree of colinearity between pB171 and pMAR7 (19) (Fig. (Fig.2).2). pMAR7 was derived from pMAR2 from EPEC strain E2348/69 belonging to the O127:H6 serotype. The primary differences between pMAR7/pMAR2 and pB171 is the presence of an intact F transfer region in pMAR7/pMAR2, which is completely absent from pB171, and 16 ORFs in pB171 not present in pMAR7/pMAR2, which are mostly mobile elements (19). pMAR2 was sequenced as a part of a recent genome sequencing effort involving strain E2438/69, and pMAR7 and pMAR2 were reported to be identical outside of three SNPs and two single-base insertions/deletions within intergenic regions (79). Although the sequenced EAF plasmids shared very strong similarities with one another, further plasmids from isolates representing different EPEC clonal types should be examined to better understand the overall evolution of the EAF plasmid.
Some E. coli strains can cause disease in extraintestinal locations. Like the intestinal pathotypes, ExPEC strains carry a distinct set of virulence genes enabling their extraintestinal life-style and pathogenicity (84). Infections due to ExPEC carry with them a high cost in terms of both human and animal morbidity and mortality and annual costs to the human health care system and to animal owners. A great deal of work has focused on the virulence mechanisms and molecular characterization of ExPEC, leading to a better appreciation of the great diversity within this pathotype. Within this group are ExPEC strains that are adapted for life in the urinary tract (uropathogenic E. coli [UPEC]), bloodstream and meninges of neonates (neonatal meningitis E. coli), or animal hosts such as birds (avian pathogenic E. coli [APEC]). Although we know much about the molecular mechanisms of chromosomally encoded virulence factors of human ExPEC (85), ExPEC virulence factors can also be carried on plasmids. In particular, colicin-encoding plasmids, particularly ColV plasmids, have long been associated with ExPEC virulence (168). In addition to the ColV plasmid, Smith and Huggins identified a second plasmid type associated with invasive E. coli of human and animal origin known as the Vir plasmid (166-168). For an excellent and prescient review of ColV plasmids and their properties, see a review by Waters and Crosa (192). Here, a brief history of the ColV and Vir plasmids will be given, and the impact of genome sequencing on our understanding of these plasmid types will be examined.
ColV plasmids have a history in the scientific literature dating back to the discovery of the phenomenon of “principle V” by Gratia in 1925, who described a substance produced by a “colibacillus strain” capable of lysing other cells (68). Nearly 40 years later, Nagel de Zwaig demonstrated that the colicin V phenotype was transferable (119). It was later found that this transmissibility was linked to F-type plasmids and that these plasmids conferred a virulence phenotype (192). However, work by Quackenbush and Falkow (148) and Williams and Warner (194) demonstrated that colicin V itself, the namesake of the ColV plasmid, was not responsible for the virulence phenotype conferred by these plasmids. Their findings indicated that other traits encoded by ColV plasmids must be responsible for their contributions to virulence. Indeed, Binns et al. reported that iron acquisition mechanisms and serum resistance were linked to a ColV plasmid (11, 12). Iron acquisition in ColV-containing strains was initially attributed to the presence of the aerobactin siderophore system, and serum resistance was attributed to the presence of the iss and traT genes (12, 194), both of which Johnson et al. confirmed as occurring on ColV plasmids of APEC (86). Dozois et al. described an additional iron acquisition operon on ColV plasmids (42), and Provence and Curtiss described tsh, the first serine protease autotransporter described for the Enterobacteriaceae (146), which Dozois et al. later reported to be linked to ColV plasmids (42). In addition to ColV production, iron acquisition, and adherence, ColV plasmids have been associated with resistance to chlorine and disinfectants (74), growth in human urine (164), improved growth under acidic pH conditions (149), bacteriophage resistance (76), and the establishment of avian colibacillosis (164, 196) and murine septicemia, meningitis (2), and UTI (164).
In 2006, Johnson et al. published the first complete sequence of a ColV plasmid, pAPEC-O2-ColV, from an APEC isolate belonging to the O2 serogroup (88). Analysis of its sequence revealed that many of the virulence factors known to contribute to APEC pathogenesis were located within a plasmid-encoded PAI. This sequence was interpreted along with gene prevalence data in an effort to identify core ColV components. The results of that study confirmed the hypothesis set forth by Waters and Crosa (192), that “constant” and “variable” portions of the ColV plasmid PAI exist. The constant region contained the RepFIB replicon (145); the aerobactin operon (189); the Sit iron and manganese transport system (199); a putative outer membrane protease gene, ompT; an avian hemolysin gene, hlyF (118); a novel ABC transport system known as Ets (88); the salmochelin siderophore system (69); and iss (12). The variable portion of this PAI contained the temperature-sensitive hemagglutinin gene tsh (43) and another novel transport system known as Eit (88).
Since the completion of the sequence of pAPEC-O2-ColV, three additional ColV plasmid sequences have become available (Fig. (Fig.6).6). Like pAPEC-O2-ColV, pAPEC-1 was isolated from a prototypical APEC isolate belonging to the O78 serogroup (117). pEcoS88 originated from an emergent O45:K1:H7 ExPEC strain causing neonatal meningitis (183). pCVM29188_146 originated from an S. enterica subsp. enterica serovar Kentucky isolate from retail chicken breast (60). Despite the diverse sources from which these plasmids originated, their core genetic constitution remains remarkably conserved (Fig. (Fig.6).6). Similar to the EIEC/Shigella plasmids, it appears that ColV plasmids have evolved primarily via additions and deletions of large blocks of DNA, since the regions common to the plasmids share very high nucleotide similarities (>95%) and very few SNPs. As mentioned above, the core components of these plasmids are found surrounding the RepFIB replicon, and they include hlyF and ompT, salmochelin and iss, Sit and aerobactin, and a colicin-encoding region. These components are common to all sequenced ColV-type virulence plasmids and are highly syntenic. ColV plasmids also contain an F transfer region and regions encoding plasmid maintenance and stability (Fig. (Fig.6).6). In pAPEC-1 and pAPEC-O103-ColBM, the F transfer region is truncated but in a different manner in each plasmid. Remaining variations in ColV plasmid sizes (ranging from 80 to 180 kb in the literature) reflect other genetic additions or deletions (192).
Besides ColV, other colicin-encoding plasmids were previously identified among ExPEC strains. These plasmids were shown to encode colicins B and M but were not known to possess virulence factors (113, 158). In 2006, Johnson et al. described an APEC plasmid, pAPEC-O1-ColBM, encoding colicins B and M in combination with a PAI similar to that of pAPEC-O2-ColV (87). Such “ColBM” plasmids appear to have evolved from ColV plasmids, since the ColBM regions are embedded with remnants of a truncated ColV operon (Fig. (Fig.6).6). To date, four ColBM plasmids have been sequenced, including three from APEC strains (pAPEC-O1-ColBM, pVM01, and pAPEC-O103-ColBM) (Table (Table3)3) (87, 181) and one from an MDR E. coli strain isolated from a coastal environment (pSMS35_130) (Table (Table3)3) (61). Interestingly, three ColV/ColBM plasmids (pAPEC-O103-ColBM, pSMS35_130, and pCVM29188_146) have acquired MDR-encoding regions, conferring resistance to drugs such as tetracycline, sulfisoxazole, ampicillin, streptomycin, and trimethoprim (60, 61). The linkage of MDR-encoding regions with the ColV-encoded PAI is particularly disturbing, since it provides a means for the selection of highly virulent strains through the use of antibiotics. pCVM29188_146 harbors typical ColV-associated virulence genes but also contains Tn10 downstream of the F transfer region, an atypical region for accessory components (Fig. (Fig.6).6). pSMS35_130 was found to contain an MDR-encoding island encoding high-level resistance to at least six antimicrobial agents in total. While pSMS35_130 harbors an MDR-encoding region and hlyF, ompT, and sitABCD, it is an atypical ColV/ColBM plasmid because it lacks many of the other typical ColV PAI virulence factors, including iss, iroBCDEN, and the salmochelin and aerobactin operons. In contrast, pAPEC-O103-ColBM possesses both the typical ColV PAI virulence factors and an MDR island. Still, both of these plasmids contain only minimal accessory components compared to the ColBM plasmids pAPEC-O1-ColBM and pVM01, which are from typical APEC isolates and contain a larger number of additional uncharacterized hypothetical genes (Fig. (Fig.6).6). While the ColV and ColBM plasmids might be considered a monophyletic group of plasmids with a highly conserved core genetic makeup, diversity within this group is rampant due to apparently frequent IS-mediated recombinational events. This is supported by our analysis of the eight sequenced ColV and ColBM plasmids, where none are identical with regard to gene content. As a result, the noncore components of these sequenced plasmids are highly heterogeneous and result in a large ColV plasmid “pangenome,” or a collective set of all genes present in ColV plasmids. Most of these accessory components appear to have been introduced into their respective plasmids via recombination events mediated by IS elements such as IS1.
Necrotoxigenic Escherichia coli (NTEC) strains are E. coli strains that produce cytotoxic necrotizing factor (CNF) (40). CNFs are strong toxins that cause tissue damage and host disease. NTEC strains are responsible for various diseases of humans and animals including UTI, septicemia, and diarrhea (4, 13, 18, 39, 40, 135, 174). Thus, NTEC strains are both intestinal and extraintestinal pathogens. A subgroup of NTEC strains are known as NTEC-2, characterized by their possession of the CNF variant known as CNF-2. The defining traits of NTEC-2 are encoded on a virulence plasmid. In 1974, Smith first described this virulence plasmid that later came to be known as Vir (166). The Vir plasmid was described for its association with a surface antigen, ability to produce the Vir cytotoxin, and lethality in chickens. In 1980, Lopez-Alvarez and Gyles described the Vir plasmid in E. coli strains causing human and bovine septicemia (104). Lopez-Alvarez et al. also determined that the NTEC-2 Vir plasmid was transmissible to avirulent E. coli recipients and that it was F-like in nature (105). Later, it was shown that these plasmids encoded CNF-2, which was responsible for the cytopathology and lethality in mice caused by certain NTEC-2 strains (40). The Vir plasmid was also found to encode an F17-like fimbria known as F17b, although this fimbrial operon was not found on all cnf2+ isolates examined (10, 44, 109).
Previous work identified and sequenced some apparent core components of the NTEC-2 Vir plasmid, including regions encoding CNF-2 (139), type III cytolethal distending toxin (142), and F17b fimbriae (44). A completed sequence of an NTEC-2 Vir plasmid is available (unpublished data) (GenBank accession number CP001162). Analysis of this plasmid, pVir68, revealed that it is 138,362 bp in size and contains RepFIB and RepFIIA replicons. As expected, pVir68 contains cdtABC, cnf2, and the F17b fimbrial operon, all of which are identical to previously published sequences (Fig. (Fig.2).2). pVir68 also contains a functional F transfer region and plasmid stability and maintenance regions. Furthermore, this plasmid contains two previously unidentified regions: a putative novel fimbrial operon and the tibAC adhesin/invasin genes, shown elsewhere to induce bacterial aggregation and biofilm formation (161). Overall, the NTEC-2 Vir plasmid appears to have evolved in a typical fashion compared to other virulence plasmids of RepFIB/FIIA origin, with the acquisition of virulence factors via recombination events occurring within a basic RepFIB/FIIA backbone.
Gross plasmid comparisons can be used in combination with sequence analysis in an effort to understand how genomes have evolved. By using all of the sequenced RepFIB/FIIA plasmids, some insights can be achieved (Fig. (Fig.2,2, ,7,7, and and8).8). First, all RepFIB/FIIA plasmids possess some core components, including their two respective replicons, the sopAB and psiAB regions, ensuring their stability, and the F transfer region, ensuring their self-transmissibility (Fig. (Fig.2).2). It is evident that two “genetic load” regions exist, directly upstream and downstream of the RepFIB locus. The upstream genetic load region tends to contain most of the acquired virulence factors for each plasmid type and tends to have the greater load, whereas the downstream genetic load region contains the stability regions, with most genetic acquisitions occurring in between RepFIB and psiAB. Exceptions to this observation are pO42, which has acquired much of its virulence-associated genetic load downstream of the F transfer region and upstream of the RepFIIA replicon, and pCVM29188_146, which has acquired Tn10 in the same location. Nevertheless, acquisitions within this region appear to be highly unusual. RepFIB/FIIA plasmids have the ability to acquire both virulence factors and MDR-encoding islands, as illustrated by plasmids pAPEC-O103-ColBM, pSMS35_130, pRSB107, and pCVM29188_146 (Fig. (Fig.2).2). The NCBI database also contains numerous examples of possible evolutionary intermediates of these plasmids. For example, p1658/97 (200) is a plasmid from a human clinical outbreak strain that contains the hlyF gene adjacent to the RepFIB region, an arrangement that has been found only for ColV and ColBM plasmids. However, this plasmid lacks other ColV-associated genes and instead contains a class 1 integron. A second example of a possible evolutionary intermediate is pRSB107, which was isolated from a sewage treatment plant (176). This plasmid contains the aerobactin operon, which is a typical ColV component, but lacks other ColV-associated genes and instead contains a large MDR-encoding island (Fig. (Fig.2).2). As the representation of F-type plasmids in the NCBI database increases, so will examples such as those highlighted above.
The underlying mechanisms controlling the apparently stable genetic makeup of these plasmids (i.e., the clearly defined genetic load regions and conserved core components) are not completely understood, nor is it completely understood what drives the evolution of the genetic load regions. Since these regions surround the RepFIB replication gene and its adjacent site-specific recombinase, both of which are conserved among these plasmids, it is possible that this region plays a role in such evolution. Perhaps the RepFIB site-specific recombinase acts upon IS elements present within these regions during plasmid replication and segregation. Also, some of the sequenced RepFIB/FIIA plasmids have far fewer IS elements than others, suggesting that the ancestral RepFIB/FIIA plasmid backbone lacked IS elements. Thus, these plasmids have been greatly affected by the acquisition of IS elements over time, which in turn introduced the virulence factors and resistance-encoding genes that now characterize these plasmid types. Also interesting is the loss of a functional F transfer region by a number of sequenced virulence plasmids. Intuition would suggest that these plasmids do not want to render themselves transfer deficient; thus, a truncation in the F transfer region would seem undesirable. However, when one considers that most wild-type E. coli strains harbor multiple plasmids that might contain more than one T4SS, the truncation of the F transfer region might be more easily explained. Perhaps multiple T4SSs in the same bacterium actually interfere with the conjugative process. In this case, it would be beneficial for the duplicate transfer regions to be deleted. This would result in the formation of plasmids with truncated transfer regions, still capable of conjugation and dissemination via the T4SSs of other plasmids within the host complement.
The alignment and phylogenetic analysis of the RepFIB repA gene shows that it is quite conserved (Fig. (Fig.7).7). As expected, the ColV/ColBM plasmids cluster together with pRSB107 and p1658/97, both mentioned above as being possible ColV/ColBM intermediates. EHEC, EPEC, and EAEC plasmids also cluster with members of their own respective groups. Interestingly, EPEC plasmids are distinct from the other RepFIB/FIIA plasmids and actually clustered closest to outgroup plasmid pSLT from S. Typhimurium, suggesting that EPEC plasmids do not share a recent ancestry with other RepFIB/FIIA plasmids. An alignment of RepFIIA repA1 replication genes from the RepFIB/FIIA plasmids suggests that the addition of RepFIB to RepFIIA to form the RepFIB/RepFIIA plasmid type might have occurred on multiple occasions (Fig. (Fig.8).8). The resulting dendrogram from this alignment depicts RepFIIA and RepFIB/FIIA plasmids clustering together within multiple lineages, suggesting that plasmids within these lineages shared a common RepFIIA ancestor lacking the RepFIB replicon in each case and that RepFIB was subsequently acquired during multiple recombinational events. A second possibility is that of reductive evolution, where the shared ancestor contained both replicons and evolved via the loss of the RepFIB replicon by RepFIIA plasmids. However, the former hypothesis is better supported by a similar situation in the Salmonella virulence plasmids, with some plasmids containing both RepFIB and RepFIIA and others containing only RepFIIA (155). The evolution of the RepFIC replicon from RepFIIA, or vice versa, suggests that RepFIB was present on these plasmids prior to the change within the RepFIIA region. This can be observed for plasmid 55989p, where RepFIC resides in a location similar to that of RepFIIA, and RepFIB is also present (Fig. (Fig.2).2). However, the lack of a sequenced RepFIB+ RepFIC− RepFIIA− plasmid makes it difficult to draw conclusions on the overall evolution of this plasmid type.
With the ColV and ColBM plasmids, the presence of either colicin-encoding operon is incongruent with the alignments of their replications genes (Fig. (Fig.2,2, ,5,5, ,7,7, and and8);8); that is, multiple phylogenetic groups emerge based upon replication gene alignments that contain both ColV and ColBM plasmids. This supports the idea that these plasmids have arisen from a RepFIB/FIIA ancestral plasmid on multiple occasions. This is also supported by the findings by Christenson and Gordon showing that ColBM plasmids have arisen on at least three separate occasions (31). The coevolution of similar plasmid types on multiple occasions also provides further evidence that the possession of a particular set of core virulence factors provides the host bacterium with the tools that it needs to cause a specific disease type. Since there are fewer examples of EAEC, EHEC, and EPEC plasmids in the database, it is difficult to determine if they have evolved in a fashion similar to that of the ColV and ColBM plasmids. However, the examples available for these pathotypes thus far suggest that multiple plasmids from each of these pathotypes share a common ancestor (Fig. (Fig.77).
From a genetic backbone standpoint, the RepFIIA plasmids are more variable than the RepFIB/FIIA plasmids (Fig. (Fig.1).1). The core components of these plasmids still include the RepFIIA region, psiAB, and sopAB but in a less conserved manner. The sopAB and psiAB genes are not always present on every sequenced RepFIIA plasmid, and their genetic arrangement is sometimes in a reverse orientation within the plasmid. Still, these plasmids generally contain these core elements and have a genetic load region between the RepFIIA replicon and psiAB. The larger number of sequenced RepFIIA plasmids provides us with a glimpse into the possible range of this genetic load, with plasmids ranging in size from 35 kb to greater than 100 kb. As with the RepFIB/FIIA plasmids, the RepFIIA plasmids appear to have evolved from an ancestral RepFIIA+ sopAB+ psiAB+ Tra+ plasmid that acquired IS elements, which subsequently facilitated the acquisition of virulence factors and drug resistance-encoding genes. Like the RepFIB/FIIA plasmids, some RepFIIA plasmids have truncated F transfer regions, rendering them apparently unable to self-transfer. Also worth noting is the presence of three plasmids in ETEC strain E24377A (pETEC_35, pETEC_74, and pETEC_80) with similar RepFIIA replication regions. The presence of multiple plasmids with similar replicons challenges the idea that they are incompatible with one another and should not coexist in a stable manner. While the stability of these plasmids in their wild-type host is not known, Rasko et al. noted that an analysis of strain E24377A suggests that ETEC is a “pathovar in flux” (150). Quite possibly, the plasmid sequences of strain E24377A in the NCBI database are simply a snapshot in time of these plasmids (or cointegrates thereof) in flux. This is particularly evident when E23477A's plasmids are compared to those of strains C921b-1 (pCoo) and H10407 (pH10407_95 and pH10407_66). pCoo forms a cointegrate between the RepFIIA and RepI1 plasmid backbones and contains eatA and the CS1 operon (Fig. (Fig.1),1), while pH10407_95 is a RepFIIA plasmid containing eatA, CS1, aatPABCD (typically found among EAEC plasmids), and CFA/I. In E24377A, CS1 is encoded on a RepI1 backbone, and eatA is encoded on a separate RepFIIA plasmid. Certainly, the RepFIIA plasmids are in constant flux, and recombination with their coresident plasmids is inevitable, as illustrated here. Given that there are more than 20 known CF types in human ETEC strains alone, it will be very interesting to see how additional sequenced ETEC plasmids compare to what is currently available.
Overall, the alignment of RepFIIA's repA1 genes from the sequenced plasmids distinguishes the EIEC plasmids from other E. coli virulence and MDR plasmids (Fig. (Fig.8).8). Typical MDR-encoding plasmids (pC15-1a and pO26-L) seem to be related to the prototypical MDR-encoding plasmid R100. As expected, p1658/97 seems to be a relative of the ColV and ColBM plasmids, since it contains hlyF and a RepFIB region most similar to these plasmids, and possibly represents a predecessor to such plasmids. ETEC plasmids belong to multiple lineages, further emphasizing their diversity and likely emergence on multiple occasions.
Since several E. coli virulence plasmids form cointegrates with RepI1 plasmid sequences or contain solely a RepI1 backbone, they should also be mentioned here. Fewer completed plasmid sequences belonging to the RepI1 group are available (Fig. (Fig.9).9). pSL476_91 is a 91.4-kb plasmid from an S. enterica subsp. enterica serovar Heidelberg isolate containing a functional transfer region and genes encoding ColIb activity and immunity. pCVM29188_101 is a 101-kb plasmid isolated from S. enterica subsp. enterica serovar Kentucky containing a transfer region, a composite transposon-like element (containing blaCMY-2-blc-sugE), and ColIb activity and immunity genes. Other plasmids closely related to the RepI1-ColIb plasmids according to RepI1 alignments are pCoo and pETEC_73, CS1-encoding human ETEC plasmids. These plasmids lack the ColIb genes but still possess a RepI1 backbone with Tra genes. Their similarity based upon the repZ gene alignment (RepI1) suggests that an ancestral CS1-containing RepI1 unit merged with a RepFIIA plasmid to form cointegrate plasmids pCoo and pETEC_73. Apparently divergent from the above-mentioned plasmids, but sharing common ancestry with one another, are pColIB-P9, a 93.4-kb prototypical ColIb plasmid with a Tra region (90); pNF1358, a 99.3-kb plasmid similar in content to pCVM29188_101; and R64, a 120-kb prototypical RepI1 MDR-encoding plasmid (96). Finally, most divergent from the above-mentioned plasmids is 55989p, an EAEC AAF/III RepFIB/FIC plasmid, with its RepFIC region sharing homology with RepI1. RepI1 plasmids represent an additional plasmid type capable of possessing and disseminating E. coli virulence factors.
With over 40 E. coli plasmid sequences now completed and multiple plasmid sequences available for each E. coli pathotype, the pathogenomics of these diverse plasmids are only now beginning to be fully understood (Table (Table4).4). Comparisons made at the plasmid genome-wide level emphasize their plasticity, which has been daunting to analyze, particularly with regard to plasmid evolution. Despite this diversity, E. coli virulence plasmids are restricted to a few plasmid backbones that have conservation and synteny with regard to their core components. These plasmids contain distinct genetic load regions, which appear to evolve via IS-mediated site-specific recombination. There are now numerous examples in the database of E. coli virulence plasmids that have also acquired MDR-encoding islands. The ease with which this might occur is particularly disturbing, as is the means by which these elements might disseminate. Certainly, these plasmid sequences are only a snapshot view of bacterial genomes in flux. As illustrated with ETEC plasmid comparisons, multiple plasmids within a single bacterial host are under constant pressure and are more amenable to recombination than their host chromosomes. Therefore, rapid changes in bacterial populations brought forth via changes in plasmid content should be of primary concern to those involved in human health and food safety.
The advent of pyrosequencing, its application to bacterial plasmid complements, and the evolving comparative tools for plasmid analysis will facilitate more meaningful comparisons in the future. Efforts are currently under way to improve the ability to perform meaningful annotation and analyses of bacterial plasmids, including a plasmid-focused annotation and analysis site (http://www.theseed.org) and an E. coli plasmid genome database dedicated to the implementation of user-friendly comparative genomics tools (http://www.ecoli.cvm.iastate.edu). The modular and plastic nature of bacterial plasmids surely makes the study of their evolution more complicated than that of bacterial chromosomes. However, the future use of unique and multifactorial approaches (i.e., replicon typing, relaxase typing, multilocus sequence analysis, and analysis of whole-genome content and arrangement) for these problems will aid in our future ability to understand virulence plasmid evolution. The use of such improved tools will also enhance the consistency of plasmid nomenclature and annotation. This, combined with additional plasmid sequences and a refinement of plasmid-typing protocols, will undoubtedly enable a better understanding of the evolution of these dynamic elements.
Preparation of this review was supported by the University of Minnesota College of Veterinary Medicine, the University of Minnesota Supercomputing Institute, and grant EFO062666 from the National Science Foundation.
We give special thanks to Richard Isaacson (University of Minnesota) for critique of this review.
Timothy J. Johnson received his Ph.D. in 2004 from North Dakota State University in Fargo, North Dakota. His postdoctoral work was performed at the College of Veterinary Medicine at Iowa State University (2004 to 2007). Since 2007, he has worked as an Assistant Professor of Microbiology within the Department of Veterinary and Biomedical Sciences, College of Veterinary Medicine, University of Minnesota. Dr. Johnson's research interests involve genomics-based approaches towards understanding the evolution of virulence and antimicrobial resistance in enteric bacteria of production animals and humans. He is particularly interested in the mobile genetic elements contained by these organisms and the means by which they are disseminated among bacterial populations. His research also strives toward the development of means to control diseases of poultry, swine, and humans.
Lisa K. Nolan received D.V.M. and Ph.D. degrees from the University of Georgia. Her early career was spent at North Dakota State University, where she rose through the ranks from Assistant to Full Professor and served as the founding Director of the Great Plains Institute of Food Safety. In 2003, she became Professor and Chair of the Department of Veterinary Microbiology and Preventive Medicine in the College of Veterinary Medicine at Iowa State University and now serves as this college's Associate Dean of Research and Graduate Studies. The enduring focus of her research has been on the virulence mechanisms that extraintestinal pathogenic Escherichia coli (ExPEC) uses to cause disease in animal and human hosts, the role of plasmids in the pathogenesis of these diseases, and the relationship between ExPEC diseases in animals and those of human hosts.