|Home | About | Journals | Submit | Contact Us | Français|
The IncQ plasmids have a broader host-range than any other known replicating element in bacteria. Studies on the replication and conjugative mobilization of these plasmids, which have mostly been focused on the nearly identical RSF1010 and R1162, are summarized with a view to understanding how this broad host-range is achieved. Several significant features of IncQ plasmids emerge from these studies: (1) initiation of replication, involving DnaA-independent activation of the origin and a dedicated primase, is strictly host-independent. (2) The plasmids can be conjugatively mobilized by a variety of different type IV transporters, including those engaged in the secretion of proteins involved in pathogenesis. (3) Stability is insured by a combination of high copy-number and modulated gene expression to reduce metabolic load.
In the 1960's and early 1970's, small R factors with linked resistances to streptomycin and sulfonamides were being isolated from different Gram-negative bacteria (Barth and Grinter, 1974). These plasmids were not members of the incompatibility groups identified at the time and a physical analysis was undertaken to determine their relatedness. It turned out that many of the nonself-transmissible plasmids had similar molecular weights (about 5.5 × 106 daltons) and extensively cross-hybridized (Barth and Grinter, 1974), indicating that they were closely related. Their relatedness was confirmed by the demonstration that they were in the same incompatibility group (termed IncQ) (Grinter and Barth, 1976). At about the same time, the laboratory of Stanley Falkow identified a similar plasmid from E. coli and named it RSF1010 (Guerry et al., 1974). It quickly became clear that these plasmids had an extended host-range (Nagahari and Sakaguchi, 1978). In addition, although they were not self-transmissible, they could be mobilized by different conjugative plasmids, often at high frequencies (Guerry et al., 1974; Willetts and Crowther, 1981).
Three plasmids in the IncQ group, RSF1010, and R300B and R1162, from S. enterica serovar Typhimurium and P. aeruginosa, respectively (Barth and Grinter, 1974), have been studied extensively. Since they are essentially identical, they will be discussed interchangeably in this review. These plasmids, and close relatives sharing a common core, continue to be found in a wide range of bacteria occupying different habitats (Fig. 1A) (Bönemann et al., 2006; Ito et al., 2004; Kehrenberg et al., 2008; Palmer et al., 1997; Recchia and Hall, 1995; Smalla et al., 2000). With its frequent appearance on different stages taken as a criterion (albeit in slightly different costumes), RSF1010/R1162/R300B is one of the most successful actors in the plasmid world. Underlying this success is, most obviously, the ability of these plasmids to be replicated and maintained in different hosts. These include not only the enterics, pseudomonads and their relatives in the proteobacteria, but also more distant species in the firmicutes, actinomycetes and cyanobacteria (Gormley and Davies, 1991; Nesvera et al., 1994; Sode et al., 1992). Less obvious perhaps is the success of RSF1010 in using different type IV secretion systems to bring about their own mobilization. Mobilization by the broad host-range incP1 plasmids, such as RK2 and R751, has almost certainly contributed substantially to the dissemination of RSF1010. However, RSF1010 is also mobilized by other groups of self-transmissible plasmids (Willetts and Crowther, 1981), the integrating conjugative element SXT in Vibrio cholerae (Hochhut et al., 2000), the Dot/Icm virulence system of Legionella (Segal and Shuman, 1998; Vogel et al., 1998), as well as uncharacterized type IV secretion systems in Salmonella (Baker et al., 2008) and Agrobacterium tumefaciens (Chen et al., 2002). The ability of the RSF1010 replication system to establish in different environments is exemplified by the incQ2 group of plasmids (Fig. 1B), which are found in acidophilic bacteria associated with mining environments (Dorrington and Rawlings, 1990; Gardner et al., 2001). Not surprisingly, RSF1010 has been extensively exploited in the development of both mobilizable and non-mobilizable plasmid vectors (Bagdasarian et al., 1981; Bagdasarian et al., 1982; Priefer et al., 1985; Sharpe, 1984). RSF1010-based vectors have led to the development of genetic systems in bacteria where such tools have been limited or absent (Coppi et al., 2001; Frey, 1992).
In this review, studies on the molecular mechanisms of replication and transfer will be summarized with a view to understanding how broad host-range is achieved by the RSF1010 plasmid group. Some of the major results of this work are schematically outlined in Fig. 2. A comprehensive review comparing the structural relationships of the different IncQ-like plasmids and the phylogenetic relatedness of their genes has been published recently (Rawlings and Tietze, 2001).
The replicative origin of RSF1010 (oriV, Fig. 1) was first mapped by electron microscopic examination of partially replicated molecules extracted from E. coli (de Graaff et al., 1978). The locations of the replicative forks formed on partially replicated molecules indicated that synthesis could be either bidirectional or unidirectional. RSF1010 uses the same origin in Pseudomonas aeruginosa (Scholz et al., 1985), and thus probably in all bacteria within its host-range. The origin was later mapped to the same location by cloning different DNA fragments and testing for replication when the R1162 proteins were provided in trans (Lin and Meyer, 1984; Meyer et al., 1985; Scholz et al., 1985). The essential DNA is located in two, nearly adjacent HpaII fragments (Fig. 3). One of these fragments contains three and one-half perfectly conserved directed repeats, or iterons (DRs). The other fragment is made up mostly of a large inverted repeat (IR).
The functionally active regions within the two HpaII fragments containing oriV have been defined more exactly by deletion analysis. Deletions extending from the left in Fig. 3, up to the beginning of the IR, had no effect on replication, but larger deletions into the IR inactivated the origin or, at best, resulted in very low activity (Haring and Scherzinger, 1989; Meyer et al., 1985). From the right, all the DNA up to and including the half DR is dispensible (Haring and Scherzinger, 1989; Lin and Meyer, 1986). However, the DNA adjacent to the DRs on the other side is required: only about 60 bp rightward from the internal HpaII sites can be deleted (Lin and Meyer, 1986). Deletions extending leftward from this HpaII site are tolerated only up to the internal arm of the IR (Lin and Meyer, 1987). Thus, the oriV is made up of are two, noncontiguous essential regions, labeled I and II in Fig. 3. The tolerance of oriV to internal deletions between I and II showed that the spacing between them did not need to be conserved. In fact, the spacing between the two domains could be increased to at least 2000 base-pairs without complete loss of activity (Kim et al., 1987). Domain I could also be inverted with respect to domain II (Haring and Scherzinger, 1989; Meyer et al., 1985), showing that its structural symmetry reflects functional symmetry as well.
The replication of RSF1010 DNA in a cell-free extract was first reported in 1982 (Diaz and Staudenbauer, 1982). Scherzinger and his colleagues (Scherzinger et al., 1984) used a refined version of this system to show that RSF1010 encoded three proteins required for its replication. Their approach was to partially purify proteins from strains containing different cloned fragments of RSF1010 DNA, then add these proteins in different combinations to an extract from a plasmid-free strain and test for replication of RSF1010 DNA. Replication in vitro was initiated from the same region identified in vivo as the origin (Haring and Scherzinger, 1989). At the time, the three replication genes (named repA, repB and repC) could only be localized approximately by deletion analysis, but they mapped at positions remote from the origin (Scherzinger et al., 1984; Scholz et al., 1985). Shortly thereafter, completion of the RSF1010 base sequence (Scholz et al., 1989) allowed unambiguous mapping of the genes as shown in Fig. 1. In addition, all three replication proteins have been purified in active form (Scherzinger et al., 1991).
The properties of the in vitro system, and the behavior of R1162 in vivo as well, gave clues to the possible functions of the plasmid Rep genes. RSF1010 DNA was replicated in an extract containing a defective DnaA protein (Scherzinger et al., 1991), and in vivo dnaA(ts) could be integratively suppressed by R1162 (Brasch and Meyer, 1988). In addition, antibodies directed against DnaG and DnaB were ineffective at inhibiting replication in vitro (Scherzinger et al., 1991). The clear indication, borne out by subsequent studies, was that RSF1010 provided key proteins for initiation of its own replication, thus maintaining greater independence from the host.
The presence of direct repeats in oriV indicated that the replicative origin of R1162 belongs to the iteron-type, common for a large group of different plasmids and phages including F, P1, λ, R6K, RK2 and pPS10, as well as for bacterial chromosomes. In these origins, a plasmid-encoded initiation protein binds to a set of direct repeats, inducing torsional stress to disrupt the helicity of the adjacent DNA and allowing entry of a helicase and perhaps other components of the initiation machinery. For RSF1010, this protein is RepC. The protein was shown by electrophoretric mobility-shift to bind to a 200 bp DdeI fragment containing the DRs of domain II (Fig. 3), but not to the IR of domain I (Haring and Scherzinger, 1989). In addition, the DRs were required for expression of incompatibility in vivo, presumably by binding RepC (Lin and Meyer, 1984, 1986; Persson and Nordström, 1986). Binding of RepC to oriV is thought to be cooperative and result in the formation of a large complex (Haring and Scherzinger, 1989). Consistent with this, RepC binding causes static bending of DNA containing one or more iterons (Miao et al., 1995). However, even a single DR exerts some incompatibility and interferes with replication of R1162 DNA in a cell extract (Lin and Meyer, 1984). Binding of the protein to the DRs causes localized strand separation, detectable by probing with P1 nuclease, in the adjacent, AT-rich region in domain II (Kim and Meyer, 1991). Mutations in the AT rich region (Fig. 3) reduced P1 sensitivity and interfered with replication in vivo. It is likely that RepC is directly contacting this region, since one of the mutations could be suppressed by a point mutation in repC. The P1-sensitive region itself contains directly repeated DNA, reminiscent of the 13 bp contact sites for activated DNA-ATP in the E. coli origin (Bramhill and Kornberg, 1988; Speck and Messer, 2001), but the function or significance of these DRs, if any, is unknown. In any case, these repeats are in a region that is highly conserved in the origins of the IncQ plasmids (Rawlings and Tietze, 2001).
The RSF1010-like plasmids pIE1107, pIE1115 and pIE1130 contain a secondary origin that is non-functional for replication but exerts incompatibility toward RSF1010 (Rawlings and Tietze, 2001; Tietze, 1998). It has been proposed that the inactive origin conveys a competitive advantage for the plasmid since it will be able to exclude a wider variety of IncQ-like plasmids from the cytoplasm. Consistent with this idea, the narrow host-range plasmid R89S contains an RSF1010 oriV-like sequence that exerts incompatibility toward RSF1010, but the two plasmids replicate by completely different mechanisms (Saano and Zinchenko, 1987).
RepA is a DNA helicase (Haring and Scherzinger, 1989; Scherzinger et al., 1997) that was shown to be a hexamer by gel filtration chromatography and cross-linking. Because it is in the same group of helicases as DnaB (Gorbalenya and Koonin, 1993), attention has been paid to its structure and function. Reconstruction of electron micrographic images suggested a hexagonal ring of subunits, in agreement with the structure deduced from X-ray crystallographic data (Niedenzu et al., 2001). The hole in the center of the ring has a diameter of 17 A, large enough to accommodate a strand of DNA. While detailed consideration of the RepA structure is beyond the scope of this review, it is worthwhile noting that the structure in solution is significantly different than the crystal structure, and the shape of the protein is altered significantly by binding ATP (Marcinowicz et al., 2008). In particular, sequential binding of ATP causes the flattened hexamer to assume a more rounded shape, which in turn increases its affinity for single-stranded DNA, thus promoting strand separation. Subsequent hydrolysis of the bound co-factors then causes the reverse conformational change, readying the protein for the next cycle of interaction.
The protein RepB' was shown to be a primase by adding different combinations of the Rep proteins to a cell free extract and monitoring complementary strand synthesis of an M13 derivative containing the R1162 oriV (Haring and Scherzinger, 1989). RepB' was the only plasmid protein required for initiation on this substrate. The R1162 oriV was also essential, indicating that the primase is highly specific. In addition, since oriV was active in either orientation, there must be priming sites on both strands. A parallel series of studies was done in vivo, by testing for complementation of an oriC-defective M13 (Honda et al., 1989). A deletion analysis showed that RepB' was required for complementation, consistent with biochemical data. The primase is also made as the C-terminal domain of a larger protein, MobA (also called RepB) (Scholz et al., 1989). The N-terminal part of the protein is a DNA relaxase active at the origin of transfer (oriT), with the two domains linked by translation through mobB (Figs. 1,,2),2), in a different reading frame than the one that encodes MobB itself. Both RepB' and the larger RepB/MobA are active in priming (Scherzinger et al., 1991). The RSF1010 primase can initiate replication by incorporation of either rNTPs or dNTPs (Haring and Scherzinger, 1989), but at some point synthesis shifts to DNA polymerase III, since DnaZ (the gamma subunit of polIII holoenzyme) is required for replication of RSF1010 DNA (Scherzinger et al, 1991).
The crystal structure of RepB' has been recently solved (Geibel et al., 2009). The protein consists of two distinct domains, joined together by a long α-helix and 14-amino acid linker. Both domains are required for activity in vitro, although they can be provided as separate fragments. The N-terminal domain, which contains the catalytic center, is closely similar in structure to the catalytic domains of archaeal primases.
The single-strand initiation sites for R1162 replication were identified by allowing plasmid DNA to replicate in a cell extract in the presence of dideoxyCTP (Lin and Meyer, 1987). The 5′ ends of the prematurely terminated nascent strands were then mapped by digestion with restriction enzymes and gel electrophoresis. Synthesis within oriV was initiated at each end of the large inverted repeat in domain I (Fig. 3), at sites called oriL (ssiA) and oriR (ssiB). The two sites face inward within the domain, so that the two nascent strands would quickly pass each other. The locations of the initiation sites were also roughly mapped by the ability of cloned fragments to complement the oriC-defective M13 (Honda et al., 1988). The positions determined in this way were consistent with those identified in vitro. Adjacent to each start site is a small hairpin loop. The secondary structure of these loops is important for initiation: mutations in one arm impaired initiation, but these were partially suppressed by mutations in the other arm restoring base complementarity (Miao et al., 1993). In a co-crystal, the RepB' catalytic domain interacts with an ssiA oligonucleotide at the bases underlined in Fig. 3 (Geibel et al., 2009). Formation of a hairpin loop by the inverted repeat thus brings the subunit close to the location where synthesis is initiated (starred in the figure). The other subunit of the protein also binds ssiA DNA, but the binding site has not been determined (Geibel et al., 2009). The two initiation sites for replication are interchangeable, and can be outwardly, rather than inwardly facing (Zhou and Meyer, 1990). Since the sites themselves are small, domain I would appear to be much larger than required. The spacing between oriL and oriR is not required: neither an 81 bp insertion nor a 69 bp deletion between these sites affects replication (Lin et al., 1987).
Examination of replicative intermediates formed both in vivo (de Graaff et al., 1978) and in a cell extract (Scherzinger et al., 1991) suggested that oriL and oriR do not necessarily initiate synthesis simultaneously. In fact, about 30-40% of the replication bubbles are made up of one single-stranded and one double-stranded arm. An interesting question is whether such intermediates complete a round of synthesis, generating a single-stranded molecule with no net gain of plasmid copies. In vivo, at least, there is only a small production of single strands during replication of RSF1010 (Tanaka et al., 1994), indicating that in most cases the second origin fires after a delay.
Plasmids that lack oriL or oriR are replicated poorly both in vivo (Tanaka et al., 1994; Zhou and Meyer, 1990) and are poor templates for RSF1010-specfic synthesis in cell extracts (Haring and Scherzinger, 1989). Thus, there are no secondary sites for initiation of replication of RSF1010 that can be efficiently utilized by the plasmid. Nevertheless, plasmids lacking one ssi site can still be replicated and maintained in the host, although they are unstable and present at less than half the normal copy-number (Tanaka et al., 1994; Zhou and Meyer, 1990). Cells with plasmids containing one ssi accumulate single-stranded plasmid DNA, indicating unbalanced replication with initiation on the strand lacking an origin failing to keep up with initiation at the remaining ori site (Tanaka et al., 1994). In cell extracts, only synthesis from the remaining origin is detectable (Zhou and Meyer, 1990). In addition, multimers of the defective plasmids accumulate during growth of cells. This would be expected since multimers would have a replication advantage because of the additional copies of the weak initiation sites. A reasonable model is that complementary strand synthesis is initiated on the accumulated population of single-stranded DNA, formed by repeated initiation at the remaining initiation site. Although the initiation frequency per molecule on the strand lacking a bona fide origin might be low, the large number of molecules would result in a frequency of replication capable of sustaining the plasmid. However, this model might be incorrect. R1162 and a ΔoriL deletion derivative were uniformly density labeled, then shifted to light medium and pulsed for less than a generation time with radioactive thymidine (Becker et al., 1996). The amounts of radiolabel incorporated into each strand of those molecules with hybrid density were similar for both plasmids. This result suggests that initiation of synthesis on the ori-containing strand favors initiation on the other strand, and that the completed single strands are much less active. Whatever the mechanism of initiation on ssi-less DNA, it is favored by SOS induction (Becker et al., 1996) presumably brought about by the accumulation of single-stranded DNA in the cell. In addition, the ssi signals can be substituted by a DnaA box (Taguchi et al., 1996), confirming there are potential but normally cryptic sites for initiation of strand synthesis on the plasmid.
By analogy with other iteron-containing origins, it is thought that the RepA helicase enters at domain II (Fig. 3) within the AT-rich region, and then migrates to domain I to cause strand separation and thus activation of oriL and oriR. Presumably, the site of localized strand separation within this region, induced by RepC binding to the iterons (Kim and Meyer, 1991), also marks the position of helicase entry. Consistent with this idea, placing a helicase termination sequence from the E. coli terminus between domains I and II lowers activation of oriL and oriR (Zhou et al., 1991). On the other hand, the helicase and primase do not detectably interact (Scherzinger et al., 1997). Rather, activation appears nonspecific: when ssi sequences from X174 and pACYC184 were used to replace oriL and oriR, these sites were activated as well (Honda et al., 1991). In this case, the plasmid primase was no longer required. Interestingly, the substitute priming sites were inactive in Pseudomonas aeruginosa (Higashi et al., 1994). These artificial substitutions are mirrored by the independence of RepAC from RepB in nature: the IncQ-like plasmid pGNB2, isolated from bacteria in wastewater, contains a repAC module but no gene related to repB (Bönemann et al., 2006). Activation of domain I by helicases other than RepA has not been reported. However, it is clear that oriL and oriR can contribute to strand synthesis during conjugation, under conditions where domain II is absent (section 3.3).
RSF1010 is maintained at a copy-number of about 10-12 per chromosome in E coli, P. aeruginosa and S. enterica sv.Typhimurium (Frey and Bagdasarian, 1989). There are several regulatory circuits in the plasmid that could potentially contribute to copy-number control (Fig. 2). The repA and repC genes are co-transcribed from promoter p4 (Bagdasarian et al., 1981; Scholz et al., 1989). The second gene in the RepAC operon, repF (also known as cac) encodes a small repressor active at operators adjacent to p4 (Maeser et al., 1990) (Fig. 4A). There is another gene in the operon, repE, that is translated but does not appear to be active in regulation (Maeser et al., 1990; Scholz et al., 1989). RepF could be controlling copy-number by regulating the amounts of RepC in the cell: overproduction of RepC from an expression vector resulted in as much as a six-fold increase in copy-number (Haring et al., 1985; Kim and Meyer, 1985). In contrast, overproduction of the helicase RepA had no effect on plasmid copy-number. Nonetheless, a small, 75-nt antisense RNA is made which overlaps the ribosome binding site and initiation codon of repA (Kim and Meyer, 1986). A mutation lowering the amount of this RNA also increases plasmid copy-number, presumably due to a downstream effect on mobC.
Transcription from the promoters p1-p3 (Fig. 4B) is negatively regulated by two components of the relaxosome, MobA and MobC (Frey et al., 1992). Since a transcription termination signal has not been identified at the end of MobA/RepB (Scholz et al., 1989), transcripts originating from these promoters, in addition to being responsible for synthesis of RepB, could also pass through repEFAC (Fig. 2). Deletions in mobA and mobC led to up-regulation of transcription from p1-p3, and up to a four-fold increase in plasmid copy-number (Frey et al., 1992). Since overproduction of the primase leads only to a 1.6-fold increase in copy number (Haring et al., 1985), some of the effect of increased transcription from p1 and p3 (Fig. 2) must be due to transcription through the RepAC operon. In agreement with copy-number regulation at p1 and p3, deletion of the entire mob region of RSF1010, including these promoters, resulted in a twofold decrease in copy-number (Katashkina et al., 2007)
If p4 and p1-3 both initiate transcription through repA and repC, and both are negatively regulated, it would seem that regulation of these genes, and copy-number control, is over-determined. In this connection, the related plasmids pTF-FC2 and pTCF-14, isolated from Acidithobacillus ferrooxidans and Acidithiobacillus caldus, respectively (Gardner et al., 2001; Rawlings et al., 1984), have IncQ-like replication systems, with the Rep genes arranged similarly to those in RSF1010 (Fig. 1). However, instead of repF (and repE), these plasmids encode toxin-antitoxin systems that can stabilize low copy-number plasmids (Gardner et al., 2001; Smith and Rawlings, 1997). It is not obvious why pTF-FC2 and pTCF-14 carry these addiction systems, since the copy-numbers of these plasmids and RSF1010, which lacks any known plasmid stability system, are similar (Dorrington and Rawlings, 1989; Gardner et al., 2001; Haring and Scherzinger, 1989). In addition, the RSF1010-like plasmids pDN1 from Dichelobacter nodosus and pIE1107, from an uncharacterized organism, lack repF as well (Tietze, 1998; Whittle et al., 2000). These observations suggest that RepF is not a required regulatory element. We have deleted repF and p4, so that all transcription of repA and repC is originated from p1 and p3. The copy number of the plasmid is essentially unchanged (Zhang and Meyer, unpublished). Interestingly, the transformation efficiency of the plasmid is reduced, suggesting that the p4 control system might have evolved to allow rapid expression of replication genes when the plasmid enters a new host, thus increasing the chance of establishment.
Many iteron-containing plasmids, such as F, P1, RK2 and R6K, form “handcuffing” complexes to regulate the rate of initiation at the origin (Blasina et al., 1996; McEachern et al., 1989; Park et al., 2001). In these complexes, plasmid molecules are noncovalently bound at the replicative origin by the iteron-binding proteins, an association that dramatically interferes with localized melting at the origin (Zzaman and Bastia, 2005) and therefore reduces the rate of initiation. One of the hallmarks of such regulation is that an increase in the iteron-binding protein does not produce a large increase in copy-number, since handcuffing prevents a greater rate of initiation by the additional initiator (Durland and Helinski, 1990; Filutowicz et al., 1986; Pal and Chattoraj, 1988). A second characteristic is that the dimer form of the initiator is primarily responsible for handcuffing, whereas the monomer is active in initiation of replication; as a result, mutations causing defects in dimerization result in high copy-number (Das and Chattoraj, 2004; Kummimalaiyaan et al., 2005; Toukdarian and Helinski, 1998). There have been no reports of handcuffing at the iterons of IncQ plasmids. Since overproduction of RepC results in as much as a six-fold increase in copy number (Haring et al., 1985), any inhibition of initiation by handcuffing must be only partially effective. On the other hand, RepC forms dimers in solution (Scherzinger et al., 1984) and, more interestingly, binds simultaneously in vitro at oriV and at a secondary site, resulting in looping of the DNA (Haring and Scherzinger, 1989). This behavior, observed by electron microscopy, might reflect a propensity of the protein to handcuff DNA in vivo. In addition, two DRs, cloned into R1162, lower the copy number of the plasmid (Becker and Meyer, 1997) and exert incompatibility when cloned in trans (Lin et al., 1987). Since transcription from p1 and p3 is normally at about one-fourth the unregulated level (Frey et al., 1992), up-regulation of transcription should be able to maintain copy number at a normal level if the additional DRs were simply titrating RepC.
A large number of plasmids, including RSF1010, can be conjugatively mobilized by other, self-transmissible plasmids. Mobilizable (Mob) plasmids encode proteins to process their DNA for transfer and recognize the conjugative pore, but parasitize the type IV secretion systems encoded by the larger, self-transmissible elements. In 2004 the Mob regions of plasmids were classified in several different groups (Francia et al., 2004). RSF1010 was taken as the archetype of one group, named mobQ. However, it is important to note that plasmid members of this group, which are distributed widely in the bacterial world, are quite diverse, and may contain replicons and other properties different from those of the IncQ plasmids.
The Mob region of RSF1010/R1162 was localized to a single region of the plasmid and the genes shown to have a complex, overlapping arrangement (Brasch and Meyer, 1986; Derbyshire et al., 1987) (Figs. 1,,2).2). The organization of the Mob region was clarified by completion of the sequence and identification of the active ORFs (Scholz et al., 1989). As described in section 2.2 and shown in Fig. 2, MobA consists of an N-terminal DNA processing domain linked by a polypeptide bridge to the primase. Fusion of the relaxase to another protein is not an invariable property of MobA-like relaxases, but is characteristic for those plasmids using the RSF1010-type replicon. When the Mob region is cloned into a different replicon, the primase part of MobA is not required for mobilization (Derbyshire et al., 1987).
The R1162 oriT is small, consisting of no more than 38 bp (Brasch and Meyer, 1987). The site is made up of an inverted repeat and an adjacent region consisting of three AT base-pairs and a GC-rich region containing the nick site (Fig. 5). For the large population of MobQ plasmids, the IR at oriT varies extensively in size and sequence, whereas the rest of oriT is more highly conserved and has been referred to as the “core” (Becker and Meyer, 2003). For a few MobQ plasmids, such as pSC101, DNA adjacent to the IR arm furthest from the core has a sequence very similar to an inverted, second copy of the core (Becker and Meyer, 2003). The core-like DNA is inactive for transfer, but raises the possibility that MobQ oriTs are generated by duplication and inversion of a core sequence and some adjacent DNA. Transfer is in the direction shown in Fig. 5, so that the majority of oriT is at the end of the transferred strand (Kim and Meyer, 1989). The location and orientation of oriT with respect to MobA appears to be highly conserved among the diverse MobQ group. Since oriT is functional when cloned into other plasmids, this conservation probably reflects its role in regulation (section 2.4) rather than a requirement for transfer.
Three proteins encoded by RSF1010 are required for its mobilization: MobA (MW 77,945) and two smaller proteins, MobB (MW 15,097) and MobC (MW 10,867). [In this review, all calculations include the N-terminal methionine, which is removed from MobA (Scholz et al., 1989)] The three proteins have been purified and used to assemble in vitro a complex (the relaxosome) on the plasmid oriT (Scherzinger et al., 1992). In this complex, or relaxosome, MobA reversibly nicks one of the DNA strands at the location shown in Fig. 5. The nicked intermediate contains MobA covalently linked to the 5′ end of the cleaved strand. The optimal stoichiometry for the reaction is 1-2 copies of MobA per plasmid molecule. MobC is also required for cleavage of plasmid DNA in vitro, whereas MobB is stimulatory.
The interaction of single-stranded oriT DNA by MobA has also been examined in vitro. The protein binds to only the strand that is normally transferred, and forms a very stable complex with a half-life of about 90 min (Bhattacharjee and Meyer, 1993). Unlike the interaction with duplex DNA, MobC is not required. Strong binding requires both the intact IR and the adjacent core. However, an oligonucleotide consisting only of the core is cleaved at the correct location in vitro by MobA (Scherzinger et al., 1993). Thus, the IR is important for strong binding, but not for recognition of the cleavage site.
The size and location of the MobA domain required for binding in vitro to single-stranded oriT DNA has been determined in two ways (Becker and Meyer, 2002). Binding fragments of the protein were enriched by phage display, and also by gel shift following partial proteolysis. The smallest fragments identified in these procedures were the N-terminal 184 and 188 amino acids, respectively. Since no smaller fragments with activity were identified, the implication is that a rather large region of MobA is required to bind oriT, presumably because of multiple contacts along the 38 bp oriT. The 184 amino-acid fragment, termed minMobA, has been purified and shown to cleave specifically at oriT. In addition, essentially the same fragment (amino acids 1-204) is active for DNA processing in vivo. When an M13 phage derivative that contained two, directly repeated copies of oriT was used to infect a cell producing this fragment, the two oriTs recombined at the nick site to generate a single copy (Meyer, 1989). MobB and MobC were not required for this reaction, and larger fragments, including full-length MobA, were also active. Only the normally transferred strand was acted on by the protein. Presumably, the MobA fragment was cleaving and then rejoining the single-stranded oriT DNA, either just after entry of phage DNA into the cell or following its replication in the host.
Although MobA alone cleaves and rejoins single-stranded oriT DNA, nicking of double-stranded DNA requires MobC as well (Scherzinger et al., 1992; Scherzinger et al., 1993). Within the relaxosome, base-pairing between the strands of oriT DNA is disrupted within the AT-rich region of the core (Zhang and Meyer, 1995). MobC enhances this strand separation, causing it to extend through the nick site (Zhang and Meyer, 1997). It is likely that the small MobC protein, which is present in multiple copies in the relaxosome (Scherzinger et al., 1992), forms a structure that wraps or otherwise distorts the DNA, thus introducing sufficient single-stranded character to allow MobA to cleave the active strand. If the DNA is already single-stranded, MobC would not be required, explaining why MobA alone is active on oriTs carried by M13.
An important characteristic of the RSF1010 Mob system is that transfer can be initiated at one oriT and terminated at another, directly repeated oriT on the same molecule (Kim and Meyer, 1989). Normally, about 50% of the transferred molecules are terminated at the second oriT. Cleavage and rejoining is at the nick site determined in vitro (Brasch and Meyer, 1987; Scherzinger et al., 1992). Mutations in oriT have different effects, depending on whether they are in the initiating or terminating copy. This is presumably because initiation involves cleavage of a single strand within a relaxosome bound to double-stranded DNA, whereas termination involves rejoining (and in this case, at least, cleavage as well) of the transferred single strand. In agreement with this interpretation, an intact IR is required for the termination step, to provide a strong binding site for capture of the covalently bound MobA, and is also required for oriT recombination on single-stranded M13 DNA (Barlett et al., 1990). However, the arm of the IR distal to the core is not required for initiation (Brasch and Meyer, 1987; Kim and Meyer, 1989). This has led to a model in which the purpose of the IR is to form a hairpin, reproducing the double-stranded character of this part of the MobA binding site (Zhang and Meyer, 1995). In addition to the IR, the adjacent TAA in the core is critical for binding, with the core consensus being TAARTGYGY (Becker and Meyer, 2000). If an extra base is inserted between the IR and the core, the cleavage site is shifted by one base. This shows that it is the core that determines the cleavage position, consistent with the correct cleavage in vitro of a core oligonucleotide lacking an IR (Becker and Meyer, 2000; Scherzinger et al., 1993).
Initiation of transfer at one oriT and termination at the other requires a second DNA cleavage. Since plasmids containing two oriT copies resemble intermediates generated by rolling-circle replication during transfer, proposed as a mechanism for strand replacement, it is interesting to ask how the second cleavage occurs. A nonsense mutation close to the N terminus of MobA lowers the transfer frequency several orders of magnitude (the residual transfer is probably due to misreading) but does not change the recombination frequency, suggesting that a single molecule of MobA is involved in both the initiating and terminating cleavage. The crystal structure of minMobA has been solved (Monzingo et al., 2007). The active tyrosine Y25 protrudes into a pocket that also contains a second tyrosine (Y32) and three glutamates, including the highly conserved E74. Any one of these could be involved in the second cleavage (Noirot-Gros et al., 1994), but directed mutagenesis of these residues causes little or no change in recombination frequency. Although the protein is unrelated at the amino acid level to TraI and TrwC, relaxases encoded by the plasmids F and R388, respectively, all three proteins have similar structures (Datta et al., 2003; Guasch et al., 2003). TrwC can also recombine directly-repeated oriTs (Llosa et al., 1994), and in this case a second tyrosine, remote from the active site deduced from the crystal structure, is thought to be involved (Gonzalez-Perez et al., 2007; Grandoso et al., 2000).
The R1162 Mob proteins are also active on the oriT of pSC101, another member of the MobQ group (Meyer, 2000). The cores of the two oriTs are essentially the same, but the DRs are significantly different. It was subsequently found that substantial sequence degeneracy was permissible in the inner arm for nicking and initiation of transfer (the outer arm of the IR is not required at this step) (Becker and Meyer, 2003; Jandle and Meyer, 2006). AT or TA base-pairs at positions 17-19 (Fig 5) were required for function. The importance of A/T at these positions suggested that MobA was fitting into the minor groove of the helix at this location (Seeman et al., 1976); re-creating this groove might be why an IR is required for the rejoining step at termination. In addition, the tolerance to TA transversions explains why an A to T base change in the inner arm of the IR interferes with MobA binding, but this is suppressed by a mutation restoring complementarity (Bhattacharjee and Meyer, 1993). At the other positions of the inner arm, any of the four bases could be present, with however varying effects on activity (Jandle and Meyer, 2006) (unpublished data). Thus, considerable base degeneracy is tolerated throughout the R1162 oriT, even though it is small, and DNA with the consensus sequence WWWNNNNTAARTGCGC would be expected to have some activity for initiation of transfer. Interestingly, the pSC101 MobA, which is closely related to the R1162 protein, is much more specific: this specificity is determined primarily by the sequence CGTC in the pSC101 oriT, at the location corresponding to base-pairs 20-23 in the R1162 origin (Fig. 5).
Why R1162 has evolved to be so permissive is an unanswered question, since the contrasting behavior of the pSC101 MobA shows that this is not an invariable feature of the IncQ group of MobA proteins. Also unknown is whether the MobAs of the other IncQ plasmids, proteins which as a group are highly conserved (Rawlings and Tietze, 2001), are similarly permissive. In any case, an implication of this property is that MobA might be active on ectopic sequences that conform by chance to the oriT consensus. When DNA from the genome of Pectobacterium atrosepticum (formerly Erwinia carotovora v. atrosepticum), containing the sequence CAATAAGCTTAAGTGCGC, was cloned into a plasmid, it could initiate transfer, although the DNA is not part of any known mobilizable element (Jandle and Meyer, 2006). It has also been recently shown that the R1162 Mob proteins can initiate transfer from such sequences in the chromosomes of E. coli and Pectobacterium atrosepticum (Meyer, 2009).
Fusion of the primase to the oriT-nicking protein is found not only for RSF1010, but also for other IncQ plasmids including pIE1107 (from an uncharacterized strain) (Tietze, 1998), pCCK1900 (from Pasteurella multocida) (Kehrenberg et al., 2008) and pDN1(from Dichelobacter nodosus) (Whittle et al., 2000). In addition, RepB of plasmid pTF-FC2 is a homolog of the RSF1010 RepB protein, but is fused to the MobA of a transfer system related to that of RP4 (Rohrer and Rawlings, 1992). Plasmid pGNB2 contains the IncQ replication genes repA and repC but encodes an unrelated primase that is nevertheless fused to the relaxase protein (Bönemann et al., 2006). Thus there appears to be strong selection for the fusion between primase and relaxase when the IncQ genes for initiation of replication are present. A plausible explanation is that the fusion is selected because it facilitates complementary strand synthesis after transfer, by tethering the primase at a location (oriT) close to the priming sites. Where plasmids have other, efficient mechanisms for complementary strand synthesis, such as primosome assembly sites, rapid delivery of the primosome might be less important. The linked primase and one of its initiation sites, appropriately oriented site for synthesis of the complementary strand, does increase the probability of successful transfer under conditions where it is infrequent (Henderson and Meyer, 1996). To explore the contribution of the priming system to transfer in greater detail, a system was developed where transfer could be measured in the absence of ongoing vegetative replication. In brief, plasmids were transformed into donor cells where they could not replicate; the resulting cells were then immediately mated. The non-replicating, transferred molecules were then captured in the recipient by lambda att-mediated integration into the chromosome (Henderson and Meyer, 1999). Of the three replication proteins, only the MobA-linked primase affected the frequency of transfer under these conditions. However, molecules deleted for oriV, and thus lacking oriL and oriR, were transferred into an E. coli recipient at the same rate as oriV+ plasmids (Parker and Meyer, 2005). Thus, E. coli has a system for complementary strand synthesis of incoming, newly transferred DNA. When the recipient was Salmonella enterica, recovery was less efficient, and here the plasmid priming system increased the number of successful transconjugants. Thus, the fused primase might be a back-up system in recipients where an endogenous mechanism of complementary strand synthesis is unavailable. Finally, although it seems likely that the RSF1010 MobA was formed by the fusion of two, preexisting proteins, it also might be possible that in some cases the fusion is secondarily lost as well. In this connection, it is interesting that the 371-amino acid pSC101 MobA contains, near the C-terminal end, a sequence homologous to a region within the primase domain of the RSF1010 MobA. In this case, pSC101 might have acquired the fused protein, then lost the primase portion since it was no longer useful for replication.
Complementary strand synthesis in the donor, to replace the outgoing strand, is not obviously required, since in the case of RSF1010 copies of the plasmid lost by mating could be easily restored by the normal mechanism of copy-control. Nevertheless, under conditions where vegetative replication is absent, the plasmid primase is involved in strand replacement synthesis in the donor (Parker and Meyer, 2002). The restored copies are then able to undergo a second round of replication. One possibility is that the transferring DNA remains at the conjugative pore. Replacement of the outgoing strand at that site, by the MobA-linked priming system, would allow rapid regeneration of the donor DNA and facilitate additional rounds of transfer. However, as pointed out in section 3.2, rolling-circle replication, with extension from the 3′-OH at the nick site, could also rapidly generate new strands. Intermediate forms probably generated by rolling-circle replication during transfer have been detected (Erickson and Meyer, 1993). However, direct evidence for rolling circle replication as a major mode of strand replacement synthesis is lacking.
Although minMobA can process single-stranded oriT DNA both in vitro and in vivo (section 3.2), it is inactive for transfer. However, a slightly larger N-terminal fragment of MobA (base-pairs 1-285), is functional for mobilization when the other Mob proteins are provided (Brasch and Meyer, 1986). Two, nonexclusive explanations are that the additional residues are required to form a relaxosome able to cleave double-stranded DNA, or that they are required to engage the type IV secretion machinery. An assay was developed to detect proteins that are substrates for type IV secretion (Vergunst et al., 2000). In this assay, Cre is first fused to the test protein in the donor. The recipient cell is genetically altered so that if Cre is transported, a subsequent site-specific recombination at lox sites will result in the appearance of an easily detectable property, such as drug-resistance. With this system, the Agrobacterium tumefaciens Vir system relaxase, VirD2, as well as other Vir proteins, were shown to be transported into plant cells (Vergunst et al., 2005). In addition, a signal for type IV transport by the VirB/D4 system was localized to the C-terminal 48 amino acids of MobA (i.e, in the primase domain). A weak consensus signal rich in arginines suggests that positive charge at the C-terminal end of a protein is required for transport.
A variation of the Cre system was used to identify signals for mobilization by the IncP-1 plasmid R751 (Parker and Meyer, 2007). One signal was found in the N-terminal half of the protein, and includes some of the additional residues required for conjugal mobilization (Fig. 6). This might be the ancestral transport signal that was used prior to fusion to the primase. There was also a second signal, associated with the primase, but oddly it is not the same as the signal used by VirB/D4. Both signals on MobA contribute to transfer of the plasmid during conjugation. Both also require MobB, which along with MobA and MobC is a component of the relaxosome. In contrast to these other proteins, MobB has only a small effect on the fraction of nicked plasmid DNA, either in vivo or in vitro (Perwez and Meyer, 1996; Scherzinger et al., 1992), an effect insufficient to explain the importance of the protein for conjugative transfer (Perwez and Meyer, 1999). MobB contains a putative membrane-spanning domain, and might instead be involved in anchoring or presenting MobA, and the covalently-linked plasmid DNA, to the conjugative pore for subsequent export. In agreement with this, MobB has been shown to be associated with the membrane (Parker and Meyer, 2007). Deletion of the membrane-spanning domain disrupts this association and decreases the frequency of both type IV transport and plasmid mobilization [(Parker and Meyer, 2007) and unpublished].
The Cre fusion assay shows that residues 185-285 contains at least part of a type IV secretion signal. This region contains several, highly conserved residues, both for the relaxases of the IncQ plasmids and also for other IncQ-type relaxases, including that encoded by pSC101. As with the Vir secretion system, the basic amino acid arginine is prominent at the conserved locations.
RSF1010/R1162/R300B is a remarkably successful molecular parasite, in terms of both the number of hosts it can successfully inhabit and the number of type IV systems it can exploit to insure its infectivity. What are the properties that contribute to this success, and what are their limitations? It is useful to compare the RSF1010 group with the IncP1 plasmids such as RK2, which are also broad host-range.
RSF1010 can be replicated in many cytoplasms first because it initiates replication by a mechanism that is host-independent. By contrast, the RK2 oriV requires host DnaA protein to load the essential helicase DnaB, and this limits its host-range since the DnaAs of some species are inactive (Caspi et al., 2000). Thus, RSF1010 can replicate in Streptomyces lividans, but RK2 cannot (Caspi et al., 2000; Gormley and Davies, 1991). In addition, although RepC binding at the R1162 oriV and loading of the plasmid helicase RepA can activate different priming systems (Bönemann et al., 2006; Honda et al., 1991), the one used by the IncQ plasmids is highly specific and essentially invisible to the host. Providing a dedicated priming system seems to be unusual among plasmids; an exception are plasmids such as pT181, which replicate by a rolling-circle mechanism. Here, a plasmid-specific protein generates a nick to provide a primer for strand extension. However, these plasmids invariably fall back on host mechanisms to generate the lagging stand (Khan, 1997).
Recruitment of a priming system for plasmid-specific replication depends on the availability of such systems. At present, the origin of the RepB priming system is mysterious. Certain IncP6 plasmids contain a related protein, also fused to a relaxase, but replicate by another mechanism and do not encode homologs of RepA or RepC (Haines et al., 2005; Schlüter et al., 2007). Most likely, the RepB-like proteins in these plasmids were acquired secondarily and reflect the modular nature of the IncP6 group.
There are no efficiently utilized, secondary sites for initiation of vegetative R1162 replication. Thus, if one of the primary initiation sites is deleted, single strands generated by initiation at the other site accumulate in the cell (Tanaka et al., 1994). The lack of secondary sites capable of utilizing chromosomally-encoded proteins emphasizes the independence of the plasmid from its host. Another possible advantage is that the apparent lack of Okazaki fragment synthesis might mean a simpler replication fork, with coordinated synthesis on both strands by the DNA polymerase III replication complex no longer required. This in turn could make the plasmid more tolerant the different replication machines in its host range. A penalty imposed by this mode of replication of R1162 might be a limitation on size (Rawlings and Tietze, 2001). Large plasmids would result in large amounts of single-stranded DNA, which might induce the SOS response, and this in turn limits the number of genes carried by the plasmid. A recent survey revealed that the naturally-occurring plasmids with an RSF1010-like replicon have sizes of 14,000 bp or less [(Rawlings and Tietze, 2001) and fig. 1].
A lack of dependence on host-activated sites for replication could create another problem as well. Such sites could be used after conjugative transfer into recipient cells for rapid complementary strand synthesis, improving the chance of establishment. However, strands transferred into E. coli are recovered as plasmids even in the absence of oriV, indicating that some efficient system for initiation of complementary strand synthesis is active under these circumstances (Parker and Meyer, 2005). Initiation of synthesis is not due to the primase encoded by the mobilizing incP1 vector and is indeed probably due to host-encoded proteins, since it is less active in Salmonella. Why then has fusion of the primase to the relaxase been selected? In cases where a host system for strand synthesis is unavailable, delivery of the primase via the covalently linked relaxase might serve as another way of restoring the duplex molecule (Parker and Meyer, 2005).
The stability of RSF1010 is determined by its copy-number: when this number is lowered, the plasmid becomes unstable (Becker and Meyer, 1997). It has been shown in numerous studies that a plasmid introduces a metabolic burden on its host, primarily due to the demands of synthesizing plasmid-encoded proteins (Bentley et al., 1990; Lenski et al., 1994). This metabolic burden is minimized by elimination of plasmid genes, strict regulation of those that remain, and the lowering of copy-number, made possible by the acquisition of mechanisms to insure plasmid maintenance (Thomas, 2004). The IncQ plasmids are not excepted from the problem of metabolic burden. The chloramphenicol-resistance gene from Tn9 was cloned into R1162. In the presence of chloramphenicol, cells containing plasmids with lower copy-number (due to the presence of additional DRs) outgrew those having plasmids with the normal copy-number. However, in the absence of selection, the lower copy-number plasmids were gradually lost from the cell, as expected. As just described, one way to resolve the conflict between the burden of gene expression and the requirement for stability is to have a system that insures stable plasmid inheritance. There is no intrinsic problem with the IncQ plasmids acquiring such a system: the partitioning system of P1 stabilizes a low copy-number R1162 derivative in E. coli (Becker and Meyer, 1997). However, true par modules are absent from the IncQ plasmids. This is not due to the broad host-range of the plasmid, since RK2 has a partitioning system that is active in a number of different hosts (Siddique and Figurski, 2002). On the other hand, the IncQ2 group of plasmids (Deane and Rawlings, 2004) (Fig. 1) do encode toxin-antidote systems, which have the potential to contribute to plasmid stability in situations where the copy-number of the plasmid is low.
The IncQ plasmids are thus maintained primarily by high copy-number. This presumably limits which genes are carried by the plasmid and the level of their expression. Two genes for streptomycin-resistance are carried by RSF1010 and many other IncQ plasmids (Fig. 1). In contrast to chloramphenicol-resistance from Tn9, lowering the copy number of RSF1010 reduces resistance to streptomycin and leads to the selection of higher copy-number variants (Becker and Meyer, 1997). Thus, the resistance-level encoded by each copy of the plasmid is matched to the high copy-number, preventing the appearance of low copy-number, unstable variants and reducing plasmid loss during periods of non-selection. Since cells containing IncQ plasmids are exposed to a variety of antibiotics, it might be expected that the level of expression, rather than the range of antibiotic-resistance genes acquired, would be affected by the high copy-number. It has been observed that the accessory genes of IncQ plasmids seem to have weak promoters and ribosome binding sites (Rawlings and Tietze, 2001). The immensely complex regulatory circuitry of IncP1 plasmids might reflect the need to regulate tightly but flexibly gene expression in different hosts. The apparent multiple control of repC expression (section 2.4) might reflect the same requirement, although on a far less grand scale.
RSF1010 is also a successful parasite of the Type IV secretion systems encoded by bacteria and their plasmids (Introduction, Section 1). The plasmid competes successfully with the endogenous, natural substrates of these systems, whether they are protein-DNA complexes of other plasmids or proteins involved in pathogenesis. As a result, RSF1010 can reduce the oncogenicity of the Agrobacterium tumefaciens Ti plasmid by competing for the VirD4 docking protein (Cascales et al., 2005). The plasmid inhibits the virulence of Legionella pneumophila as well, probably by a similar mechanism (Segal and Shuman, 1998). Since type IV secretion systems are well-distributed, the ability to use many of these for transfer might represent an adaptation responsible in part for the broad host-range of RSF1010. An interesting exception is that RSF1010 does not compete well for the type IV secretion system encoded by the F factor (Willetts and Crowther, 1981). It was shown that highly efficient F transfer, and exclusion of RSF1010, were due to a carboxy-terminal tail on the coupling protein (Sastre et al., 1998). Thus the tail served two purposes, increasing efficiency of coupling to the F factor relaxase, and offering immunity to competition from broad host-range parasites.
How is RSF1010 able to use a variety of different type IV transporters? Part of the answer might be that it has simply evolved a general signal, recognized by the coupling proteins of different type IV systems. However, the signals for export of RSF1010 by the Agrobacterium tumefaciens Vir system and IncP1 plasmids appear to be different (Section 3.4). Thus, RSF1010 might contain different signals active with different secretion machines. This seems like a limiting strategy, however, and part of the explanation might be that a weak, general transport signal is amplified. In this connection, the role of the small protein MobB might be to anchor the plasmid to the membrane, thus increasing the chance of interaction with the secretory apparatus.
In summary, to continue the analogy put forward at the beginning of this review, the IncQ plasmids share many of the attributes of successful actors:
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.