|Home | About | Journals | Submit | Contact Us | Français|
Summary: A classical feature of the tyrosine recombinase family of proteins catalyzing site-specific recombination, as exemplified by the phage lambda integrase and the Cre and Flp recombinases, is the ability to recombine substrates sharing very limited DNA sequence identity. Decades of research have established the importance of this short stretch of identity within the core regions of the substrates. Since then, several new enzymes that challenge this paradigm have been discovered and require the role of sequence identity in site-specific recombination to be reconsidered. The integrases of the conjugative transposons such as Tn916, Tn1545, and CTnDOT recombine substrates with heterologous core sequences. The integrase of the mobilizable transposon NBU1 performs recombination more efficiently with certain core mismatches. The integration of CTX phage and capture of gene cassettes by integrons also occur by altered mechanisms. In these systems, recombination occurs between mismatched sequences by a single strand exchange. In this review, we discuss literature that led to the formulation of the current strand-swapping isomerization model for tyrosine recombinases. The review then focuses on recent developments on the recombinases that challenged the paradigm that was derived from the studies of early systems.
Integrative conjugative elements (ICEs), also called conjugative transposons, are making a major contribution to bacterial evolution by moving antibiotic resistance genes, virulence genes, and metabolic genes across species and genus lines. ICEs further extend their genomic reach by mobilizing unrelated plasmids and integrated mobilizable elements (13, 72). Integrated mobile elements have now been found in a number of different phylogenetic groups, including the Bacteroidetes, the Firmicutes, and the proteobacteria (13, 14, 27, 43). The integrases of ICEs and integrated mobilizable elements are usually tyrosine recombinases related to those of lambdoid phages; however, important differences are being discovered, which are challenging the mechanistic paradigm based on the previously studied recombinases.
Tyrosine recombinases make up one of two classes of proteins catalyzing site-specific recombination (35, 56). Unlike general recombination that requires large segments of homologous DNA, tyrosine recombinases catalyze recombination between substrates that share limited sequence identity (traditionally referred to as homology). The sequence identity usually extends over the short strand exchange region and flanking recombinase binding sites. DNA homology within the 6- to 8-bp region between the strand cleavage sites, called the overlap, spacer, or crossover region, is critical to the recombination reaction in most cases studied. This article provides a concise review of the literature related to the recombination mechanism of tyrosine recombinases that established the current homology paradigm, with focus on new developments regarding the role of sequence identity.
There are more than a hundred members of the tyrosine recombinase family, and they perform a variety of DNA rearrangement reactions while following the same active-site chemistry of utilizing a tyrosine nucleophile to attack the scissile phosphate (62). Some of the best-studied members of the family are phage lambda integrase (Int) and the Flp, Cre, and Xer recombinases. Lambda Int is required for the integration and excision of phage lambda into and out of the Escherichia coli chromosome (7). Int is a heterobivalent protein that recognizes two classes of DNA sequences: the arm and core sites (Fig. (Fig.1A).1A). Recombination occurs within the identical overlap sites found in both phage attP and bacterial attB sites. The arm sites also bind accessory DNA-bending proteins (IHF, Fis, and Xis) that govern the directionality of the reaction. The Cre recombinase is encoded by phage P1. Cre maintains the genome of phage P1 in the monomeric state by mediating recombination between two 34-bp loxP sites (Fig. (Fig.1B)1B) (82). The Flp recombinase maintains the high copy number of Saccharomyces cerevisiae plasmid 2μm by recombining two identical 34-bp minimal frt sites (Fig. (Fig.1C).1C). In contrast to the lambda system, both the Cre and Flp proteins are simple recombination systems requiring no accessory proteins or sequences. The Xer system is a conserved feature of most bacteria, and it ensures proper chromosome segregation during cell division by acting on the dif site to convert chromosomal dimers to monomers. It is also responsible for converting ColE1 and pSC101 plasmid multimers into monomers by acting on the cer and psi sites, respectively (Fig. (Fig.1D).1D). The Xer system is unusual in that it utilizes two tyrosine recombinases, XerC and XerD. The requirement for accessory proteins and sequences in XerC/XerD recombination varies with the substrate.
Recombination catalyzed by most tyrosine recombinases occurs by sequential exchanges between partner sites where one strand in each site is cleaved by the recombinase to form a 3′-phosphotyrosyl intermediate and a free 5′-hydroxyl (83) (Fig. (Fig.2).2). The next step is a strand exchange and rejoining reaction where the 5′-hydroxyl groups attack the phosphorotyrosyl bonds of the partner sites to form phosphodiester bonds. This reaction generates a Holliday junction (HJ) intermediate. The opposite strands of each site are then exchanged to form recombinant products. The distances between the sites of exchange are typically 6, 7, or 8 bp, depending upon the system.
In the Int, Cre, and Flp systems, it was established very early on that the identity between the two recombining substrates within the overlap region was necessary for recombination to occur (Fig. (Fig.1).1). This was first demonstrated for lambda Int (8, 9, 24, 57, 85) for both integrative and excisive recombination. Several site affinity, or saf, mutants that contained single or multiple mutations within the overlap region of attP or attB were isolated. They blocked recombination both in vivo and in vitro. When identical saf mutations were made in both att sites, the recombination efficiency was similar to that of crosses with wild-type sites. The effect of heterology was limited to the 7-bp overlap sequence. A mismatch at the position just outside the overlap region (to the left of the top cleavage site in Fig. Fig.1A)1A) did not affect the recombination reaction (8). Sequence identity within the spacer region was shown to be essential in the Cre-loxP and Flp-frt systems as well (3, 37, 53, 75).
The homology rule was violated by the discovery of the coupling sequence mechanism for the conjugative transposons Tn916 and Tn1545 found in Enterococcus faecalis and Streptococcus pneumoniae, respectively (15, 66). The overlap regions of the att sites of these elements were nonhomologous. Subsequently, other mobile elements encoding tyrosine recombinases like Bacteroides CTnDOT and NBU1 were discovered. CTnDOT has a coupling sequence mechanism similar to that of Tn916 and Tn1545, and NBU1 integrase has interesting homology requirements that were not apparent from initial analyses of recombination sites (19, 49, 68, 73). The integration of CTX phage and the capture of gene cassettes by integrons use a single strand exchange mechanism to recombine nonhomologous core sequences (11, 81). With new insights into the mechanism of these integrases, it is time to revisit the paradigm of the importance of homology within the crossover region.
Since early studies of the Int, Cre, and Flp systems showed that homology in the overlap region promoted recombination, it indicated that the overlap regions interact with each other and that DNA-DNA interactions in addition to DNA-protein interactions were involved in determining site specificity during recombination (85). To study the role of homology, it was necessary to first understand what strand exchange mechanisms were involved in recombination. There were two models proposed. The “cohesive-end model” stated that the recombinase made concerted double-stranded cuts at both recombining sites and that homology was needed for annealing the cohesive ends together before ligation (58). The reaction would be analogous to the cleavage of DNA sites by a restriction enzyme to generate cohesive ends that anneal together and become sealed by DNA ligase. Bauer et al. (7) showed that lambda Int-mediated recombination occurred efficiently even when only one of the two strands of attB was complementary to attP. The cohesive-end model predicted that in crosses with wild-type attP and heteroduplex attB sites, only nonreciprocal products containing one completed product and a broken site would be formed. However, many crosses resulted in reciprocal products, arguing against the model (58).
The “branch migration” model proposed that the first strand exchange occurs at one end of the overlap or crossover region and that the HJ intermediate-formed branch migrates to the other end of the crossover region, where it is resolved by the second strand exchange (85). Homology between the two sites would be necessary for branch migration. Although genetic evidence suggested that at least a fraction of Int-mediated recombination proceeded via HJ intermediates (28, 29), and Int could efficiently resolve synthetic att-site HJ structures with the branch point in the core region (38), HJs were not observed in vitro. It took ingenious methods that blocked the second strand exchange to show that the HJ was an intermediate in Int recombination (42, 64). An att site with a single-stranded nick (64) or a phosphorothioate linkage (42) at the bottom-strand cleavage site accumulated a reaction intermediate. Analysis of this intermediate revealed that it was always formed by the exchange of top strands. A minor amount of this intermediate was also accumulated in unblocked reactions when a variant of Int (Int-h) was used (40). Synthetic peptides were developed recently as a more efficient tool for trapping HJs with natural substrates (10, 16, 17). Thus, it was established that Int carries out two sequential strand exchanges in a defined order (40, 41, 64).
When a covalent cross-link was introduced between complementary bases within the overlap region of lambda att sites, recombination was blocked, although HJ intermediates were accumulated (22). This result indicated that the dissociation of strands was essential for recombination, thus providing further support for the branch migration model.
HJs were also found in other systems. HJs were observed in the Flp system, and Flp was able to resolve the isolated HJ intermediates (55). Cre mutants were identified, which accumulated HJ intermediates, and it was shown that wild-type Cre could resolve the HJ intermediate to recombinant products (36). Cre appeared to have a preference for exchanging the bottom strands first to form the intermediate (36), and recent work has confirmed this observation (30, 31, 46). Flp, on the other hand, carries out nonbiased sequential cleavages: the HJs obtained had an equal probability of being formed by top- or bottom-strand exchanges (2). The strand exchanges in the XerCD system depend on the sites involved. Recombination at cer proceeds by a single strand exchange performed by XerC to result in an HJ, which is resolved by cellular processes (20, 51). At psi sites, there are ordered strand exchanges with XerC cutting first to form the HJ that is resolved by XerD cleavage (5). Recombination at dif requires the FtsK protein, which switches the catalytic state of XerC and XerD so that recombination is now initiated by XerD (6).
Initial models proposed by Weisberg et al. (85) and Bauer et al. (9) suggested that homology plays a role in synapsis and/or during strand exchange for branch migration of the junction. However, Nash and Pollock argued against the role of homology in branch migration since until then, no HJ intermediates were observed in heterologous crosses where the overlap regions were different (59). Instead, they proposed that the requirement for homology involved DNA interactions during synapsis that mediated the formation of a four-stranded DNA helix (39). The requirement for homology would fit neatly into such a model since the formation of a four-stranded helix would be possible only if the two DNA duplexes were homologous to each other (52). However, in reactions mediated by Int, partner sites with overlap mismatches on the right side of the overlap region accumulated HJ intermediates (41, 64), indicating that heterology did not affect synapsis or the first strand exchange. Later, it was shown that attB has a weak affinity for Int and that Int and IHF assemble on attP and that synapsis occurs by the capture of the naked attB DNA by the attP-intasome (70). Richet et al. devised an attB cleavage assay using heteroduplex attB sites and attP-saf variants with overlap regions that were not homologous to either strand of the heteroduplex attB and showed that homology was not required for the capture and cleavage of attB (70). Other experiments using half-att sites carrying either a P or P′ arm and heterology in the overlap region showed that heterology did not affect synapsis or initial strand exchange (63).
The location of the heterology appeared to determine the outcome of the reaction. In Int-mediated recombination, Holliday intermediates accumulated when mismatches were present on the right side of the overlap or crossover region but not when mismatches were closer to the site of top-strand exchange (41, 64). This was explained by the effect of heterology on branch migration. Following the first strand exchange, any heterology close to this site would block branch migration and reverse the strand exchange, but heterologies distal to the site would not destabilize the Holliday intermediate and instead would block its resolution to products. An attP with a flipped core shows a reversed order of strand exchanges in recombination with an attB site. Thus, the strand that is normally exchanged first (the original top strand) is exchanged second and the strand that is normally exchanged second (the original bottom strand) is exchanged first. It was found that mismatches between attP and attB had the opposite effect, with heterologies to the right blocking the reaction and those to the left accumulating the intermediate (40). Experiments with artificial HJs showed that heterologies near the second strand exchange site did not inhibit resolution but affected the directionality of the reaction. Lambda Int resolves wild-type junctions equally to parental and recombinant products, but HJs with mismatches on the right side of the overlap region were resolved almost entirely to parental products (23). Interestingly, the order of strand exchanges and the effect of heterology also govern the choice of secondary sites into which phage lambda inserts. All the secondary sites examined have the first three bases of the left side of the overlap region conserved. The remaining bases to the right are not conserved, probably because during phage integration into a secondary site, the HJ intermediate that is formed by top-strand exchange is resolved by chromosome replication or by RuvC instead of Int (71). Thus, it was accepted for several years that sequence identity was essential for branch migration of the junction.
Synapsis was not affected by heterology in the Flp system either. Synaptic intermediates were observed even when the two frt sites had heterologous core sequences (2). In crosses between half and full frt sites, synapsis was seen to occur in both homologous and nonhomologous alignments. Strand exchange could also be initiated in either alignment, but the product was greatly favored in the homologous alignment (76). Mismatches did not prevent the formation of the HJ intermediate. When the heterology was closer to the top-strand exchange site, the HJ was formed by the exchange of bottom strands alone (2). This was explained by the presumed ability of the branch to migrate through more base pairs when formed by bottom-strand exchange than by top-strand exchange.
Similar to results found with Int, the position of the mismatch dictated the outcome of the reaction for Cre recombinase. Substitutions in the loxP site at bases close to the first strand exchange site completely blocked the reaction, but those distal to it trapped the Holliday intermediate (44).
The importance of homology has not been clearly established for the XerC and XerD recombinases. Recombination at cer was not affected by mismatches on the right side of the core (51). This was expected since only a single top-strand exchange mediated by XerC occurs at cer (20). Although mismatches on the left side of the core abolished recombination, homologous substitutions did not restore it. It is possible that the first two bases on the left side of the overlap region are important since they are present in all cer homologues (51).
Doubts about the branch migration model were raised when it was observed that Int (61) and Flp (26) did not require branch migration to resolve synthetic HJs. Nunes-Duby et al. constructed HJs that had limited branch mobility within the center of the overlap region. They found that resolution by Int was most efficient when the mobility was limited to the central base pair (61). Resolution of immobile junctions with a fixed branch point was best when the crossover point was fixed at 2 to 3 bp within the spacer and was poor at the ends of the spacer region. Additionally, a minor shift of the branch point around the center of the overlap switched the resolution bias from top-strand to bottom-strand resolution. These experiments indicated that the branch point is at the center of the crossover region and not at the ends, as previously supposed. Those authors proposed a novel strand-swapping isomerization model in which there are two sequential symmetrical swaps of three bases each between partners. The homology-sensitive step would be at the annealing stage before strand joining. An isomerization step that may involve a limited branch migration of the central 1 to 2 bp would promote the second strand exchange (61).
The same model was proposed for Flp by Lee and Jayaram (45). They measured cleavage and strand joining in half-frt sites and the amount of strand transfer product formed in half-site-by-full-site crosses and full-site-by-full-site crosses. They showed that strand rejoining, but not cleavage, requires homology between the first two to three bases adjacent to the site of ligation. Ligation was insensitive to heterology beyond the first three positions. They concluded that the functional role of homology was to align the 5′-hydroxyl of the cleaved strand by Watson-Crick base pairing for the strand-joining reaction. They proposed that branch migration would be limited to the central 2 bp of the core (45). This finding also supported data from previous reports showing that Flp required 2 bp of homology at either end of the overlap region for efficient strand transfer product formation in full-site-by-half-site crosses (1, 67). The results described previously by Zhu et al. also supported the role of homology in promoting efficient ligation in Flp-mediated recombination. They measured strand rejoining using several assays and observed that complementarity at the position immediately adjacent to the ligation site was most important (86).
Experiments using the XerCD system also did not favor the branch migration model. XerC cleavage of synthetic cer HJ structures with the junction position constrained to different points by heterology suggested that the optimal position was at least two bases from the XerC cleavage site. During recombination of these HJs, strand rejoining was most inefficient when the mismatch was at the site of cleavage and was normal when the mismatch was four bases away, supporting a requirement for sequence identity in the ligation step (4).
Burgin and Nash (12) observed that although cleavage by lambda Int at the B core site of attB in the presence of an attP intasome was insensitive to homology, cleavage at the B′ core site depended on homology between the first 3 bp closest to the B site. Since the second cleavage occurred in the absence of strand joining at the B site, they argued against strand rejoining being the homology-sensing step.
Nunes-Duby et al. extended these and their own previous observations by employing various heterologous half-site-by-full-site crosses in lambda excisive recombination. They followed the accumulation of covalent Y junctions formed by double-strand transfer that is analogous to a top-strand exchange. They showed that homology is sensed prior to ligation and that strand transfer occurred even when ligation was blocked by a 5′-PO4 (65). However, homology stimulated ligation, and ligation stabilized the Y-junction products. Single-base mismatches at both ligation sites prevented strand transfer products from accumulating. Moreover, they found that complementary base pairing at one site stimulated strand transfer at the other mismatched site. This explained the results described previously by Bauer et al. (7), where it was shown that a heteroduplex attB could recombine with wild-type attP when only one of the strands of attB was complementary. Correct annealing of either strand was necessary and sufficient for a complete strand exchange (65). Nash and Robertson had observed different products with a reaction containing a saf+ attL and heteroduplex saf+/safG attR than with a heteroduplex safG/saf+ attR (60); their results were explained by taking into account reverse alignments of the overlap regions. The attR site with the safG bottom strand has homology in reverse with the saf+ attL site and produced Y junctions, while the attR with the saf+ bottom strand has complementarity only in parallel and gave truncated linear products.
Although wild-type Int is very sensitive to homology at the sites of ligation, Int R169D that is defective in forming a proper dimer/tetramer interface is insensitive to mismatches (47). It appears that multimeric Int complexes have greater specificity for base pairing within the active site, and this could explain why isolated Int-mediated hairpin loop formations can occur in the absence of homology.
Studies performed previously by Azam et al. addressed the possibility that mismatches did not inhibit Flp-mediated recombination but inhibited the appearance of recombinant products. They used topological analysis of supercoiled substrates to show that in the presence of heterology, there were two rounds of recombination, which yielded the nonrecombinant configuration (6a).
The strand-swapping-isomerization model gained strong support from the various Cre-loxP crystal structures described previously by Gopaul et al. and Guo et al. (32-34). Cre forms a cyclic tetramer that holds the DNA in a square planar conformation, with the ends to be exchanged lying near each other at the center of the complex. The crystal structure did not support any large-scale protein-DNA rearrangements that would be required for branch migration and instead suggested that complete strand exchange could occur within the synaptic architecture. Crystal structures of the Flp-HJ intermediate also supported a model with minimal branch migration and relatively modest movements needed for the isomerization of the HJ (18, 21).
The Cre-nicked HJ complex trapped the intermediate where three bases had been swapped but ligation had not yet taken place (32). This confirmed that it was possible for strand exchange to occur in the absence of ligation. The structures of both the cleaved Cre-DNA intermediate and Flp-nicked HJ provided additional support for the requirement for homology in the strand ligation step. The recombinase does not contact the swapping strand but makes a number of contacts to the sugar phosphate backbone of the complementary strand. Since access to the phosphotyrosine linkage is stereochemically limited, the incoming 5′-OH would have to occupy a base-pairing position to be able to attack it.
A general model for the tyrosine recombinase family described by Voziyanov et al. explains how the spacer length affects the amount of branch migration (84). In the case of Cre recombinase reactions with a 6-bp spacer, the two scissile phosphates are in the same binding plane, and the angle between them is zero, thus requiring minimal branch migration. For Int and Flp, the binding planes of the two phosphates differ by 36° and 72°, respectively, so each phosphate must rotate through 18° and 36° to reach the catalytically active states. These correspond to branch migrations of 1 and 2 bp greater than those for Cre. Since in each case, the spacer length exceeds the branch migration range by about 6 bp, branch migration alone cannot account for the strand exchange, and the remaining base pairs must be exchanged by two swaps of 3 bp each. The sizes of the swap (3 nucleotides) appear to be the same for Int, Cre, and Flp.
The absolute dependence of tyrosine recombinases on overlap sequence homology was first challenged by the discovery of the closely related conjugative transposons Tn916 and Tn1545 (15, 66, 74). Conjugative transposons exist integrated in the bacterial chromosome. Under the appropriate conditions, they excise to form covalently closed circular intermediates from which a single strand is transferred into the recipient cell by conjugation. Complementary-strand synthesis then occurs, and the double-stranded element integrates into the recipient chromosome. The excision of Tn916 and Tn1545 occurs when the integrase makes 6-bp staggered cuts at the ends of the transposon. Since the two ends are not complementary, the 6-bp “coupling sequences” form a heteroduplex joint in the circular intermediate. During integration, the integrase makes 6-bp staggered cuts on the transposon and at the target site, and integration produces heteroduplexes at the left and right junctions that are resolved by chromosome replication or mismatch repair (15, 74, 80). Since recombination always occurs between 6 bp of heterologous sequences, it is clear that recombination cannot involve branch migration or strand swapping of homologous sequences through the entire coupling sequence.
More recently, it was discovered that the Vibrio cholerae phage CTX uses a novel mechanism to integrate its genome into the host chromosome. It uses the host-encoded XerC and XerD recombinases to integrate its plus single-stranded genome into the chromosomal dif1 site (Fig. (Fig.3).3). It was initially assumed that upon the entry of the phage genome into the bacterial cell, complementary-strand synthesis would generate the active double-stranded form that undergoes recombination. However, the spacing between the XerC and XerD cleavage sites differed between the phage attP and the chromosomal dif1 sites (54). This suggested that recombined DNA would have nonhomologous sequences and lengths, which were never observed in vivo. The phage solves this problem by folding its plus single strand, which has a second XerCD binding site separated from the first by 90 bp to form a forked hairpin structure (Fig. (Fig.3)3) such that the stem forms a dif site with a bulge in the central region (81). In this structure, three bases immediately 3′ to the XerC cleavage site are shared between attP and attB (dif1). In vitro and in vivo experiments have confirmed that only a single strand exchange mediated by XerC occurs, and the resulting branched intermediate (HJ) is resolved by DNA replication. The minus strand cannot promote integration because the hairpin structure that it forms lacks the necessary homology in the central region. This model also explains why the CTX phage never excises, since neither of the two dif-like sites in the prophage can recombine with dif1 due to a lack of homology (81).
A similar strategy has been employed for the capture of gene cassettes by integrons. The attI site in the integron is a typical recombination site, but the attC sites in the gene cassettes show poor sequence conservation and vary in length, resulting in nonhomologous sequences in the region of recombination (79). The attC bottom strand can fold upon itself to form a double-stranded structure that almost resembles a canonical site but with an unpaired central segment. Cassette integration occurs between the attC bottom strand and a double-stranded attI by a single-strand exchange mediated by the IntI recombinase (11). Unlike that seen for the CTX phage attP, the hairpin structure does not result in homology between attC and attI. IntI recombinase employs a sequence-dependent recognition of the attI site and a unique sequence-independent method to recognize attC sites by interacting with an extrahelical base on one of the attC arms (48). Excision of the gene cassettes occurs by attC × attC recombination.
The Bacteroides conjugative transposon CTnDOT has a life cycle similar to that of Tn916 except that its excision and transfer is stimulated 1,000-fold by tetracycline (72). It uses a coupling-sequence mechanism reminiscent of that of Tn916. CTnDOT integrates into several sites, but all known attB target sites and the attDOT site have a 10-bp sequence (5′-GTAANTTTGC-3′) within the B- or D-core-type site (Fig. (Fig.4).4). Originally, the coupling sequences were thought to be 5 bp in length, since only five bases varied between sites (19). In vitro cleavage assays using phosphorothioate substrates revealed that the cleavage site on the top strand was located two bases to the left (between the T and G) within the 10-bp region of homology. Thus, the cleavage sites for IntDOT are 7 bp apart, and the attDOT- and attB-site overlap regions share 2 bp of identity on the left side (19, 49) (Fig. (Fig.44).
To explain how recombination occurs through 5 bp of heterology, it was proposed that CTnDOT may use its single-stranded DNA and integrate by a single strand exchange step similar to that seen for CTX phage and integron gene cassettes. However, it is unlikely that CTnDOT uses this strategy since in vitro systems for both integration and excision use double-stranded substrates. The in vitro integration reaction between a circular attDOT and a linear attB results in a linear recombinant that can be formed only by a double-strand exchange mechanism (25, 50). The HJ intermediate was trapped using nicked attDOT substrates and was found to be preferentially formed by top-strand exchange (50). This places the 2-bp region of identity next to the first strand exchange site. Substitutions of these two bases in either attDOT or attB so that these positions were mismatched severely depressed the reaction. However, sites with identical substitutions allowed the reaction to proceed efficiently (50). Thus, homology was important for the first strand exchange. However, heterology at the remaining five positions in the crossover region does not affect the reaction. Making the two 7-bp overlap regions completely identical does not stimulate the reaction, indicating that heterology does not inhibit the second strand exchange. Apparently, the first strand exchange of IntDOT is homology dependent, while the second strand exchange is homology independent. Among the insertion sites that were analyzed for Tn916 and Tn1545, two bases outside the 6-bp overlap region (TT to the right and AA to the left) are conserved. We suggest that it is possible that, similar to IntDOT, the first cleavage site for Tn916 and Tn1545 may be in a region of homology.
NBU1 is a Bacteroides mobilizable transposon that relies upon CTnDOT for its excision and transfer to a new host but encodes its own integrase, IntN1 (77). The integration of NBU1 has several parallels with that of phage lambda. Integration is highly site specific, with a single target site (attBT1-1) in the Bacteroides host chromosome lying at the 3′ end of a leu-tRNA gene (78). The attBT1-1 site shares a 14-bp region of identity with the NBU1 attN1 site, and the cleavage sites for IntN1 are spaced 7 bp apart (Fig. (Fig.5)5) (68, 78). An important difference between IntN1 and the rest of the lambda Int family lies in its lack of dependence on overlap sequence homology. Although strand exchange normally occurs between completely identical sites, some fortuitous observations came from experiments where sites contained mismatches at positions −2 and −3 next to the bottom-strand cleavage site that drastically stimulated the integration reaction (73) (Fig. (Fig.5).5). The G(−2)C substitution in attN1 and the C(−3)G substitution in attBT1-1 enhance the reaction rates both in vivo as well as in vitro. Identical mutations in both the sites, which restore homology, result in reaction frequencies below the wild-type level, unlike that seen for lambda Int, Cre, or Flp. IntN1 is the only recombinase known where heterologies can stimulate the reaction. NBU1 does not employ the strategy of folding its single-stranded DNA into a recombination substrate (69). An in vitro integration system was used to show that IntN1 performs ordered sequential strand exchanges via an HJ intermediate that is formed by the exchange of bottom strands (69). With the C(−3)G attBT1-1, the HJ intermediate was formed in spite of a mismatch one base from the site of the first strand exchange. In contrast, the second strand exchange appears to be homology dependent, with very little recombinant being formed in the in vitro reaction in the presence of mismatches at position −7 or −8 (69). IntN1 is unique in the family of tyrosine recombinases because it can perform the first strand exchange in the absence of homology at the sites of cleavage and exchange. It is likely that the attN1 and attBT1-1 sites are complementary to maintain the sequence of the leu-tRNA gene after recombination.
Since only the G(−2)C substitution in attN1 and the C(−3)G substitution in attBT1-1 result in enhanced activity, these substituted bases may make favorable contacts with IntN1 in the intasome. There is evidence suggesting that the HJ formed with the C(−3)G attBT1-1 has a different conformation than that formed with the wild-type substrate (69). The enhanced reaction rates could be due to a combination of particular protein-DNA contacts made in the intasome by the substituted bases and a favorable HJ conformation. The opposite effects seen with the attN1 and attBT1-1 mutations may be a result of the difference in binding sites for IntN1 on the right side of the crossover region.
Studies of other tyrosine recombinases suggested that the role of homology is to facilitate strand ligation and stabilize the strand exchange products. Most of the tyrosine recombinases mentioned in this section can perform a strand exchange in the absence of a complementary base at the site of ligation. In an experiment that directly compared ligation reactions of IntDOT and lambda Int, IntDOT could ligate substrates with all 7 bp mismatched in the overlap region, whereas lambda Int could not tolerate a single base pair mismatch next to the site of ligation (50). There must be important differences in the architectures of the protein-DNA complexes formed by these recombinases that allow the strand exchange and ligation to take place in spite of mismatches.
To conclude, overlap sequence identity is not an absolute necessity in all site-specific recombination systems. Some tyrosine recombinases have relaxed requirements for homology at the site of ligation and can join DNA in the absence of base pairing. There must be catalytic or structural diversity that can account for the ability of these enzymes to perform strand exchanges in the absence of homology. Perhaps the catalytic sites of Tn916 and Tn1545 Int, IntDOT, and IntN1 proteins better accommodate mispaired helices in the catalytic pocket. For example, the enzymes could make contacts with the incoming DNA strand such that orienting the attacking 5′-OH for the ligation reaction does not require correct base pairing. Crystal structures of these proteins complexed to their DNA substrates would lead to useful insights into their mechanisms.
Paradigms are subject to the progression of discoveries in a field. The original paradigm for site-specific recombination reactions proposed simple rules in which sites with identical sequences were aligned and recombined by sequential pairs of strand exchanges. It is now clear that some systems are more complex with regard to the structures of their substrates and requirements for DNA sequence homology. It is interesting to speculate about the course of events if elements like conjugative transposons, NBU1, phage CTX, and integrons had been discovered and characterized first. With the discovery of lambda Int, Flp, and Cre, it would have been a novelty that some systems require strict DNA homology for strand exchanges.
We thank Anca Segall, Art Landy, and Abigail Salyers for suggestions on the manuscript.
The work done in our laboratory was supported by NIH grant GM28717.