|Home | About | Journals | Submit | Contact Us | Français|
CTnDOT integrase (IntDOT) is a member of the tyrosine family of site-specific DNA recombinases. IntDOT is unusual in that it catalyzes recombination between nonidentical sequences. Previous mutational analyses centered on mutants with substitutions of conserved residues in the catalytic (CAT) domain or residues predicted by homology modeling to be close to DNA in the core-binding (CB) domain. That work suggested that a conserved active-site residue (Arg I) of the CAT domain is missing and that some residues in the CB domain are involved in catalysis. Here we used a genetic approach and constructed an Escherichia coli indicator strain to screen for random mutations in IntDOT that disrupt integrative recombination in vivo. Twenty-five IntDOT mutants were isolated and characterized for DNA binding, DNA cleavage, and DNA ligation activities. We found that mutants with substitutions in the amino-terminal (N) domain were catalytically active but defective in forming nucleoprotein complexes, suggesting that they have altered protein-protein interactions or altered interactions with DNA. Replacement of Ala-352 of the CAT domain disrupted DNA cleavage but not DNA ligation, suggesting that Ala-352 may be important for positioning the catalytic tyrosine (Tyr-381) during cleavage. Interestingly, our biochemical data and homology modeling of the CAT domain suggest that Arg-285 is the missing Arg I residue of IntDOT. The predicted position of Arg-285 shows it entering the active site from a position on the polypeptide backbone that is not utilized in other tyrosine recombinases. IntDOT may therefore employ a novel active-site architecture to catalyze recombination.
Conjugative transposons (CTns) are mobile DNA segments that use conjugation and site-specific recombination to transfer a copy of their DNA from a donor to a recipient strain. CTnDOT was originally discovered in a strain of Bacteroides thetaiotaomicron that was capable of transferring resistance to tetracycline and erythromycin (4, 35). Upon exposure to tetracycline, CTnDOT excises from the donor chromosome, copies its DNA by rolling-circle replication, and transfers its DNA to the recipient cell, where it circularizes and is integrated into the recipient chromosome by site-specific recombination. In the past 30 years, the frequency of tetracycline-resistant Bacteroides isolates has risen dramatically, to around 80% of isolates (35). Much of the spread of tetracycline resistance is due to the conjugative transposon CTnDOT and its close relatives (37).
Previous work has shown that the integration and excision reactions require the CTnDOT-encoded integrase (IntDOT) and an uncharacterized Bacteroides host factor (8, 9, 30, 39). Analysis of the IntDOT amino acid sequence indicated that it was a member of the tyrosine recombinase family. It contains five of the six signature residues required for catalysis of the tyrosine recombination reactions (8, 30, 33). We previously constructed and characterized mutants containing alanine substitutions of residues in the catalytic (CAT) domain that are conserved among tyrosine recombinases. The results supported the inclusion of IntDOT within the tyrosine family of recombinases. However, the catalytic core seemed to have an organization somewhat different from those of other tyrosine recombinases (30). In addition, we used a homology modeling method to identify residues in the core-binding (CB) domain that are predicted to be near the DNA (29). The results of alanine substitutions of several residues indicated that some residues in the CB domain are likely involved in catalysis.
The information-directed mutagenesis approaches that we used previously with IntDOT are useful for producing amino acid substitutions at positions predicted to be important for protein function on the basis of methods such as sequence analysis or homology modeling. Because IntDOT has an arm-binding (N) domain in the N terminus of the protein about which relatively little is known, and because the CAT domain appeared to have an unusual structure not found in other family members, the in vitro approach is limited in its ability to produce useful substitution mutations affecting the functions of these domains. In order to complement our earlier work, we chose to use a structure-function approach similar to one used previously with λ Int (21). The strategy we used was to isolate substitution mutants of IntDOT generated by random mutagenesis using an in vivo screen for recombination activity. This approach produced amino acid substitutions in all three of the domains of IntDOT. Analysis of the mutants has uncovered novel amino acid substitutions that cause defects in different steps in the recombination pathway, such as DNA binding, DNA cleavage, and DNA ligation.
The strains, plasmids, and oligonucleotides used in this study are listed in Table Table1.1. Escherichia coli strains were grown in Luria-Bertani (LB) medium (Difco). E. coli DH5α was used for cloning and plasmid maintenance. E. coli W3110 Δ(lacI) (described below) was used for construction of the indicator strain. E. coli MC1061 strains containing pSK2 and pSK2 derivatives were used for production of wild-type and mutant IntDOT proteins. Antibiotics, arabinose, and hydroxylamine were purchased from Sigma. Antibiotic concentrations used were 100 μg/ml augmentin (Aug), 100 μg/ml ampicillin (Amp), 10 μg/ml chloramphenicol (Cam), and 50 μg/ml kanamycin (Kan). X-Gal (5-bromo-4-chloro-3-indolyl-β-d-galactopyranoside) was obtained from RPI and was used at a concentration of 80 μg/ml. DNA-modifying enzymes were supplied by New England Biolabs or Invitrogen, and reactions were performed as recommended by the manufacturers. [γ-32P]ATP was purchased from Perkin-Elmer, Inc., and T4 DNA kinase from Fermentas. All oligonucleotides were synthesized by Integrated DNA Technologies.
A pir-dependent plasmid, pJMD100 (12), containing the CTnDOT attB and attDOT sites, the phage λ attP site containing the P′3 ten mutation (32), a chloramphenicol resistance gene (camR), the oriR6K origin of replication, and the lacZ+ gene, was constructed. In order to construct pSK1, two DNA fragments produced by PCR amplification were ligated using XbaI restriction sites introduced at the ends of oligonucleotide primers (Table (Table1).1). The first DNA fragment contained the lacI+ gene amplified from pAH55 (20). The second DNA fragment contained the CTnDOT attB and attDOT sites, the phage λ attP P′3 ten site, the camR gene, and the oriR6K origin amplified from pJMD100. The DNA sequence of the λ attP P′3 ten site in pSK1 was changed to the wild-type attP P′3 DNA sequence by site-directed mutagenesis (Stratagene QuikChange kit) so that the pSK1 plasmid could integrate into the λ attB site in E. coli W3110 Δ(lacI) through recombination catalyzed by λ Int (see below). In order to construct pSK2, the wild-type intDOT gene from plasmid pT7 int (11) was subcloned into pBAD18 between the XbaI and HindIII sites.
The lacI +gene in E. coli W3110 was replaced with a gene encoding kanamycin resistance as described by Datsenko and Wanner (10) by using oligonucleotides listed in Table Table11 to produce strain JG19000. The deletion of the lacI gene was confirmed by the ability of the mutant strain to form dark blue colonies on LB agar plates supplemented with X-Gal in the absence of isopropyl-β-d-thiogalactopyranoside (IPTG). JG19001 carries plasmid pBMS2, which contains a pBR322 origin of replication, a gene encoding ampicillin resistance, and the λ int gene under the control of the arabinose-inducible PBAD promoter as a source of λ Int. Plasmid pSK1, which contains the lacI+ gene flanked by the CTnDOT attB and attDOT sites (see above), was integrated into the JG19001 chromosome by site-specific recombination between the λ attP site on pSK1 and the attB site in the chromosome of JG19001. Chloramphenicol-resistant colonies containing the integrated pSK1 were white on LB X-Gal plates in the absence of IPTG due to expression on the Lac repressor carried by pSK1. The integrated plasmid was transduced by phage P1 vir into JG19000. A white P1 transductant (JG19002) that was resistant to chloramphenicol and kanamycin and sensitive to ampicillin was selected on the indicator strain used to screen for IntDOT mutants.
Plasmid pSK2 DNA (10 μg) was incubated with 7 M hydroxylamine in sodium phosphate-EDTA buffer (0.1 M potassium phosphate, 1 mM EDTA [pH 6]) (21) at 37°C for 7 h. The reaction was stopped by drop dialysis for 1 h against TE buffer (10 mM Tris-HCl, 1 mM EDTA [pH 8]) on 0.025-μm-pore-size Millipore filters. Competent E. coli JG19002 cells were electroporated with the hydroxylamine-treated pSK2 DNA. After 2 h of incubation at 37°C in the presence of 0.02% arabinose to induce IntDOT expression, cells were plated onto LB plates containing X-Gal, 0.02% (wt/vol) arabinose, and Aug and were grown at room temperature for 48 h. Loss of the integrated lacI+ gene due to IntDOT-catalyzed recombination was determined by colony color and PCR using oligonucleotides listed in Table Table1.1. Expression of lacZ was monitored by screening for blue or white colonies on a medium containing X-Gal. Colonies expressing active IntDOT protein were blue and gave a PCR product of approximately 500 bp due to loss of the lacI+ gene. Colonies expressing inactive IntDOT protein were white and produced 1.8-kb PCR products because of the presence of the lacI+ gene. White colonies with blue papillae yielded both PCR products. The intDOT genes from white colonies containing putative mutants were subcloned into a fresh parental vector and electroporated into fresh JG19002 cells to confirm the recombination-defective phenotype. The intDOT gene from each putative mutant subclone was sequenced to confirm the presence of the mutation and to identify the amino acid substitutions.
Three pSK2 derivatives, each containing the R285A, R285D, or R285K substitution, were constructed using the Stratagene QuikChange site-directed mutagenesis kit. Oligonucleotides containing specific mutations are listed in Table Table1.1. Mutagenized plasmids were sequenced to confirm the presence of the desired mutation. The plasmids were transformed into JG19002 for in vivo integration assays and into MC1061 for overproduction of mutant proteins for biochemical assays.
To express wild-type and defective IntDOT proteins, MC1061 cells containing pSK2 or pSK2 derivatives were induced with arabinose. A 50-ml culture was grown in LB medium supplemented with ampicillin to an optical density (OD) of 0.4 at 37°C, and arabinose was added to a final concentration of 0.4%. The cells were induced for 4 h at room temperature. Cells were harvested and suspended in 0.5 ml of low-salt IntDOT lysis buffer (50 mM NaHPO4, 1 mM EDTA, 50 mM NaCl, 10% glycerol, 1 mM dithiothreitol [DTT] [pH 8]). After sonication and centrifugation at 4,000 × g for 20 min at 4°C, the supernatants containing IntDOT protein were used for in vitro assays (described below). Quick Start Bradford protein assays (Bio-Rad) were used to determine the amounts of protein in cell extracts. The IntDOT proteins expressed from the wild-type or mutant genes were not detected on sodium dodecyl sulfate (SDS)-polyacrylamide gels.
A 324-bp fragment containing the functional attDOT site was amplified from plasmid pJDE2.3 (8) using oligonucleotides listed in Table Table1.1. The DNA was labeled as described previously (29) and was gel purified. Thirteen microliters of each binding reaction mixture, containing 25 nM labeled attDOT DNA, 15 nM purified E. coli integration host factor (IHF), and 15 μg of total-cell extract containing wild-type or mutant IntDOT proteins and IHF, was used in reactions performed at room temperature for 20 min in GSBA75 buffer (50 mM Tris-HCl [pH 8], 1 mM EDTA, 50 mM NaCl, 10% glycerol, 75 ng/ml herring sperm DNA). The mixtures were loaded on native 5% polyacrylamide gels and were subjected to electrophoresis for 2 h at 200 V. The gels were dried and exposed to phosphorimager screens, and results were analyzed using Fujifilm Image Gauge software (Macintosh, version 3.4). Symbols were used to represent the results as follows: ++ indicates that complexes formed as efficiently or nearly as efficiently as those with the wild-type protein.; + indicates decreased efficiency; and − indicates severely decreased efficiency.
Phosphorothiolate cleavage substrates (30) containing a bridging sulfur at the site of cleavage were prepared by annealing the radiolabeled JG01T-ops and unlabeled JG01B at a 1:10 ratio in an annealing buffer (0.1 M KCl, 10 mM Tris-HCl, 5 mM EDTA [pH 8]) by heating to 90°C and cooling to 22°C at a rate of 5°C min−1 using a thermocycler (MJ Research, Inc.). A double-stranded cleavage substrate (43 pmol) was incubated with 15 μg of cell extracts containing wild-type or mutant IntDOT proteins in 12 μl of cleavage reaction mixture (20 mM Tris-HCl [pH 8], 5 mM DTT, 0.05 mg/ml bovine serum albumin [BSA], 1% glycerol) at 37°C for 2 h. Reactions were stopped by addition of 4 μl of 4× sample buffer (200 mM Tris-HCl [pH 6.8], 400 mM DTT, 8% SDS, 0.4% bromophenol blue, 40% glycerol). Samples were boiled for 5 min, and protein-DNA complexes were resolved on 4-to-20% Tris-glycine gels (30). The gels were exposed to phosphorimager screens and were analyzed by a DNA ligation assay as described in the next section. The assays were performed at least in triplicate.
The attDOT ligation substrate (30) was double-stranded DNA that mimics the intermediate produced by DNA cleavage performed by IntDOT. The oligonucleotides used (see Table Table1)1) were annealed to form a double-stranded substrate with 3′ para-nitrophenol (pNP) and 5′-OH at the site of ligation. The end-labeled JG02 DNA contains a pNP group at the 3′ end and was annealed with JG02B and JG02T at a ratio of 1:5:10 as described for the cleavage assay in the preceding section. Assays were performed in 14-μl reaction mixtures containing 6.5 pmol of a 32P-labeled attDOT ligation substrate, 50 mM Tris-HCl (pH 8), 1 mM EDTA, 65 mM KCl, 10% glycerol, 200 μg/ml BSA, and 15 μg of cell extract as a source of IntDOT proteins. The reactions were stopped after 1 h of incubation at 37°C by addition of 14 μl of 90% deionized formamide in 1× TE. Samples were heated for 5 min, and the ligated DNA products were detected after electrophoresis through 15% polyacrylamide-TBE (Tris-borate-EDTA) buffer-urea gels at 600 V for 2 h. Gels were exposed to phosphorimager screens and were analyzed as described below. The results for each sample were calculated by dividing the total number of counts present in the cleavage or ligation product band by the total number of counts in the substrate and product bands. Results were reported as the averages for at least three independent experiments and are presented as follows: ++, the cleavage or ligation product represents ≥20% of the total; +, the cleavage or ligation product represents 0.1 to 20% of the total; −, no product was detectable.
A 3-dimensional structure for the CAT domain of IntDOT was produced essentially as described previously (29, 40). Briefly, the mGenTHREADER fold recognition algorithm (24, 31) was used to identify sequence-structure alignments between the amino acid sequence of the IntDOT CAT domain and experimentally determined structures in the Protein Data Bank (PDB). Significant structural conservation with numerous tyrosine recombinases was detected. The most closely related proteins were integron IntI (25.3% identical; PDB 2a3v) (28), HP1 Int (20.6% identical; PDB 1aih) (23), λ Int (18.4% identical; PDB 1ae9) (27), XerD (17.1% identical; PDB 1aop) (38), and Cre (12.1% identical; PDB 1xo0) (16). Each pairwise alignment produced by mGenTHREADER was generally consistent and similarly aligned the active-site residues of each crystallized protein with IntDOT. Each sequence-structure alignment was evaluated for use as a modeling template by threading the IntDOT amino acid sequence onto each crystal structure using the Deep View Swiss-PdbViewer and analyzing its calculated threading energy. The 4crx Cre structure (18) was selected for model building (see also Results). Because one short region of the alignment (IntDOT residues 284 to 292) yielded high threading energy, we produced a hybrid template structure for model building as described previously (40), in which the high-energy portion of the 4crx template structure was replaced with the low-energy portion of the 1p7d structure. This yielded a much lower threading energy for IntDOT in this region, while also preserving the interactions of the template structure with DNA. A 3-dimensional all-atom model for the IntDOT CAT domain bound to a core-type DNA site was then constructed from the hybrid template structure using Swiss Model (36), which incorporates molecular dynamics simulations to minimize the energy of polypeptide backbone and side chain conformations, as well as a loop-building process to predict the best position and orientation of amino acid residues in flexible protein regions. All model calculations were performed using default parameters.
In previous studies, we used the conservation of catalytic residues or homology modeling of tyrosine recombinases as guides to construct IntDOT mutants in vitro. Although these methods can be used to test specific hypotheses, these approaches do not uncover mutants with interesting phenotypes that cannot be predicted. In order to circumvent this limitation, we chose to complement the in vitro studies by using a structure-function approach, based on random mutagenesis, that we have used previously with λ Int (21). In order to perform a structure-function analysis of the IntDOT protein, we first needed to isolate mutants generated by random mutagenesis that were deficient in recombination by an in vivo integration assay. Subsequent analysis of the mutants could identify those that are defective in different steps in the recombination pathway, such as DNA binding, DNA cleavage, or DNA ligation.
Because of the technical difficulties involved with genetic manipulations of Bacteroides, we developed a genetic screen for CTnDOT integrative recombination that functions in E. coli. Since the in vitro reaction functions with IntDOT, IHF, and attDOT and attB substrates (9), we anticipated that the reaction would also function in E. coli cells in vivo. Using the Lac repressor (lacI+) gene in E. coli as a reporter, we developed an indicator strain (JG19002) (see Materials and Methods) to assay for integrative recombination in vivo (Fig. (Fig.1).1). The indicator strain was engineered to have a single copy of the lacI+ gene flanked by direct repeats of the attDOT and attB sites. Introduction of a plasmid containing a functional intDOT gene into the indicator strain results in recombination between attDOT and attB. This reaction excises the lacI+ gene as a circular product that cannot replicate. The lacI+ gene is lost from the population as the cells divide to form a colony. β-Galactosidase is expressed constitutively in colonies with cells that have lost the lacI+ allele and can be detected by plating cells onto LB plates supplemented with X-Gal. Because the E. coli in vivo integration assay uses indicator plates to screen for mutants, it is practical to screen a large population of colonies for mutants.
We used the in vivo genetic screen to isolate a large collection of recombination-defective IntDOT mutants produced by random mutagenesis with hydroxylamine. After electroporation of the JG19002 indicator strain with mutagenized plasmid DNA, a wide range of colony phenotypes was observed: blue colonies, white colonies, and white colonies containing various amounts of blue papillae. Colonies containing wild-type IntDOT were blue or white with blue papillae. Colonies containing severely defective IntDOT proteins were white or mostly white, because the lacI+ gene remained in the genomes of all cells in the colony and the Lac repressor repressed the synthesis of β-galactosidase. As expected, a plasmid containing an intDOT gene with a substitution of phenylalanine for the catalytic tyrosine (Y381F) produced white colonies in this assay. We picked the white regions of colonies that contained few or no papillae, streaked them onto plates supplemented with X-Gal, and observed the color patterns of several independent colonies. Colonies that yielded white colonies or a mixture of white and papillated colonies when streaked were saved as potential mutants. Colonies that produced both blue and sectored colonies were assumed to contain plasmids with wild-type intDOT genes and were not analyzed further. To ensure that these phenotypes resulted from IntDOT recombination, the presence or absence of the lacI+ gene was confirmed by colony PCR. All putative mutants were backcrossed into JG19002 to confirm that the phenotypes were associated with the plasmid.
Using this approach, we isolated 25 mutants with single substitution mutations in the intDOT gene from approximately 3,000 transformants screened. This resulted in a 0.83% mutation frequency. As predicted by the known in vitro specificity of hydroxylamine, all of mutants contained C·G-to-T·A transition mutations. Twenty-four of the mutants contained a single missense mutation, and one contained an amber mutation. Two mutants, the R348H and H372Y mutants, affected residues that are conserved in the catalytic domains of tyrosine family recombinases (13, 30, 33). The other 23 substitution mutations were distributed throughout the three domains of the protein. The mutants and their amino acid substitutions are listed in Table Table22 .
Tyrosine recombinases often contain three domains: an arm-binding (N) domain, a core-binding (CB) domain, and a catalytic (CAT) domain. Because the integration and excision reactions of IntDOT are regulated, we expected that IntDOT would also contain an N domain. We predicted the secondary structure of IntDOT using PROF sec (34).
This prediction agreed with the secondary-structure prediction that we reported previously for the IntDOT CB domain using the PSIPRED algorithm (29). Figure Figure22 summarizes the predicted secondary-structure elements of IntDOT, the organization of these elements into the three domains typically found in several tyrosine recombinases, and the location in the protein of each amino acid substitution shown in Table Table2.2. The putative IntDOT N domain spans approximately 90 residues and is predicted to contain a three-stranded sheet (β1, β2, β3) followed by an α-helix (H1). This arrangement is also found in both λ Int and Tn916 Int, although the primary sequences of this region of the three proteins are not related. The protein also contains a second putative α-helix (H2) extending from residues 92 to 107, which is analogous to the “coupler” region of λ Int (5).
We previously predicted that the IntDOT CB domain comprises residues 108 through 220 and contains four major α-helices (A, B, C, D) (29), arranged in the orthogonally crossed conformation that is typical of tyrosine recombinases (40). It is not clear whether the IntDOT CB domain also contains a fifth α-helix (E, near residues 209 to 213) similar to those found in Cre (19) and Flp (7); therefore, in this report we have not labeled any of the helices “E,” in order to avoid conflicts in nomenclature.
The predicted CAT domain of IntDOT contains residues 222 to 411. It contains nine helices (F, G, H, I, J, K, L, M, and N) and three β-strands (β3, β4, and β5). The λ Int (27), Cre (19), and Flp (7) proteins contain 7 to 14 helices and 5 to 7 strands in their CAT domains, depending on the protein. Through multiple-sequence alignment, we previously identified five of the six signature active-site residues of tyrosine recombinases within the CAT domain of IntDOT (8, 30). These include K287 in the loop between strands β-4 and β-5, H345 and R348 in the loop between helices J and K, H372 in the loop between helices L and M, and the catalytic tyrosine Y381 in helix M. However, IntDOT is missing the first conserved arginine (Arg I), which is present in other tyrosine recombinases (30). The residue at the equivalent position (259) in IntDOT is serine. Replacement of the serine with alanine has no effect on recombination (30). Our experimental and modeling results suggest that the Arg I function of IntDOT is provided by an arginine from another location in the CAT domain (see below).
A homology model for the IntDOT CAT domain was constructed based on the structural conservation we detected between IntDOT and several other tyrosine recombinases. Cre was selected for model building because (i) it provided a low overall threading energy, (ii) known active-site residues were aligned correctly (except for Arg I; see below), (iii) insertions and gaps in the alignment did not disrupt secondary-structure elements that are strongly conserved among tyrosine recombinases (33), (iv) cocrystal structures with Cre allow interactions between the modeled protein and a bound DNA site to be investigated, and (v) we previously obtained good results using Cre to model the IntDOT CB domain (29). Although Cre shares low sequence identity with the IntDOT CAT domain (12.1%), we found previously that modeling of tyrosine recombinases is useful at this level of identity (29, 40). However, the model has certain limitations; for example, the true path of the polypeptide backbone may differ by as much as 3 to 4 Å (on average) from the model, and exact conformations of amino acid side chains in variable protein regions (e.g., on DNA-binding surfaces) cannot be predicted reliably (40). Nevertheless, the strong conservation of CAT domain structure and active-site configurations among tyrosine recombinases (33, 45) suggests that the model can provide a useful approximate representation of the overall IntDOT CAT domain and of the positions of its active-site residues within this structure.
We analyzed the mutant proteins for their abilities to bind DNA and to perform different steps in the recombination reaction by using assays described previously (30). In order to undergo recombination, IntDOT, the host factor, and attDOT DNA form a nucleoprotein structure called the integrative intasome. The host factor is an unknown Bacteroides protein, but we showed that E. coli IHF substitutes for the host factor in an in vitro recombination assay (8, 9). IHF likely binds nonspecifically and bends attDOT DNA to act in concert with IntDOT to form the integrative intasome. We also showed previously that IntDOT and IHF form a complex with attDOT DNA that can be detected by a gel shift assay (29) (Fig. (Fig.3).3). IntDOT alone does not shift attDOT efficiently. In the presence of both IntDOT and IHF, a supershift of both IntDOT and IHF is seen (Fig. (Fig.3).3). Because the complex requires IHF, which bends DNA, it is possible that the complex contains a monomer of IntDOT that interacts with a single DNA molecule containing attDOT through the binding of its N domain to an arm-type site and the binding of its CB domain to a core-type site (see below). The complex could contain one or more other monomers of IntDOT. The results of the gel shift assays performed with the mutant proteins and IHF are shown in Fig. Fig.33.
To analyze the abilities of the proteins to carry out catalytic steps in the recombination reaction, we used DNA cleavage and ligation assays. The cleavage assay utilizes a suicide substrate containing a bridging phosphorothiolate at one of the cleavage sites where strand exchange is initiated (Fig. (Fig.4A).4A). Cleavage of the substrate by IntDOT produces a 5′ phosphotyrosine linkage with the protein, leaving behind a 5′-SH group that is a poor nucleophile. Thus, it cannot attack a 3′ phosphotyrosyl bond to release the enzyme from the DNA. This reaction results in the irreversible covalent attachment of the protein to DNA through its catalytic Tyr-381 (30). The results of the cleavage assays performed with the wild-type and mutant IntDOT proteins are shown in Fig. Fig.4B4B.
The ligation assay utilizes an activated substrate containing a 3′ para-nitrophenol group adjacent to a free hydroxyl at the site of catalysis in the DNA (Fig. (Fig.5A).5A). Ligation of the substrate releases the para-nitrophenol group and forms a covalent phosphodiester bond in the DNA. This ligation product can be resolved from unreacted DNA in a denaturing gel. This ligation reaction does not require the catalytic Tyr-381 (30). The results of the ligation reactions performed with the wild-type and mutant IntDOT proteins are shown in Fig. Fig.5B5B.
Among mutants with substitutions in the N domain, the R13C protein shows a strong phenotype and the S38N protein shows a leaky phenotype in the in vivo integration assay. Although the R13C and S38N proteins show a diminished ability to form the supershifted complex, the mutants retain substantial cleavage and ligation activities, indicating that they are still catalytically active. The observation that these mutants are defective in forming the supershifted complex with IHF and attDOT DNA is consistent with a defect in DNA binding or protein-protein interactions. Nuclear magnetic resonance (NMR) studies of the λ Int N domain show that it contains an unstructured N-terminal tail and three β-strands that interact with an arm-type site (14, 44) by making sequence-specific contacts with the major groove of the DNA. Our secondary-structure prediction for the N domain of IntDOT suggests that it contains a similar three-β-strand motif but lacks an N-terminal tail. Thus, it is possible that the β1 strand containing R13 and the β3 strand containing S38 may have a role in binding to arm-type sites by IntDOT.
The V95M mutant is defective and the G101R mutant is leaky in the in vivo integration assay. The V95M substitution is located in a predicted α-helix (H1) that joins the N domain to the CB domain, and the G101R substitution is also near that helix. In λ Int, this region contains an α-helix and is known as the “coupler” region (5). The V95M and G101R proteins also showed a diminished ability to shift attDOT DNA in the presence of IHF. However, the V95M and G101R proteins did form complexes with mobilities similar to that of the complex with wild-type protein and IHF, and they were catalytically active in the cleavage and ligation assays. Warren et al. (42) showed that residues in the coupler region of λ Int are involved in protein-protein interactions that are essential for cooperative binding to arm sites. Taken together, the positions of V95M and G101R in a putative “coupler” region of IntDOT and their inability to form nucleoprotein complexes suggest that these mutants are defective in protein-protein interactions or in interactions with DNA.
The T184I and P209L proteins contain substitutions that introduce hydrophobic residues into the CB domain of IntDOT. The T184I and P209L mutant proteins are partially defective in all of the in vitro assays. However, they are distinguished by their effects on in vivo integration; the T184 mutant is a strong mutant, while the P209 mutant is a leaky mutant. T184 lies in helix D of the CB domain and is equivalent to residue S139 of λ Int (40). This position is highly conserved among tyrosine recombinases, where the preferred residues are serine or threonine (40). In our homology-based model of the λ Int CB domain, λ Int S139 is very close to the DNA backbone directly across from the active site, and this position was predicted to play an important mechanistic role during the recombination process (40). The strong effect of the T184I substitution on recombination catalyzed by IntDOT supports this prediction and provides additional experimental evidence for the importance of this conserved residue in recombination.
The P209L protein is also defective in all of the in vivo and in vitro assays. Interestingly, the P209 residue is located in the “linker” region connecting the CB and CAT domains. In the XerD crystal structures (38), this linker region is disordered and may differ in structure depending on interactions with DNA or other proteins in the intasome. The removal of a proline in this region is expected to shift the path of the peptide backbone significantly and may thus have a significant effect on the structure of the linker region.
The W280 amber mutant is predicted to form a truncated protein that contains both the N and CB domains but lacks a large portion of the CAT domain. The mutant is completely inactive in the in vivo integration reaction as well as the cleavage and ligation reactions, as expected. The protein did not form the supershifted complex with IHF and attDOT DNA (Fig. (Fig.3),3), which is consistent with the predicted requirement for binding of the CAT domain to a core-type DNA site in the supershifted complex. Since the complex requires IHF, which bends DNA, we previously suggested that bending of the DNA by IHF allows a single IntDOT monomer to simultaneously form an intramolecular bridge with an arm-type site through the N domain and with a core-type site through interactions with the CB and CAT domains (29). However, we cannot rule out the possibility that the truncated protein is rapidly degraded by cellular proteases.
All of the remaining substitutions, which constitute the majority of the mutants isolated in this study, lie in the CAT domain. The T256I, V319I, C325Y, L326F, R330H, and S347N proteins have defects in all of the in vivo and in vitro assays. These proteins are defective in forming specific complexes with IHF, and some (T256I, W280, C325Y, L326F, and R330H) actually appear to prevent the formation of the specific complex that contains only IHF and attDOT DNA (Fig. (Fig.3).3). This result is reproducible and has been obtained for all five mutants with several independent extracts. We do not know the mechanistic basis for these results, but the proteins could interact nonspecifically with attDOT DNA to inhibit the binding of IHF or to form complexes that are not stable. However, the V319I and S347N mutant proteins were defective to some extent in all functions assayed, suggesting that these proteins might not be expressed well or might be misfolded or degraded.
The substitutions in the A352T and T354I proteins lie within putative α-helix K. The A352T protein is defective and the T354I protein is leaky in the in vivo integration assay, but both of these mutant proteins formed supershift complexes with IHF and ligated DNA. Interestingly, the A352T protein is defective in cleavage activity without apparent alteration of its ligation activity, whereas the T354I protein cleaves DNA as well as wild-type protein. In the model, the A352 residue is located in the hydrophobic core of the protein. Therefore, it appears that the nonconservative polar A352T substitution alters the CAT domain structure by interfering with hydrophobic packing in the domain core. Because A352 is located near the active site, the A352T substitution probably disrupts recombination by altering the position of active-site residues, such as H345 and R348, that reside nearby in the same α-helix. The T354I substitution is also located in the same α-helix and may have a similar effect; however, the effect of T354I may be diminished because this position is located farther from the active site. It is unlikely that substitutions in the A352T and T354I proteins destabilize the CAT domain, since the mutant proteins bind DNA as well as the wild-type protein.
The T365M, S367N, and S368F substitutions change residues that are located in putative α-helix L. The T365M protein is defective and the S367N and S368F proteins are leaky mutants in an in vivo integration assay. The T365N and S368F proteins are active in the gel shift, cleavage, and ligation assays. The observation that these mutants retain significant activity in most assays indicates that the substituted residues are not severely disruptive. The S367N protein forms a supershifted complex but shows a significant reduction of cleavage and ligation activity. In the model, these three residues are located proximal to the conserved catalytic H372 residue, and thus, the substituted residues may affect recombination and catalysis by altering the position of H372. Additionally, the three residues are located on or near the exterior surface of the protein, and the substituted residues may therefore interfere with interprotein interactions within the intasome.
The G371E and H372Y proteins are defective in most assays and contain substitutions that affect adjacent residues in IntDOT. This glycine-histidine dyad, located in a loop between helices L and M in IntDOT, is conserved in λ Int and other tyrosine recombinases (33), and in λ Int the glycine-histidine dyad is known to interact with DNA (27). H372 aligns with the conserved His II residues in other tyrosine recombinases and thus is likely to be a catalytic residue located in the active site, as shown in the model (see Fig. Fig.7).7). We showed previously that the H372A mutant protein is also inactive in recombination and lacks detectible cleavage and ligation activities (30).
Substitution of the G371 residue is expected to reduce the flexibility of the protein backbone at this location. Due to the proximity of G371 to the active-site residue H372, it seems probable that the glycine-to-glutamic acid mutation disrupts recombination by altering the position of H372. However, it is also possible that the introduction of the glutamic acid residue, and not the loss of the glycine, is responsible for the mutant phenotype. Our model suggests that the mutant glutamic acid residue is proximal to the DNA and may thus interfere with DNA binding through electrostatic repulsion with the negatively charged DNA backbone. Alternatively, the negatively charged glutamic acid might interact with the nearby positively charged residues of the active site to disrupt reaction chemistry.
The substitution in the A382V protein affects the residue adjacent to the catalytic tyrosine Y381 in helix M. The A382V mutant is defective for recombination in vivo but is proficient in most of the in vitro assays. In the model, A382 is located on the back side of the helix containing Y381, and the valine substitution could alter the positioning of this critical residue without disrupting DNA binding by other parts of the protein.
The L389F substitution affects a residue in the C-terminal tail of the protein. The L389F protein showed reduced recombination in the in vivo integration assay and normal activity in the DNA binding, cleavage, and ligation assays. In our model, L389F is located on the extensible “tail” that interacts with another recombinase monomer in the IntDOT tetramer, so the L389F substitution appears unlikely to have a direct effect on catalytic activity. In λ Int, a mutant protein lacking the last 8 C-terminal residues is defective for recombination but has increased topoisomerase activity, suggesting that the C-terminal tail is involved in the regulation of catalysis (26). Subsequent work (22) on the C-terminal tail of λ Int showed that this region is essential for intermolecular protein-protein interactions required for coordinated Holliday junction (HJ) resolution.
The active sites of tyrosine recombinases contain six signature residues (RKHRHY) that are involved in catalysis. The two conserved arginines (Arg I and Arg II) stabilize an intermediate in cleavage reaction and are essential to recombinase activity (3, 41). Our previous attempt to identify the Arg I residue through multiple-sequence alignment was unsuccessful because the position that corresponds to Arg I in other tyrosine recombinases is a serine (S259) in IntDOT (30). The S259 residue does not appear to be involved in catalysis, because an alanine substitution (S259A) at this position did not affect recombination in vivo, while an arginine substitution (S259R) interfered with recombination (30). A second nearby arginine residue was also identified (R247), but an alanine substitution at this position showed that R247 also was not required for recombination (30). The lack of a conserved Arg I residue in IntDOT was particularly interesting because Arg I substitution mutants in λ Int, Cre, and Flp are all defective in catalysis (1, 2, 6, 21, 30, 33, 43). Our conclusion was that an arginine residue in another region of the CAT domain might substitute for the missing arginine (30). In this study, we identified two nonconserved arginine residues in our mutant collection, R295 and R285, that are candidates for the missing catalytic arginine of IntDOT.
The R295H protein was defective for integration in vivo but was proficient in the DNA binding, cleavage, and ligation activities in vitro. Since the catalytic activities of the R295H mutant seem to be as functional as those of the wild-type protein, it is unlikely that R295 is involved in catalysis. This prediction is supported by the model, which shows that R295 is on the opposite side of the CAT domain from the active site and the DNA binding interface.
On the other hand, the R285H mutant protein was defective for recombination, cleavage, and ligation. However it formed the supershift complex like the wild-type protein. These results suggest that R285 is involved in catalysis and could be functioning in a manner similar to that of the Arg I residues in other tyrosine recombinases. To investigate the effects of other residues at position 285, we constructed and characterized proteins with substitutions of alanine (R285A), aspartic acid (R285D), or lysine (R285K) (Fig. (Fig.6).6). All three mutant proteins were defective in recombination in vivo and were unable to cleave DNA. However, the mutant proteins were able to form the IHF-dependent complexes in the gel shift experiment similarly to the wild-type protein, indicating that they are capable of binding DNA and forming higher-order nucleoprotein structures.
Mutant proteins with replacements of the Arg I residues of Cre (at position 173) and Flp (at position 191) have been constructed and analyzed. Most of the phenotypes were similar to those of the IntDOT R285K protein. The Cre R173K protein was defective in recombination and was able to bind to DNA (1). The Flp R191K protein bound DNA but was defective in ligation and in completing recombination (15). However, the Flp R191K protein was able to cleave a full site efficiently (6), while the IntDOT R285K protein cleaved a full site inefficiently. Since the substrates used in the Flp studies did not contain a nick, while the substrate used in our assay with IntDOT R285K contained a nick in the core, the results cannot be directly compared. Thus, with the possible exception of cleavage activity, the IntDOT R285K protein has a phenotype similar to those of the Cre R173K and Flp R191K proteins.
The Arg I residues of Cre (R173), Flp (R191), and λ Int (R212) are located in the loop between α-helices G and H (Fig. (Fig.2).2). In contrast, R285 of IntDOT is predicted to be at the end of β4 in the CAT domain. Despite the unusual position of R285 in the polypeptide backbone, our homology model predicts that R285 can enter the active site and interact with bound core-type DNA in a manner similar to that of the Arg I residues of other tyrosine recombinases (Fig. (Fig.7).7). The position and predicted role of R285 provided independently by the model agree with our biochemical evidence demonstrating the importance of R285 for catalysis by IntDOT. If our model is correct and Arg285 substitutes for Arg I of other tyrosine recombinases, this might be an example of a fundamental difference in the architecture of the catalytic site of IntDOT relative to other tyrosine recombinases.
It is not unexpected to find major structural differences between IntDOT and other recombinases in the region containing IntDOT R285. A previous alignment of tyrosine recombinase sequences showed weak amino acid conservation in this region (33), despite the proximity of a conserved active-site residue, λ Int K235 (IntDOT K287). Inspection of cocrystal structures containing recombinase bound to DNA (e.g., Cre  and IntI ) shows that this region comprises a pair of antiparallel β-strands (β4 and β5 in Fig. Fig.2)2) that directly interact with DNA near the active site. This protein structure is apparently flexible, because it can assume different conformations in different protein-DNA complexes. For example, IntI K160 (which corresponds to λ Int K235 and IntDOT K287) is found either interacting with DNA or exposed to a solvent, depending on the recombinase monomer that is examined (28). In another example, this region of Cre was disordered in a cocrystal with a Holliday junction (17). Collectively, this information indicates that the IntDOT region containing residues 279 to 297 is likely to diverge from other recombinase sequences and may therefore contribute to the novel active-site configuration proposed in this report.
Another example of variation in active-site architecture is displayed by the Flp protein. In most tyrosine recombinases, the six conserved active-site residues are located on the same monomer, and the protein cleaves DNA “in cis” during the recombination reaction. In Flp, the tyrosine nucleophile of the active site is located on a monomer different from that of the other five conserved residues. Thus, the active site is formed by residues from two different monomers, and the DNA is cleaved “in trans” during the recombination reaction (7).
This study presents a genetic approach to identify functional residues that are essential to various steps of recombination catalyzed by IntDOT. Using random mutagenesis, we have identified residues throughout the protein that are important for IntDOT function. Surprisingly, the function of the Arg I active-site residue, which is absolutely conserved in other tyrosine recombinases, could be provided by a nonconserved arginine residue, R285, that is located in a different part of the protein. This intrinsic difference in protein structure between IntDOT and other tyrosine family members could contribute to the unique mechanism of IntDOT-mediated recombination. It will be interesting to solve the structure of the IntDOT catalytic site so as to better understand the role of the unusual R285 residue in catalysis.
We thank Scott Silverman for providing a phosphorothiolate cleavage substrate, Alex Burgin for providing pNP ligation substrates, Jerry Miner for technical assistance, and members of the Gardner lab for comments on the manuscript.
This work is supported by U.S. National Institutes of Health grant GM28717.
Published ahead of print on 13 November 2009.