Genetic screen for improved integrases
We previously identified a DNA sequence on human chromosome 8 as the most prevalent genomic integration site for
C31 integrase-mediated genomic integration of a plasmid bearing attB
). On this basis, the ψA site appeared to be relatively well recognized by the
C31 integrase and was chosen as the DNA substrate for this directed evolution study. As shown in Figure C, ψA shares the TTG common core and possesses identity at 44% of the positions in the 39-bp attP
site that we previously determined to be the minimal site for obtaining full reaction in an Escherichia coli
The primary property we were seeking to improve was the ability of the integrase to perform a recombination reaction between attB
and the ψA pseudo attP
site. For convenience, we developed a screen in E.coli
, with the expectation that some of the improved integrases would also show enhanced function in the human cell environment. In order to identify improved integrases from a shuffled library of integrase mutants, we devised a genetic screen in which ability of the enzyme to complete a reaction at ψA could be read off Xgal indicator plates (17
) as blue colony color. As shown in Figure A, we constructed an assay plasmid pRes-ψA in which the attB
and ψA sites were separated by a stuffer sequence that would block transcription by a promoter upstream of ψA into the lacZ
gene positioned downstream of attB
. Only after an intramolecular integration event between attB
and ψA would the transcription occur.
A 1.9-kb fragment containing the
C31 integrase gene was subjected to DNA shuffling (9
). The shuffled products were cloned into the pINT-T plasmid, and the shuffled library transformed into E.coli
cells carrying pRES-ψA. Colonies were allowed to form at 30°C on Xgal plates. Under these conditions, the pRES-ψA plasmid replicated and lacI
was expressed, keeping most transcription of the integrase gene repressed. Once a moderately large colony size was attained, induction of the integrase was started. Induction was achieved simply by moving the plates to 37°C. At this temperature, most of the lac
repressor product of lacI
was inactive, enabling transcription and expression of the library of mutant integrases. In addition, replication of pRES-ψA was inhibited due to the temperature sensitive replication origin on the plasmid. Thus, during the growth period, colonies underwent little expansion, but expression of integrase and any consequent intramolecular integration reactions proceeded. When recombination occurred between attB
and the ψA site, transcription of lacZ
was activated, resulting in hydrolysis of Xgal to indigo, read as blue color on Xgal agar plates.
The timing, pattern and degree of blue color gave a measure of the integration activity characteristic of the integrase gene present in that bacterial colony. With an induction period of 24 h, we found that the wild-type integrase produced essentially no blue color in the colonies (Fig. ). Therefore, any mutant integrase producing blue color in 24 h or less probably had increased ability to perform recombination between attB and ψA. By reducing the time allowed for induction, we increased the stringency of the screen.
Figure 2 Colony color screen for improved integrases. Color photographs indicate the appearance of bacterial colonies carrying wild-type (WT INT) and shuffled integrases. Blue color on plates containing Xgal reflects the occurrence of intramolecular integration. (more ...)
DNA shuffling cycle 1
For the first cycle of shuffling, we subjected the wild-type
C31 integrase gene to the DNA shuffling protocol, then transformed the shuffled library into E.coli
carrying pRES-ψA. After colony formation at 30°C, the plates were incubated for 24 h at 37°C to allow integrase expression and subsequent recombination. The plates were scored for blue color, with the result that 12 of ~10 000 colonies screened showed specks of blue in the colony. Plasmid DNA was purified from these colonies and retested in the assay to confirm the color phenotype. Mini-prep DNA from blue colonies was analyzed by restriction mapping to verify that the expected intramolecular integration reaction had occurred in each case. For the most promising mutants, i.e. the ones that generated most blue color (1C1, 12C1, 17C1), the attR
junction generated by recombination from several blue colonies was sequenced. The attR
junction generated by recombination between attB
and ψA was perfect to the base in each case. This result confirmed that the mutant integrases were performing the appropriate integration reaction, were precise, and were not simply mediating promiscuous integration at a variety of sequences. Instead, the mutant integrases appeared to have improved recognition of and activity at the ψA attP
The best candidates from cycle 1 were arranged in a tentative order of ability to perform recombination at ψA, based on the degree of blueness developed by the bacterial colony at a particular time of induction. By this measure, 17C1 appeared to be the most effective, since it formed colonies that turned uniformly pale blue after only 6 h of integrase expression (Fig. ). The other cycle 1 mutants required longer induction times to develop blue color.
Five of the cycle 1 integrases were completely analyzed at the DNA sequence level in order to characterize what mutations were present. This information is depicted schematically in Figure and reported completely in Table S1 (see Supplementary Material). In each case, three to six mutations were present. About half of the mutations were silent with respect to amino acid change, while the others led to amino acid substitutions. Most of the mutations were single base changes, although a single base deletion in 1C1 led to a frameshift near the C-terminus of the protein (Table S2). Each of the mutations was distinct. This outcome is typical of cycle 1 shuffling (7
Figure 3 Schematic diagram of mutational changes present in shuffled integrases. The 613 amino acid length of the C31 integrase is demarcated by dots at 100 amino acid intervals. The approximate location of the catalytic domain, in analogy with other (more ...)
DNA shuffling cycle 2
Plasmid pINT-T DNAs purified from the best 12 candidates from cycle 1 were mixed together in equal proportion and subjected to DNA shuffling to produce cycle 2. In this way, the distinct mutations present in the cycle 1 mutants could be mixed in a combinatorial way to produce new configurations in which favorable features could be combined to produce integrases with additional benefits. Because the cycle 1 mutants all gave evidence of blue color after 24 h of induction at 37°C, to increase the stringency of the cycle 2 assay the induction period was reduced to 6 h. None of the cycle 1 mutations showed more than a pale degree of blueness at this time point, so appearance of deep blue colonies at 6 h would indicate a further gain in integration efficiency.
The cycle 2 library was transformed into E.coli, and from a screen of ~10 000 colonies we obtained 11 candidates that produced blue colonies after 6 h of integrase induction (Fig. ). Based on the amount of blue color present, 1C2, 2C2 and 11C2 appeared to be the most efficient integrases. The mutant integrases were re-tested and the DNA of pRES-ψA from blue colonies was examined by restriction mapping and DNA sequencing. Again, the attR recombination junction was found to be perfect to the base in each case, indicating that the mutant integrases mediated a precise recombination reaction, had not become promiscuous and appeared to have an elevated ability to perform intramolecular integration between attB and ψA.
The complete DNA sequences of four of the best cycle 2 integrases were determined and are reported schematically in Figure and in detail in Table S1. We found evidence that efficient DNA shuffling had occurred, because many of the mutations present in the cycle 2 mutants had already been seen in cycle 1 and were now present in new combinations. Since not all of the cycle 1 mutants were sequenced, we could not determine whether new mutations appeared in cycle 2. The Gln to Pro mutation at amino acid 134, derived from 17C1, appeared in all four cycle 2 mutants analyzed at the sequence level (Fig. and Table S1). A single base deletion distinct from the one in 1C1 was present in 11C2 and caused a different frameshift that changed the amino acid sequence of the C-terminus of the protein (Fig. and Table S2).
Mammalian integration frequency analyzed by quantitative PCR
Our genetic screen in E.coli
effectively identified mutant enzymes that could perform more efficient recombination between attB
and ψA in bacterial cells. Our goal was to obtain mutants that were more effective at performing integration at the ψA target site in its native context in the human genome. Because this reaction probably has requirements that would not be optimized in E.coli
, not all of the mutant integrases we isolated in E.coli
were expected to perform well in the mammalian context. The most direct measure of the ability of the mutants to mediate the desired integration event in human cells was to monitor the frequency of recombination at the ψA chromosomal position using quantitative PCR (15
). This assay directly monitored the creation of the desired recombination junction and was free of artifacts, such as silencing of the integrated gene, which might be present in a genetic assay that relied on selection. To perform quantitative PCR, we used primers that flanked the expected recombination junction. One primer was located in the integrated plasmid and the other in flanking genomic sequences downstream of attL
. Detection of this PCR band could only occur if the integration event we desired had occurred.
To perform the quantitative PCR assay, human 293 cells were transfected with plasmids carrying the mutant integrase gene to be tested and an attB plasmid, pHZ-attB. Forty-eight hours after transfection, integration was assayed. Cellular DNA was purified, and the PCR primers and Taqman probe were added and PCR performed. Each of the cycle 1 and 2 integrase mutants were tested in the quantitative PCR assay. None of the cycle 1 mutants showed an improvement in absolute integration frequency in human cells, despite their improved performance in E.coli. However, among the cycle 2 mutants, two of the eleven mutants, 1C2 and 11C2, showed a significant increase of 2–3-fold in absolute integration efficiency at ψA (Fig. A). Wild-type integrase mediated site-specific integration at ψA at a frequency of 0.21%, whereas the frequency mediated by 1C2 was 0.43% (P < 0.05) and that mediated by 11C2 was 0.55% (P < 0.01). These frequencies are uncorrected for transfection efficiency.
Figure 4 Integration frequency assays for shuffled integrases in human cells. (A) Quantitative PCR assay for integration frequency at ψA. Human 293 cells were transfected with pHZ-attB and either no integrase, the wild-type C31 integrase, or (more ...)
Overall genomic integration frequency and specificity
The above quantitative PCR data showed that the 1C2 and 11C2 integrases directed an increased absolute integration frequency at ψA. However, the PCR data did not reveal whether these evolved integrases also displayed an elevated integration frequency at other genomic sites. To get a sense of the relative specificity of the evolved integrases, we followed overall integration frequency by using a genetic assay, then determined what fraction of the integrants were located at ψA.
To determine overall integration frequency, we transfected the pHZ-attB plasmid carrying attB and the hygromycin resistance gene, along with an expression plasmid encoding the integrase to be tested, into human 293 cells. After 3 weeks of hygromycin selection, colonies were counted. This analysis (Fig. B) revealed that the shuffled integrases did not mediate an increase in overall integration frequency. Rather, one mutant was indistinguishable from wild-type integrase and the other 10 had significantly lower overall integration frequencies. In particular, the 1C2 and 11C2 mutants, which had higher integration frequencies specifically at ψA, had the lowest overall integration frequencies, as reflected by numbers of hygromycin-resistant colonies ~10-fold below that of the wild-type integrase.
This result could be explained if the shuffled integrases now possessed an elevated specificity for ψA that decreased the background of integration into other genomic sequences, such as the other ~100–1000 pseudo attP
sequences thought to be present in the human genome (6
). To measure the integration specificity, 20 randomly chosen hygromycin-resistant colonies generated by the 1C2 and 11C2 integrases were expanded. Genomic DNA was prepared and analyzed by PCR using primers that would detect integration at ψA. The results of this PCR analysis (Fig. S) demonstrated that 30% (6/20) of integration events mediated by these two shuffled integrases now occurred at ψA. This result was in contrast to the previously determined figure of 5.2% (6
) for the wild-type
C31 integrase. These data suggested that an increase of ~6-fold in specificity for ψA accompanied the 2–3-fold increase in absolute integration frequency at ψA.
The 2–3-fold higher integration frequency at ψA measured for 1C2 and 11C2 in Figure A would have predicted that for these mutants we would have seen approximately 10 colonies with integrations at ψA per 100 total integrations by the wild-type integrase in Figure B, rather than the three we observed. We do not have a full explanation for this result, but note that the assays measured two different things. The quantitative PCR assay directly measured recombination events at ψA at an early time, with no selection. The selection assay measured sustained gene expression at the composite of all integrated loci. It is possible that integration events were more poorly recovered in the selection assay in the case of the shuffled integrases simply due to the low colony numbers, which give rise to depressed growth on the plates. The lower colony numbers obtained with 1C2 and 11C2 suggested that the mutants did not confer hyper recombination or relaxed specificity, but rather represented altered specificity mutants.
In order to analyze the precision of the integration events mediated by the shuffled recombinases, we performed DNA sequence analysis on fragments containing the attL
recombination junctions resulting after genomic integration. The attL
junction was obtained by PCR from individual colonies representing several independent integration events at ψA. The sequence analysis revealed small deletions of from 6 to 17 bp in each case, overlapping the 3-bp cross-over region. This result is similar to results reported for the wild-type
C31 integrase, where most recombination junctions contained small deletions (6
). The small deletions may indicate an impaired ability of both the wild-type and shuffled integrases to complete the recombination process on pseudo attP
sites and possible involvement of host repair enzymes to complete the reaction.