|Home | About | Journals | Submit | Contact Us | Français|
A tetramer model for HIV-1 IN with DNA representing the LTR termini was previously assembled to predict the IN residues that interact with the LTR termini; these predictions were experimentally verified for nine amino acid residues (Chen et al, J. Biol. Chem. 281, 4173-4182 (2006)). In a similar strategy the unique amino acids found in ASV IN rather than HIV-1 or MPMV IN were substituted into the structurally related positions of HIV-1 IN. Substitutions of six additional residues (Q44, L68, E69, D229, S230 and D253) showed changes in the 3′ processing specificity of the enzyme verifying their predicted interaction with the LTR DNA. The newly identified residues extend interactions along a sixteen base pair length of the LTR termini and are consistent with known LTR DNA: HIV-1 IN cross-links. The tetramer model for HIV-1 IN with LTR termini was modified to include two IN binding domains for LEDGF/p75. The target DNA was predicted to bind in a surface trench perpendicular to the plane of the LTR DNA binding sites of HIV-1 IN and extending alongside LEDGF. This hypothesis is supported by the in vitro activity phenotype of HIV-1 IN mutant with a K219S substitution showing loss in strand transfer activity while maintaining 3′ processing on a HIV-1 substrate. Mutations at seven other residues reported in the literature have the same phenotype and all eight residues align along the length of the putative target DNA binding trench.
Integrase (IN) of human immunodeficiency virus type 1 (HIV-1) is an attractive target for therapeutic development as it is essential for early steps in viral replication and there are no homologues in the eukaryotic system for which inhibitors would negatively affect host viability. This enzyme is both necessary and sufficient to catalyze the insertion of viral into host DNA1-3. In the first step or 3′ processing reaction, two deoxyribonucleotides are removed from the 3′ end of the LTR strands containing the highly conserved CA dinucleotides. In the second step or strand transfer reaction, the newly created 3′ ends undergo a staggered nucleophilic attack on the two strands of the target DNA. These structures are resolved and repaired by host cell enzymes resulting in an integrated copy of the viral DNA with the gene encoding sequence co-linear to the viral RNA and flanked by a four to six base pair duplication of the target DNA, depending upon the viral IN.
Design of inhibitors has been hampered by the lack of crystal structures available for full-length IN monomers or higher-order IN oligomers, let alone in complex with DNA. Partial structures with two of the three domains have been reported and these were used to assemble a model of a tetramer HIV-1 IN with bound LTR DNAs4. The model was then used to predict residues that were in close proximity to the viral DNA. To verify these predictions, a structural alignment of the primary sequences of HIV-1, SIV, and ASV INs5 was used to identify residues that were unique to each virus enzyme, because viral IN specifically recognizes its cognate LTR end substrates. The unique amino acids from ASV IN were substituted into the equivalent structural position of HIV-1 IN. Substitution of the ASV IN residues conferred on the HIV-1 IN mutants the partial ability to cleave an ASV substrate. Multiple residues of HIV-1 IN were demonstrated to alter specificity: V72, S153, K160, I161, G163, Q164, V165, H171, and L1724, shown in red in Figure 1 for only one of the two LTR ends. In this report we have identified six additional HIV-1 IN residues that influence the selection of the LTR end substrates for 3′ processing. These include Q44, L68, E69, D229, S230, and D253, which align along the two LTR binding grooves with the previous residues that alter recognition.
In addition, the structural model was modified to include the host protein LEDGF/p75 integrase binding domains and predict a trench on the HIV-1 IN surface that may accommodate the target DNA. The putative binding site for target DNA is positioned roughly perpendicular to the LTR binding sites. Consistent with this interpretation, we have identified in the literature a series of amino acids along the length of one side of this trench where point amino acid substitutions result in enzymes that lose the ability to strand transfer with little or no effect on 3′ processing. In addition, we have identified a residue on the opposite wall of the trench, K219, where serine substitution displays the same phenotype. This residue is found in a peptide that was previously demonstrated to cross-link to the target DNA6.
In the original selection of amino acids that could affect the recognition of LTR ends, we used the structural alignment of SIV, HIV-1, and ASV INs to identify those residues that were unique. We subsequently observed that HIV-1 IN was capable of 3′ processing a U5 SIV but not a MPMV LTR DNA substrate (data not shown). Therefore, we examined the structural alignment of HIV-1 and ASV INs with the MPMV IN sequence5. This analysis identified additional unique residues in the IN model near the LTR DNA ends: Ser39, Lys42, Gln44, Leu68, Glu69, Leu74, Lys156, Glu170, Tyr227, Asp229, Ser230, Asp253, Asn254, Lys258, and Arg262 (HIV-1 IN numbering). To test whether any of these residues were involved in recognition of the LTR ends, eleven HIV-1/ASV IN chimeras were constructed that substituted the amino acid from ASV IN into the structurally equivalent position of HIV-1 IN as described in the Materials and Methods. A list of chimeras is presented in Table 1.
HIV-1 IN mutants were constructed and purified as described in Materials and Methods. The chimeras were assembled in a 3CSF185H background, an enzyme with four amino acid substitutions (C56S, C65S, C280S, and F185H) to improve solubility. This enabled purification of the chimeras from the soluble faction. Individually these amino acid substitutions have little or no effect on viral replication7-9. Enzymes purified by this protocol were free of detectable non-specific nuclease4. When a HIV-1 G197I substitution, which caused a significant decrease in 3′ processing, was combined into chimeras that alter LTR recognition, this resulted in purified enzymes that did not efficiently 3′ process HIV-1 or ASV substrates. As further evidence for the purity of INs prepared by this protocol, 5′ 33P-end labeled DNA substrates representing the HIV-1 U5, ASV U3, and MLV U5 LTR termini, were individually incubated with wild type or selected IN chimeras. The 3CSF185H HIV-1 and ASV INs cleave their respective homologous substrates, but not heterologous substrates including the MLV sequence (Figure 2). The MLV U5 LTR DNA sequence is cleaved by the wild type MLV IN10. As previously reported4, the G163R Q164V V165L chimera in the 3CSF185H background cleaved both HIV-1 U5 and ASV U3 LTR end substrates (Figure 2). However, the chimera did not cleave the MLV U5 substrate indicating the ability to cleave the ASV substrate was not due to a non-specific nuclease activity. A second HIV-1 IN that contains a K211S substitution, also in the 3CSF185H background, cleaved the HIV-1 U5 but not the ASV U3 or MLV U5 substrates. This amino acid is positioned in the structural model at a distance from the LTR binding sites so that it was expected to maintain HIV-1 substrate specificity for 3′ processing.
We screened each HIV-1/ASV IN chimera listed in Table 1 for their ability to 3′ process HIV-1 and ASV LTR duplex DNA substrates. Of the chimeras tested, S39T K42H, Y227I, N254D, K258T R262S showed decreased or no activity with HIV-1 substrates and did not cleave the ASV substrates (data not shown). As such, these residues were not further considered. In contrast, Q44N, L68E E69P, D229I S230E, and D253N maintained the ability to cleave the HIV-1 substrate, and gained the ability to cleave the ASV substrates to different extents (Figure 3). D229I S230E was unique among these mutants in that it possessed more 3′ processing activity towards the HIV-1 substrate than 3CSF185H IN (Figure 3A). Thus, the residues Q44, L68, E69, D229, S230, and D253 have been verified to interact with the viral LTR and are highlighted on the structural model in Figure 1 (magenta residues) along with the previously identified residues that affect 3′ processing specificity (red residues)4. Taken together, these residues strikingly define two linear trenches on the HIV-1 IN molecular surface that accommodate the two LTR ends. The binding trenches are asymmetrically positioned along the two strands of a 16 base pair length of the LTR ends. Twelve of the fifteen amino acid exchanges that affect LTR recognition in the processing reaction were combined into a single construct (V72W S153R K160D I161R G163R Q164V V165L H171K L172Q D229I S230E D253N) and purified from the soluble fraction. As shown in Figure 3B, this enzyme designated S7C is active and has substantially more specific 3′ processing activity towards the ASV than the HIV-1 DNA substrate. We also assembled a S7C IN chimera in which the substitutions at positions 185 and 280 were restored to wild-type residues. Unfortunately this resulted in an enzyme that was insoluble; 3′ processing activity, however, could be recovered by renaturation from urea and the activity of this enzyme was qualitatively similar to that shown for S7C. This indicates that these two substitutions did not affect the specificity of the 3′ processing reaction. Taken together, these results suggest that the sum of IN interactions along a 16 base pair length of the viral DNA ends determines its specificity.
While we have been able to alter the specificity for 3′ processing of LTR substrates, we have not been able to demonstrate specificity changes in strand transfer activity. Several amino acid changes introduced into the HIV-1 IN (L68E and E69P, V72W, and H171K and L172Q) that alter 3′ processing disrupt the strand transfer activity towards the HIV-1 substrates. The reason for this is not known. Substitutions at the other ten residues that affect 3′ processing support strand transfer activity with HIV-1 substrates. None of the chimeras had strand transfer activity using the ASV substrate. When we combined multiple substitutions into the soluble form of S7C, it too was unable to support a strand transfer reaction with either the HIV-1 U5 LTR or ASV U3 LTR preprocessed end substrates. One possible explanation for this behavior might be that because IN is a tetramer, single substitutions of one residue will necessarily change all four subunits, which might be responsible for the loss in strand transfer activity.
When HIV-1 is replicated in the presence of diketo-acid based compounds, a number of escape mutants are selected that are believed to act against the strand transfer reaction. We previously reported that V72 and S153 were among the residues that influence LTR selection and these residues were mutated in drug resistant INs4. Changes at position 230 in HIV-1 IN are also found in drug resistant INs11. When a substitution in integrase occurs that disrupt its catalytic activity (3′ processing or strand transfer), we hypothesize that second site mutations in IN are selected to compensate for the lost activity caused by the original mutation. In the case of the V72W IN mutant, which has reduced 3′ processing of HIV-1 substrates (79 % compared to wild-type, SD=8.6), second site substitutions at F121, T125, and V151 have been reported12. HIV-1 IN with a T125S substitution resulted in an enzyme that increased its 3′ processing reaction relative to 3CSF185H (126%, SD=11.7). When the V72W and T125S substitutions were combined into the same enzyme, the resultant chimera had a 3′ processing activity equivalent to 3CSF185H (97%, SD=2.0). This result demonstrates that at least one mutation at a second site associated with substitutions at position 72 can compensate for its decreased 3′ processing activity.
The model of the IN tetramer with viral LTR ends was augmented with the host protein LEDGF. The LEDGF integrase binding domain from the crystal structure of LEDGF bound to the catalytic domain of IN (2B4J) was docked to the HIV-1 IN tetramer model. Two LEDGF subunits were accommodated, bound at opposite ends (Figure 4A). Structural data show the binding pocket for LEDGF integrase binding domain is formed by residues 102, 128, 129 and 132 in one IN subunit and residues 174 and 178 in a second subunit13. These residues, colored in red, are at the LEDGF/HIV-1 IN interface. Additional residues that may be involved in the interaction with LEDGF, colored in yellow, based on mutagenesis and structural data include 131, 161, 165, 166, 168, and 170-17313-15. These residues are also located at or near the interaction interface in our model. Therefore, the new model is consistent with information for the residues implicated in the IN interaction with LEDGF.
A groove is observed on the IN tetramer model between the two LTR ends and approximately perpendicular to the long axis of the LTR DNA that could accommodate the target DNA. The positioning of LEDGF as an extension of the target DNA binding trench (indicated by arrows, Figure 4A) would be consistent with its role in interacting with chromatin to influence target site selection16-20. We predict from the structural model that mutations introduced into the target DNA binding site of IN, but distant from the catalytic and LTR binding sites, would have a phenotype where strand transfer would be inactivated without impairment of the 3′ processing reaction. Within the putative DNA binding trench, mutations of a series of residues including S11921, N12022, C13023,24, W13224, F181 and F18525 display this activity phenotype. These residues align along one wall of the trench (Figure 4B, green residues). If this trench is the binding site for the target DNA, we would predict that point mutations introduced at residues on the opposite wall would have the same phenotype. We therefore assembled HIV-1 IN mutants with K211S, K219S, and Q221S, substitutions, respectively. Each was tested for 3′ end processing and strand transfer against the homologous HIV-1 DNA substrates. The K211S mutant showed near wild-type 3′ processing while the K219S mutant still had 3′ processing activity towards a HIV-1 substrate though less than wild type (Figure 5). When tested in the strand transfer assay using pre-processed HIV-1 LTR DNA, we observed that the K211S mutant was as active as wild-type while the K219S mutant was inactive. The activity of the Q221S mutant in both 3′ end processing and strand transfer was similar to that of K211S (data not shown). Thus, the K219S enzyme lost the ability to strand transfer with a decrease in 3′ processing while the K211S and the Q221S mutants had no detectable affect on either activity. These results suggest K219 is involved in binding to the target DNA.
The homotetramer form of IN catalyzes all of its known enzymatic activities. While dimers of IN are capable of catalyzing 3′ processing and strand transfer reactions, they do not support a concerted DNA integration reaction26. For this reason, a homotetramer model was assembled. In the model, two of the four subunits are depicted with major contacts with the LTR and target DNAs. The remaining two subunits are available for contacts with proteins that interact with integrase27,28 including the host protein LEDGF as shown in our new augmented IN tetramer model. There are at least fifteen residues in the LTR binding groove on the IN surface that are associated with LTR specificity. They are predicted to interact asymmetrically along a 15-16 base length of DNA duplex (see Figure 1). When LTR DNA is bound to IN and the complex treated with DNase, a 16 base pair duplex length of the LTR is protected from digestion29,30. Thus there is agreement between the size of the protected DNA and residues in close proximity to the LTR DNA that change specificity for substrate (Figure 1). Moreover, when mutations are introduced into HIV-1 LTR duplex DNA substrates at 16 base pairs from the ends, changes in concerted DNA integration in vitro are detected61. Interactions between viral DNA and HIV-1 IN have also been demonstrated in cross-linking studies for residues 143, 148, 156, 159, 160, 230, 246, 262, 263, and 26431-35. These residues are highlighted in Figure 6. A group cluster in and around the catalytic site in close proximity to the first 6 bases/base-pairs of the processed LTR ends. A second group which includes residues 246, 262, 263, and 264 are in close proximity to base-pairs 15-16 of the LTR DNA that interact with residues 229, 230 and 253 identified in this study. Finally, Agapkina, et al. used substrate analogs to probe contacts between HIV-1 IN and LTR DNA substrates36. In this study they identified eleven contacts with the sugar phosphate backbones from residues 5-9 and interactions with four bases asymmetrically distributed between the two strands of the LTR ends. The fifteen residues that influence specificity for 3′ processing reactions are spatially in close proximity to all of these sugar phosphate backbone contacts and most of the base contacts.
There are a series of naphthyridine carboxamide and diketo acid related drugs that act in the nanomolar range to inhibit HIV-1 IN11,12,37-44. Drug resistant enzymes with changes at more than ten different sites were identified. Five of these residues were unique in the structural alignment of different INs and were located near the LTR ends in the structural model. While these drugs were thought to act at strand transfer and not 3′ processing, we found that HIV-1 IN residues S153 and V724 as well as S230 were among those positions involved in LTR end recognition and 3′ processing. Because drug resistant sites affect specific recognition of the viral DNA ends and change the rate of processing of HIV-1 substrates, we predict that amino acid changes at some of these sites will lead to partially defective INs in cells. This should subsequently result in selection of second site substitutions that compensate for the loss in 3′ processing activity caused by the initial drug resistant amino acid substitutions. If correct, we predict that depending upon the extent of change in 3′ processing observed towards HIV-1 duplex substrates4, our data will be correlated to the appearance of individual or multiple residue substitutions detected in drug resistant enzymes. For example, in the case of a position 153 chimera, it gains the ability to 3′ process the ASV LTR end duplex with only a small decrease in its ability to 3′ process the HIV-1 duplex substrate4. As such, we would predict that this mutation would be found by itself and should have only a small affect on replication of HIV-1 in cells, as observed44,45. In contrast, the substitution at position 72 (V72W) caused a larger decrease in the ability to process the HIV-1 U5 duplex substrate4. On this basis, we would predict that the V72I drug resistant mutation would appear in the presence of other substitutions that compensate for the loss in its 3′ processing towards HIV-1 substrates. Second site mutations of F121Y and T125K subsequently appear in HIV-1 IN containing the V72I mutation12. A T125S substitution increases the 3′ processing of U5 HIV-1 duplex4 as well as joining of a HIV-1 preprocessed substrate. When combined with the V72W mutation, this produces an enzyme with near wild type levels of 3′ processing, suggesting that a second site mutation compensates for the decrease in 3′ processing caused by the initial drug resistant mutation. Another illustrative example involves position 230 where substitutions at this residue also affect recognition of the viral DNA ends. With the caveat that the exchange of S230E was analyzed as a double mutant in combination with D229I, it gained the ability to cleave the ASV substrate, but in contrast to chimeras with changes at positions 72 or 153, it displayed an increase in activity towards HIV-1 substrates. Changes at position 230 are reported to appear in conjunction with T66I and M74L substitutions in cells11. We have not analyzed substitutions at position 66 in vitro, but Lee and Robinson reported that the T66I substitution caused a small decrease in 3′ processing44. We tested a M74A substitution that resulted in a significant loss of 3′ processing of HIV-1 substrates. Taken together, this suggests that substitution at position 230 might compensate for the loss in 3′ processing caused by mutations at positions 74 and 66.
In examining the structural model, we identified a trench on the HIV-1 IN surface with its long axis almost perpendicular to those accommodating the viral DNA ends4. We speculate that the target DNA fits into this groove. Moreover, interactions with LEDGF will further stabilize this IN: target DNA complex and would be consistent with the role for LEDGF in promoting the interaction of the integration complex with host chromosomal DNA14,17,28,46-48. The target DNA is positioned between the viral DNA ends and this location will facilitate the nucleophilic attack of the 3′ hydroxyl ends of the respective CA strands into each strand of the target DNA. There are several lines of evidence that support this hypothesis. First, amino acid residues S119, N120, C130, W132, and K159 are reported to interact with target DNA based upon activity and drug sensitivity data21,22,24,45,49. These residues strikingly align along one surface wall of the proposed target DNA binding trench. A N120S mutant is reported to increase 3′ processing and strand transfer activities while N120Q and N120K mutants show little effect on processing but some decrease in strand transfer22,45. The C130S and W132A/G/R substitutions are reported to have normal 3′ processing but little or no joining activity24. A C130S substitution in combination with three other mutations shows loss in strand transfer but with some decrease in 3′ processing9. More recently, four HIV-1 IN mutants with W132Y, M178C, F181G, and F185G substitutions, respectively, were constructed25. The enzymes with mutations at positions W132, F181, and F185 did not support a strand transfer reaction but had wild type or near wild type levels of 3′ processing. In contrast, the mutation at M178 showed decreases in both activities25. This latter residue lies below the surface of the target DNA binding trench (data not shown). An ASV IN mutant, structurally equivalent to HIV-1 IN S119, has normal 3′ processing but barely detectable strand transfer activity21. Similarly, Asp substitutions in ASV IN equivalent to HIV-1 IN G94 and S123 show the same activity phenotype (Michael Katzman, Penn State College of Medicine, personal communication). Finally, HIV-1 IN substitutions of I141K, I203P or I203K (Corinne Ronfort, Universite de Lyon, personal communication) also show a loss in strand transfer with little affect on 3′ processing. As shown in Figure 4B, G94, S119, S123, W132, I141, F181, F185 and I203 lie on the IN surface in the putative target DNA binding trench (green residues) and are aligned with other residues that cause similar activity defects. As reported here, the K219S mutation also loses strand transfer but maintains 3′ processing activity. In contrast to the above residues, K219 is found on the opposite wall of the trench (see Figure 4B, red residues). A K219A mutation has been analyzed for its effect on HIV-1 replication and was reported to have a limited affect50. It is not known why it did not show a stronger phenotype. This may be related to alanine rather than serine being substituted in this study or reflects differences in sensitivity between in vitro and in cell assays.
Second, the target DNA binding site contains peptides previously shown to be cross-linked to the target DNA portion of a disintegration substrate modified with an azidophenacyl group6. After UV photo activation, cross-links were established between the DNA substrate and six endoproteinases Glu-C digested peptides. In terms of our model, the peptide 139-152 represents the active site between the two LTR ends and contains Q148 as well as Q137, Q146, and N144. A second peptide, 213-247 is found at a distance from the catalytic site in the putative target DNA binding site and contains K219 but not K211. The K211S mutation has no affect on activity of IN in vitro. The other peptides identified in that report were implicated in binding both viral and target DNA substrates or only to the viral DNA substrate in agreement with the model's predictions.
Third, we find a series of residues (Arg, Lys, Gln, and Asn) that appear along the length of the putative DNA binding pocket, which are found in known DNA binding sites of other enzymes51-53 and could therefore be involved in binding to the target DNA. In contrast to the residues interacting with the LTR ends, these residues are conserved among INs to different extents. This would be consistent with IN inserting the viral DNA into many sites in the target DNA. Five of these residues (Q62, N117, Q148, N155, K159) have been mutated and cause defects to 3′ processing, strand transfer, and disintegration8,22,33-35. These residues are predicted to lie in the catalytic site between the ends of the two LTRs so they could interact with both the viral and the target DNAs. This conclusion is supported by a recent study that showed that Q148 cross-linked to the ends of the LTRs54.
Fourth, when we examined the positions in the structural model of 50 amino acid residues described in the literature8,22,35,55-58 where mutations result in either little or no effect on or decreases in both 3′ processing and joining activities, none lay in the proposed target DNA binding trench. The only exception is when the targeted amino acids were positioned between the two LTR ends where they could interact with both viral and target DNAs. Additionally, Puglia et al.59 reported an analysis of HIV-1 IN where in-frame insertions of small peptides were placed at 56 sites. The mutants were analyzed for changes in the joining reaction (but not 3′ processing because a preprocessed substrate was used). We examined the positions of these mutations in our structural model and can interpret their reported activity changes. For example, when the bulky insertions are near the enzyme surface but not near the viral DNA ends or the proposed target DNA binding site, the model predicts and Puglia et al.59 report that there is no effect on activity. In contrast, when the peptide insertions are at the surface near either the viral DNA or target DNA binding sites, we predict that there should be a disruption to the joining reaction as observed. When insertions are buried within the structure we predict distortions that disrupt all activities and this too is seen.
Finally, we mapped the naphthyridine carboxamide and diketo acid related drug resistant sites on the structural model. The drug resistant sites associated with diketo acid compounds (residues 66, 74, 92, 143, 148, and 151-155) map in the active site region of the target DNA binding trench near or between the two LTRs. The napthyridine carboxamide related drug resistant sites (residues 121 and 125) map in the target DNA binding trench near to the LEDGF binding sites shown in Fig. 4. This observation suggests that the napthyridine carboxamide related drugs might interfere with formation of the LEDGF-HIV-1 IN complex. Others (residues 72, and 150) map in the target DNA binding site near the LTRs. Taken together, these results are consistent with the model and the hypothesis for the binding of target DNA.
[γ-33P]-ATP (2500 Ci/mmole) was purchased from Perkin Elmer Life Sciences. HiTrapTM Chelating HP resin and HiTrapTM Heparin HP resin were purchased from GE Healthcare Life Sciences (Piscataway, NJ). T4 polynucleotide kinase was from USB (Cleveland, Ohio). IPTG was from Roche (Indianapolis, IN). The Slide-A-Lyzer Dialysis cassette (10kD MWCO) was obtained from Pierce (Rockford, IL). CentriPrep centrifugal filter devices with YM-10 MW membranes were from Millipore (Bedford, MA). Acrylamide and bisacrylamide solutions were from Bio-Rad (Hercules, CA). SimplyBlue Safe Stain was from Invitrogen (Carlsbad, CA). DE81 filters were purchased from Whatman International Ltd (Kent, UK). Unless specified, all restriction enzymes were purchased from New England Biolabs (Beverly, MA). ASV IN was provided by Dr. Ann Skalka (Fox Chase Cancer Center, Phil, PA). An expression construct for HIV-1 IN 1-288 residues (p28bIN-3CS-F185H) was also obtained from the laboratory of Dr. Ann Skalka and contains the wild type NY5 HIV-1 sequence (Parke Davis clone) from the NdeI to the HindIII site in the pET28b plasmid vector. The IN sequence encodes 4 substitutions (C56S, C65S, C280S, and F185H) to increase solubility and a six amino acid His-tag separated from the N-terminus of IN by a thrombin cleavage site. Two translation stop codons were added after residue D288. MLV oligo substrates were a gift from Monica Roth (UMDMJ).
The protein expression host strain BL21(DE3) were purchased from Novagen (Madison, WI). Selection of mutagenesis for chimera construction was performed in Supercompetent XL1-Blue cells from Stratagene (LaJolla, CA). Storage of confirmed clone DNA was carried out in DH5α at −80°C using competent cells from Invitrogen (Carlsbad, CA). Unless otherwise noted, bacteria were selected for using LB+Kan50 media at 37°C.
The following oligos were used in the integrase 3′ processing activity assay:
The following oligos were used in the integrase strand transfer activity assay to simulate pre-processed LTR ends:
The plus-strand substrates (100pmoles, containing the conserved ‘CA’ dinucleotides) were 5’ end labeled using T4 polynucleotide kinase (30U) and [γ-33P]-ATP as previously described4. The specific activity of the radiolabeled substrates was diluted to 105 cpm/pmol using unlabeled plus-strand oligo and the mixture was purified and recovered from a 20% denaturing polyacrylamide gel. Duplex oligos were formed by annealing to a molar excess of unlabeled complementary strand as described4.
Mutagenesis oligos were obtained from Integrated DNA Technologies Inc (Coralville, Iowa) and are listed in Supplementary Data Table 1. The mutations were constructed using QuikChange® Site-Directed Mutagenesis Kit from Stratagene (La Jolla, CA) according to manufacturer's directions. Codon preferences for E. coli were used in the oligo design. The presence of all mutations was confirmed by sequencing the complete individual DNA clones. The Wizard® Plus SV Miniprep DNA Purification System (Promega, Madison, WI) was used to prepare DNA for cloning.
His-tagged HIV-1 IN chimeras were purified from the soluble fraction as previously described60 with some modification. Briefly, proteins were induced in BL21 (DE3) cells at 20°C by adding IPTG to 0.5mM after the bacteria had grown to optical density at 600nm of 0.8. The bacteria were lysed in 25mM Bis Tris, pH 6.1, 1M KCl, 1M urea, 1% thiodiglycol, 5mM imidazole and then filtered through 0.22μm membrane from Millipore (Billerica, MA). The lysate fraction was applied to a HiTrap™ Chelating HP Ni-affinity column (5ml) and IN was eluted with a 5mM to 1.0M linear imidazole gradient. Fractions containing IN, as detected by absorbance at 280nm and confirmed by SDS-PAGE with staining using the SimplyBlue Safe Stain, were applied to a HiTrap™ Heparin HP column (5ml) and eluted with a 0.25 - 1.0M linear KCl gradient. Selected fractions were concentrated using a Centriprep filter with YM-10 MW membrane and then dialyzed against 25mM Bis Tris, pH 6.1, 0.5M KCl, 1% Thiodiglycol, 1mM DTT, 0.1mM EDTA, and 40% glycerol. The purified protein was aliquoted and stored at −80°C. The protein concentration was determined using a Bio-Rad protein assay as described by the manufacturer.
The processing reactions for the HIV-1 U5 or ASV U3 LTR substrates were carried out as previously described 61. These two substrates were used because they are the “dominant” LTR ends where substitutions into these sequences caused large decreases in the rate of concerted integration in vitro62-64. Substitutions introduced in the HIV-1 U3 or ASV U5 LTR substrates change the mechanism of concerted DNA integration in vitro from two to one-ended insertion events. Reactions were in a volume of 12μl with 25mM MOPs, pH 7.2, 10mM DTT, 15mM potassium glutamate, 5% PEG8000, 5% DMSO, 500ng of HIV-1 or HIV-1 chimeras and 1pmol of labeled duplex substrate as indicated. Reaction mixtures were assembled from individual components and preincubated overnight at 4°C. To start the processing reaction, MgCl2 was added to final concentration at 10mM and reaction mixtures incubated at 37°C for 90 minutes. The reactions were stopped by the addition of 3μl o f Stop buffer (95% formamide, 20 mM EDTA, 0.1% xylene cyanol, 0.1% bromphenol blue), heated to 95°C for 5 min, and then placed on ice. Products of the reaction were separated through a 20% polyacrylamide denaturing sequencing gel. Labeled reaction products were visualized using KODAK MR film by exposure overnight. For reactions containing ASV IN the final reaction mixture contained 20mM MOPs, pH 7.2, 3mM DTT, 100μg/ml BSA, 500ng ASV IN, and 1pmol labeled duplex substrates as indicated. For the strand transfer assay, the reaction conditions are identical to those used for 3′ end processing, however the substrates used mimic the preprocessed LTR end (designed with 5′-CA dinucleotide overhang) in order to promote strand transfer. The reaction products were analyzed by denaturing gel electrophoresis4,65.
The crystal structure of LEDGF (residues 345-426) bound to the IN catalytic domain dimer (2B4J) was superimposed on the IN catalytic domains in the tetramer model. Then, the program AMMP66 was used to produce and minimize hydrogen atom positions for the two LEDGF monomers and the monomers were minimized using conjugate gradients with all non-bonded and geometric terms. The all-atom sp4 potential set67,68 was used with the charge generation parameters from Bagossi et al.69 and a dielectric of 1.0. The LEDGF monomers were combined with the model consisting of the tetramer of full-length IN with Zn and Mg atoms, two 20-mer LTRs, as described in Chen et al.4. The new model with LEDGF was optimized by 500 cycles of conjugate gradients minimization in AMMP to ensure good non-bonded interactions. The standard AMMP conjugate gradients algorithm with the Polak-Ribere beta and inexact line search was used for optimization. The total non-bonded energy was minimized from 11,754,460.0 kcal/mol to −20,446.6 kcal/mol, and the maximum magnitude (l1 norm) of the derivative of the energy (gradient) was reduced from 1,734,389,397.9 kcal/(mol A) to 50.03 kcal/(mol A) in the final model comprising 22,162 atoms. Figures of the model were made using PyMol70.
We thank Yuan-Fang Wang for assistance with the modeling and preparing structural Figures. This work was supported in part by United States Public Health Service Grants AI054143 (to J.L.), GM6290 and GM065762 (to I.T.W. and R.W. H.). J.D. is supported in part by the Training Program in Viral Replication T32 AI060523.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.