The homotetramer form of IN catalyzes all of its known enzymatic activities. While dimers of IN are capable of catalyzing 3′ processing and strand transfer reactions, they do not support a concerted DNA integration reaction
26. For this reason, a homotetramer model was assembled. In the model, two of the four subunits are depicted with major contacts with the LTR and target DNAs. The remaining two subunits are available for contacts with proteins that interact with integrase
27,28 including the host protein LEDGF as shown in our new augmented IN tetramer model. There are at least fifteen residues in the LTR binding groove on the IN surface that are associated with LTR specificity. They are predicted to interact asymmetrically along a 15-16 base length of DNA duplex (see ). When LTR DNA is bound to IN and the complex treated with DNase, a 16 base pair duplex length of the LTR is protected from digestion
29,30. Thus there is agreement between the size of the protected DNA and residues in close proximity to the LTR DNA that change specificity for substrate (). Moreover, when mutations are introduced into HIV-1 LTR duplex DNA substrates at 16 base pairs from the ends, changes in concerted DNA integration in vitro are detected
61. Interactions between viral DNA and HIV-1 IN have also been demonstrated in cross-linking studies for residues 143, 148, 156, 159, 160, 230, 246, 262, 263, and 264
31-35. These residues are highlighted in . A group cluster in and around the catalytic site in close proximity to the first 6 bases/base-pairs of the processed LTR ends. A second group which includes residues 246, 262, 263, and 264 are in close proximity to base-pairs 15-16 of the LTR DNA that interact with residues 229, 230 and 253 identified in this study. Finally, Agapkina, et al. used substrate analogs to probe contacts between HIV-1 IN and LTR DNA substrates
36. In this study they identified eleven contacts with the sugar phosphate backbones from residues 5-9 and interactions with four bases asymmetrically distributed between the two strands of the LTR ends. The fifteen residues that influence specificity for 3′ processing reactions are spatially in close proximity to all of these sugar phosphate backbone contacts and most of the base contacts.
There are a series of naphthyridine carboxamide and diketo acid related drugs that act in the nanomolar range to inhibit HIV-1 IN
11,12,37-44. Drug resistant enzymes with changes at more than ten different sites were identified. Five of these residues were unique in the structural alignment of different INs and were located near the LTR ends in the structural model. While these drugs were thought to act at strand transfer and not 3′ processing, we found that HIV-1 IN residues S153 and V72
4 as well as S230 were among those positions involved in LTR end recognition and 3′ processing. Because drug resistant sites affect specific recognition of the viral DNA ends and change the rate of processing of HIV-1 substrates, we predict that amino acid changes at some of these sites will lead to partially defective INs in cells. This should subsequently result in selection of second site substitutions that compensate for the loss in 3′ processing activity caused by the initial drug resistant amino acid substitutions. If correct, we predict that depending upon the extent of change in 3′ processing observed towards HIV-1 duplex substrates
4, our data will be correlated to the appearance of individual or multiple residue substitutions detected in drug resistant enzymes. For example, in the case of a position 153 chimera, it gains the ability to 3′ process the ASV LTR end duplex with only a small decrease in its ability to 3′ process the HIV-1 duplex substrate
4. As such, we would predict that this mutation would be found by itself and should have only a small affect on replication of HIV-1 in cells, as observed
44,45. In contrast, the substitution at position 72 (V72W) caused a larger decrease in the ability to process the HIV-1 U5 duplex substrate
4. On this basis, we would predict that the V72I drug resistant mutation would appear in the presence of other substitutions that compensate for the loss in its 3′ processing towards HIV-1 substrates. Second site mutations of F121Y and T125K subsequently appear in HIV-1 IN containing the V72I mutation
12. A T125S substitution increases the 3′ processing of U5 HIV-1 duplex
4 as well as joining of a HIV-1 preprocessed substrate. When combined with the V72W mutation, this produces an enzyme with near wild type levels of 3′ processing, suggesting that a second site mutation compensates for the decrease in 3′ processing caused by the initial drug resistant mutation. Another illustrative example involves position 230 where substitutions at this residue also affect recognition of the viral DNA ends. With the caveat that the exchange of S230E was analyzed as a double mutant in combination with D229I, it gained the ability to cleave the ASV substrate, but in contrast to chimeras with changes at positions 72 or 153, it displayed an increase in activity towards HIV-1 substrates. Changes at position 230 are reported to appear in conjunction with T66I and M74L substitutions in cells
11. We have not analyzed substitutions at position 66
in vitro, but Lee and Robinson reported that the T66I substitution caused a small decrease in 3′ processing
44. We tested a M74A substitution that resulted in a significant loss of 3′ processing of HIV-1 substrates. Taken together, this suggests that substitution at position 230 might compensate for the loss in 3′ processing caused by mutations at positions 74 and 66.
In examining the structural model, we identified a trench on the HIV-1 IN surface with its long axis almost perpendicular to those accommodating the viral DNA ends
4. We speculate that the target DNA fits into this groove. Moreover, interactions with LEDGF will further stabilize this IN: target DNA complex and would be consistent with the role for LEDGF in promoting the interaction of the integration complex with host chromosomal DNA
14,17,28,46-48. The target DNA is positioned between the viral DNA ends and this location will facilitate the nucleophilic attack of the 3′ hydroxyl ends of the respective CA strands into each strand of the target DNA. There are several lines of evidence that support this hypothesis. First, amino acid residues S119, N120, C130, W132, and K159 are reported to interact with target DNA based upon activity and drug sensitivity data
21,22,24,45,49. These residues strikingly align along one surface wall of the proposed target DNA binding trench. A N120S mutant is reported to increase 3′ processing and strand transfer activities while N120Q and N120K mutants show little effect on processing but some decrease in strand transfer
22,45. The C130S and W132A/G/R substitutions are reported to have normal 3′ processing but little or no joining activity
24. A C130S substitution in combination with three other mutations shows loss in strand transfer but with some decrease in 3′ processing
9. More recently, four HIV-1 IN mutants with W132Y, M178C, F181G, and F185G substitutions, respectively, were constructed
25. The enzymes with mutations at positions W132, F181, and F185 did not support a strand transfer reaction but had wild type or near wild type levels of 3′ processing. In contrast, the mutation at M178 showed decreases in both activities
25. This latter residue lies below the surface of the target DNA binding trench (data not shown). An ASV IN mutant, structurally equivalent to HIV-1 IN S119, has normal 3′ processing but barely detectable strand transfer activity
21. Similarly, Asp substitutions in ASV IN equivalent to HIV-1 IN G94 and S123 show the same activity phenotype (Michael Katzman, Penn State College of Medicine, personal communication). Finally, HIV-1 IN substitutions of I141K, I203P or I203K (Corinne Ronfort, Universite de Lyon, personal communication) also show a loss in strand transfer with little affect on 3′ processing. As shown in , G94, S119, S123, W132, I141, F181, F185 and I203 lie on the IN surface in the putative target DNA binding trench (green residues) and are aligned with other residues that cause similar activity defects. As reported here, the K219S mutation also loses strand transfer but maintains 3′ processing activity. In contrast to the above residues, K219 is found on the opposite wall of the trench (see , red residues). A K219A mutation has been analyzed for its effect on HIV-1 replication and was reported to have a limited affect
50. It is not known why it did not show a stronger phenotype. This may be related to alanine rather than serine being substituted in this study or reflects differences in sensitivity between
in vitro and in cell assays.
Second, the target DNA binding site contains peptides previously shown to be cross-linked to the target DNA portion of a disintegration substrate modified with an azidophenacyl group
6. After UV photo activation, cross-links were established between the DNA substrate and six endoproteinases Glu-C digested peptides. In terms of our model, the peptide 139-152 represents the active site between the two LTR ends and contains Q148 as well as Q137, Q146, and N144. A second peptide, 213-247 is found at a distance from the catalytic site in the putative target DNA binding site and contains K219 but not K211. The K211S mutation has no affect on activity of IN
in vitro. The other peptides identified in that report were implicated in binding both viral and target DNA substrates or only to the viral DNA substrate in agreement with the model's predictions.
Third, we find a series of residues (Arg, Lys, Gln, and Asn) that appear along the length of the putative DNA binding pocket, which are found in known DNA binding sites of other enzymes
51-53 and could therefore be involved in binding to the target DNA. In contrast to the residues interacting with the LTR ends, these residues are conserved among INs to different extents. This would be consistent with IN inserting the viral DNA into many sites in the target DNA. Five of these residues (Q62, N117, Q148, N155, K159) have been mutated and cause defects to 3′ processing, strand transfer, and disintegration
8,22,33-35. These residues are predicted to lie in the catalytic site between the ends of the two LTRs so they could interact with both the viral and the target DNAs. This conclusion is supported by a recent study that showed that Q148 cross-linked to the ends of the LTRs
54.
Fourth, when we examined the positions in the structural model of 50 amino acid residues described in the literature
8,22,35,55-58 where mutations result in either little or no effect on or decreases in both 3′ processing and joining activities, none lay in the proposed target DNA binding trench. The only exception is when the targeted amino acids were positioned between the two LTR ends where they could interact with both viral and target DNAs. Additionally, Puglia et al.
59 reported an analysis of HIV-1 IN where in-frame insertions of small peptides were placed at 56 sites. The mutants were analyzed for changes in the joining reaction (but not 3′ processing because a preprocessed substrate was used). We examined the positions of these mutations in our structural model and can interpret their reported activity changes. For example, when the bulky insertions are near the enzyme surface but not near the viral DNA ends or the proposed target DNA binding site, the model predicts and Puglia et al.
59 report that there is no effect on activity. In contrast, when the peptide insertions are at the surface near either the viral DNA or target DNA binding sites, we predict that there should be a disruption to the joining reaction as observed. When insertions are buried within the structure we predict distortions that disrupt all activities and this too is seen.
Finally, we mapped the naphthyridine carboxamide and diketo acid related drug resistant sites on the structural model. The drug resistant sites associated with diketo acid compounds (residues 66, 74, 92, 143, 148, and 151-155) map in the active site region of the target DNA binding trench near or between the two LTRs. The napthyridine carboxamide related drug resistant sites (residues 121 and 125) map in the target DNA binding trench near to the LEDGF binding sites shown in . This observation suggests that the napthyridine carboxamide related drugs might interfere with formation of the LEDGF-HIV-1 IN complex. Others (residues 72, and 150) map in the target DNA binding site near the LTRs. Taken together, these results are consistent with the model and the hypothesis for the binding of target DNA.