We performed a fragment screen of 500 compounds using surface plasmon resonance (SPR), NMR and crystallography (data not shown) using the His-tagged IN-CCD protein from E. coli
. This initial screen was done using a commercial library (Maybridge Ro3) via both SPR and NMR, and crystallographic analysis was used to confirm these initial hits. Based on the hits obtained from this screen, several analogues were chosen from the CSIRO compound library and these were tested via SPR and crystallography for binding affinity and the location of binding respectively. One of the hits that demonstrated good density in the LEDGF site was lactone 1
, which was found by SPR to have better affinity (750 µM vs 1570 µM, ) for the core4H than the core3H construct of the CCD. Core4H has a Trp131Asp mutation not present in core3H, and this residue forms part of the wall of the LEDGF pocket 
, suggesting that this compound would not interact as well with wild type IN. Additionally it was found in the crystal and in assays that 1
existed as the ring-opened form and this form (compound 2
) was as active as the original sample in SPR assays ( and , ). The 1H
)-one of 2
occupies the same position in the IN pocket (PDB 3ZT3, ) as residue Ile365 in the LEDGF loop 
. In addition, the carboxylic acid of the compound makes a virtually identical interaction to Asp366 of the LEDGF loop 
. This charge interaction is key to the series developed here, to the series of peptides that have been shown to interact with the LEDGF binding site on HIV IN, and key for the binding of other small molecules that have been developed by other groups 
. This was the basis for our first analogue 3
, which demonstrated similar affinity to 2
in the AS assay (AS) (270 µM and 200 µM respectively, ), and the SPR core3H assay (1435 µM vs 1375 µM, ) and clear density in the LEDGF site.
All values in the table are in micromolar (µM).
Chemical compounds explored to determine structure activity relationship with IN.
Overlay of structures in the LEDGF site.
Data and model statistics for the 10 co-crystal complexes of compounds with HIV integrase.
Inspection suggested that a seven membered ring would more effectively fill this pocket and accordingly several 2,3,4,5-tetrahydro-1H
diazepin-1-one based analogues were synthesised (examples are compounds 4
). Several compounds in this series were active in the AS assay (e.g. compound 4
at 110 µM), but did not bind in the SPR assays. However, when 4
was tested in a virus infectivity assay the activity in the cell toxicity counter screen was of the same order of magnitude (275 µM and 265 µM, respectively), indicating that true activity could not be determined. This suggested a possible problem with cell permeability for this series. Upon analysis of the crystal of IN soaked with 5
, it was found that the ring-opened structure, compound 6
(a side product of the synthetic pathway) was bound (PDB 3ZSZ, ). To confirm this result, compound 6
was isolated from the original preparative reaction mixture directly and this sample afforded the identical crystal complex and AS activity of 270 µM but binding could not be detected in the SPR assay.
We noted that compound 6
contained a secondary amino group and subsequently several N
-alkylated derivatives of 6
were prepared and encouragingly this modification restored binding in the SPR assay. Alkylation of the secondary amine of 6
led to 7
Me) which displayed similar levels of AS activity. Thus, tertiary amine 7
displayed an affinity of 595 µM in the SPR core3H compared to >2000 µM for secondary amine 6
. From 7,
replacement of the p
-methoxyphenyl group led to 8
cyclohexyl) which gave 100 µM inhibition in the AS activity assay but with loss of SPR activity, whereas 9
allyl) retained SPR activity. Analysis of the crystal complexes of compounds 6
suggested that branched amide analogues could more effectively fill the pocket, so we synthesized compounds 10
Both had better activity in the AS assay, 29 µM and 8 µM respectively, with 11
having the best activity in the series. To confirm that compounds 10
were not giving a false positive reading in the AS assay, they were tested in a counter screen using the Flag-His6 fusion protein and showed respectively 6 and almost 20 fold less activity (175 µM and 145 µM, ). Both compounds show the desired selectivity in the SPR assay for the core3H over the core4H CCD constructs (, ). A cell based HIV-1 infection assay was performed to obtain EC50
values, and 11
returned an EC50
of 29 µM and in the counter screen had a CC50
of >100 µM. To provide further evidence that compound 11
did not interact at the IN active site, the compound was assayed in the cell infectivity assay using IN double active site mutants, either Q148H/G140S (QHGS) or N155H/E92Q (NHEQ), and returned a similar EC50
of 54 µM (±4 SD) or 37 µM (±4 SD). In this same assay raltegravir 
(which is the clinically approved IN inhibitor Isentress™) has an EC50
of 10 nM for the virus with the wild type IN, but was essentially inactive (EC50
>1 µM) for virus with either of the QHGS or NHEQ mutations. Raltegravir has been confirmed to bind at the active site of IN by crystallography 
SPR sensorgrams showing three compounds binding to immobilized HIV integrase core3H (left panels) and core4H (right panels).
A comparison of our initial compound 2
bound in the crystal structure superposed on to the crystal structure with compound 11
can be seen in . The amine of 11
sits deeper in the LEDGF hydrophobic pocket and makes another hydrogen bond to the HIV IN backbone (the carbonyl of Gln168). This hydrogen bond is recapitulating the bond seen in the crystal structures to the backbone amine of Ile365 of the IBD (PDB 2B4J) and of cyclic peptides (e.g. PDB 3AVB) bound to this site on HIV IN 
. shows a similar orientation of 11
but with the Christ et al.
LEDGIN-6 (CX04328, PDB code 3LPU) 
superposed in two positions. Due to the differences in crystallographic packing in the two structures (3LPU is a monomer and the structures presented here are dimers in the asymmetric unit), superposition of the protein structure (the monomer in 3LPU to one of the monomers in the dimer of 3ZSO, 144 residues align with a r.m.s.d. of ~0.8Å) does not align the key carboxylic acid motifs of the compounds, so one alignment, 4A, is based on the superposition of the protein and the other, 4B, is based on superposition of the compounds. In both alignments, one can see that 11
delves deeper into the LEDGF pocket and makes additional hydrophobic interactions as well as the additional hydrogen bond to Gln168.
Superposition of structures with 2 (cyan) with 11 (magenta) in the LEDGF site.
Superposition of 3LPU structure over structure with compound 11.