|Home | About | Journals | Submit | Contact Us | Français|
The human immunodeficiency virus envelope glycoproteins, gp120 and gp41, function in cell entry by binding to CD4 and a chemokine receptor on the cell surface and orchestrating the direct fusion of the viral and target cell membranes. On the virion surface, three gp120 molecules associate noncovalently with the ectodomain of the gp41 trimer to form the envelope oligomer. Although an atomic-level structure of a monomeric gp120 core has been determined, the structure of the oligomer is unknown. Here, the orientation of gp120 in the oligomer is modeled by using quantifiable criteria of carbohydrate exposure, occlusion of conserved residues, and steric considerations with regard to the binding of the neutralizing antibody 17b. Applying similar modeling techniques to influenza virus hemagglutinin suggests a rotational accuracy for the oriented gp120 of better than 10°. The model shows that CD4 binds obliquely, such that multiple CD4 molecules bound to the same oligomer have their membrane-spanning portions separated by at least 190 Å. The chemokine receptor, in contrast, binds to a sterically restricted surface close to the trimer axis. Electrostatic analyses reveal a basic region which faces away from the virus, toward the target cell membrane, and is conserved on core gp120. The electrostatic potentials of this region are strongly influenced by the overall charge, but not the precise structure, of the third variable (V3) loop. This dependence on charge and not structure may make electrostatic interactions between this basic region and the cell difficult to target therapeutically and may also provide a means of viral escape from immune system surveillance.
The human immunodeficiency viruses (types 1 [HIV-1] and 2 [HIV-2]) and related simian viruses (SIVs) cause the depletion and functional dysregulation of CD4+ lymphocytes, resulting in the development of AIDS. The HIV envelope contains two glycoproteins: gp120, the exterior receptor-binding component, and its noncovalently interacting partner, gp41, the transmembrane envelope glycoprotein. Only half of gp41 is exposed in the ectodomain; the other half, separated by a transmembrane region, is thought to anchor the envelope complex to the underlying matrix. New infections are initiated by interaction of gp120 with the N-terminal membrane-distal domain of CD4, a glycoprotein on the surface of specific cells of the immune system (12, 29). A second interaction of gp120, with a member of the chemokine receptor family, primarily CCR5 or CXCR4, is believed to trigger conformational changes in gp41, which ultimately mediates virus-cell membrane fusion (20, 38).
HIV receptor binding takes place in the context of an oligomeric viral spike. Atomic-level structures have been determined for many of the component molecules: the entire extracellular portion of CD4 (60), the complex of monomeric core gp120 (a truncated version of gp120 with deletions of the gp41-interactive region at the N and C termini as well as of the variable V1/V2 and V3 loops) with the two N-terminal domains of CD4 and the antigen binding fragment (Fab) of the neutralizing antibody 17b (32), and a final fusion-active state of the gp41 trimer (10, 52, 56). Despite extensive effort, the structure of the oligomeric spike has resisted atomic-level investigation and is only known from electron microscopy (21, 22).
Accumulating evidence suggests that the HIV viral spike is a trimer of gp120-gp41 heterodimers. The most convincing evidence comes from the structural resemblance of the fusion-active state of gp41 to other fusion-active trimeric coiled-coils, including the equivalent transmembrane envelope proteins from Moloney murine leukemia virus (19), influenza virus (6), and Ebola virus (55). Other suggestive evidence comes from the introduction of cysteines into the coiled-coil to create disulfide-stabilized trimers (16), the trimeric nature of the underlying HIV matrix which interacts with gp41 (25), the trimerization of various ectodomain constructs of gp120-gp41 (X. Yang, L. Florin, M. Farzan, P. Kolchinsky, P. D. Kwong, J. Sodroski, and R. Wyatt, submitted for publication) and the therapeutic success of a peptide which appears to work by stabilizing an intermediate trimeric state in the gp120-gp41 fusion process (11, 28, 58).
Although the conformation of core gp120 is known by antigenic studies to be similar in the gp120-gp41 complex, molecular docking of gp120 onto the trimeric gp41 is not feasible because gp41 undergoes large conformational changes (6, 10, 56). Nonetheless, by correlating the antigenic map of gp120 with its atomic structure, a preliminary model of oligomeric gp120 was defined (62). Because this model was based on antibody binding, which occurs in the context of a ~600-Å2 epitope, it was of relatively low resolution. Still it showed that the regions of the oligomer facing the target cell after CD4 binding consisted of two components: a conserved portion of the core and a sequence-variable excursion, the V3 loop. These components have been shown by mutational analysis to interact with the chemokine receptor CCR5 (45).
Here we more precisely model the HIV-1 envelope glycoprotein oligomer, using quantifiable criteria based on carbohydrate exposure, occlusion of conserved surface residues that are solvent exposed on the gp120 protomer, and steric constraints imposed by the binding of the 17b antibody. We have applied the same modeling techniques to influenza virus hemagglutinin to estimate the modeling precision, since the structures of both the monomeric “HA top” of hemagglutinin (the gp120 core equivalent) and the HA1-HA2 heterotrimer (the gp120-gp41 complex equivalent) have been determined (4, 59). We have also investigated the electrostatic nature of the gp120 region facing the target cell, examining in particular the dependence of the potential on the structure and overall charge of the V3 loop. In a companion study (39), we tested the electrostatic predictions of our model on the binding of heparan sulfate and dextran sulfate to different variants of gp120. Finally, we examine the consequences of our results with regard to initial virus-cell attachment, viral mechanisms of immune evasion, and the feasibility of anion-based therapeutic strategies.
Superpositions, center-of-mass calculations, model rotations and translations, radius of gyration analysis, and solvent exposure calculations were performed with the software XPLOR (5). The structure of the monomeric core gp120 complexed with the two membrane-distal domains of CD4 (D1D2) and the neutralizing antibody 17b (Protein Data Bank [PDB] accession code 1gc1) was superimposed onto the corresponding D1D2 domains (residues 1 to 178) of the structure of the four-domain CD4 (the entire extracellular region) (PDB accession code 1wio) (60). The superposition gave an RM5 deviation of 2.12 Å for all atoms in residues 1 to 178.
The coordinate system used for modeling the gp120 oligomer (Fig. (Fig.1)1) had the following properties: (i) the z axis was coincident with the gp120 trimer axis; (ii) all three rotational degrees of freedom were permitted; (iii) since the coordinate system was threefold symmetric, only one other translation axis, chosen here to be the x axis, was independent (translations along the y axis were thus related by a Z rotation and x axis translation); and (iv) the initial position of the gp120 protomer was oriented by the superposition described above and placed with its center of mass at x = 35 Å and y = 0. (To distinguish between translations and rotations, lowercase letters x, y, and z are used to specify both the axis and the translational position along each axis and uppercase letters X, Y, and Z are used to specify the rotations about each axis.)
The gp120 protomer was rotated around its center of mass. The distance from the z axis of each of the various criteria chosen for quantification was calculated at 10° intervals. These calculations were performed with the program GRASP, with the normal to the z axis being determined by calculating the minimum of the distance matrix between the criterion being analyzed and a set of pseudoatoms placed at 1-Å intervals along the z axis.
Three criteria were chosen for quantification: carbohydrate exposure, conserved and exposed residues, and steric constraint of the 17b Fab fragment. For calculations of carbohydrate exposure, the structure of gp120 used was a model described previously (62), which comprised the crystallographic core (32), the modeled residues 88 and 89 and the V4 loop, and the central (N-acetylglucosamine)2(mannose)3 glycan moiety of all N-linked glycosylations. Because the positions of the mannose residues were poorly defined in the model (62), the N-acetylglucosamine residues, which are more proximal to the protein and appear less conformationally flexible, were used to represent the carbohydrate. The molecular surface of the N-acetylglucosamine residues was constructed, and the distance of this surface to the z axis was calculated as a function of gp120 orientation.
Conserved residues were conserved across all HIV-1 isolates (32). The fractional solvent accessibility for individual amino acids of core gp120 which were extracted from the 1gc1 complex was calculated as the ratio of the solvent-accessible surface area for atoms of an amino acid residue X in the protein to that area obtained after reducing the structure to a Gly-X-Gly tripeptide (47). Residues were considered exposed if they had a solvent accessibility of more than 40%. The resultant set of conserved and exposed residues on the gp120 core (33 residues) was further delineated by removal of those within van der Waals radii of either CD4 or the 17b Fab (10 residues excluded) or those that were glycosylated (2 residues excluded). Finally, a clustering analysis which excluded outliers more than 5 Å from the main cluster of conserved solvent-exposed residues (six residues excluded) was performed. The residues chosen by imposing the above criteria were 102, 103, 113, 114, 204, 208, 209, 211, 213, 214, 216, 221, 250, 439, and 491. The molecular surface of these residues was constructed, and the distance of this surface to the z axis was calculated as a function of the gp120 orientation.
To quantify steric constraint of the 17b Fab, its molecular surface was calculated as oriented by the position of the gp120 protomer, and the distance from this surface to the z axis was calculated. This criterion was less strict than the others since it was dependent on translational positioning, for which there was little constraint. To account for this, an orientation was considered sterically forbidden only if the distance from the 17b surface to the z axis was less than 3 Å for both the orientation being considered and the previous 10° rotation, effectively adding a 10° buffer zone to the sterically forbidden region.
The accuracy of the surface criterion optimization procedure was tested with the monomeric HA top of influenza virus hemagglutinin complexed with Fab HC19 (PDB accession code 2vir) (4). The “correct” oligomeric orientation was defined by the HA1-HA2 heterotrimer (PDB accession code 5hmg) (54, 59), positioned with its trimer axis coincident with the z axis. Superposition of the HA top onto a protomer of the oriented oligomeric HA1-HA2 heterotrimer gave an RMS deviation of 1.09 Å for all atoms in residues 43 to 309.
Quantification of surface criteria described above for gp120 was performed on the HA top structure with several modifications. For the carbohydrate criterion, a BLAST search (1) of the GenBank database (release 113.0) with the HA top sequence enabled 485 HA1 sequences to be aligned. All sites of potential N-linked glycosylation in the aligned sequences were identified. These mostly nonconserved sites of glycosylation were at residues 45, 63, 81, 122, 126, 133, 144, 165, 246, 276, and 285. A model of the molecular surface of the nonbackbone portions of these residues in the HA top structure (2vir) was constructed, and the distance from this surface to the z axis was calculated as a function of the HA top orientation.
For the conserved exposed residue criterion, a threshold of 99% identity was used with the same 485 sequences. A 40% solvent exposure criterion was used, as calculated for the HA top (2vir) structure. Because the receptor (sialic acid) is small and no main cluster of conserved exposed residues could be identified, receptor distance exclusion and outlier rejection were not used. Conserved exposed residues identified were 55, 57, 104, 107, 110, 129, 165, 169, 187, 208 to 210, 212, 221, 222, 225, 238, 240, 263, 269, 271, 285, 289 to 291, 293, 304, and 308. A model of the molecular surface of the nonbackbone portions of these residues in the HA top structure was constructed, and the distance of this surface to the z axis was calculated as a function of the HA top orientation.
Finally, for the steric constraint criterion, the Fab fragment of the influenza virus-neutralizing antibody HC19 was used in place of the HIV-neutralizing antibody 17b.
In addition to the typical modeling criteria, such as avoiding steric clashes and maximizing hydrogen bonding, the position of the 17b antibody was used to provide additional constraints since it is known that this antibody binds to both native gp120 and V3 loop-truncated gp120. Three different models of the V3 loop were constructed by using the program O (27). Using the helix model, called alpha, the g and h helices of sperm whale myoglobin, residues 106 to 137 of 1mbo, were extracted and substituted for the GAG residues of the V3 loop in 1gc1. The myoglobin sequence was replaced by the HXBc2 sequence, and successive rounds of stereochemical optimization coupled to manual rebuilding were performed to remove steric clashes. Using the beta-strand model, called beta, the f and g strands of the immunoglobulin Bence-Jones protein, residues 80 to 107 of 1rei, were grafted onto the core gp120 in the same manner as described for the alpha model. The nmr model was derived from nuclear magnetic resonance (NMR) analyses of V3 loops from several different HIV-1 isolates (7, 8) and consisted primarily of random-coil secondary structure. Five different NMR structures were examined in relation to the core gp120. Only two, from the mn and Haiti isolates, both refined from H2O-trifluoroethanol (TFE) mixtures, could be grafted onto the core without extensive clashes. Upon completion of structure building, clashes with the carbohydrate at position 332 of the gp120 core could not be resolved with the model derived from the Haiti isolate, and so this isolate was not used in further analyses. The nmr model thus derives solely from the NMR analysis of the mn isolate in H2O-TFE (8).
Electrostatic analyses were performed analytically with the program DelPhi (41) with a protein dielectric of 2.0, a solvent dielectric of 80, an ion exclusion radius of 2.0 Å, a probe radius of 1.4 Å, and an ionic strength of 0.14 M. For pictorial display, the precise Delphi potentials were read into the program GRASP (42), with the local potentials displayed at the solvent-accessible surface.
Homology modeling was carried out with the program PrISM (64), with sequence alignments and homologous models constructed based on the 1gc1 gp120 structure.
A three-dimensional model of the gp120 portion of the trimeric HIV envelope glycoprotein complex was developed subject to the conditions that the surface of gp120 that is occluded in the oligomer interface should (i) maximize carbohydrate exclusion, (ii) minimize conserved residue exposure, and (iii) be sterically compatible with binding of the 17b antibody. Since gp120 interacts with gp41, which is a symmetric trimer in its isolated form and remains trimeric when in the complex, the model was constrained to be threefold symmetric. The occluded interface in the oligomer is expected to comprise gp120-gp41 contacts, gp120-gp120 contacts, and occluded surfaces that large ligands such as antibodies cannot access. The constraining conditions for the model were chosen because, a priori, they represent reasonable expectations about the oligomer interface and because they could be reduced to quantifiable criteria.
Glycosylation sites occur almost exclusively on the exposed surfaces of protein molecules, and they occur near protein interfaces only at the periphery. Carbohydrate residues in N-linked glycans tend to be both flexible and highly hydrated; although they can be secured by protein contacts, the resultant entropic loss is large, making such interactions generally unlikely. As a consequence, we could expect the oligomeric interface to be free of glycosylation.
We would also expect exposed surface residues on gp120 to be variable, a consequence of immune pressure. Possible factors of conservation are limited primarily to occlusion at the oligomer interface, involvement with receptor binding, or constraints of folding topology. Conservation due to the last two criteria could be eliminated by examining residues for proximity to either the CD4 or the 17b Fab binding site (the 17b site here served as a surrogate for the chemokine receptor binding site ) and by performing a clustering analysis to remove statistical outliers (constraints on exposed residues for structural purposes should be rare).
Several ligands are known to bind to gp120 in the context of the oligomeric complex as well as to isolated gp120. The sites for such ligands must be oriented on the oligomer interface and also appropriately oriented for biological interactions. The 17b antibody is known to bind to oligomeric gp120 (53) and does not cause appreciable dissociation of the gp120 from the oligomer (44). Therefore, any valid oligomer model must be sterically compatible with 17b binding. Similarly, productive binding of HIV to CD4-positive cells is known to involve intact glycoprotein oligomers. Therefore, one expects the CD4 binding site to be both free on the oligomer surface and appropriately oriented for attachment to CD4 on the cell surface.
The crystal structure of core gp120 was first elaborated with a modeled completion of two N-terminal residues (one of which is glycosylated), of the V4 loop, and of the two N-acetylglucosamine residues at each site of N-linked glycosylation. This elaborated core was then oriented as a rigid body relative to a coordinate system established with a threefold-symmetry axis perpendicular to the viral surface (Fig. (Fig.1).1). The initial orientation of gp120 was set such that the D1D2 portion of CD4 in the core gp120-CD4 complex would be superimposed onto a promoter of D1D2 in the structure of dimeric soluble CD4, oriented with its diad axis perpendicular to the hypothetical cell surface. (The dimeric soluble CD4, which consists of the entire extracellular portion of CD4, crystallizes as a dimer in three different space groups ; its orientation is physiologically relevant and thus serves to position the hypothetical cell surface.) The gp120 protomer was then reoriented about its center of mass, displaced from the triad axis sufficiently to avoid collisions with other protomers. First, all rotational orientations about the z axis were tested with respect to the quantifiable criteria, and the optimal value was found to be at 30° (Fig. (Fig.2a,2a, left panel). Then, with the Z orientation at the optimum, rotations were made successively about the X and Y rotational axes (Fig. (Fig.2a,2a, middle and right panels). A protomer in the optimized model can be obtained from the 1gc1 PDB coordinates by the Euler rotation (Θ1 = 6.90, Θ2 = 112.34, Θ3 = 22.60) followed by the translation (tx = −25.91, ty = −71.24, tz = 30.90). This procedure of successive rotations, which was used for computational economy, does not sample all of the rotational space; nevertheless, we expect the result to be close to the optimum for the three conditions since it preserves the initial orientation relative to CD4 on the cell surface. In addition, visual inspection of the final orientation confirms that it is close to, if not at, the global optimum.
The carbohydrate criterion and the exposed and conserved surface residue criterion were completely independent. Nonetheless, quantification of these two criteria led to maxima and minima within 20° of each other for all independent axial rotations (with the 17b steric criterion used bifunctionally to eliminate incompatible orientations) (Fig. (Fig.2).2). Taken together, the three criteria produced well-defined peaks with all three independent rotational parameters, thereby allowing a “best” orientation to be derived at an X rotation of 0°, a Y rotation of 0°, and a Z rotation of 30° (0° 0° 30°). This best orientation actually corresponds to two possible alignments with respect to the viral membrane, “up” and “down.” One of these alignments could be eliminated due to CD4 (or 17b) steric constraints; the binding of CD4 (or 17b) in this orientation would bury it in the viral membrane (Fig. (Fig.3),3), thus permitting a single unique best orientation to be derived.
The resultant model of the core oligomer (Fig. (Fig.3)3) displayed several relevant features that were not included in the modeling criteria. The observed and deduced binding sites for CD4 and neutralizing antibodies were all on exposed surfaces. The variable loops (V1 to V5) were all well exposed on the oligomer. The N and C termini were clustered at the end pointing toward the viral membrane (with the C terminus proximal to the oligomer axis poised to interact with the trimeric gp41). Additionally, the region identified by substitutional mutagenesis as interacting with chemokine receptors (45) was free of carbohydrate and pointed directly at the target membrane. These features are consistent with the biology of the gp120 trimer and with the qualitative characteristics of the rough model obtained previously (62).
To estimate the precision of the surface criterion quantification procedure, we turned to the influenza virus hemagglutinin system. Atomic-level structures are known for both the HA1-HA2 heterotrimer (59) (equivalent to the gp120-gp41 oligomeric complex) and a monomeric proteolytic fragment, the HA top, complexed with the Fab of a neutralizing antibody (4) (equivalent to the gp120-17b complex). In addition, the fusion-activated forms of gp41 and HA2 are structurally similar, and although core gp120 and the HA top show no sequence similarity, they have almost identical radii of gyration (20.6 and 20.8 Å, respectively), making rotations about a position displaced the same distance from the trimer axis reasonable.
The HA top contains much less glycosylation than core gp120. Only three sites are present, although no carbohydrate is modeled in the 2vir coordinates (X31 isolate). In contrast, 18 sites are present on the gp120 core model (HXBc2 isolate). To increase the accuracy of this criterion for the HA top, all nonconserved sites of glycosylation were identified and then used to enhance overall surface coverage, increasing the residues used in the carbohydrate criterion to 11. Even so, this was less than two-thirds of that used for gp120 (compare the structures in Fig. Fig.2c).2c). In addition, because side chains, instead of glycan moieties, were used to mark the positions of the carbohydrates, criterion uncertainty increased. Nevertheless, the residues identified tended to be on the outside of the trimer, and for rotations in Y and Z this criterion produced well-defined maxima close to the known trimer orientation (Fig. (Fig.2b2b and c).
The hemagglutinin sequence is much less divergent than the gp120 sequence. This makes it difficult to find a reasonable set of conserved exposed residues. (If there were no divergence, for example, this criterion would be meaningless.) Using 485 aligned HA1 sequences and a minimum solvent exposure of 40%, 39 residues showed 98% identity, 28 showed 99% identity, and 12 showed 100% identity. The 100% criterion was judged too strict because sequencing errors might account for some divergence, and so a 99% criterion was used. This stringent 99% criterion selected approximately the same number of surface residues as the less-restrictive gp120 criterion (for which all single-atom substitutions [e.g., Gln to Glu] were included, as well as larger substitutions as long as they did not change the character of the amino acid [e.g., Lys to Arg or Tyr to Trp]). While these residues tended to cluster at the known oligomer interface, the minima produced by this criterion were not well-defined (Fig. (Fig.2b2b and c). For example, the minimum Z rotation of the furthest conserved residues appears close to the maximum for the nearest (Fig. (Fig.2b,2b, left panel), and the X rotation showed very little change in parameter distance (Fig. (Fig.2b,2b, middle panel). This lack of definition may be related to the packing of the hemagglutinin trimer, with conserved residues clustering at two discrete interfaces, as opposed to gp120, for which one central cluster was observed (Fig. (Fig.2c).2c). Such clustering would account, for example, for the local maximum at the origin of the Z rotation for the minimum conserved residue distance.
Finally, because the HC19 Fab binds further from the trimer axis than 17b, it does not produce as much of a steric clash. Indeed, for rotations in X, no angles are restricted.
Judging from the poor shape of the quantification curves, the surface criteria optimization did not work as well with the HA top as with the gp120 core. Overall results of the optimization procedures are shown in Table Table1.1. The internal agreement of the criterion optimizations, as judged by the peak orientational difference between the carbohydrate and conserved-residue extrema, was much better for gp120. Despite the poor agreement, the mean of the extrema for the HA top was within 20° of the correct orientation, suggesting that averaging these independent criteria reduced the overall error. In the case of gp120, the 17b steric criterion reduced the mean deviation even further, to one-fourth that observed for the HA top. This suggested that the error associated with the rotational parameters of the resultant gp120 oligomeric model was only 5°.
While it is conceivable that the deletion of the gp120 N and C termini (57 and 19 amino acids, respectively) could alter the modeling results, we feel that this is unlikely. With respect to the carbohydrate and 17b steric criteria, the missing termini are carbohydrate free and 17b binds to core gp120. Thus, both criteria should be unaffected. With respect to the conserved and exposed amino acid criterion, similar extrema were observed for both the furthest and closest residues (Fig. (Fig.2a,2a, left panel). If some of the 15 exposed and conserved amino acids are covered by the termini, a subset should give similar results. In the event that all or nearly all 15 are covered, this would localize much of the missing termini in the same region as the previous exposed amino acids; since the missing termini correspond to the epitopes of virtually all of the nonneutralizing antibodies, this would again place the expected oligomer interface in a similar region. Finally, the agreement between the carbohydrate and exposed-residue criteria was so good that even if the exposed-residue criterion were omitted, the extrema would change by only 5° on average. Thus, we feel that the presence of the missing termini is unlikely to substantially alter the results obtained here.
Although the modeling precisely defined the rotational parameters of the trimer, the translational parameter was only partially determined. Steric constraints defined a minimum approach, and distance constraints between the C terminus of gp120 and the N terminus of gp41 defined a maximum, but these two criteria did not discriminate sharply. With the center of rotation for a protomer placed at 35 Å from the trimer axis, which was close to the sterically constrained minimum approach, the Cα-Cα diameter of the model was ~110 Å; if the (mannose)3 carbohydrate extensions were included, the diameter increased to ~150 Å. These dimensions agreed with electron microscopic observations of the viral spike, although these observations vary widely, from 100 Å (by negative staining of gp120 from SIV ) to 150 Å (by ultrathin section tannic acid cytochemistry and surface replica electron microscopy of HIV-1 ). If the distance from the trimer axis is proportional to the dimensions of the protomer in the x and y directions, using the relative proportions of hemagglutinin as a guide would place the gp120 core 30.5 Å from the trimer axis (Fig. (Fig.2c).2c). A protomer of this hemagglutinin proportionate model can be obtained from the 1gc1 coordinates with the Euler rotation specified previously followed by the translation (tx = −30.43, ty = −71.24, tz = 30.90). At this distance, steric clashes occur between neighboring gp120 protomers with the carbohydrate at position 197, although these are easily resolved by minor movements of this flexible portion of the model. (This proportionate model is shown in Fig. Fig.2c2c and and7;7; the rest of the figures depict the gp120 protomer with a center of mass at x = 35 Å.)
Electrostatic analysis of our model of the oligomeric core gp120 defined an electropositive surface which would face the target cell membrane. We tested the robustness of this basic region to changes in the modeling parameters. As can be seen in Fig. Fig.4,4, rotations of ±30° and translations of ±5 Å maintained the electropositivity of the region.
We used sequence analysis combined with homology modeling to test the conservation of this basic region throughout the primate immunodeficiency viruses. Although the charge of the region differed among viruses, its basic nature was conserved in different clades as well as with HIV-2 and SIV (Fig. (Fig.5).5).
The V3 loop comprises roughly 30 amino acids. It is too large to be modeled correctly without experimental constraints. We used the steric constraints inherent in connecting the V3 loop to core gp120 as well as those consistent with 17b Fab binding. Nevertheless, very different V3 loop structures could be successfully built (Fig. (Fig.6a).6a).
The overall charge of the V3 loop ranges from +2 to +10, with that of a CCR5-using isolate generally in the range of +3 to +5 and that of a CXCR4-using isolate (often a T-cell-line-adapted strain) being from +7 to +10. With the HXBc2 sequence (+9 charge on the V3 loop), we analyzed the electrostatics of three very different V3 loop structures, as well as the effect of varying the overall net V3 loop charge (Fig. (Fig.66c).
Analysis of the potential near the cell-facing region of the oligomer showed a correlation between the potential and the overall V3 loop charge, especially at distances further than 20 Å from the gp120 core (Fig. (Fig.6c,6c, middle and right panels). In contrast, the potential did not seem to correlate with the precise structure of the V3 loop (Fig. (Fig.6c,6c, left panels).
The relatively low electrostatic potential for a zero V3 loop net charge suggested that the contribution of the core to the overall potential was small. We tested this explicitly by calculating the electrostatic potential as a function of only the V3 loop (Fig. (Fig.6c).6c). At long distances and with a high charge (for example, CXCR4-using isolates), the V3 loop potential dominated, approximating closely the potential for the entire core with V3 loop. At short distances and with a low V3 loop charge, the core contributed significantly to the electrostatic potential. This was especially true at x = 35, y = 0, directly below the protomer, close to the above-described basic conserved region on the core.
From the earliest days of structural biology, for example, with myoglobin and hemoglobin, biologically important results have been extracted from low-resolution oligomeric models of more highly resolved protomers. The construction of such models often requires little additional information; if the protomer conformation does not change, only three angles and one distance are needed to specify a symmetric oligomer. Here, optimization of quantifiable surface criteria was used to precisely determine the orientation of gp120 in the oligomeric viral spike. The orientational precision of better than 10° corresponded to an average positional error of less than 3.5 Å for a molecule with a 20-Å radius of gyration. Because we did not have precise experimental limits on the translation component, however, our analysis of the model was limited to properties that depend primarily on rotational parameters. Still, the resultant model defined several interesting features, and here we explore their functional, immunological, and therapeutic implications.
Studies of the effect of soluble CD4 on virus entry suggest that more than one CD4 molecule must bind the envelope glycoprotein oligomer to initiate virus-cell fusion (33, 37, 50). The model derived here demonstrates that three CD4 molecules can bind without CD4:CD4 steric interference (in the hemagglutinin proportionate model, the closest CD4-to-CD4 contact is over 35 Å). CD4 binds obliquely to the sides of the oligomer, with the third and fourth domains of CD4 being almost parallel to the cell membrane. Because CD4 is an extended molecule and it binds gp120 at the N-terminal membrane-distal domain, the membrane-spanning portion of CD4 is positioned far from the oligomer axis. With the hemagglutinin proportionate model, the end of the second CD4 domain (residue 178) would be 127 Å from the same position on an adjacent CD4 and the last ordered extracellular residue (363) would be 198 Å away (Fig. (Fig.7).7). Even accounting for a tighter protomer packing or for the known segmental flexibility between domains 2 and 3 of CD4, membrane-spanning portions of adjacent CD4 molecules would be separated by at least 190 Å.
The chemokine receptor, in contrast, binds close to the trimer axis, at a gp120 surface which is almost 20 Å closer to the target cell membrane. This arrangement is consistent with simultaneous CD4 and chemokine receptor binding to a protomer. In terms of multiple chemokine receptors binding to the oligomer, our data show that the 17b Fab fragment, whose epitope overlaps the chemokine receptor binding site (45), lies at the edge of the sterically restricted “forbidden zone” in all three rotational parameters (Fig. (Fig.2).2). This suggests that multiple chemokine receptors bound to the same oligomer may be sterically strained. Since the chemokine receptor is an integral membrane protein and z translations are constrained, such strain could only be relieved by movement of the gp120 protomer away from the trimer axis. Such movement may serve to signal the binding state of gp120 at the target cell membrane to the gp41-interactive regions, roughly 50 Å distal, triggering gp41 fusion-related conformational changes at the appropriate moment. In this regard, it may be relevant that some of the substitutions that effect chemokine receptor binding map closer to the trimer axis than the 17b epitope, and many of these axis-proximal substitutions induce dissociation of gp120 from the oligomer (45).
Multivalent CD4 molecules have been constructed with the intent of binding multiple gp120 molecules to enhance avidity. Our results show that this is only possible with an extremely long and flexible linker, thus ruling out enhanced affinity with, for example, the two N-terminal domains of CD4 in an antibody format (9). Indeed, the dimensions are such that although the position of viral spikes is fixed by the underlying matrix, binding to a neighboring spike (which is roughly 380 Å away, assuming 72 viral spikes per virion and a virion diameter of 1,000 Å) would be almost equally feasible. (Such orientational steric restrictions should also apply to antibodies that bind to the CD4 binding site.)
Finally, with regard to the known segmental flexibility of CD4, the optimal orientations of the x and y axes of gp120 in the quantitative modeling were found to coincide with the dimer orientation of CD4. Given that the orientations were based on completely different criteria, this coincidence suggests that the orientation of CD4 on the cell surface may be more rotationally constrained with respect to the plane of the membrane than previously thought (60).
A central mechanistic question is how CD4 activates gp120 on the virion, transforming the stable envelope into a reactive fusogen. CD4 is known to cause conformational changes in gp120, including most prominently the formation of the bridging sheet (32, 53). Interestingly, the edge of this conserved sheet can be seen as a pivotal site of contact in the oligomer; although the specific details of the contact are dependent on the ill-defined translation component, it is the well-defined rotation parameters which position the bridging sheet in its prominent location. In the hemagglutinin proportionate model, steric clashes are found with residue 197 in this region. Moreover, deletion of the outer two strands of the bridging sheet generates an oligomer to which CD4 binds without inducing shedding (63), and mutations in this region, especially of the glycosylation site at residue 197, correlate with a CD4-independent infection phenotype (30). All of these lines of evidence are consistent with the notion that activation of the trimer may involve close contact of the bridging sheet with the neighboring protomer (Fig. (Fig.77).
Another region of potential protomer:protomer contact involves the sequence-variable V1/V2 and V3 loops, which were truncated in the core gp120 structure. In the V3 loop modeling it was clear that 17b acts as a barrier between the V1/V2 stem and the V3 loop (Fig. (Fig.6a).6a). Unless 17b binding substantially alters the positioning of the V1/V2 and/or V3 loop, this suggests that these loops probably do not contact each other within a protomer. Data from experiments in which revertants were obtained subsequent to mutation of the V3 loop, however, show that changes in the V1/V2 loop can rescue changes in the V3 loop (46). Moreover, some neutralizing antibodies in simian-human immunodeficiency virus-infected monkeys apparently recognize a discontinuous V2-V3 determinant (15), and a functional interaction of the V2 and V3 loops is able to determine coreceptor choice and neutralization resistance (50). These data suggest that, in the context of the oligomer, the V1/V2 loop and the V3 loop are near one another. In this regard, it may be relevant that the model derived here for the trimer displayed the base of the V3 loop juxtaposed to the V1/V2 stem from the neighboring protomer.
Nonspecific electrostatic interactions of the basic cell-facing surface may affect entry of virus into a cell in several ways. With respect to initial attachment, these interactions may play a role in binding of virus to polyanions such as heparan sulfate. This conjecture agrees with biochemical results which show that heparan sulfate influences the binding of HIV virions to some cells (36). In addition, the model derived here allows us to analyze the electrostatic properties of a wide variety of HIV strains. In a companion study (39), we found that the observed binding of polyanions to different isolates of gp120 as well as to mutants of gp120 with and without the V3 loop correlates extremely well with model-based electrostatic predictions.
Other acidic surfaces may interact with the basic cell-facing region described here. For example, negatively charged lipid head groups may create an acidic zone in the target membrane itself. The degree of membrane acidity is dependent on the composition of the lipid head groups, which may differ in different cells as well as locally, from patch to patch, on the surface of an individual cell. The target membrane compositional variation may explain the ability of heparinase to affect initial binding in some cell types and not in others (36, 43).
Numerous studies have documented the binding of membrane-associated proteins through both a hydrophobic interaction and an electrostatic attraction to membrane head groups (34, 40). A theoretical and experimental analysis of the binding of the polylysine peptides (Lys)3, (Lys)5, and (Lys)7 to a 33% acidic membrane (a 2:1 mixture of phosphotidylcholine and phosphotidylglycerol) showed that each charge enhances the binding of the polymer to the membrane by roughly 1 order of magnitude (3). While it is difficult to correlate the polylysine results with the V3 loop/core basicity determined here, that analysis does suggest that if the cell-facing surface of oligomeric gp120 is highly basic, direct interactions with the membrane will be enhanced. Moreover, our results suggest that the wide variation in overall charge on the V3 loop will be a primary determinant of the magnitude of this electrostatic interaction. (We note, however, that the relatively poor correlations between potential and charge at shorter distances suggest that the precise structure of the V3 loop will influence short-range interactions.)
After gp120 is bound to CD4, nonspecific electrostatics may play a role in virus entry and chemokine receptor binding. CD4 itself is quite basic, with a net overall charge for the first two domains of +5, which should serve to increase the overall basicity of the CD4-gp120 complex. In addition to the membrane head group interactions detailed above, nonspecific electrostatics may effect binding to the sulfonated, acidic N terminus of the chemokine receptor (particularly CXCR4) (14, 17, 18). Relevant to this, mutagenesis of the chemokine receptor binding site on gp120 showed that substitutions of negatively charged amino acids for positively charged ones often have substantial effects on binding (45).
Our results suggest that the biological effects of polyanions are primarily the consequence of nonspecific electrostatic interactions. If so, this may explain the dramatic differences in in vitro and in vivo effectiveness of such polyanions as dextran sulfate. Since the primary interaction appears to be dependent on charge density rather than a particular structural motif (57), it seems unlikely that the specificity essential for effective therapeutic treatment is present. This does not bode well for drug development based on polyanions such as sulfated polysaccharides (2, 24, 35), carboxylated albumins (26, 51), porphyrins (13, 49), or DNA or RNA derivatives (31). Nevertheless, the unusual basicity observed for the CXCR4-using viruses may, in this particular case, permit anion-based intervention.
Because generation of broadly neutralizing antibodies depends on detection of conserved sequences and nonspecific electrostatic interactions can be achieved through variable regions like the V3 loop, the nonspecific binding predicted here, which does not appear to be sensitive to the precise structure of the V3 loop, may serve as a mechanism of gp120 immune system evasion. Additionally, the structures of several neutralizing antibodies complexed with their V3 loop epitopes revealed that the V3 loop exists in at least two quite different conformation (48). This suggests that the V3 loop may adopt different structures, adding yet another layer of difficulty to its recognition by the immune system.
Initial nonspecific electrostatic attachment may help to conceal the conserved chemokine receptor site, which is on the cell-facing surface. Studies have shown that polyanions compete effectively with antibodies for binding to the V3 loop (23, 39). The chemokine receptor binding site, which in any case may only be exposed after a conformational change is induced by CD4 (32, 53), may thus be multiply shielded from immune system surveillance.
Last, we note that although electrostatic potentials are calculated from the exact positions of the atoms in the atomic structure, the effect of electrostatics on biological processes such as initial viral attachment and virus-cell membrane fusion depends on much-lower-resolution parameters. Since the modeling criteria used here are not well constrained with respect to protomer translations, the relative insensitivity of the electrostatics to this parameter bodes well for the analysis. Indeed, the basic surface defined here appears to be robust to relatively large changes in modeling parameters, both for the core (Fig. (Fig.4)4) and for the V3 loop (Fig. (Fig.6).6). This robustness, coupled with the low resolution of the effects, increases our confidence in the biological significance of our conclusions.
We thank An-Suei Yang for assistance with the program PrISM and homology modeling, Barry Honig and Ben Hitz for help with the programs DelPhi and GRASP, Lawrence Shapiro for a thorough reading of the manuscript, and a reviewer for suggesting the use of influenza virus hemagglutinin to test the modeling procedure.
This work was supported by grants from the National Institutes of Health (AI 31783 and AI 39420); by the ANRS, INSERM, and CNRS of France; and by a Center for AIDS Research grant to the Dana-Farber Cancer Institute (AI 28691). The Dana-Farber Cancer Institute is also the recipient of a Cancer Center Grant from the National Institutes of Health (CA 06516). Columbia University is a participant in the Center for AIDS Research (AI 42848). This work was made possible by gifts from the late William McCarty-Cooper, from the G. Harold and Leila Y. Mathers Charitable Foundation, and from the Aaron Diamond Foundation as well as Douglas and Judi Krupp. R.W. was a fellow of the American Foundation for AIDS Research, and P.D.K. is a recipient of a Burroughs Wellcome Career Development award.