|Home | About | Journals | Submit | Contact Us | Français|
The identification of vaccine immunogens able to elicit broadly neutralizing antibodies (bNAbs) is a major goal in HIV vaccine research. Although it has been possible to produce recombinant envelope glycoproteins able to adsorb bNAbs from HIV-positive sera, immunization with these proteins has failed to elicit antibody responses effective against clinical isolates of HIV-1. Thus, the epitopes recognized by bNAbs are present on recombinant proteins, but they are not immunogenic. These results led us to consider the possibility that changes in the pattern of antigen processing might alter the immune response to the envelope glycoprotein to better elicit protective immunity. In these studies, we have defined protease cleavage sites on HIV gp120 recognized by three major human proteases (cathepsins L, S, and D) important for antigen processing and presentation. Remarkably, six of the eight sites identified in gp120 were highly conserved and clustered in regions of the molecule associated with receptor binding and/or the binding of neutralizing antibodies. These results suggested that HIV may have evolved to take advantage of major histocompatibility complex (MHC) class II antigen processing enzymes in order to evade or direct the antiviral immune response.
A major goal of HIV vaccine development is the development of immunogens that elicit protective antiviral antibody and cellular immune responses. However, after more than 25 years of research, vaccine immunogens able to elicit protective immunity in humans have yet to be described (11, 31). Although it has been possible to produce recombinant envelope proteins (gp120 and gp140) with many of the features of native virus proteins (e.g., complex glycosylation and the ability to bind CD4, chemokine receptors, and neutralizing antibodies), these antigens have not been able to elicit broadly neutralizing antibodies (bNAbs) or protective immune responses when used as immunogens (11, 32, 43, 50, 56, 74, 79). The fact that recombinant proteins can adsorb virus bNAbs from HIV-1-positive sera (59, 91) indicates that many recombinant envelope proteins are correctly folded but that the epitopes recognized by bNAbs are simply not immunogenic. Over the last decade, several different approaches have been employed to create immunogens able to elicit broadly neutralizing antibodies. These strategies have included efforts to duplicate and/or stabilize the oligomeric structure of HIV envelope proteins (5, 26, 87), the creation of minimal antigenic structures lacking epitopes that conceal important neutralizing sites (27, 46, 70, 89), and prime/boost strategies combining protein immunization with DNA immunization or infection with recombinant viruses in order to stimulate the endogenous synthesis and presentation of HIV immunogens (15, 29, 30, 83). However, none of these approaches has resulted in a clinically significant improvement in antiviral immunity or HIV vaccine efficacy. Efforts to elicit protective cellular immune responses (e.g., cytotoxic lymphocytes) by use of recombinant virus vaccines have likewise been disappointing (10, 61). In fact, such vaccines may have promoted HIV infection rather than inhibiting it (22, 23).
In the present study, we describe the first steps in a new approach to reengineering the immunogenicity of HIV envelope proteins in order to improve the potency and specificity of humoral and cellular immune responses. The approach is based on defining the determinants of antigen processing and presentation of HIV envelope glycoproteins. Both humoral and cellular immune responses depend on proteolytic degradation of protein antigens prior to antigen presentation, mediated by professional antigen-presenting cells (APCs) such as macrophages, dendritic cells, and B cells (97). Normally, proteins of intracellular origin are processed by the proteasome, a 14- to 17-subunit protein complex located in the cytosol. Proteins of extracellular origin are processed in lysosomes or late endosomes of APCs. The resulting peptide epitopes are then loaded into major histocompatibility complex (MHC) class I or class II molecules and presented on the surfaces of APCs to CD8 or CD4 T cells. Within the endosomes and lysosomes of APCs, there are cathepsins, acid thiol reductase, and aspartyl endopeptidase. The enzymes perform two activities: degrading endocytosed protein antigens to liberate peptides for MHC class II binding (99) and removing the invariant chain chaperone (6, 94). Although all cathepsins can liberate epitopes from a diverse range of antigens (16), only cathepsins S and L have nonredundant roles in antigen processing in vivo (reviewed by Hsing and Rudensky ). Cathepsin L is expressed in thymic cortical epithelial cells but not in B cells or dendritic cells, while cathepsin S is found in all three types of APCs. Unlike cathepsins L and S, which are cysteine proteases and active at neutral pH, cathepsin D is an aspartic protease, is active at acidic pH, and participates in proteolysis and antigen presentation in connection with MHC class I and class II antigen presentation pathways established for CD4 and CD8 T cells. In considering the use of envelope proteins as potential vaccines, the route of immunization, formulation (e.g., adjuvants), protein folding, disulfide bonding, and glycosylation pattern all determine which peptides are available for MHC-restricted presentation.
Previous studies provided evidence that gp120 was sensitive to digestion by cathepsins B, D, and L, but the specific cleavage sites were not defined (18). In the present study, we (i) describe the locations of eight protease cleavage sites on HIV-1 gp120 recognized by cathepsins L, S, and D, involved in antigen processing; (ii) determine the extent to which they are conserved; and (iii) evaluate the effect of cathepsin cleavage on the binding of gp120 to CD4-IgG and neutralizing antibodies. The results obtained provide new insights into the basis of envelope immunogenicity that may prove to be useful in the development of HIV vaccine antigens.
Recombinant gp120 from the MN strain of HIV-1 (MN-rgp120) was produced in Chinese hamster ovary (CHO) cells by Genentech, Inc. (South San Francisco, CA). MN-rgp120 was a major component of the candidate HIV vaccine AIDSVAX B/B (32). This protein differs from native gp120 in that the first 12 amino acids from the mature form of gp120 were deleted and replaced with a 30-amino-acid flag epitope from the amino terminus of herpes simplex virus (HSV) glycoprotein D. Purified human cathepsins L, S, and D as well as the cathepsin L and D inhibitors N-acetyl-Leu-Leu-methional calpain inhibitor II (ALLM) and pepstatin A were obtained from Biomol (Philadelphia, PA).
The broadly neutralizing, CD4-blocking monoclonal antibody (MAb) b12 (12, 70) was obtained from Polymun Scientific (Vienna, Austria). The virus entry inhibitor CD4-IgG (13) and MAbs to the MN and IIIB strains of HIV, reactive with the V3 domain (1026), the C4 domain (13H8), and the V2 domain (6E10 and 1088), were obtained from Genentech, Inc. (South San Francisco, CA), and have been described previously (66, 67). The neutralizing MAb 447-52D (24) was obtained through the AIDS Research and Reference Reagent Program, Division of AIDS, NIAID, NIH, from Susan Zolla-Pazner. Polyclonal antibody D7324 was purchased from Aalto Bio Reagents Ltd. (Dublin, Ireland). Horseradish peroxidase (HRP)-labeled goat anti-human IgG and goat anti-mouse IgG+M were obtained from American Qualex Antibodies (San Clemente, CA).
Fifty micrograms of MN-rgp120 in 25 μl of 100 mM sodium acetate, pH 5.5, digestion buffer was mixed with 0.5 μg cathepsin L (protease-to-protein ratio, 1:100). The reaction mix was incubated at 37°C. Aliquots of 3 μl were taken at 15 min, 30 min, 60 min, 2 h, 4 h, and 7 h, and the digestion was stopped by rapid cooling in liquid nitrogen. An additional 3-μl aliquot was taken after overnight incubation at room temperature. The aliquots from cathepsin L digestion were mixed with 3 μl of 3× reducing polyacrylamide gel electrophoresis (PAGE) sample buffer (5% SDS, 5% β-mercaptoethanol, 40% glycerol, and 200 mM Tris, pH 6.8) and boiled for 2 min. The collected samples were run in two 4 to 12% Bis-Tris precast gels (Invitrogen, Carlsbad, CA). Digested fragments were visualized either by direct Coomassie blue staining or on immunoblots after electrophoresis onto a polyvinylidene difluoride (PVDF) membrane (Millipore Immobilon PSQ). For sequencing of peptides on PVDF membranes, bands were cut out and transferred to the Molecular Structure Facility at the University of California, Davis, for N-terminal protein sequencing by Edman degradation. The same experimental procedure was carried out for cathepsin S and D digestion, except for the digestion buffer, which was 50 mM sodium phosphate, pH 6.5, with 50 mM sodium chloride for cathepsin S and 100 mM sodium acetate, pH 3.3, for cathepsin D.
To prepare cathepsin L-digested MN-rgp120 for enzyme-linked immunosorbent assay (ELISA), 25 μg of MN-rgp120 in 50 μl of 100 mM sodium acetate, pH 5.5, digestion buffer was mixed with 1 μg cathepsin L (protease-to-protein ratio, 1:25) at 37°C for overnight incubation, followed by 1 μl ALLM (25 mg/ml in dimethyl sulfoxide [DMSO]) solution to stop the digestion reaction. For cathepsin D-treated MN-rgp120, 25 μg of MN-rgp120 was mixed with 1 μg of cathepsin D in 50 μl buffer (100 mM sodium acetate, pH 3.3) at 37°C for 1 h. Pepstatin A (1 μl at 25 mg/ml in DMSO) solution was added to stop cathepsin D activity.
Wells of microtiter plates (Immulon II; Thermo-Fisher Scientific) were coated with 100 μl of polyclonal antibody D7324 solution (2 μg/ml in phosphate-buffered saline [PBS]) overnight at 4°C. The wells were blocked with 200 μl of blocking buffer (1% bovine serum albumin [BSA] in PBS) and incubated at 37°C for 1 h. After rinsing of the plates with washing buffer (0.05% Tween 20 in PBS), 100 μl of cathepsin L-treated, cathepsin D-treated, or undigested MN-rgp120 solution was added to each well (2 μg/ml in blocking buffer) and incubated for 1 h at 37°C. After washing of the plates, MAbs and CD4-IgG were added and fivefold serial dilutions were carried out, starting with 25 μg/ml of MAb b12 and 5 μg/ml of all other MAbs and CD4-IgG, and incubated for 1 h at 37°C. After washing of the plates three times with washing buffer, 100 μl of HRP-labeled goat anti-human IgG or goat anti-mouse IgG+M solution (diluted 1:10,000 in blocking buffer) was added and incubated for 1 h at 37°C. Finally, after plate washing, 100 μl of 0.4-mg/ml o-phenylenediamine dihydrochloride (Sigma-Aldrich Chemicals, St. Louis, MO) solution was added and incubated at room temperature for 10 min, followed by 100 μl of 3 M sulfuric acid to stop the reaction. The absorbance was measured by a SpectraMax 190 plate reader (Molecular Devices) at 490 nm.
Envelope glycoprotein sequences were obtained from various sources (described below) and aligned using MAFFT (52). The sequence for gp120 from the MN strain of HIV-1 used in these studies, MNGNE, differs from the sequence reported by Gurgo et al. (41) and has been published previously (71). To determine the locations of predicted cathepsin cleavage sites in MN-rgp120, we used the PoPs program developed by Boyd et al. (8), cleavage specificity algorithms for cathepsins L, S, and B generated by Choe et al. (19), and the cathepsin D recognition sequence of Scarborough et al. (85).
Three data sets were used to investigate the sequence conservation of cathepsin cleavage sites. The VAX004 data set (32) was obtained from the Global Solutions for Infectious Diseases (GSID) HIV data browser (http://www.gsid.org/), which includes 1,047 clade B envelope glycoprotein sequences from 349 individuals with recent HIV infections. A data set of acute and recent clade B infections containing 2,908 envelope glycoprotein sequences from 102 infected individuals was obtained from the studies of Keele et al. (53). Finally, a listing of clade-specific reference sequences, as well as a data set containing 1,766 envelope glycoprotein sequences from isolates collected worldwide at various undefined times after HIV infection, was obtained from the Los Alamos HIV Sequence Database (http://www.hiv.lanl.gov/). The sequences from all three databases were aligned using MAFFT (52).
Cathepsins L, S, and D are known to play an important role in antigen processing and presentation (28, 44, 45, 94). In initial studies, we used computational methods (see Materials and Methods) to determine whether gp120 was likely to possess cleavage sites recognized by cathepsins known to be important for antigen processing. For these studies, we examined sequences with the prediction algorithm (8) set for maximum stringency. The results of these studies (see Table S1 in the supplemental material) suggested that MN-rgp120 was likely to possess multiple cathepsin cleavage sites. However, because cathepsin cleavage sites are difficult to predict, we reasoned that actual protease digestion studies would be required to reliably identify the number and locations of these sites.
Initially, we examined the sensitivity of MN-rgp120 to digestion by cathepsin L. A time course experiment is shown in Fig. Fig.1A.1A. We found that cathepsin L digestion resulted in six proteolytic fragments. Because of their size, these fragments could not be analyzed by mass spectrometry but, rather, required analysis by Edman sequence degradation. The sizes, locations, and experimentally determined N-terminal sequences of the peptides isolated are shown in Table Table1.1. The scissile peptide bond represents that joining the P1 and P1′ residues, according to the nomenclature of Schechter and Berger (86). A listing of flanking residues likely to include the enzyme recognition residues and ranging from P4 to P4′ is provided in Table S2 in the supplemental material. We found that digestion with cathepsin L resulted in a 70-kDa fragment and a 50-kDa fragment that appeared within 15 min of treatment. Edman degradation showed that the first five amino acids in the N terminus of the 50-kDa fragment are GTIRQ, which revealed that the cleavage site is located between the K327 and G328 residues in the V3 domain. The N terminus of the 70-kDa fragment is derived from cleavage between the A3-L4 bond in the glycoprotein D flag epitope at the N terminus of MN-rgp120, resulting in the L4-A5-D6 N-terminal sequence. The kinetics of the appearance of these two fragments indicate that MN-rgp120 was first attacked at the V3 domain cleavage site, K327-G328, resulting in the 70-kDa fragment and the 50-kDa fragment. The 50-kDa fragment was subsequently degraded with longer digestion times to yield additional fragments (Fig. (Fig.1A1A and and2).2). Edman degradation confirmed that the resulting 45-kDa and 35-kDa fragments were originally from the 50-kDa fragment, because both included the same N-terminal sequence, GTIRQ, as that of the 50-kDa fragment. Although the C-terminal sequences of the 50-kDa, 45-kDa, and 35-kDa fragments are not known, at least two cathepsin L cleavage sites in the 50-kDa fragment are indicated, which result in the 45-kDa and 35-kDa fragments. The N-terminal amino acid sequences of the resulting 20-kDa fragment and 14-kDa fragment prove that there are two cathepsin cleavage sites within the C4 domain. The first four amino acids of the N termini of the 20-kDa fragment and the 14-kDa fragment are KAMY and APPI, respectively. Thus, two cathepsin L cleavage sites were identified, located at the G431-K432 and Y435-A436 bonds. However, because the molecular mass difference between the 20-kDa and 14-kDa fragments is about 6 kDa, while the N-terminal sequence difference between the 20-kDa and 14-kDa fragments is only four amino acids, we deduced that another cathepsin L cleavage site must be present between A436 and the C terminus of MN-rgp120.
We next examined the ability of cathepsin S to digest gp120, using the same methods. The results of a time course experiment are shown in Fig. Fig.1B.1B. It can be seen that six degradation products were visible on SDS-PAGE gels. The sizes of the peptides isolated, the N-terminal sequences, and the locations within gp120 are shown in Table Table1.1. Compared to cathepsin L digestion, the kinetics of cathepsin S digestion were much more rapid and indicated significantly increased sensitivity to cathepsin S. Six major digestion fragments appeared on the SDS-PAGE gel within 15 min of cathepsin S digestion, indicating greater exposure or accessibility of cathepsin S cleavage sites than of cathepsin L cleavage sites. Because of the rapid digestion by cathepsin S, it was not possible to determine whether there was a kinetically distinct, ordered degradation of gp120, as seems to be the case with cathepsin L. Rather, cathepsin S appears to follow a different digestion pathway, where the protease generates multiple fragments in a very short time frame. Analysis of six cathepsin S digestion fragments (60 kDa, 50 kDa, 38 kDa, 26 kDa, 18 kDa, and 12 kDa) identified four distinct cathepsin S cleavage sites (Fig. (Fig.2).2). The 26-kDa fragment was derived from cathepsin S. Two of the fragments were located in the C2 domain of gp120, between Q208-A209 (60-kDa fragment) and S261-T262 (50-kDa fragment). The third cathepsin S cleavage site occurred in the V3 domain and involved the bond joining T322 and T323 (38-kDa fragment). Finally, an additional cleavage site was located in the C4 domain and occurred at Y435-A436 (18-kDa and 12-kDa fragments), which is also a cathepsin L cleavage site. Fragments located N-terminal to the C2 domain were not recovered, suggesting that this region of the molecule contains multiple, as-yet-undefined cathepsin S cleavage sites. It is possible that some of these yield 3.5-kDa fragments, since the final 3.5-kDa band on the PAGE gels appeared to be heterogeneous, with multiple fragments migrating at the same position.
A complicated digestion pattern was observed in the digestion of MN-rgp120 with cathepsin D (Fig. (Fig.1C).1C). Eleven digestion fragments were visualized on the SDS-PAGE gel, but only eight fragments were able to be characterized by Edman degradation, due to heterogeneity in bands and/or their low abundance. Four fragments (55 kDa, 52 kDa, 30 kDa, and 12 kDa) share the N-terminal sequence VVIRS, which is located in the C2 domain, suggesting the cleavage site E274-V275. Based on differences in molecular masses, we deduced that additional cathepsin D cleavage sites occur in the V3, C3, V4, and C4 domains (Fig. (Fig.2).2). N-terminal sequencing of the 20-kDa and 70-kDa fragments indicated another cathepsin D cleavage site, in the V2 domain, at the bond between residues L181 and Y182. The 4-kDa and 5-kDa fragments were from the N terminus and shared the KYAL sequence derived from the HSV-1 gD flag epitope.
The locations of cathepsin cleavage sites relative to the disulfide bonds and conserved (C) and variable (V) regions of gp120 were mapped onto the two-dimensional structure of Leonard et al. (58) (Fig. (Fig.3)3) and the three-dimensional structure of Huang et al. (48) (Fig. (Fig.4).4). Also shown in Fig. Fig.33 are the locations of amino acids reported to be involved in receptor binding and the binding of neutralizing antibodies. In total, nine cathepsin cleavage sites were identified, with one in the N-terminal flag sequence and eight in gp120. One cleavage site was found in the V2 domain, three in the C2 domain, two in the V3 domain, and two in the C4 domain.
It was clear from these studies that the cathepsin cleavage sites are not randomly distributed. Remarkably, they appeared to cluster in regions of functional significance, often in close proximity to the binding sites for the CD4 and chemokine coreceptors and/or epitopes recognized by neutralizing antibodies (Table (Table1).1). For example, the cathepsin S cleavage sites in the C2 (Q208-A209) and C4 (Y435-A436) domains and the cathepsin L sites at G431-K432 and Y435-A436 are located in close proximity in the three-dimensional structure of gp120. The K432 residue and the Y435 residue are reported to be contact residues for chemokine receptor binding, and the Q208-A209 cleavage site is three amino acids away from K212, which is also thought to be a chemokine receptor contact residue (25, 78, 79). Additionally, the G431 residue is located within a string of eight amino acids (425 to 432) known to be contact residues for CD4 binding (55). G431 and K432, along with V429, are known to be contacts for CD4, chemokine receptors, and the broadly neutralizing b12 MAb (105). Two additional cathepsin cleavage sites occur in the C2 domain. Of these, position T262 in the S261-T262 cathepsin L cleavage site is known to be a contact residue for the broadly neutralizing b12 MAb (105), whereas the cathepsin D cleavage site (E274-V275) was the only cathepsin cleavage site that was not part of or adjacent to a receptor or neutralizing antibody binding site. Two cathepsin cleavage sites were identified in the V3 domain. The V3 domain is thought to be an important determinant of chemokine receptor usage (21, 101) and is known to possess epitopes recognized by a variety of neutralizing antibodies. A cathepsin S site (T322-T323) is located one amino acid away from the crown of the V3 loop containing the GPGRAF sequence, important for the binding of multiple neutralizing antibodies (24, 66, 81). The cathepsin L cleavage site at K327-G328 is four amino acids from the cathepsin S site between the stem and the crown of the V3 loop. Finally, a single cathepsin D site involving residues L181 and Y182 is located in the V2 domain. The V2 domain is known to possess multiple epitopes for neutralizing antibodies (62, 72) and contains the newly described receptor binding site for the α4β7 integrin (2). Interestingly, the L181-Y182 cleavage site is located one amino acid away from the LDI/LDV recognition sequence required for α4β7 binding to gp120.
An important question in these studies was to determine whether any of the cathepsin protease sites were conserved. In view of the high degree of sequence variation within HIV and the fact that the envelope protein is the most variable of all of the HIV proteins, it was uncertain whether any of the sites would be conserved. In initial studies, we aligned the MN-rgp120 and HXB2 gp120 sequences with 12 reference sequences: 2 from each of four major group clades, A, C, D, and E (crf A/E), plus 2 from the chimpanzee isolate HIVcpz and 2 simian immunodeficiency virus (SIV) sequences (SIVMac251 and SIVMac239). The results of this analysis are shown in Fig. S1 in the supplemental material, where both the locations and conservation of the sites recognized by cathepsins L, S, and D can be seen along with the locations of predicted cathepsin cleavage sites. This analysis of the residues occurring at the P1 and P1′ positions showed a high level of conservation at six of the eight cathepsin cleavage sites. Remarkably, two sites, including one cathepsin S site, S261-T262, in the C2 domain, and one cathepsin L site, G431-K432, in the C4 domain, were conserved in the reference strains of the major group HIV clades, the HIVcpz strains, and the SIV strains. A high level of conservation (~98%) was also noted at the Q208-A209 cleavage site in the C2 domain and the Y435-A436 site in the C4 domain. A somewhat lower (81 to 92%) level of conservation was also noted at the L181-Y182 site in the V2 domain; however, in this case, the MN strain is unusual in that F is replaced with L at position 181. Comparison of sequence alignments showed that in most cases the cathepsin L site I327-G328 is conserved; however, MN-rgp120 is unusual in this respect, with K rather than I at position 327. The highly conserved nature of these sites suggests that they are important for virus function or survival and thus have been preserved by positive selection across species and time.
To further explore the conservation of cathepsin cleavage sites, we examined three independent HIV sequence data sets. One data set (GSID HIV Sequence Database) included 1,047 gp120 sequences from 349 individuals with new and recent HIV infections (less than 6 months postinfection) from different cities throughout North America (32). A second data set was obtained from the studies of Keele et al. (53) and consisted of 2,908 sequences from 102 new and acute infections collected in the United States. The third HIV data set examined was the Los Alamos HIV Sequence Database, comprising 1,766 gp120 sequences collected from worldwide isolates that include sequences from the 1980s through the present, mostly from chronic HIV infections. The results of this analysis are presented in Table Table2.2. We found very high levels of conservation (i.e., >96%) for the Q208-A209 and S261-T262 cathepsin cleavage sites, located in the C2 domain, and for the G431-K432 and Y435-A436 cleavage sites, located in the C4 domain of gp120. In the case of the 431-432 cleavage site, a significant discrepancy was noted between the Los Alamos data set and the VAX004 and Keele data sets. Further analysis indicated that this result could be attributed to clade-specific polymorphism, as clade B viruses typically possessed K at position 432, while other clades typically possessed R or Q at this position. The L181-Y182 cleavage site in the V2 domain was less conserved (i.e., >80%); however, as noted earlier, the sequence of HIVMN was unusual in that F was replaced with L at position 181. However, this replacement probably preserved the cleavage site, since leucine or aromatic residues are known to be favored by cathepsin D (96).
Based on the locations of cathepsin cleavage sites at or near receptor binding sites and epitopes recognized by neutralizing antibodies, it was of interest to determine whether cathepsin cleavage actually affected the binding of antibodies to these sites. The binding of monoclonal antibodies to cathepsin-treated and untreated MN-rgp120 was investigated by ELISA (Fig. (Fig.5).5). One concern in performing this assay was the possibility that enzyme cleavage would release small peptide fragments that would not be captured on the microtiter plate. Examination of the protease cleavage sites in relation to the disulfide structure showed that proteolysis of the peptide backbone would not necessarily release multiple peptide fragments, since most would remain associated by virtue of disulfide bonds. Thus, treatment with cathepsin L should release only a small, 4-amino-acid peptide, K432 to Y435, from the C4 domain. Treatment with cathepsin D might split the molecule into two large fragments by virtue of the cleavage site located at position 274 in the C2 domain and might also result in the release of an undefined, 4- to 5-kDa fragment from the C1 domain. Treatment with cathepsin S should have the largest effect and should result in the loss of the C1, V1, V2, and C2 domains, rendering the assays difficult. For this reason, we studied antibody binding to only cathepsin L- and cathepsin D-treated molecules. The panel of MAbs used for this study included the 13H8 and 6E10 antibodies, which recognize the C4 and V2 domains, respectively, and the 1026 and 1088 antibodies, which recognize the V3 and V2 domains, respectively (66, 67). In addition, we included the V3 domain-specific MAb 447-52D (24), the broadly neutralizing, CD4-blocking MAb b12 (12, 70), and CD4-IgG (13). Of these, 6E10, b12, and CD4-IgG are known to bind to conformation-dependent sites involving sequences from several regions of the molecule.
Using a standard ELISA, we compared antibody binding to cathepsin L-treated and untreated MN-rgp120. The digestions ran to completion, as judged by the absence of intact gp120 when products were resolved by Coomassie blue-stained PAGE. We found that cathepsin L digestion of gp120 destroyed the ability to bind both the V3 domain-specific, virus-neutralizing 1026 and 447-52D MAbs and the C4 domain-specific, CD4-blocking 13H8 MAb. However, most of the binding of b12 and CD4-IgG was preserved, as well as much of the binding to the V2-specific antibodies 6E10 and 1088, although there appeared to be a significant reduction in binding affinity. This result can be explained by the fact that the two C4 sites, G431-K432 and Y435-A436, and one V3 site, K327-G328, are located in close proximity to the epitopes recognized by the 13H8, 1026, and 447-52D MAbs, while the V2 domain is remote from the cathepsin L site. A different pattern of binding was observed with cathepsin D-treated gp120. In these experiments, binding to 13H8, 447-52D, and 1026 was preserved, although there appeared to be some reduction in binding affinity of the 1026 MAb. However, there was a large reduction in binding of the b12 and 6E10 MAbs and CD4-IgG. The inability of cathepsin D treatment to inhibit the binding of the 13H8 MAb can be attributed to the fact that the cathepsin D cleavage sites are located in the V2 and C2 domains, remote from the conformation-independent 13H8 epitope in the C4 domain. The large decrease in binding affinity of the b12 MAb and CD4-IgG for cathepsin D-treated gp120 might be explained by the fact that sequences in the C2 domain are known to be important for maintaining the structure of the CD4 binding site and that binding of the b12 MAb is dependent on contact sites in this region (55, 105). The sensitivity of 6E10 to cathepsin D inhibition compared to that of 1088 may reflect the fact that 6E10 is a conformation-dependent epitope that requires sequences in the C2, V3, and C3 domains, whereas 1088 is a conformation-independent epitope (G. R. Nakamura et al., unpublished results). Preservation of significant binding to 1088 in cathepsin D-treated gp120 suggested that despite cleavage between residues 274 and 275 in the C2 domain, the N-terminal and C-terminal portions of the envelope protein remain bound together in a noncovalent complex. Together, these results demonstrate that cathepsin cleavage sites are located in regions of gp120 recognized by neutralizing MAbs and CD4-IgG and that cleavage by cathepsins L and D differentially alters antibody and CD4 binding to these sites.
In these studies, we identified the locations of cleavage sites on MN-rgp120 recognized by three proteases (cathepsins L, S, and D) thought to be important in antigen processing and presentation. We found that these sites were not randomly distributed but, rather, occurred in regions of the envelope glycoprotein known to possess receptor binding sites and epitopes recognized by neutralizing antibodies. Comparative sequence analysis showed that the majority of these sites were highly conserved in the major clades of HIV, with some being conserved in both the chimpanzee form of HIV and SIV. Finally, we showed that cleavage by cathepsins L and D diminished the binding of neutralizing antibodies and CD4-IgG. We found that none of the experimentally determined cathepsin cleavage sites matched the cathepsin cleavage sites predicted by enzyme cleavage site prediction programs (see Tables S1 and S2 and Fig. S1 in the supplemental material), thus emphasizing the need for experimental studies. To some extent, the ability to predict cathepsin cleavage sites has been limited by the availability of experimental data, as evidenced by the paucity of cathepsin cleavage data in the MEROPS Peptidase Database (76). Moreover, there is uncertainty as to the extent to which cathepsin recognition sequences extend upstream and downstream of the scissile bond. The listing of N-terminal and C-terminal flanking sequences for the sites defined in this study (see Tables S1 and S2 in the supplemental material) will contribute to our knowledge of cathepsin recognition motifs.
Remarkably, seven of the eight cathepsin cleavage sites in gp120 identified in this study were located in regions of the envelope protein known to be associated with receptor binding or the binding of neutralizing monoclonal antibodies. For example, the V2 domain is known to contain epitopes recognized by virus neutralizing antibodies (36, 39, 62, 100) and has been termed the global regulator of virus neutralization (72). Moreover, the L181-Y182 cathepsin D cleavage site is located just one amino acid away from the α4β7 receptor binding site (LDI/V) recently reported by Arthos et al. (2). The V3 domain is known as the principal neutralizing determinant, contains epitopes recognized by a variety of neutralizing antibodies (24, 66, 81), and is a key determinant of chemokine receptor tropism (14, 21, 101). Sequences in the C2 domain have been reported to be important for CD4 binding (55, 105), chemokine receptor binding (78, 79), and the binding of the broadly neutralizing b12 MAb (105). It is remarkable that one of the cathepsin S sites identified in the C2 domain is located adjacent to a chemokine receptor contact residue and that another is located at b12 MAb contact residues. The C4 domain is known to possess multiple contact residues for CD4 binding (55, 105), chemokine receptor binding (78, 79), and the binding of CD4-blocking neutralizing antibodies (56, 66, 67, 70). The importance of the CD4 binding site in antigen processing was noted previously (18, 95), where antibodies to the CD4 binding site inhibited cleavage by antigen processing enzymes and subsequent MHC class II antigen presentation. It is difficult to understand how this remarkable correspondence between receptor and neutralizing antibody binding sites and cathepsin cleavage sites could occur by chance. This is particularly significant in view of the fact that there are several regions in gp120 (C1, V1, C3, V4, and V5 domains) that appear to be devoid of cathepsin cleavage sites, indicating that cleavage sites are not randomly distributed.
The functional importance of the cathepsin cleavage sites identified above was further supported by the observation that six of the eight cathepsin cleavage sites in gp120 were highly conserved, with one (G431-K432 in the C4 domain) found in HIV, HIVcpz, and SIV and another (S261-T262) conserved in HIV and SIV. Previous studies have suggested that many viruses, including HIV, have evolved mechanisms to alter antigen processing to their advantage, as a way to escape or direct the immune response (103). Most of these mechanisms affect MHC class I-restricted cellular immune responses; however, mechanisms that alter MHC class II antigen presentation have also been reported (54).
HIV has developed a variety of mechanisms to evade the immune response. The hallmark of HIV infections is destruction of CD4 helper T cells, required to initiate and sustain effective antiviral immune responses (40) and to control virus replication (73, 80). Another mechanism to evade the immune response is the high level of sequence variation seen in all HIV proteins, particularly in the envelope protein, which typically exhibits multiple insertions and deletions. The virus also appears to have evolved epitope concealment mechanisms, especially in the unbound trimeric form of the envelope protein (prior to receptor ligation), that restrict access to antibody binding at neutralizing sites in the V3 domain, CD4 binding site, and membrane-proximal external region (MPER) (7, 34, 84, 90). Finally, the many N-linked glycosylation sites on gp120 are thought to form a protective “glycan shield” that provides yet another level of protection from the binding of neutralizing antibodies (98).
Our results suggest that HIV may have evolved another mechanism of immune escape involving insertion of protease cleavage sites in regions important for receptor binding and the binding of neutralizing antibodies. Cleavage at these sites may direct or modulate the immune response in a way that benefits survival of the virus. Because of the extraordinarily high level of sequence variation in HIV-1, resulting from high mutation and replication rates as well as immune selection, it is unlikely that these cleavage sites could be preserved unless they provided a significant fitness advantage for the virus. Recent studies by Tenzer et al. (92) suggested that the immunodominance of cytotoxic T-lymphocyte (CTL) epitopes is determined by proteasome digestion profiles and trimming by endoplasmic reticulum aminopeptidases. They further showed that CTL escape mutations involved amino acid substitutions that affected proteasome cleavages directly or sequences flanking cleavage sites in p17 and p24. The results from the present studies are consistent with the possibility that HIV might similarly regulate the immunodominance of MHC class II-restricted immune responses by tightly controlling proteolysis by antigen processing enzymes. The observation that the antigen processing sites are highly conserved is itself remarkable and consistent with this hypothesis. The additional observation that these sites are located in regions associated with receptor binding and neutralizing antibody binding is noteworthy and suggestive of important functional significance.
One potential explanation for the conservation of cathepsin cleavage sites at receptor binding sites is the fact that the receptor binding sites are among the few sites on virion-associated envelope proteins that are not protected by the glycan shield, and thus, they may be the only sites on the trimeric envelope protein accessible to proteases. However, it is unlikely that this can explain the data, since gp120 is readily shed from viruses and monomeric gp120 has multiple exposed regions that are not glycosylated. An alternative explanation may relate to an immune escape mechanism first described for poliovirus. Studies with poliovirus type 3 have shown that a major neutralizing epitope (antigenic site 1) contains a protease cleavage site and that cleavage at this site prevents the binding of neutralizing antibodies to an immunodominant epitope (63). It was suggested that this protease site may have evolved as a means by which the virus could escape from neutralizing antibodies directed to this site. The incorporation of protease cleavage sites at neutralizing epitopes, in effect, causes neutralizing epitopes to “self-destruct” after coming into contact with serum or cellular proteases. The effect of such cleavage could be to prevent either formation of neutralizing antibodies or binding of existing neutralizing antibodies to these sites. The possibility that key epitopes on the HIV envelope protein are labile and subject to destruction by extracellular proteases before they can stimulate antibody responses is intriguing. It could explain why it has been so difficult to elicit neutralizing antibodies with recombinant envelope proteins, despite the fact that they clearly possess the capacity to absorb bNAbs from HIV-positive sera (4, 59, 91). For this mechanism to be operative, two conditions must be met: cathepsin proteolysis must destroy the epitopes recognized by neutralizing antibodies, and the proteolysis must occur extracellularly, prior to binding of gp120 to antigen receptors on B cells. The antibody binding studies described in this paper showed that the binding to neutralizing antibodies and CD4-IgG was significantly reduced or eliminated by cathepsin cleavage, thus fulfilling the first condition. Since cathepsins are best known as lysosomal and endosomal enzymes, it was uncertain whether the second condition could be met. However, examination of the literature revealed that several cathepsins (e.g., cathepsins L, B, S, and K) can be secreted; these are known to play an important role in cancer biology, tissue remodeling, and inflammatory diseases (17, 57, 69, 102). The release of these enzymes has not been studied in the course of HIV infection or vaccination; however, cathepsin S has been reported to be secreted from activated macrophages (77) and is induced by treatment with adjuvants such as MF59 (65). Because cathepsin S is secreted and active at neutral pH and because cathepsin S sites are located at a variety of neutralizing epitopes in the C2, V3, and C4 domains, this enzyme appears to be a good candidate to mediate the destruction of neutralizing epitopes in gp120. While proteolysis of virion-associated envelope proteins would be expected to reduce virus infectivity, it is doubtful that this cleavage would be 100% effective. The high levels of plasma viremia and integrated provirus that occur in HIV infection would likely ensure that infection is sustained even if a large percentage of virus is inactivated by protease cleavage. Studies to assess the sensitivity of virions to cathepsin digestion are in progress. Several recent papers have suggested that HIV may enter cells by endocytosis as well as by virus-to-cell and cell-to-cell fusion (20, 49, 64). Although the relative importance of this pathway remains to be clarified, cleavage at the conserved cathepsin cleavage sites would be expected to occur upon virion uptake; however, the implications for modulation of the immune response by a virus-infected cell are uncertain.
While the role of cathepsins in MHC class II immune responses is undisputed, they may also play an important role in MHC class I responses to HIV. A variety of MHC class I-restricted CTL epitopes and MHC class II-restricted T-cell epitopes occur at “hot spots” that appear to be located in close proximity (see Fig. S2 in the supplemental material) to the cathepsin cleavage sites identified in this paper (9, 33, 104). These include the cathepsin S sites in the C2 domain (35), the cathepsin S and L sites in the V3 domain (1, 82), and the cathepsin L sites in the C4 domain (42, 93). The colocation of these CTL epitopes with the cathepsin cleavage sites identified in this paper may result from the TAP-independent “cross-presentation” pathway that has been documented for dendritic cells and macrophages and is known to require cathepsin S (88). This pathway enables proteasome-independent MHC class I-restricted presentation of peptides generated by cathepsin S cleavage.
Identification of antigen processing sites promises to provide a new understanding of the molecular basis of the immune response to HIV envelope glycoprotein. Inactivation or relocation of cleavage sites recognized by antigen-processing proteases may provide a new approach to refocus the specificity of humoral and cellular responses to viral proteins. Although the functional target of bNAbs is the membrane-bound trimeric envelope protein on the surfaces of virions, we do not know whether the bNAbs in HIV-positive sera are elicited against virion-associated trimers or against monomeric envelope proteins shed into humoral circulation (37, 38, 68). Since a large percentage of bNAbs in HIV-positive sera can be adsorbed with monomeric gp120 (4, 59, 91) and since neither disulfide bond-stabilized trimers nor monomeric gp120 is effective in eliciting bNAbs (3, 51, 60), it appears that factors other than the oligomerization state must limit the neutralizing antibody response to HIV. Proteolytic cleavage of HIV envelope proteins may provide an explanation for why neither of these forms of the envelope protein has been effective in eliciting bNAbs.
Proteases are estimated to represent ~2% of the genes in the human genome (75), and it would not be surprising that HIV has evolved strategies to use cellular proteases to its advantage. The studies described here will contribute to our understanding of the specificity of antiviral immune responses and will add to our knowledge of the role of proteases in HIV biology.
This work was supported by funding provided by the University of California, Santa Cruz.
We thank Ann Durbin for expert technical assistance in the preparation of the manuscript.
Published ahead of print on 25 November 2009.
†Supplemental material for this article may be found at http://jvi.asm.org/.