|Home | About | Journals | Submit | Contact Us | Français|
To overcome the problem of HIV-1 variability, candidate vaccine antigens have been designed to be composed of conserved elements of the HIV-1 proteome. Such candidate vaccines could be improved with a better understanding of both HIV-1 evolutionary constraints and the fitness cost of specific mutations. We evaluated the in vitro fitness cost of 23 mutations engineered in the HIV-1 subtype B Gag-p24 Center-of-Tree (COT) protein through fitness competition assays. While some mutations at conserved sites exacted a high fitness cost, as expected under the assumption that the most conserved residue confers the highest fitness, there was no overall strong relationship between sequence conservation and replicative capacity. By comparing sites that have evolved since the beginning of the epidemic to those that have remain unchanged, we found that sites that have evolved over time were more likely to correspond to HLA-associated sites and that their mutation had limited fitness costs. Our data showed no transcendent link between high conservation and high fitness cost, indicating that merely focusing on conserved segments of HIV-1 would not be sufficient for a successful vaccine strategy. Nonetheless, a subset of sites exacted a high fitness cost upon mutation—these sites have been under selective pressure to change since the beginning of the epidemic but have proved virtually nonmutable and could constitute preferred targets for vaccine design.
The extreme diversity of HIV-1 underlies some of the challenges in making an effective vaccine against HIV/AIDS. Within hosts, the propensity of mutant forms of the virus to emerge is echoed in its many mutational pathways of escape from antiretroviral drug and immune pressures—mutations that progressively permeate the population of HIV-1 strains circulating worldwide. Globally, a successful vaccine will have to control a growing variety of strains corresponding to multiple subtypes, ever-increasing circulating recombinant forms, and population-adapted viruses.
To cope with HIV-1 diversity in vaccine design, one approach is to omit variable segments of HIV-1 to focus on the most conserved elements of the HIV-1 genome (1–4). Conservation-restricted vaccines are designed to focus immune responses onto conserved elements of the HIV-1 proteome. Some of these candidate immunogens specifically exclude variable HIV-1 segments under the rationale that targeting variable, cycling segments could in effect cancel out the benefits of more durable responses against virtually nonmutable segments of HIV-1 (3). Antigens based on conserved elements of HIV-1 are being developed under the premise that mutations at conserved sites would have a high fitness cost, potentially driving the virus to less fit forms that would be better controlled by immune responses.
Escape mutations with associated fitness costs have been described, primarily in Gag, with the mutations T242N in the TW10 epitope (associated with HLA-B57 and B*5801) (5, 6), R264K in the KK10 epitope (B27) (7), and A163G in KF11 (B*5703) (8), as well as in the A*2501 epitopes QW11 and EW10 (9). Understanding of how well these cases translate at the population level has been sought with comparisons of the relative fitness, measured by replicative capacity or competitive growth assays in cell culture (10–14) of HIV-1 strains with gag-pro or pol inserts from both elite controllers and viremic subjects (15–20). While the growth in any particular cell culture does not fully recapitulate the environment for virus survival in vivo, the replicative capacity of viruses from controllers was lower than that of viruses with inserts from chronically infected individuals. This effect was more flagrant with inserts from acutely compared to chronically infected subjects and has been attributed in part to the presence of HLA-associated polymorphisms in controllers and the development of compensatory mutations that restore fitness over time (15, 20). These studies were performed using a subject-derived gene integrated in a common HIV-1 backbone (9, 15–24), and thus the results are contingent on mutations that are private to each subject's viruses and the replicative capacity of these viruses reflect a combination of mutations that differs from one subject to the next. More recently, infectious molecular clones have been developed from a number of subjects (25–29) and these infectious molecular clones provide the best opportunity to study interactions between different mutations and subject-specific mutational pathways that are evidenced during an HIV-1 infection.
We decided to use an alternate approach to explore the fitness cost of specific mutations—all mutations were tested independently in the context of HIV-1 subtype B Gag-p24 Center-of-Tree (COT) sequence (30); a COT, which is a centralized ancestral sequence, is more likely to retain covarying residues (not necessarily the case with consensus sequences) and is not biased toward outlier sequences (unlike the most recent common ancestor) (30, 31).
We focused on mutations at 23 residues in Gag-p24 to evaluate aspects of our initial design of a Gag-p24 “conserved-element” (CE) vaccine (3, 32, 33). This immunogen includes 7 segments that are composed of sites conserved in ~98% of circulating sequences, allowing one site per element to be more variable (“toggle” site) if the two residues together represent >99% of the known viral genetic variation. We tested the fitness cost of mutations occurring within CE sites and additional surrounding mutations—some chosen to potentially bridge conserved elements to create longer colinear elements, a potentially desirable feature of vaccine antigens. Our goal was to assess the fitness cost of mutations at sites corresponding to a range of conservation in circulating viruses, including sites known to escape due to cytotoxic T lymphocyte (CTL) pressure, to better assess the relationship between sequence conservation and viral fitness. Particularly, we wanted to interrogate the notion that the most conserved residues would incur the largest fitness cost upon mutation under the hypothesis that a consensus is the fittest sequence.
The Center-of-Tree (COT) Gag-p24 HIV-1 subtype B sequence (30) was placed in an NL4-3 backbone using the restriction sites BstEII and SfiI. Point mutations were engineered using the QuikChange XL site-directed mutagenesis kit (Stratagene). The complete HIV-1 coding regions of variant viruses were sequenced to confirm the presence of only the desired mutations.
CEMx174 cells and peripheral blood mononuclear cells (PBMC) were cultured in medium supplemented with 10% fetal bovine serum; all experiments were performed using frozen PBMC from a single donor. Titers were determined after propagation in CEMx174 cells using the method of Reed and Muench (34). Viral stocks were generated by transfection of HEK293T cells with 1 μg of plasmid DNA using Fugene (Roche). Supernatants were harvested 48 h after transfection, and frozen aliquots were stored at −80°C. The capsid concentration of the viral stocks was quantified by p24 enzyme-linked immunosorbent assay (ELISA).
Cells were infected with COT or variant viruses or both; mono and dual infections were done in triplicate (E. C. Lanxon-Cookson, J. V. Swain, S. Manocheewa, R. A. Smith, B. Maust, M. Kim, D. Westfall, M. Rolland, and J. I. Mullins, unpublished data). In brief, viruses were added at a multiplicity of infection (MOI) of 0.005 to 105 PBMC or 103 CEMx174 cells and washed 16 to 24 h postinfection. Viral production was monitored with p24 ELISA, and aliquots of supernatants were sampled for 6 (CEMx174) to 11 (PBMC) days. Viral RNA was extracted from supernatant aliquots, cDNA synthesis was done using SuperScript III (Invitrogen), and genes of interest were amplified by PCR and fully sequenced. Following propagation, viruses were again fully sequenced to verify that no reversion or additional mutations occurred. Values reported are the average of 3 replicates per experiment, and at least three experiments were performed for each competition. To quantitate the proportion of the variant and COT viruses, we measured the area under the peak on chromatograms using an in-house web tool (http://indra.mullins.microbiol.washington.edu/cgi-bin/chromatquant.cgi). The fitness of the variants relative to the COT viruses was calculated based on a mathematical model conforming to the definition of fitness in population biology using a linear regression method (least squares) based on constant exponential growth rates with multiple data points assessed during this period of growth as described by Wu and colleagues (35, 36), using days of culture as the unit for fitness calculations. Fitness estimates were calculated with the web tool developed by Hulin Wu's group (35, 36), using as input the proportion of each variant as measured based on peak heights on chromatograms.
HIV-1 gag nucleotide sequences were downloaded from the HIV Database (37) and curated to retain only phylogenetically unlinked full-length Gag coding sequences that were not hypermutated and had the correct open reading frames. A total of 995 HIV-1 group M sequences, including 411 subtype B sequences, were aligned. Additional HIV-1 subtype B alignments consisted of Gag-p24 nucleotide sequences sampled in the United States in the 1980s (n = 44) and 2000s (n = 459). These were derived from the United States to limit potential confounders linked to the geographic origin of the isolate (an alignment and phylogenetic tree are available as supplemental material). Tests for evidence of selection pressure were based on the measure of nonsynonymous − synonymous substitutions (dN − dS), in codon-aligned nucleotide sequences using MEME, FEL, and SLAC with the general reversible nucleotide substitution model implemented in HyPhy (38–40; http://www.hyphy.org/). For each amino acid (aa) site in Gag, we compared residue frequencies between U.S. sequences obtained in the 1980s and 2000s using the permutation procedure of the numerator-type t statistic of Gilbert, Wu, and Jobes (41) to compute an unadjusted P value for each position. To account for the multiplicity of hypothesis tests, the Holm-Bonferroni multiplicity adjustment procedure was applied. Signature sites with different residues over time are those below the adjusted P value threshold. (Sites above the P value threshold are considered nonsignature sites.)
We performed fitness competition assays between a chimeric NL4-3 virus encoding a Center-of-Tree (COT) Gag-p24 protein from clade B (Gag-p24-COT-B) and its variants mutated at sites of interest for conserved-element (CE) vaccine designs (3, 32, 33) that encapsulate segments of HIV-1 that are maximally conserved among HIV-1 group M sequences. The Gag-p24 CE vaccine is composed of seven segments at least 12 aa long, and each element includes one “toggle” amino acid site at which only one of two residues are found in >99% of the HIV-1 sequences in the HIV database. We created 24 viruses with amino acid substitutions corresponding to the CE toggles and additional sites, including those with known HLA associations. The substitutions corresponded to the second most common amino acid found at each site among circulating HIV-1 group M strains; the consensus residue was conserved in 63% to 100% of sequences among circulating HIV-1 subtype B sequences (Fig. 1).
Mutations at some highly conserved sites yielded no virus production: this was the case for the T186S, T190I, and F293S mutations, where the consensus residues were conserved in 97.1%, 97.8%, and 100% of circulating HIV-1 subtype B sequences, respectively. This was possibly due to a defect in particle formation, since an expression plasmid encoding only the Gag protein showed intracellular production of proteins detected by Western blotting, while no extracellular release was evidenced (unlike other mutated Gag proteins which retained the ability to form virus-like particles) (data not shown). In contrast, other mutations at highly conserved sites resulted in infectious viruses, including a substitution at a residue conserved in 99.3% of subtype B viruses (V159I).
Relative fitness was measured by performing dual infections and quantifying the proportion of p24-COT-B and each of its mutants at different times following infection. Repeated sampling of cell supernatants allowed monitoring of newly produced viral particles over the course of the experiment.
Fitness assays were performed both in CEMx74 cells and in PBMC. While the range of fitness estimates was wider with CEMx74 cells, there was a positive relationship between results from the two cell types (r2 = 0.70, P < 0.0001; Spearman's ρ = 0.91, P < 0.0001) (see Fig. S1 in the supplemental material). There were nonetheless some apparent discrepancies in the estimates of replicative capacities: the V159I substitution was associated with an increase in relative fitness (r) in CEMx74 cells (r = 1.37) but with no change in PBMC (r = 1.01). Conversely, a V-to-I change at position 143 was associated with a decrease in relative fitness in CEMx74 cells (r = 0.78) but with almost no change in PBMC (r = 0.96). It is interesting to note that it is the mutation Y301F that had the strongest fitness cost (among the viable mutants) in both cell types, with r = 0.36 in CEMx74 cells and r = 0.64 in PBMC. Table S1 in the supplemental material provides the relative fitness values and standard deviation for each mutant.
Among the viable mutants, the mutation Y301F corresponded to the most conserved site (found in 99.3% of circulating subtype B sequences) and showed the most drastic fitness cost. In contrast, mutations at sites where the second-most-frequent residue was found in at least 20% of circulating subtype B sequences (G248A, T280V, and R286K) did not exact a fitness cost on the virus and the mutants sometimes showed a relative fitness similar to or slightly higher than that of the unmutated virus—in 5 of 23 mutants tested in CEMx74 cells and in 7 of 27 mutants tested in PBMC. To place our results in perspective, the previously studied T242N mutation (associated with the B*57-TW10 epitope) (5, 6) had a fitness cost in the middle range of what we observed in our data set: the relative fitness value was 0.87 in CEMx74 cells (median, 0.85) and it was equal to the median (0.96) in PBMC.
Because consensus sequences correspond to the viruses that are the most frequently observed, the most frequent viruses can be considered to be the most fit. We wanted to test how this relationship translated at the amino acid level (i.e., whether the most conserved amino acid residues would exact the strongest fitness cost upon mutation). By analyzing competitive fitness as a function of the database frequency of the altered residues among circulating HIV-1 subtype B sequences, we did not observe the predicted strong positive relationship between sequence conservation and fitness cost. Figure 2 shows that the fitness impact was not significantly stronger when a conserved site residue was mutated either in PBMC (r2 = 0.18, P = 0.05; Spearman's ρ = −0.40, P = 0.06) or in CEM (r2 = 0.06, P = 0.30; Spearman's ρ = −0.38, P = 0.10). Similar results were obtained if we used the database frequency of the mutant amino acid or the switch in database frequency between the consensus and mutated residues instead of the database frequency of the consensus amino acid (data not shown). Because conserved residues in subtype B sequences are often conserved in group M sequences (including subtype C), we looked at the relationship between sequence conservation in group M sequences and viral fitness and also found no evidence of a relationship (in PBMC, r2 = 0.06 [P = 0.23] and Spearman's ρ = −0.13 [P = 0.55]; in CEM cells, r2 = 0.00 [P = 0.98] and Spearman's ρ = −0.11 [P = 0.65]).
Given that our results did not show an evident relationship between the population-level conservation of a mutated residue in Gag and the relative fitness of the mutated virus, we asked whether factors affecting the sequence conservation of HIV-1 could influence the expected positive relationship between sequence conservation and viral fitness. In particular, we sought to distinguish sites that have evolved since the beginning of the epidemic from sites that have shown no significant sign of HIV-1 evolution over the last 3 decades. We generated alignments of Gag-p24 sequences sampled in the United States in the 1980s (n = 44) and 2000s (n = 459). Phylogenetic analyses showed a star-like phylogeny without evidence of a temporal structure (see Fig. S2 in the supplemental material). Analysis of selection pressure showed that most of the 231 sites in Gag-p24 were significantly under negative selection, i.e., a purifying pressure to leave the residues unchanged (negative dN − dS with P < 0.05 for 194 sites based on FEL or 183 sites based on SLAC) (38, 40). In contrast, a minority of sites were under significant positive selective pressure: i.e., under pressure to mutate (positive dN − dS with P < 0.05 for 5 sites under FEL and 8 sites under SLAC). (MEME  identified 17 sites under diversifying selection.) By comparing amino acid sequences from the 1980s to those from 2000s, we identified 12 temporally specific signature sites that distinguished sequences from the two time periods; nine of them were under positive selection based on dN − dS estimates. At each signature site, the frequency of the consensus residue varied significantly since the beginning of the U.S. epidemic, and sometimes the consensus amino acid residue that was found in the 1980s was later replaced.
Temporal signature sites were more likely to correspond to HLA-associated sites (i.e., associated with CTL escape), as 10 of the 12 temporal signatures (83%) were known HLA-associated sites, while only 32 HLA-associated sites have been reported in the remainder of Gag-p24 (42) (i.e., 32 of 219 residues = 15%). Due to the diversification of HIV-1 subtype B since the beginning of the epidemic, the frequency of the consensus residue at each site would be expected to decrease over time, yet, the decrease in frequency of the consensus residue was significantly sharper for HLA-associated sites (n = 42) than for non-HLA-associated sites (n = 189): median of −4.56% versus −0.18%, respectively (Fig. 3, left panel).
Among the p24 mutations that we tested, seven sites were signature sites, i.e., the frequency of the consensus amino acid has changed since the 1980s: the frequency dropped by more than 20% at six sites, while one residue saw its frequency increase by 17.8% among circulating sequences. As with the larger data set, the decrease in frequency of the consensus residue of the tested sites was significantly sharper for HLA-associated sites (n = 13) compared to non-HLA-associated sites (n = 10): median of −12.10% versus −0.77%, respectively (Fig. 3, right panel).
We next interrogated the relationship between sequence conservation and relative fitness by focusing on signature sites. For the eight sites at which the frequency of the consensus residue had decreased by a median of 27% since the 1980s (consistent with the consensus residue being replaced by a CTL escape mutation), there was no or only a minor fitness cost associated with the mutations (median relative fitness = 0.99 in PBMC and 0.94 in CEM cells). In contrast, for the 15 sites with stable amino acid frequency over time, the associated fitness cost was significantly higher (median relative fitness = 0.94 in PBMC and 0.80 in CEM cells) (P = 0.06 in PBMC and P = 0.04 if zero values are included; P = 0.02 in CEM cells) (Fig. 4). Hence, sites that have shown signs of HIV-1 adaptation since the 1980s, which are likely to have occurred primarily as a result of HLA-mediated selection within CTL epitopes, were associated with only a slight fitness cost. Sites at which the consensus residue had remained stable over time showed more substantial fitness costs.
Our data better define the interrelationships between HIV-1 sequence conservation, CTL responses, and viral fitness in Gag-p24 and provide guidelines toward the design of a fitness-informed HIV-1 CE vaccine, specifically to delineate longer HIV-1 segments for CE vaccine constructs.
Under the hypothesis of a high fitness cost for mutations at conserved sites, our CE vaccine insert was designed by setting a strict sequence conservation threshold: i.e., only sites that are conserved in more than ~98% of circulating HIV-1 sequences were included. We reasoned that an efficacious vaccine must elicit responses toward HIV-1 segments that if mutated severely compromise viral viability. At the same time, the vaccine must not elicit responses against variable, immunodominant “decoy” epitopes, yet an intransigent 98% threshold means that only a small fraction of HIV-1 can be included in the CE insert. Our results allow us to empirically mollify our inclusion criteria and thereby extend CE segments based on the fitness cost of mutations to residues adjacent to CE. This can enable us to create longer CE vaccine candidates in order to potentially broaden vaccine-elicited responses.
Our results conformed to the relationship between HIV-1 sequence conservation and viral fitness in a more limited way than expected, as our data did not replicate at the amino acid level the assumption that the consensus is necessarily the most fit virus. While we did not see an overall strong relationship between sequence conservation and viral fitness for the 23 mutants that we tested, we observed that some mutations at very conserved sites had a high fitness cost. This had been noted previously (9; R. M. Troyer, J. McNevin, Y. Liu, R. W. Krizan, A. Abraha, D. M. Tebit, H. Zhao, S. Avila, M. A. Lobritz, M. J. McElrath, J. I. Mullins, and E. J. Arts, presented at the XVI International AIDS Conference, Toronto, Canada, 2006), and a recent study showed that rare HIV-1 subtype C Gag mutations had the greatest impact on replicative capacity (22).
Our results also show that HIV-1 Gag-p24 subtype B has evolved since the 1980s, with significant changes likely driven by CTL immune pressure, as previously described (43, 44). Although most HIV-1 p24 sites were under significant purifying pressure (i.e., under fitness constraints), thereby resulting in the conservation of the residues found at these sites, several sites in Gag-p24 reflect the imprinting of HLA-associated mutations when sequences from the 1980s are compared to contemporary circulating strains. We found that mutations at most HLA-associated sites did not overall have a high fitness impact on the virus, implying that findings of high fitness cost for certain HLA-associated mutations (e.g., in the TW10 or KK10 epitopes in Gag-p24) cannot be extrapolated to interpret the fitness cost of other escape mutations. Importantly, the fitness cost of mutations at sites that remained conserved was greater than the fitness cost at the evolved sites, implying that the fitness cost of mutations at these conserved sites may be too high for the mutations to be disseminated at the population level because they are probably transient within a host or occur only in the presence of a set of facilitating/compensatory mutations and thus are likely to revert rapidly upon transmission to a new host. In contrast, the CTL escape mutations that are progressively becoming imprinted in the HIV-1 genome had only modest effects on replicative fitness. Given that conserved segments of Gag-p24 are targeted by CTL responses (45) and are thus not conserved by virtue of being “invisible” to the host immune response, the less mutable sites may be important for inclusion in a conserved sequence vaccine, as an immune response toward these segments might be durable because there might be no costless path to escape for the virus as none has yet occurred widely. Our data suggest that conserved sequence vaccine strategies should focus on elements of HIV-1 that are not merely conserved but conserved despite being under selective pressure.
We thank Joshua Herbeck, Laura Heath, Wenjie Deng, and Brandon Maust for help with preparation of sequence alignments.
This work was supported by awards to J.I.M. from the Bill and Melinda Gates foundation (A39748) and the NIH (P01 AI057005 and AI47734).
Published ahead of print 6 March 2013
Supplemental material for this article may be found at http://dx.doi.org/10.1128/JVI.03033-12.