In our study, HLA allele frequency was found to be inversely associated with HIV-1 terminal branch lengths for env, nef,
suggesting that rare HLA alleles may be associated with additional viral evolution in these HIV-1 genes. This supports previous reports suggesting that HIV escape variants can be transmitted1,27–31
and may become common in the interhost viral population.14
Furthermore, some escape variants can impart a replication cost to the virus3,9–12
and these variants may be lost during early stages of infection.1,32–34
In fact, the proportion of such events has been approximated in previous studies and ranges from 8 to 42%.32,34
Together these investigations suggest that the frequency of the HLA allele in the host population impacts the evolution of HIV by influencing the amount of escape and reversion during early stages of infection. Genetic drift also affects viral evolution but, by definition, it is not impacted by the immune response. The passage of time is associated with genetic drift and for this reason we adjusted for time in our multivariate analysis, reducing the potential for confounding due to this variable. Furthermore, the duration of infection will also impact the external branch lengths, due to genetic bottlenecking that takes place at the time of transmission. For this reason, we adjusted for CD4 count as a measure of disease progression. Neither genetic drift nor duration of infection was likely to be associated with HLA allele frequency, reducing the likelihood that these variables confounded our results.
One of the most daunting challenges in the design of an effective vaccine against HIV-1 is the enormous amount of genetic diversity observed among sequences identified worldwide. Developing a vaccine that targets elements common to all HIV-1 infections is a challenging task. One approach has been to identify sequences present in early infection with the hope that the factors selecting for the transmitted variants will result in common sequences or sequence patterns that can be incorporated into effective immunogens. An encouraging idea is that viruses undergo a procession toward an ancestral state of higher fitness prior to the evolution of escape mutations.35
A vaccine targeting this ancestral state, theoretically common to all infections, could prevent the virus from achieving high replication capacity very early after transmission, possibly resulting in eradication of the virus or reduced disease progression. Our study suggests that individuals with rare HLA alleles will have a greater number of evolutionary events, possibly due to increased reversions toward this ancestral state. Thus, sequences from individuals with rare HLA alleles may optimally assist in the identification of this ancestral state.
Viruses infecting individuals with rare HLA alleles would likely have both reduced replication capacity due to preexisting escape mutations as well as susceptibility to the immune response because the preexisting escape mutations would not protect against the immune response mediated by the rare HLA allele. This could lead to reduced viral load among those with rare HLA alleles. Supporting this hypothesis, two recent studies have identified an indirect association of viral load with the presence of transmitted escape variants selected for by discordant HLA earlier in the transmission chain.8,36
While we were able to detect an association between HLA frequency and viral evolution, no significant association was found in our analysis when HLA frequency was evaluated in relation to viral load. We may not have observed an association in this study because the difference in viral load among those with and without the escape variants may be most striking during the initial stages of infection. Because our study was cross-sectional, the majority of the study participants were likely to be in the asymptomatic stage of the infection. In addition, different HLA alleles have been shown to be associated both with increased and decreased disease progression,37
suggesting that the HLA-mediated immune response may have a complex set of effects, preventing the observation of an overall association of HLA-driven evolution with disease outcome. These and other unidentified misclassifications could explain why the clinical impact of a putative advantage of rare HLA alleles, possibly detectable in prospective analyses, may have been obscured in this cross-sectional study.
Although we evaluated the full HIV proteome, including all nine genes, no correction for multiple comparisons was performed because these genes were not independent, complicating the identification of an appropriate test. Given a p-value cutoff of 0.05 and the conservative Bonferroni method of correction for multiple comparisons, one would expect ≤1 association due to chance. Because our results identified three of nine associations, we conclude that our analysis provides significant evidence that population HLA class I allelic frequency has a substantial impact on viral evolution. Furthermore, this analysis evaluated a linear relationship between HLA frequency and HIV evolution as a logical first-order approximation and not necessarily the best approximation of the true relationship between these variables.
The three genes identified in this study are likely to reflect those with the strongest associations, allowing their identification in this analysis. It is not surprising to find Env evolution associated with HLA frequency, because this highly variable protein may allow substantial epitope escape in response to HLA-mediated immunity without substantial loss of viral fitness. Because Nef may be most targeted by the CTL response38
and may have the most amino acid variation driven by HLA selection pressure,19
it is also not surprising to find that Nef is likely to demonstrate differential evolution in response to varying CTL-based selection pressures. Finally, Pol, neither highly variable nor the most targeted protein, is an unanticipated result of this analysis. Pol is highly conserved and may not tolerate variation without loss of function, allowing variation only in the presence of strong CTL selection. In this case, a large proportion of the most recent evolution in Pol should be due to evolution toward a more fit form.
It remains unclear why we found associations with HLA-A only. However, this does not rule out the possibility that CTL responses mediated by HLA-B and HLA-C are not involved. Because the HLA genes are located in the same genomic region, they cosegregate, particularly HLA-A and HLA-B due to their proximity. In fact, two HLA alleles found to be associated with phylogenetic branch length also had significant linkage disequilibrium with HLA-B and HLA-C alleles (). Thus the associations with HLA-A may simply reflect linked associations with the other HLA genes.
There are several limitations to this study that may have prevented the identification of associations with some HIV-1 proteins. As mentioned earlier, this was a cross-sectional study and the precise time of infection was unknown. Longitudinal analysis including extensive viral sequencing during the acute stage of infection may be required to thoroughly evaluate viral evolution in response to HLA-mediated immunity. Because this ideal dataset is currently unavailable, our phylogenetic analysis was the best method to approximate those sequence changes. In addition, the external branch lengths we observed likely included evolutionary events that occurred in prior hosts, diluting our ability to identify an association between divergence and HLA frequency. Furthermore, HIV-1 evolves due to other factors, including other adaptive immune responses and protein function, possibly reducing associations evaluated with the CTL response. Finally, the sample size, although large for the extensive and high quality sequence data generated, may be too small to distinguish the impact of the CTL response on viral evolution in every HIV-1 gene when multiple factors are involved. These potential limitations are likely to bias our results toward the null hypothesis, making the identification of significant associations even less likely. Thus, our findings may be limited to only the strongest associations.
We have shown differential evolution of HIV-1 based on HLA frequency, suggesting additional HIV evolutionary events among individuals with rare HLA alleles. These findings support the idea that viral evolution undergoes a regressive path, reverting to states of greater fitness, prior to forward evolution driven by escape from the CTL response and other factors.35
These findings also suggest that CTL vaccine designs based on consensus sequences may include multiple escape variants to common HLA alleles, potentially reducing their effectiveness. Finally, focusing on infected individuals with rare HLA alleles may allow the identification of reversion variants that can be used as candidates in HIV-1 vaccine designs.