We hypothesized that the genetic variants found to be associated with HIV-1 VL control were subject to recent natural selection and population differentiation. This natural selection could have resulted in the observed population differences in HIV-1 pathogenicity among contemporary populations. We examined the most significant HLA and non-HLA variants associated with HIV-1 VL in GWAS among EA and AA, and found that the top associated loci in the HLA region are located in a sub-region of the HLA that shows very little relative differentiation between the considered sub-Saharan African group and other groups, compared to the relative levels of differentiation among the Eurasian groups considered. We also confirm that the patterns observed in this sub-region are not generalizable across all HLA sub-regions.
Considering all 53 populations, we find the greatest degree of differentiation at the rs2395029 locus for many pairwise comparisons with the Makrani and Sindhi in Pakistan, as well as the Bantu in Africa, and the Druze in Israel. Further studies are needed to confirm and gain a better understanding of the possible reasons for this pattern. Although we take into account a total of 253 SNPs in this locus, these results should be interpreted with some caution given the small sample sizes for some populations.
We find that the differentiation pattern observed in the HLA sub-region also applies to the non-HLA 'top hits', but only among those identified in the HIV-1 VL GWAS among AA. Averaging GSFST over the top ten regions shows a general trend of a smaller degree of differentiation between the sub-Saharan African group and other groups. The lack of differentiation that we observe between the sub-Saharan African and other groups at the HLA sub-region could be due to a high degree of conservation. It is plausible that as each population in Eurasia was subjected to unique selection pressures, a greater degree of differentiation occurred among Eurasian groups than between each of these groups and sub-Saharan Africans. These results should be interpreted with caution because some of the 'top hits' that we examine do not reach genome-wide statistical significance in the respective GWAS.
Paralleling the pattern in differentiation, we observed evidence of extreme relative extended haplotype homozygosity (REHH) among Eurasian groups but not among sub-Saharan Africans or Native Americans. On the basis of our finding of multiple haplotypes with extreme REHH and at relatively low frequencies (i.e. all below 37%), our results appear to be more consistent with a mode of recent evolution characterized by multiple soft sweeps as opposed to single hard sweeps [16
].They also suggest that the patterns of differentiation and REHH are not uniform and homogenous across all Eurasians. Instead, it is possible that population-specific patterns of genetic change, perhaps in response to region-specific selection pressures, resulted in unique localized adaptations in different Eurasian populations. Our REHH results suggest that the evidence for selection in the HLA among Eurasian groups is one in which several different haplotypes of low to moderate frequency have spread through populations. In this context, it should be noted that REHH may not be powerful enough to detect very low frequency haplotypes [17
]. Other tests of selection such as XP-EHH (using the HGDP browser: http://hgdp.uchicago.edu/cgi-bin/gbrowse/HGDP/
) do not appear to show similar trends as those we obtained using FST
and REHH. This may be due to the fact that XP-EHH is most powerful for cases of selection in which the selected haplotype reaches a very high frequency in one population but not another, which does not appear to be the situation for the loci that we have examined.
Among Europeans in the HGDP, all rs2395029 protective alleles (G) are on a haplotype that has high REHH, suggesting that this allele or one linked to it may have quickly arisen in response to a selective pressure such as an infectious disease in Europe. However, among South Asians, East Asians and Oceanians, it is the susceptibility allele at this SNP that is the allele present on the haplotype that has high REHH. The inconsistency of these results could signify, among several things, that there are many different polymorphisms in this region that could have functional significance, that the genetic basis of adaptation to similar pressures could be different among different groups, and finally, that the genetic basis of HIV-1 control could be different among different racial/ethnic groups.
Our findings are consistent with recent findings that signatures of selection among Europeans are enriched for immunity related genes in the HLA [18
]. In fact, Kudaravalli et al. [20
] recently found that SNPs associated with gene expression levels of HLA-C
also show evidence of selection in both Europeans and East Asians in the HapMap samples. Along with these and other studies (see [21
]), our findings suggest the presence of selection pressures on the immune system, possibly due to geographic, demographic, cultural, or environmental factors related to subsistence. For example, our fine-scaled analysis of all 53 HGDP populations shows that among the sub-Saharan populations, those populations that show elevated FST
between each other (Bantu, Mandenka, and Yoruba) are also the groups that have a history of practicing agriculture. It should be noted that we have not examined populations from East Africa, as these may show very unique patterns, especially considering that the frequency of the rs2395029 protective allele among the Maasai in the HapMap3 sample is quite elevated (14%), and that Henn et al. have found that this allele is also at a relatively high frequency among the Sandawe (17%) in East Africa, and that it exhibits evidence of recent selection in this group [22
While we have found evidence suggesting the action of natural selection on the genomic regions associated with HIV-1 VL among Eurasian populations, one has to be cautious in interpreting our results. Although in all of our analyses, we have controlled for the genomic background by comparing FST and REHH at the locus of interest to a distribution of other loci on the respective chromosome, we have not necessarily directly tested whether the patterns we observe are consistent with natural selection as opposed to more stochastic evolutionary forces such as genetic drift. Although it is difficult to distinguish selection from drift with certainty, the greater extended haplotype homozygosity among Eurasians coinciding with elevated differentiation among these groups makes the situation we have outlined (i.e. positive selection among Eurasians and balancing selection among some sub-Saharan Africans) more compelling.