|Home | About | Journals | Submit | Contact Us | Français|
Signatures of natural selection occur throughout the human genome and can be detected at the sequence level. We have re-sequenced ABCE1, a host candidate gene essential for HIV-1 capsid assembly, in European- (n=23) and African-descent (Yoruban; n=24) reference populations for genetic variation discovery. We identified an excess of rare genetic variation in Yoruban samples, and the resulting Tajima’s D was low (−2.27). The trend of excess rare variation persisted in flanking candidate genes ANAPC10 and OTUD4, suggesting that this pattern of positive selection can be detected across the 184.5kb examined on chromosome 4. Because of ABCE1’s role in HIV-1 replication, we re-sequenced the candidate gene in three small cohorts of HIV-1-infected or resistant individuals. We were able to confirm the excess of rare genetic variation among HIV-1 positive African-American individuals (n=53; Tajima’s D = −2.34). These results highlight the potential importance of ABCE1’s role in infectious diseases such as HIV-1.
The candidate gene ABCE1, located on chromosome 4q311, is a member of the ATP binding-cassette (ABC) family. ABCE1 has two main isoforms and is widely expressed1. Unlike other ABC family members, ABCE1 maintains an ATP-binding cassette lacking a transmembrane domain2. The product of ABCE1 was first described as an inhibitor of ribonuclease (RNase) L3, but more recently it has been shown to be involved in eukaryotic translational initiation4–9, and is essential in the assembly of immature HIV-1 capsids10, 11.
In human populations, little is known about the patterns of genetic variation in ABCE1. To date, four variation discovery efforts have been published for ABCE112–14; however, none provide a comprehensive reference of common variation across multiple populations or across both intronic and exonic sequence. Given the potential role ABCE1 has on HIV-1 replication, we, as part of the Program for Genomic Applications SeattleSNPs, re-sequenced both ABCE1 and the flanking sequence in 23 Centre d’Etude du Polymorphisme Humain samples (CEPHs) and 24 Yoruban samples using standard dye terminator sequencing technology15 to identify single nucleotide polymorphisms for future genetic association studies. In a parallel study approved by the University of Washington’s Human Subjects Review Committee, we also re-sequenced ABCE1 in three small populations ascertained for features of HIV-1 resistance or disease progression. In both the variation discovery dataset and the HIV-1 infection-related population dataset, we observed an excess of rare genetic variation in ABCE1 that extends to its neighboring genes ANAPC10 and OTUD4. These data suggest that the genomic region containing ABCE1 may have been a target for positive selection that has affected present day genetic diversity in human populations.
We successfully re-sequenced 30,773bp of ABCE1 (all introns and exons) as well as 1,870bp and 2,073bp of 5′ and 3′ flanking sequence, respectively, in 23 European-descent (CEPH) and 24 African-descent (Yoruban) samples. We identified 125 SNPs in these 47 reference DNA samples: 40 SNPs in the European-descent and 93 SNPs in the African-descent discovery panels (Table 1). Nucleotide diversity (π) was lower in the European- and African-descent discovery panels (5.5×10−4 and 7.6×10−4, respectively; Table 1) compared to the average nucleotide diversity based on 180 candidate genes re-sequenced in similar DNA discovery panels (7.0×10−4 and 9.0×10−4, respectively)16. In other words, based on 180 genes, one SNP is expected per 1,100 bps and 1,435 bps in African- and European-descent populations, respectively, compared to our observed rate of one SNP per 1,315 bps and 1,818 bps, respectively, in ABCE1.
In addition to the lower-than-expected nucleotide diversity, we observed low Tajima’s D statistics17, 18, a common test for deviation from the expected rate of mutation and genetic drift in the absence of selection, for both the European- and African-descent discovery panels (−1.36 and −2.27, respectively; Table 1). Based on 323 candidate genes re-sequenced in a similar DNA discovery panel, the average Tajima’s D is 0.14 and −0.49 in European- and African-descent populations, respectively16, 19. Lower Tajima’s D statistics have been observed for European-descent populations in candidate genes such as TRPV620, 21; however, lower Tajima’s D statistics have not yet been reported for African-descent populations for any one re-sequenced candidate gene19. The Tajima’s D statistic observed here is 2.65 standard deviations from the mean Tajima’s D statistic calculated for 323 candidate genes in African-descent samples, indicating that the pattern of ABCE1 genetic diversity is an extreme outlier and adding weight to the suggestive evidence of positive selection22.
Recent genome-wide screens for extreme Tajima’s D statistics based on Perlegen23, 24 data identified several genomic regions as extreme outliers in African-descent populations25; These analyses did not identify the region of chromosome 4 containing ABCE1 as an outlier of this statistic for either population. This is likely due to the ascertainment bias toward high-frequency alleles present in the Perlegen dataset 24, 26. Carlson and colleagues25 noted that this bias is evident from the higher mean value for Tajima’s D for 178 candidate genes in the Perlegen data (0.94 for African-descent) compared to the SeattleSNPs data (−0.54 for African-descent). For ABCE1, the Perlegen dataset only contains data for nine SNPs in for 23 African-American samples in contrast to the 93 SNPs in the 24 Yoruban samples described here (http://gvs.gs.washington.edu/GVS/). Of the nine SNPs in the Perlegen dataset, three are common (>5% minor allele frequency) with all three in high linkage disequilibrium with one another (r2=1 for all pair-wise combinations). Perlegen23, like HapMap27, is biased towards common variation, and this bias can impact statistics such as Tajima’s D which aims to summarize the natural allele frequency distribution of variations in the gene or region of interest.
Our variation discovery dataset, in contrast to the genome-wide datasets such as Perlegen23 and HapMap27, is based on re-sequencing that is less likely to be influenced by ascertainment bias28. The extreme low Tajima’s D statistics signal detected in this re-sequencing data may be due to positive selection but the possibility of population expansion cannot be formally excluded in this dataset. By comparison, it has been postulated that the signature of selection in and around TRPV6 in European-descent populations is more likely to be due to positive selection than population demography based on extensive simulations21. Also, that signature is observed only among European-descent populations, which suggests local adaptation21. Despite the uncertainty in the interpretation of the Tajima’s D statistic for ABCE1, it is notable that other host candidate genes associated with HIV-1, such as CCR5 (Tajima’s D = 2.2 in Europeans29) and APOBEC3G30, exhibit genetic signatures of natural selection in human populations. Indeed, pathogens in general are thought to have had a major impact on the landscape of the host genome over the course of human history31, 32.
Based on the preliminary evidence that ABCE1 may be subject to positive selection, and given its putative involvement in HIV-1 replication, we expanded our re-sequencing efforts to include clinical samples collected from volunteers enrolled in three separate study cohorts from the Seattle area: African-American HIV-1 positive individuals (n=53), HIV-1 positive, long-term non-progressing individuals (n=28), and HIV-1 high-risk seronegative individuals (n=10)33, 34. Overall, the pattern of genetic diversity in these populations was similar to that observed for the variation discovery dataset. That is, Tajima’s D is similar between the African-American HIV-1 infected individuals and the African population re-sequenced for SNP discovery, −2.34 and −2.27, respectively, (Table 1). The other patient populations also have negative Tajima’s D statistics that are consistent with a trend of to an excess of rare alleles despite the small samples sizes.
In addition to similar patterns of overall genetic diversity, both the HIV-1 infection-related populations and the SNP discovery panels have similar patterns of intronic and exonic diversity (Tables 1 and and2).2). Overall, we identified almost twice as many SNPs among the African-American HIV-1 infected individuals compared with the Yoruban sample (170 vs. 93), but this was expected given the sample size for the HIV-1 infected individuals was twice as large as the SNP discovery set (Table 1). In fact, of the SNPs not in common between these two sample sets, ~90% were rare (<5% minor allele frequency). Of the common SNPs not shared between the two African-descent samples, six were found only in the CEPH variation discovery and the African-American HIV-1 positive individuals datasets but not among the Yoruban sample dataset, suggesting that these variations represent a genetic admixture typical of African-descent populations ascertained in the United States35. Among the common SNPs shared between the two African-descent populations, only one (intronic rs34492893) was significantly less frequent among HIV-1 positive African-American samples (MAF=5%) compared to the samples from presumably healthy Yorubans (MAF=15%; p=0.05; Table 3). No differences in allele frequencies were observed when the CEPH data were compared with the data from the European-descent individuals of the long-term non-progressing cohort (data not shown). It is possible that the allele frequency difference observed for rs34492893 is due to statistical fluctuation from small sample sizes and/or to differences in European admixture in African-Americans and Yorubans36. Additional tests of association with larger sample sizes are needed to confirm this potential difference between HIV-1 positive and general population samples of African-descent.
For exonic diversity, no nonsynonymous variation was identified in re-sequencing any of the patient populations or the SNP discovery panels. We identified a total of five synonymous SNPs and eight diallelic variants in the untranslated regions (Table 2). All identified exonic variations had minor allele frequencies of 7% or less in their respective samples.
The genetic profile of ABCE1 observed among African-descent populations suggests positive selection, and it is intriguing to speculate that the gene’s function is associated with this selection event. It is possible, however, that ABCE1 is contained within the region that exhibits the signature of selection but is not the locus under selection. To better define the boundaries of the signature of selection on chromosome 4, we merged the SeattleSNPs discovery dataset with the NIEHS Environmental Genome Project discovery dataset37 and surveyed the genetic diversity of a 184.5kb region that contains the candidate genes ANAPC10, ABCE1, and OTUD4 in 12 overlapping Yoruban samples (Figure 1). ANAPC10 (anaphase promoting complex subunit 10) has been reported essential for mitosis38, 39. Recent expression studies suggest that ANAPC10 is highly expressed in glioblastoma endothelial cells compared with normal brain and other tissues40. Interestingly, the neighboring gene OTUD4 (OTU domain containing 4), was identified in a screen for HIV-associated chimeric provirus-host gene transcripts 41. Tajima’s D for this region in the Yoruban samples was low (−1.72), suggesting the signature for positive selection is contained within this expanded region surveyed. This region also exhibits relatively low levels of linkage disequilibrium (Figure 2), a finding which may be expected given the excess of rare variation.
Based on our variation discovery efforts, we show here that the genomic region containing ABCE1 has an excess of rare variation, resulting in a signature of natural selection in African-descent populations. The product of ABCE1 is highly conserved across species and is essential for life4. However, given that our data suggest positive selection across ABCE1 and surrounding genomic regions in African-descent compared to European-descent populations, it is possible that some factor separate from basic biological conservation is responsible for this signature. It is intriguing to speculate on the trigger of the selection event, given ABCE1’s role in HIV-1 assembly. Notably, an ABCE1 insertion/deletion variant (rs9333571) has been reported as associated with reduced HIV-1 replication13. This genetic variant was rare in its respective cohort (MAF=1%) and, as such, not identified in our small cohorts. Rare genetic variations such as this indel will require deep re-sequencing efforts in patient populations to identify associations with complex phenotypes42. These rare variations will likely be missed by current genome-wide association studies, which rely on common variation and linkage disequilibrium to detect associations for variations not directly genotyped in the experiment43.
The advent of HIV-1/AIDS is likely too recent in human history to have left a detectable footprint in present day populations, and we cannot formally exclude the possibility that the signature we are observing is due to recent population expansion. Further work with larger cohorts is needed to better understand potential sources of ABCE1 positive selection and to better characterize any possible association of rare variation within the ABCE1 gene in African-Americans with HIV-1 infection.
We would like to thank Dr. Jaisri Lingappa (University of Washington) for constructive comments, Devon Livingston-Rosanoff (VIDI) for technical assistance and Reneé Ireton for technical editing of the report. The work was funded by NIH grants 1 R21 A1073115-01 (JL), U01 HL66682 (DAN), U01 HL66642 (DAN), N01 ES-15478 (DAN), AI47806 (MJM), AI35605 (MJM), and AI057005 (MJM). ECS was supported by T32 grants AI007140 and GM0726 from National Institutes of Health, funding from the Seattle Chapter of ARCS (Achievement Rewards for College Scientists), and the Poncin Scholarship Fund. M.J.M. is a recipient of the Burroughs Wellcome Clinical Scientist Award for Translational Research. This publication/presentation/grant proposal was made possible with help from the University of Washington Center for AIDS Research (CFAR), an NIH funded program (P30 AI027757), which is supported by the following NIH Institutes and Centers (NIAID, NCI, NIMH, NIDA, NICHD, NHLBI, and NCCAM).
The authors have no conflict of interest to report.