We have tested the hypothesis that some groups of humans have recently experienced more evolutionary change at loci found to be associated with obesity and T2D compared to the rest of the genome. We have examined FST, a measure of population differentiation, and measures of shared extended haplotypes indicative of recent positive selection on new variation. Although our findings are not entirely consistent across tests, they have uncovered general as well as population- and gene-specific patterns.
First, with respect to the derived vs. ancestral status of the risk alleles, we find no evidence that the risk alleles tend to be either ancestral or derived for either the obesity or the T2D loci. We expected that if the thrifty genotype hypothesis applied specifically to the entire human species as an outlier among other primates, a majority of risk alleles would be derived. However, it is difficult to make any firm conclusions on the basis of this finding, since we are only considering markers that are still polymorphic in humans, and since the risk alleles that are reported in GWASs are unlikely to be the causative alleles, and are instead likely to only be associated with the causative variants.
Given the above-mentioned limitation, and the fact that these specific variants have been found to explain a very small proportion of the expected genetic variance (
Hofker et al., 2009;
Willer et al., 2009), we chose to examine average F
ST and haplotype patterns in the surrounding regions of each reported risk SNP (up to 800 kb in either direction). This enabled us to take into account more of the variation that is associated with any particular SNP, and may give some indication as to the timing and strength of selection. For example, depending on the population, an elevated GSF
ST that stretches over a long stretch of DNA may indicate more recent positive selection. Also, averaging F
ST over many SNPs in a region may be more a more sensitive approach given the highly variable nature of F
ST across neighboring loci.
We have found that the regions harboring T2D loci, as an ensemble, have experienced unusually high levels of differentiation compared to random regions of the genome, as assessed by the genotyped SNPs. Differentiation decays with distance as expected. Obesity loci, as an ensemble, also show unusually high levels of differentiation, but to a lesser extent than T2D loci. Our results further suggest that East Asians and sub-Saharan Africans have experienced higher levels of group-specific differentiation than other groups at the ensemble of T2D loci. We also find, as expected, that the degree of differentiation quickly decays with larger window sizes for sub-Saharan Africans, given overall reduced LD in these groups.
Pickrell et al. (2009) used the same dataset to examine the single SNP with the highest F
ST in each T2D-associated region (within a 100 kb window) and found that sub-Saharan Africans are significantly differentiated from East Asians and Europeans at these loci. Our results confirm this finding and also uncover a high degree of differentiation among East Asians at larger window sizes. Our results also confirm the results of Pickrell et al. and others (
Helgason et al., 2007;
Southam et al., 2009) that the loci TCF7L2, JAZF1, and TSPAN8 show signatures of natural selection
The reasons for a high degree of differentiation are usually interpreted as being due to natural selection. However, various types of natural selection could explain any given pattern of differentiation: purifying selection in one or several groups, or positive selection in one group but not others, or in all groups except for one. These could represent one of several evolutionary/historical scenarios. One is that there was selection either for or against insulin resistance among East Asians and sub-Saharan Africans. Another possibility is that East Asians and sub-Saharan Africans underwent a relaxation of selection pressures at these genes due to a diet that did not select for insulin resistance and gluconeogenisis.
Among the T2D-associated loci, we find that HHEX is the most strongly differentiated between groups, specifically between East Asians and other groups. HHEX (hematopoietically-expressed homeobox protein) encodes a transcriptional regulator involved in pancreatic development (
Bort et al., 2004). The risk allele has been found to be associated with reduced pancreatic β-cell function (
Pascoe et al., 2007), and there is evidence that HHEX belongs to a highly conserved “genomic regulatory block” (
Ragvin et al., 2010). Among sub-Saharan Africans, no single locus explains the overall T2D trend, suggesting that it is the effect of many moderately differentiated loci that contributes to the overall pattern for the ensemble of T2D loci.
Among the obesity loci, NEGR1 (Neuronal growth regulator 1) was found to be highly differentiated among sub-Saharan Africans. This gene has a role in neuronal outgrowth (
Schafer et al., 2005) and is highly expressed in the hypothalamus (
Willer et al., 2009). The region surrounding this allele has also been found to contain a large copy number polymorphism that could be a causal variant (
Willer et al., 2009).
Whereas FST is most powerful for detecting selection on already standing genetic variation, present on multiple haplotype backgrounds, becoming favoured in one geographic region, REHH is most powerful to detect recent strong positive selection on a novel mutation that has reached an intermediate haplotype frequency in the population. Our results show that there are differences among groups in the number of obesity loci that show evidence of recent positive selection according to the REHH test. It appears that South Asians and Europeans exhibit more such loci compared, most notably, to sub-Saharan Africans and Native Americans.
An interesting result from the F
ST and REHH tests is the case of the region near NCR3. The risk SNP is near both NCR3 (natural cytotoxicity triggering receptor 3 precursor) and AIF1 (allograft inflammatory factor 1). Being in the HLA region of chromosome 6, this region appears to be highly conserved among different human populations, since it shows very little differentiation compared to the rest of the genome (see
Supplementary Figure 2). However, this same region has among the strongest evidence of extended haplotypes among South and East Asian populations. In instances of extended haplotypes containing the risk SNP, we have determined that it is the risk allele that is contained in these haplotypes, suggesting that selection has recently favored variation in this region that enables individuals to avoid an energy deficit.
Finally, for the XP-EHH test, we do not observe major differences in signals between groups for the obesity loci. However for the T2D loci, we find that East Asians exhibit more evidence of recent positive selection, most notably at HHEX and THADA, a result that converges with the extreme differentiation that East Asians exhibit at this locus.
Although we do observe some overlap of genetic and geographic regions identified by the three tests considered (FST. REHH, XP-EHH), the lack of overlap could be due to several factors. Haplotype-based measures, such as those based on EHH, test for very restricted and likely rare set of cases of positive selection acting on newly arisen variation, that is relatively recent, and in which the haplotype quickly rises to an intermediate frequency in a given population (best seen by EHH) or a high frequency in one population but not another (best seen by XP-EHH). Therefore, the congruence of these various tests will depend on the type, timing, and strength of selection for each particular genetic and geographical region.
Our findings along with other published evidence appear to be slightly more consistent with the hypothesis that cycles of feast and famine were as or more severe among agricultural populations (
Benyshek et al., 2006;
Cordain et al., 1999). The REHH results among South Asians suggesting recent natural selection favoring obesity risk alleles is consistent with evidence of major famines in South Asia (
Wells, 2007). It may be that the adoption of agriculture, along with its associated features such as sedentary life-ways resulted in an inflexible over-reliance on a more highly variable food supply. We find that while Eurasian populations show REHH signatures of selection at several obesity loci, American and sub-Saharan African populations show signs of selection at only one locus each (). This is consistent with the fact that the relative isolated sub-Saharan and Native American populations adopted agriculture later than Eurasian populations. These findings should be interpreted with caution since the loci that we have examined have been associated with body weight only among European or European/derived populations. If agriculture did indeed select for thrifty genes, we are left with the puzzle of explaining why rates of obesity and T2D are relatively low among individuals of European ancestry, for example. It suggests that non-genetic factors could more readily explain population differences in obesity and T2D prevalence. These could be environmental factors that are not yet well understood (
Gravlee et al., 2009;
McAllister et al., 2009), including infectious agents (
Ley et al., 2006;
Vijay-Kumar et al., 2010;
Wells, 2009;
Whigham et al., 2006) that would perfectly track genetic admixture proportions.
A limitation of our findings is that we have tested whether these candidate loci are outliers compared to the rest of the genome with respect to population differentiation and extended haplotype homozygosity. It is presently difficult to determine with certainty whether such outlier loci are the result of natural selection, as opposed to other evolutionary forces such as genetic drift. Another limitation of our findings, as mentioned above, is that we have examined loci that have been found to be associated with these traits among Europeans. Although several GWASs have recently been conducted in other populations (
Cho et al., 2009;
Liu et al., 2010;
Tsai et al., 2010), there is still some uncertainty as to whether the same loci explain variation in these traits in different populations. If, as we have shown, there has been differentiation and selection at these loci, it may be that the genetic architecture of these traits is different in different populations. There may be loci that affect these traits in non-Europeans that we have not considered in this analysis. Our findings of greater evidence of thrifty genes among Eurasians may therefore be biased by the possibility that these loci are found to be associated in GWASs in Europeans, precisely because they underwent recent selection in those groups. It should also be noted that our results could be influenced by a subset of the several populations within each broader group that we are using.
In conclusion, our results have shown that genetic regions surrounding loci associated with T2D, and to a lesser extent, obesity, have been subject to unusually high levels of change in the last 50,000 to 100,000 years. Most notably, sub-Saharan Africans and East Asians appear to have undergone selection at T2D loci. Identifying specific targets of recent selection in the human genome can aid in determining population-specific risk variants, especially insofar as prevalence differences differ between populations (
Ayodo et al., 2007). We anticipate that future studies will be at a finer scale at both the population, genetic, and phenotypic level, potentially further elucidating the genetic basis of obesity and T2D, and the population-specific genetic or non-genetic mechanisms that lead to different rates, types, and consequences of obesity and T2D.