|Home | About | Journals | Submit | Contact Us | Français|
Although recent studies have shown that human genomes contain hundreds of loci that exhibit signatures of positive selection, variants that are associated with adaptation in energy-balance regulation remain elusive. We reasoned that the difficulty in identifying such variants could be due to heterogeneity in selection pressure and that an integrative approach that incorporated experiment-based evidence and population genetics-based statistical judgments would be needed to reveal important metabolic modifiers in humans.
To identify common metabolic modifiers that underlie phenotypic variation in diabetes-associated or obesity-associated traits in humans, or both, we screened 207 candidate loci for regulatory single nucleotide polymorphisms (SNPs) that exhibited evidence of gene–environmental interactions.
Three SNPs (rs3895874, rs3848460, and rs937301) at the 5′ gene region of human GIP were identified as prime metabolic-modifier candidates at the enteroinsular axis. Functional studies have shown that GIP promoter reporters carrying derived alleles of these three SNPs (haplotype GIP−1920A) have significantly lower transcriptional activities than those with ancestral alleles at corresponding positions (haplotype GIP−1920G). Consistently, studies of pregnant women who have undergone a screening test for gestational diabetes have shown that patients with a homozygous GIP−1920A/A genotype have significantly lower serum concentrations of glucose-dependent insulinotropic polypeptide (GIP) than those carrying an ancestral GIP−1920G haplotype. After controlling for a GIPR variation, we showed that serum glucose concentrations of patients carrying GIP−1920A/A homozygotes are significantly higher than that of those carrying an ancestral GIP−1920G haplotype (odds ratio 3.53).
Our proof-of-concept study indicates that common regulatory GIP variants impart a difference in GIP and glucose metabolism. The study also provides a rare example that identified the common variant-common phenotypic variation pattern based on evidence of moderate gene–environmental interactions.
Taking advantage of the availability of genome information from diverse human populations, recent studies have revealed that human genomes contain hundreds of loci that exhibit signatures of positive selection (1,2). A number of these variants have been shown to describe phenotypic variations in appearance, physiologic adaptations, and pathologic responses to diseases, thereby opening doors to human history, unexpected involvement of risk alleles, and novel prognosis power for select diseases (3–10). These recent findings on gene selection have partially affirmed the “common variants underlie common phenotypic variation” paradigm.
Complex multifactorial diseases such as diabetes and obesity have been hypothesized to be a reflection of maladaptation of previously advantageous alleles to current environments, and the divergent manifestation of different metabolic syndromes could be associated with adaptation in response to heterogeneous changes in human culture (11,12). Therefore, the revelation of adaptive variants that are associated with energy-balance regulation could reveal novel regulatory mechanisms underlying diverse metabolic syndromes and how such variants contribute to phenotypic variation. However, such variants could be subject to selection pressures that fluctuate over time or reverse in a short time in response to shifting human culture; in this way, these variants are more likely to be associated with incomplete signatures of selection (13,14). Thus, approaches for detecting signatures of complete selection may lack the power to detect metabolic modifiers that confer recent adaptations in energy-balance regulation or that are associated with phenotypic variation in diabetes- and obesity-related traits. We consequently reasoned that an integrative approach that embraces both wet-laboratory experiment-based evidence and statistical judgments of newly available population genetics data are needed to reveal important metabolic modifiers.
With this understanding, we applied two intersecting criteria to explore metabolic modifiers that have been subject to selection since Eurasians split from Africans. First, we generated a set of a priori diabetes-related and/or obesity-related candidate loci based on previous clinical investigations, genome-wide association (GWA) studies, and quantitative trait loci analyses (6,15–20).
Second, we conducted a variety of genomic scans of the candidate genic regions for evidence of integrative signals of positive selection. Genetic variants in genic regions are major focuses of the scans because most contributions of common variation to complex traits are regulatory or nonsynonymous in nature (21). In addition, we focused the analysis on variants that have a derived allele frequency >30% in the overall HapMap population because these variants presumably would affect most of the population and have a greater power for the detection of phenotypic association.
Using this two-step approach, we identified common variants in the 5′ gene region of CDKAL1, CYB5R4, GAD2, PPARG, and GIP as candidates for metabolic modifiers. Among these variants, those in GIP are of particular interest because they are highly linked to each other as well as to a functional nonsynonymous mutation (rs2291725) in exon 4 of GIP; that is, in a high linkage-disequilibrium (LD) block (22).
Glucose-dependent insulinotropic polypeptide (GIP) secreted from duodenal and jejunal K-cells is one of the two incretin hormones (i.e., glucagon-like peptide-1 [GLP-1] and GIP) that stimulate insulin release after food intake in humans (16). In addition, we know that patients with type 2 diabetes or in late pregnancy have a depressed β-cell response to GIP compared with healthy individuals, and GIP antagonism has been proposed as a strategy for the treatment of obesity (23). Thus, the characterization of potential modifiers resulting from prior GIP–environment interactions is relevant to a better understanding of molecular mechanisms that underlie diabetes and obesity as well as phenotypic variations that are associated with these diseases. Here, based on an integrative screening approach and proof-of-concept study of GIP variants, we show that it is possible to identify metabolic modifiers by studying genes that exhibit evidence of moderate gene–environmental interactions, and that regulatory GIP variants impart phenotypic variation in GIP response and glucose metabolism.
This study was approved by the institutional ethics committee review boards of Chang Gung Memorial Hospital Linkou Medical Center and Chang Gung University. We recruited 123 patients, and all patients gave written informed consent to participate in the study. The screening glucose challenge test for gestational diabetes was performed as previously described (24).
Genomic DNA samples of participants were extracted and purified from anticoagulated blood with the DNeasy Blood & Tissue Kit (Qiagen, Venlo, Netherlands). Genotyping of SNPs was performed using the TaqMan Validated SNP Genotyping Assays (Applied Biosystems, Foster City, CA). The genotyping analysis had a >97% success rate and >99% reproducibility.
Blood samples were collected from patients for hormone measurement 1 h after administration of the oral glucose tolerance test. Total GIP, insulin, and C-peptide levels in human serum were measured using sandwich ELISA kits (Millipore, Billerica, MA; Mercodia, Uppsala, Sweden; and Calbiotech, Spring Valley, CA).
A 2.15-kb fragment of human GIP promoter with the A allele at rs3895874, rs3809770, rs3848460, and rs937301 (−2073 bp to +77 bp) was chemically synthesized (Genescript Inc., Piscataway, NJ) and subcloned into the pGL4.2 luciferase reporter vector (Promega Corp., Madison, WI). Promoter fragments with ancestral haplotypes were obtained using the site-directed mutagenesis. In the promoter reporter study, each experiment was conducted at least three times with three or four replicates for each treatment.
Patients’ glucose challenge responses and serum hormone profiles were compared with the χ2 test or the Student t test with Welch correction. Ratios of patients with glucose levels exceeding the threshold were analyzed with the χ2 test. All P values were two-sided. All data were presented as mean ± SEM, and the statistical significance cutoff value was 0.05.
To systematically identify putative metabolic modifiers, we curated and screened SNPs in 207 gene loci that have previously been implicated in the regulation of diabetes-related and/or obesity-related traits using FST (Supplementary Table 1). The empiric distribution of the FST statistic has been used to detect genomic regions that have rapidly increased in frequency as a result of local selective pressures (27). Of these 207 genes, 59 carried genic variants with FST values in the top 5% bracket in comparisons between the corresponding HapMap II populations YRI (Yoruba from Ibadan), CEU (U.S. residents with northern and western European ancestry), and ASN (pooled samples of Chinese from Beijing [CHB] and Japanese from Tokyo [JPT]; Supplementary Table 1). The genic region in 29 of these 59 genes also contained another indication of local selection: long haplotypes in one of the HapMap populations.
On the basis of the presence of highly divergent allele frequencies between Eurasians and Africans, the presence of LD and extended haplotypes in Eurasians, and a >30% minor allele frequency in the overall Eurasian population, we identified seven SNPs in the 5′ gene region of CDKAL1, CYB5R4, GAD2, GIP, and PPARG as potential common metabolic modifiers (Table 1). In earlier GWA studies, select variants in CDKAL1, CYB5R4, GAD2, and PPARG were associated with type 2 diabetes-related or obesity-related traits (28–31). By contrast, there has been no report of linkage of GIP variants in GWA studies. Because the three GIP variants (rs3895874, rs3848460, and rs937301) are highly linked compared with those that appeared alone in other candidate genes (32), and because these GIP variants are partially linked with a nonsynonymous GIP SNP (rs2291725) (22), these GIP variants have a low likelihood of being false positive (Table 1). Hence, we focused subsequent functional analysis on GIP variants.
Fine mapping of the GIP locus showed that these variants are linked with more than three dozen neighboring SNPs (from rs9904761 to rs3895874 on chr17: 44,311–44,402 kb), and these linked SNPs exhibited FST values in the top 2–10% bracket in comparisons between YRI and ASN (Supplementary Fig. 1) (33). A plot of a 250-kb region of genotypes around GIP in the three HapMap II populations showed that whereas genotypes surrounding GIP in YRI exhibited a high degree of homozygosity for ancestral alleles (Fig. 1A, upper panel), in the same region, genotypes of ASN and CEU exhibited a high degree of homozygosity for the derived alleles (Fig. 1A, middle and lower panels). Consistently, analysis of genic SNPs at the GIP locus between 11 populations from the HapMap III project (34) showed that FST values were the highest for comparisons between East Asian and African populations (Fig. 1B). This result is consistent with our recent finding that a nonsynonymous variant in the exon IV of GIP (rs2291725) and a 71-kb haplotype block surrounding this variant were positively selected in Eurasians in the last 2,000 to 11,800 years (22).
Interestingly, analyses of LD and haplotype block diversity of the genomic region that encompassed the three GIP variants at the 5′ gene region showed that a 91-kb LD block was represented by five inferred haplotypes in CEU and four in and ASN chromosomes, respectively (Fig. 2; Supplementary Fig. 2, left panel; Supplementary Fig. 3, SNP No. 36–79). By contrast, the same region was represented by 58 haplotypes in YRI. In addition, these analyses showed that the high FST values and the major shift in allele frequencies of GIP variants at the 5′ gene region between Eurasian and African populations could be attributed to the increase of a derived haplotype (haplotype 50 in Fig. 2) from less than 5% in Africans to more than 50% in Eurasians. Because the three 5′ gene region variants are completely linked with the positively selected rs2291725 in ASN (22), we inferred that rs3895874, rs3848460, and rs937301 at the human GIP promoter region were positively selected in East Asian populations as well.
Importantly, we found that the 5′ gene region variants (i.e., rs3895874, rs3809770, rs3848460, and rs937301 at positions −1920, −1650, −1158, and −320 of GIP, respectively) are located in a haplotype block that is separated from the one containing the nonsynonymous variant rs2291725 by a hotspot for recombination in CEU and YRI (Supplementary Fig. 2, right panel, and Supplementary Fig. 3). In this haplotype block, a tetranucleotide polymorphism was represented by three inferred haplotypes in all three populations (referred to as derived GIP−1920AAAA, ancestral GIP−1920GAGG, and ancestral GIP−1920GGGG haplotypes in the following text; Table 2). The derived GIP−1920AAAA haplotype, which was found in 18.3% of YRI chromosomes, has become the dominant haplotype and has a frequency exceeding 50 and 75% in CEU and ASN, respectively (Table 2). By contrast, the dominant ancestral haplotype in YRI chromosomes (GIP−1920GAGG, 50.8%) was only found in 1.1% of ASN chromosomes. Therefore, these 5′ gene region variants and the nonsynonymous variant rs2291725 could be selected differentially in the overall HapMap populations and represent causal mutations independent of the nonsynonymous variant rs2291725.
Because only functional investigations can convincingly demonstrate causal mutations, we sought to obtain direct evidence that these 5′ gene region SNPs represent causal mutations for the population genetics observation. We constructed and tested GIP promoter reporters with each of the three major haplotypes in transfected human embryonic kidney (HEK) 293T cells (Fig. 3A). Measurement of promoter reporter activities showed that constructs with an ancestral haplotype (GIP−1920GAGG or GIP−1920GGGG) exhibited luciferase reporter activity 25–45% higher than that of a derived haplotype (GIP−1920AAAA, P < 0.01; Fig. 3B). Because the GIP promoter region contains elements that are important for regulation by transcriptional factors, including PAX6 and GATA4 (23), we also determined whether the GIP promoter reporter activity was haplotype-dependent in the presence of these transcription factors. As expected, coexpression of PAX6 or GATA4 increased the basal reporter activities by 1.5–2.5-fold and 0.7–0.8-fold, respectively (Fig. 3B). Importantly, we found that reporters with an ancestral haplotype consistently exhibited significantly higher activities than those containing the derived haplotype in the presence of PAX6 or GATA4 (P < 0.01). Together, these results suggested that derived alleles at rs3895874, rs3848460, and rs937301—but not rs3809770—represent functional mutations.
Given that earlier genome studies have not reported an association between GIP variants and any trait, we speculated that the partial selection of GIP variants in Eurasian populations could be associated with adaptation at a life stage that is vulnerable to environmental changes and that has not been specifically studied. Because pregnancy represents a critical life stage that subjects individuals to excessive metabolic load, and because its success has a major impact on reproductive fitness, we hypothesized that studies of phenotypic variation during pregnancy may provide a sensitive model to investigate the role of GIP haplotypes.
To test for an allelic effect of GIP variants, we studied East Asian patients for proof-of-concept testing because selection pressures are most likely ongoing in populations that exhibit the most significant evidence. We genotyped a panel of 123 unrelated Han-Chinese women who underwent a screening glucose challenge test for gestational diabetes during the 23rd to the 29th weeks of pregnancy, and these patients were assigned to three genotype groups (GIP−1920G/G, GIP−1920G/A, or GIP−1920A/A) based on alleles at rs3895874, rs3848460, and rs937301. The frequency of rs3895874 in these patients was similar to that of the ASN population and was in the Hardy–Weinberg equilibrium; frequencies of GIP−1920G/G, GIP−1920G/A, and GIP−1920A/A genotypes were 0.089, 0.480, and 0.431, respectively. In addition, alleles at rs3895874 in these patients were in absolute LD with those at rs3848460 and rs937301.
Measurements of serum GIP and glucose levels showed that circulating GIP and glucose at 1 h after the challenge test were 20.6–219.9 pg/mL, and 72–230 mg/dL, respectively. Consistent with in vitro promoter reporter assays, GIP levels in patients carrying the ancestral GIP−1920G haplotype (GIP−1920G/A heterozygote and GIP−1920G/G homozygote) were significantly higher than those with a homozygous GIP−1920A/A genotype (Fig. 4A, Table 3). By contrast, serum levels of glucose, insulin, and C-peptide, as well as age and BMI, were not significantly different among patients (Table 3).
Because two linked GIP receptor (GIPR) variants rs10423928 and rs1800437 (referred to as the GIPR1159C/G mutation in Table 3) have recently been shown to be associated with glucose and insulin levels after oral glucose challenge tests as well as the incretin effect in nondiabetic individuals in GWA studies (35,36), we also genotyped these variants and sought to isolate the potential confounding effect of GIPR variants. We found no association between GIPR SNPs and serum glucose or hormone levels, but levels of serum glucose, in addition to GIP, were significantly different between GIP−1920A/A homozygotes, and heterozygotes and the GIP−1920G/G homozygotes combined within the pool of patients with the dominant GIPR1159G/G genotype (Fig. 4B). Moreover, we noticed that after controlling for the variation at GIPR1159, the derived GIP−1920A/A genotype was associated with increased risk of having a glucose level that exceeded the 140 mg/dL threshold (odds ratio 3.53 [95% CI 1.25–9.92]; P = 0.015; Table 3). Among patients with a GIPR1159G/G genotype, 48.3% of patients with GIP−1920A/A homozygotes exhibited glucose levels that exceeded the threshold, whereas only 20.9% of the remaining patients did. Therefore, the homozygous GIP−1920A/A genotype could be associated with a reduced GIP response and reduced capability of maintaining glucose homeostasis.
On the basis of studies of signatures of selection, in vitro promoter assays, and glucose challenge tests in humans, we show that it is possible to identify causal variants related to energy-balance regulation by focusing on genic SNPs that were subject to environmental selection in a subset of candidate genes. Specifically, we demonstrated that GIP variants at the 5′ gene region represent metabolic modifiers that contribute to phenotypic variation in GIP response and glucose metabolism. Further characterization of these causal variants would open a new venue for understanding the molecular mechanisms that underlie phenotypic variations in energy-balance regulation and improve our ability to stratify and interpret clinical outcomes associated with the GIP signaling pathway.
For decades, adaptive selection was assumed to be rare; however, recent studies have demonstrated that adaptive substitution is pervasive in human genomes (1,7,37). Despite this progress, it is obvious that population differentiation characteristics of human genomes have yet to be fully explored because physiologic consequences of almost all of these past gene–environmental interactions remain to be verified experimentally (1,2). On the other hand, because many selection pressures could be heterogeneous or reversible in a short time, the signature of selection may have eroded in genes that were responsive to cultural selection pressures (12,38) compared with those shielded from heterogeneous selection (e.g., the adaptation to environmental oxygen levels and altitude) (5,39). We therefore reasoned that important metabolic modifiers could be hidden in the trove of SNPs that showed limited evidence of positive selection and that this limitation could be particularly pertinent to modifiers associated with adaptations in response to shifts of subsistence cultures.
Consistent with this hypothesis, a survey of earlier studies of genome-wide or chromosome-wide positive selection using the so-called outlier approaches—in which candidate loci are identified in the extreme tails of empiric distributions (40)—showed that GIP variants have not been reported as positively selected (1,7). The selection of GIP variants was inferred after we focused the analysis on local genomic regions and assessed the significance of integrated haplotype score using coalescent simulations (22). Therefore, our investigation provided a proof-of-concept study for identifying causal mutations that underlie phenotypic variation of complex disease-related traits. This approach could open new venues for improving the translation of common variant association signals into biologic mechanisms that underlie physiologic variability or disease risk.
Neel hypothesized that mismatches between prior adaptations and new environments, or a “conflict of adaptations,” could lead to changes in fitness or health risks (11). Because ancient variants could have been selected for the organism’s reproductive success but not for its health or longevity, the ancient alleles could confer disease risks as selection pressures change. Therefore, studies of positively selected variants that are associated with adaptations in energy-balance regulation could point not only to novel genotype–phenotype relationships but also to novel molecular mechanisms that mediate the potential phenotypic variation, thereby providing much-needed insight into how and which phenotypic variations in energy-balance regulation can be attributed to the selected variants. In support of the thrifty genotype hypothesis, human CAPN10 and house-mouse insulin genes have been shown to exhibit characteristics of adaptive evolution after the emergence of agricultural societies (41,42). Conceptually, the high GIP response associated with the ancestral GIP−1920G haplotype could have been a beneficial energy-conserving mechanism when the food supply was irregular. The ancestral haplotype could become deleterious in the last 10 millenniums as agricultural practice became widespread. One possible deleterious effect of the ancestral GIP−1920G haplotype in an environment that supplies abundant high-starch food resources is the hypersecretion of insulin and insulin resistance (43,44).
On the other hand, a reduced GIP response associated with the derived GIP−1920A haplotype could be protective by decreasing the extent of insulin secretion in the face of oversupply of energy inputs (45,46). In support of this speculation, it has been well documented that in the absence of modern medicine, diabetes-associated complications and, possibly, obesity posed detrimental effects on survival when human culture shifted (47), even though type 2 diabetes is generally considered a chronic disease in modern society. Alternatively, an elevated glucose level associated with the derived haplotype may have improved the survival of fetuses if the population faced serious famine—an event frequently experienced by agricultural societies (48)—despite the reality that an impaired glucose-tolerance response represents a risk to both the mother and fetus under normal circumstances. Moreover, the derived haplotype could have been beneficial by reducing the obesity-promoting effect of GIP (16). In vitro and in vivo studies have shown that GIP promotes fatty liver and other obesity-associated metabolic disorders, whereas GIP antagonists suppress lipid accumulation induced by a high-fat diet (23). Therefore, the derived GIP−1920A haplotype could be selected for its effects on the enteroadipocyte or the enteroinsular axis, or both.
Although we speculated that the derived GIP−1920A haplotype may have provided protective effects in famine-plagued agricultural societies, the observation that the derived haplotype has not been fixed in any population suggests that the selection of the derived GIP haplotype(s) (e.g., cycles of famine) could be opposed (balance selection) as populations experienced temporal changes in selection pressure (e.g., resumption of population growth with stable food supply). Alternatively, the derived GIP haplotype could simply be too young to become fixed, or the spread could be limited by the transgenerational effects associated with abnormal gestational glucose metabolism, which raise the risk of macrosomia and diabetes in the offspring (11).
Although an association between GIP variants and glucose-metabolism regulation has not been reported, GIPR variants were associated with glucose and insulin levels after challenge tests as well as with BMI in GWA studies that evaluated >29,000 individuals (35,36). The finding that patients with a homozygous GIP−1920A/A genotype have significantly higher glucose levels compared with those carrying an ancestral GIP−1920G haplotype within the pool of GIPR1159G/G homozygotes suggested that there is a confounding effect stemming from interactions of GIP and GIPR variants, and that GIP and GIPR variants represent novel markers for the stratification of the capability to maintain glucose homeostasis during pregnancy.
We also speculate that the significant results observed in pregnant women could be related to the fact that the success of pregnancy has a significant impact on reproductive fitness and that a major fraction of gene–environmental selections probably occurred before birth (49). Recent studies have corroborated this idea by showing that associations between many risk alleles and type 2 diabetes can be replicated with smaller sample sizes in patients with gestational diabetes mellitus (50,51).
Furthermore, given the evolutionary signatures at the GIP locus, the plausible molecular mechanism, and the significant results in East Asian women, we speculate that the GIP variant–mediated phenotypic divergence could also exist in most human populations. It is also important to note that the selection of GIP variants represents a unique example in which the selection process involves regulatory variants that alter the glucose-induced GIP response as well as a nonsynonymous variant that affects peptide bioactivity (22). Thus, the GIP signaling pathway could represent a hotspot for selection in recent human history and play an important role in the manifestation of phenotypic variation in energy-balance regulation among individuals.
In addition to GIP variants, our study identified several CDKAL1, CYB5R4, GAD2, and PPARG variants as potential metabolic modifiers. Recent studies have shown that variants in CDKAL1, PPARG, and more than two dozen genes are associated with glycemic traits in diabetic patients (10). Surprisingly, none of the CDKAL1, CYB5R4, GAD2, and PPARG variants identified here have been implicated in earlier GWA studies, which suggests that these variants could be related to novel energy-balance regulatory mechanisms that operate at certain life stages or under specific physiologic conditions that have not been specifically investigated. Future investigations of these variants could reveal additional metabolic modifiers that have arisen recently and their contributions to phenotypic variation in normal human physiology and metabolic syndrome–related traits.
In conclusion, our data demonstrated a strong association between regulatory GIP variants, and GIP response and glucose metabolism, reinforcing the indication of an important role of GIP signaling in diabetes-related traits from earlier GWA studies of GIPR. Importantly, our study also provided a novel approach to reveal metabolic modifiers by studying consequences of previous mismatches of physiologic capabilities and environments.
S.Y.T.H. received support from a National Institutes of Health award (DK-70652) and the Avon Foundation (02-2009-054). C.L.C. received support from the Chang Gung Memorial Hospital (CMRPG34002).
No potential conflicts of interest relevant to this article were reported.
C.L.C. was involved in DNA and functional testing, was responsible for the sample collection, and primarily wrote the manuscript. J.J.C. processed the raw genome data and performed haplotype analyses, and primarily wrote the manuscript. P.J.C. and H.Y.C. were responsible for the sample collection. S.Y.T.H. was involved in DNA and functional testing, primarily wrote the manuscript, and conceived and supervised the study.
The authors thank Shripa Patel (Stanford University Pan Facility), Wei Yi (Department of Obstetrics/Gynecology, Stanford University), and An Shine Chao and Yao Lung Chang (Department of Obstetrics/Gynecology, Chang Gung Memorial Hospital Linkou Medical Center) for technical assistance. The authors thank Drs. Aaron J.W. Hsueh and Renee A. Reijo Pera (Department of Obstetrics/Gynecology, Stanford University) for helpful comments on the manuscript. C.L.C. further expresses gratitude to Dr. Yung-Kuei Soong (Department of Obstetrics/Gynecology, Chang Gung Memorial Hospital, Taiwan). J.J.C. thanks Dr. Dmitri Petrov (Department of Biology, Stanford University) for his invaluable advice and long-lasting support.
This article contains Supplementary Data online at http://diabetes.diabetesjournals.org/lookup/suppl/doi:10.2337/db10-1331/-/DC1.