|Home | About | Journals | Submit | Contact Us | Français|
Haploinsufficiency of SIM1 is a cause of rare monogenic obesity. To assess the role of SIM1 in polygenic obesity, this gene was analyzed in the Pima Indian population, which has a high prevalence of obesity.
SIM1 was sequenced in 96 individuals. Variants (n = 46) were genotyped in a population-based sample of 3,250 full-heritage Pima Indians and in a separate replication sample of 2,944 predominately non–full-heritage subjects from the same community.
Variants spanning the upstream region of SIM1 through intron 8 were associated with BMI in the full-heritage Pima Indians, where the strongest associations (P ~ 10−4 to 10−6) were with common variants (risk allele frequency 0.61–0.67). The difference in mean BMI between individuals homozygous for the major allele compared with homozygotes for the minor allele was ~2.2 kg/m2 (P = 2 × 10−5 for rs3213541). These associations replicated in the separate sample of subjects from the same community (P = 5 × 10−3 for rs3213541). The strongest associations (P = 4 × 10−7, controlled for age, sex, birth year, and heritage) were seen in the combined sample (n = 6,194). The risk allele for obesity was more common in full-heritage Pimas than in the mixed-heritage subjects. Two variants (rs3734353 and rs3213541) were also genotyped in 1,275 severely obese and 1,395 lean control subjects of French European ancestry. The Pima risk alleles were the minor alleles in the European samples, and these variants did not display any significant association (P > 0.05).
Common variation in SIM1 is associated with BMI on a population level in Pima Indians where the risk allele is the major allele.
Obesity is a major cause of health disparities among minority populations. The few major reports of genes contributing to polygenic obesity have predominately focused on populations of European descent. The Pima Indians of Arizona have a high rate of obesity and, as a consequence, have an extraordinarily high prevalence of type 2 diabetes (1). To identify genetic variation that contributes to obesity in Pimas, genes that have a potential role in body weight regulation are being analyzed, and ~50 biologic candidate genes have been analyzed to date with no conclusive association with obesity reported up to now in this specific population. The most significant findings are with the transcription factor SIM1, which is the human homolog of the Drosophila single-minded (sim) developmental gene (2–4). SIM1 is essential for the development of the hypothalamic paraventricular nucleus (5), which plays a central role in the regulation of appetite and body weight (6). Several chromosomal deletions and a balanced translocation involving chromosome 6q16–6q21, which encompasses the SIM1 locus, have been identified in children with monogenic severe obesity (7–14). Mice haploinsufficient for SIM1 are hyperphagic and are sensitive to diet-induced obesity (15–17), whereas mice overexpressing SIM1 have decreased food intake (18,19). Therefore, based on the biological role of SIM1 and the human and mouse genetic studies, SIM1 was analyzed as a candidate gene for obesity in the Pima Indians of Arizona.
All subjects are part of an ongoing population-based longitudinal study of the etiology of type 2 diabetes in the Gila River Indian Community in Central Arizona, where most of the residents are Pima Indians (20). For the initial studies of SIM1, all individuals (n = 3,250) were selected who had available DNA and measures of diabetes and BMI after the age of 15 years and whose heritage was reported as full Pima and/or Tohono O'odham (a closely related tribe). For subsequent replication studies, all participants (n = 2,944) were selected who had available DNA and measures of BMI regardless of heritage (mixed-heritage replication set). Thus, most of the individuals in the replication set were not of full Pima heritage (on average their reported heritage was 1/2 Pima and 3/4 American Indian, which may include other tribes), and there were 72 residents who reported no Native American heritage.
However, 140 full-heritage Pima individuals whose DNA was not available when the initial full-heritage samples were collected were also included in the replication study. BMI was computed for all exams at age ≥15 years and the maximum BMI was used for analysis. Because BMI can be influenced by diabetes progression and disease treatment, the maximum BMI measured at a nondiabetic exam was additionally analyzed. Subjects who did not have a BMI recorded from a nondiabetic exam at age ≥15 years (i.e., developed diabetes before age 15 years or were first seen at our clinic when they were older and had already developed diabetes) were excluded from these analyses; therefore, the nondiabetic BMI analyses were restricted to 2,789 subjects for the full-heritage sample and 2,647 subjects for the replication sample. Thus, the combined sample including the initial full-heritage individuals and those included in the mixed-heritage replication study included 6,194 individuals in total and 5,436 who had been examined when nondiabetic. All subjects provided written informed consent before participation. This study was approved by the Institutional Review Board of the National Institute of Diabetes and Digestive and Kidney Diseases.
French obese children (BMI greater than the 97th percentile for sex and age) were recruited at the CNRS-UMR8090 Unit in Lille (n = 420), at the Children's Hospital, Toulouse (n = 92), and at the Trousseau Hospital (n = 90). The obese adult subgroup consisted of 673 morbidly obese (BMI ≥40 kg/m2) adults collected at the Department of Nutrition of the Hôtel Dieu Hospital in Paris or at the CNRS-UMR8090 Unit in Lille. The control group consisted of 1,395 nonobese (BMI <27 kg/m2) normoglycemic (fasting glycemia <5.56 mmol/l) French Caucasian adults pooled from four separate studies: 394 subjects were recruited at the CNRS-UMR8090, 265 were recruited by the Fleurbaix-Laventie Ville Santé study (21), 365 from the HAGUENEAU study (22), and 371 from the SUVIMAX study (23).
To identify novel variants, the coding regions, 5′- and 3′-untranslated regions (UTRs), 2.1 kb upstream of the promoter, and a 2-kb conserved region in intron 8 of SIM1 were sequenced in 96 obese Pima subjects (BMI 50–80 kg/m2). Sequencing was done using Big Dye terminator (Applied Biosystems) on an automated DNA capillary sequencer (model 3730; Applied Biosystems). In addition to the variants identified by sequencing (n = 16), 30 tag SNPs with a minor allele frequency ≥0.1 and an r2 ≥ 0.8 that cover the unsequenced intronic regions were selected from the HapMap Chinese (CHB) population. The 46 variants were genotyped in the full-heritage Pima Indians, and most of the variants (n = 43) were genotyped in the mixed-heritage replication sample using the SNPlex Genotyping System 48-plex (Applied Biosystems) on an automated DNA capillary sequencer (model 3730; Applied Biosystems). All Pima Indian genotypic data passed our quality control criteria, which require a successful genotypic call rate on >85% of the samples, a deviation from Hardy-Weinberg equilibrium of P > 0.001, and a blind duplicate genotyping of 330 samples with a discrepancy rate of <2.5%. The chromosomal locations and flanking sequence for the five novel variants (SIM1–1, Arg665His, Thr361Ile, SIM1–2, and SIM1–3) that were genotyped are shown in an online-only appendix (Table A1, available at http://diabetes.diabetesjournals.org/cgi/content/full/db09-0028/DC1). Variants rs3734353 and rs3213541 were genotyped in the French cohort using Taqman Technology (Applied Biosystems). Genotyping error rate calculated from duplicate genotypes of 250 individuals was 0% for both variants. In addition, the two SNPs were in Hardy-Weinberg equilibrium.
Statistical analyses were performed using the SAS of the SAS Institute (Cary, NC). For continuous variables, linear regression models were used to assess the association between BMI and genotype (assuming an additive model) with adjustment for covariates including age, sex, and birth year; the logarithmic transformation of BMI was taken in these analyses to reduce skewness. In the mixed-heritage replication group, the individual estimate of European admixture was also used as a covariate; these estimates were derived by the method of Hanis et al. (24) from 32 markers with large differences in allele frequency between populations (25). The generalized estimating equation procedure was used to account for family membership, since some subjects were siblings. Although the generalized estimating equation procedure accounts for the familial nature of the data, this test is not robust to population stratification. To provide a test that is robust to stratification, a modification of the method of Abecasis et al. (26) was used in which the association is partitioned into between- and within-family components.
A test of the equivalence of the between- and within-family effects provides a test of the null hypothesis of no population stratification effect for the marker in question. A combined test of association for the full-heritage and replication groups was conducted by the inverse variance method (27). Linkage disequilibrium (D′ and r 2) was analyzed using the Haploview program (Haploview, http://www.broad.mit.edu/mpg/haploview). Haplotype “blocks” were defined using the four-gamete method, with a gametic frequency >0.01 taken as indicative of significant recombination (28), and analyses of the association of haplotypes within each block with BMI were conducted. In these analyses, one to four variants were determined to define the common haplotypes (frequency >0.01) within each block. The probability that an individual carried one or two copies of each of the common haplotypes within a block was calculated by modification of the zero-recombinant haplotype method as previously described (29). The MLINK program (30) was used to assign a probability of carriage of a given haplotype, and these probabilities were analyzed in a fashion analogous to that for individual variants. An “exhaustive” analysis was conducted in which all common haplotypes for all possible combinations of one to four variants within each block were analyzed. For the French Caucasian cohort, tests for deviation from the Hardy-Weinberg equilibrium and for association were performed with the De Finetti program (http://ihg.gsf.de/cgi-bin/hw/hwa1.pl).
To identify novel variants in SIM1, all 11 exons, the 5′- and 3′-UTRs, 2.1 kb upstream of the 5′-UTR, and a 2-kb highly conserved region in intron 8 immediately adjacent to exon 8 were sequenced in 96 obese (BMI ≥50 kg/m2) Pima Indians. Sequencing of these regions identified several variants, including two previously identified missense substitutions in exon 9, Pro352Thr (rs3734354), and Ala371Val (rs3734355), and five novel polymorphisms (sequences are shown in online Table A1), which included two rare nonsynonymous amino acid changes in exons 9 and 11 (Thr361Ile and Arg665His, respectively). Variants (n = 16) identified by sequencing along with an additional 30 tag SNPs chosen from the dbSNP public database spanning 35 kb upstream of SIM1 to 25 kb downstream of SIM1 were genotyped in a population-based sample of 3,250 full-heritage Pima Indians (online Table A2). The 46 variants fell into seven haplotype blocks (A–G, defined by the four-gamete method; Fig. 1A). The two largest blocks were block B, which spans much of the 3′ region of SIM1, and block F, which spans half of intron 8 through the 5′ region flanking SIM1 (Fig. 1A). Several variants in blocks E and F were significantly associated with BMI in the full-heritage Pima Indians (P values ranging from 5 × 10−3 − 7 × 10−6; adjusted for age, sex, and birth year; Fig. 1B). These variants were associated with the maximum recorded BMI from any exam after the age of 15 years (n = 3,250) as well as the maximum BMI recorded at a nondiabetic exam after the age of 15 years (n = 2,789) (online Table A2; representative variants for blocks E and F are shown in Table 1). The differences in mean BMIs for the individuals homozygous for the major allele (M/M) versus individuals homozygous for the minor allele (m/m) for either analysis were ~2.2 kg/m2 (Table 1 and online Table A2). The four missense mutations, Pro352Thr (rs3734354), Thr361Ile, Ala371Val (rs3734355), and Arg665His, were not associated with BMI (online Table A2).
To assess whether the association with BMI could be replicated in a separate group of subjects, the variants were further genotyped in a population-based sample of individuals from the same longitudinal study, most of whom were of mixed heritage (n = 2,944, Pima heritage ranging from 0/8th to 8/8th). The variants in blocks E and F were reproducibly associated with maximum BMI from any exam, as well as maximum BMI from a nondiabetic exam in this group (mixed-heritage and mixed-heritage nondiabetic replication sets; online Table A2; representative variants are shown in Table 1). Combining the initial full-heritage Pima Indians set with the mixed-heritage replication set (n = 6,194) provided the strongest associations with BMI (e.g., rs3213541 P = 4 × 10−7; Table 1 and online Table A2). When the mean maximum BMIs based on genotypes for rs3213541 were stratified by age, an increase in BMI among the risk (G) allele carriers for rs3213541 was observed at nearly all ages (Fig. 2A), and this increase appears to be consistent for both men and women (Fig. 2B and C). There was no significant interaction with age, suggesting that the fluctuation observed in men (Fig. 2B) after the age of 45 years is likely due to a smaller sample size. The magnitude of the BMI difference is similar among individuals who are predominately Pima heritage (more than half Pima heritage) compared with individuals who are predominately of different heritage (less than half Pima heritage) (Fig. 3A and B, respectively), except at older ages (>45 years), where the number of subjects becomes small, making the values of mean BMI somewhat less reliable. However, the alleles associated with high BMI (risk alleles) are less common among the mixed-heritage individuals compared with the full-heritage Pima Indians (Table 1 and online Table A2). For example, the risk (G) allele for increased BMI for rs3213541 has a frequency of 0.62 among the full-heritage Pima subjects and 0.54 among mixed-heritage subjects. Among the 72 individuals for whom there was no reported American Indian heritage, the frequency of the rs3213541 G allele was 0.40. HapMap data for rs3213541 show that the G allele is the minor allele for all four populations (G allele frequencies: Caucasians, 0.36; Chinese, 0.34; Japanese, 0.44; and Africans, 0.13 [International HapMap Project]); therefore, the risk allele for obesity in Pima Indians is the major allele but it is the minor allele in non-Native American populations. Given the allele frequency differences of rs3213541 (and other variants in linkage disequilibrium) between American Indians and other populations, it is possible that these associations with BMI could be influenced by admixture even though our analyses controlled for heritage. Therefore, within-family association tests that are robust to population stratification were also done. The within-family analyses were less significant but consistent with the overall general associations (P = 0.04 − 0.1), while tests for population stratification were not statistically significant. Given the strong overall association, these results suggest that these associations are not solely the result of admixture.
There was little, if any, association of these variants with type 2 diabetes, with only a few variants achieving nominal statistical significance (P < 0.05; see online Table A2). Most of the nominal associations were no longer significant after controlling for BMI. For example, the diabetes odds ratio for rs3213541 is 1.09 per copy of the G allele (95% CI 0.99–1.19, P = 0.08); after control for BMI, the odds ratio is attenuated to 1.05 per copy of the G allele (95% CI 0.96–1.16, P = 0.31).
To examine whether any of the variants had an effect in addition to the most strongly associated variants, associations were further analyzed conditional on that observed for rs3213541 for each of the 45 other variants. For variants that were not highly concordant with rs3213541 (r2 < 0.78), the effect of rs3213541 remained significant (P < 0.001), whereas in most cases, the other variant was not significant (P > 0.05). The exceptions were the rare novel SIM1–2 variant located in intron 8 and the rare variant rs7766596, both of which remained significant despite controlling for rs3213541 (SIM1–2, P = 0.01, and rs7768342, P = 0.03); however, these associations are modest and may reflect chance findings. These results are consistent with the hypothesis that the primary association reflects the effect rs3213541 or a strongly concordant variant.
To assess whether a specific haplotype provided a stronger association than a single variant, tag SNPs that captured the common variation (r 2 > 0.8) within each of the seven haplotype blocks (Fig. 1A–G) were determined for the 46 variants spanning SIM1. The lowest P values obtained from tag SNP combinations within each block for the initial full-heritage set are listed in Table 2 and are also shown in Fig. 1B (plotted at the haplotype block's midpoint; black triangles). Consistent with the single variant analysis, the haplotypes providing the strongest associations with BMI were in haplotype blocks E and F (spanning intron 8 through the 5′ region of SIM1) and were highly concordant with rs3213451 (Table 2).
To determine whether common variation in SIM1 had a significant effect on obesity in a non–Native American population, two common variants (rs3734353 and rs3213541) were also genotyped in French Caucasian case/control samples consisting of 602 unrelated severely obese children, 673 unrelated morbidly obese adults, and 1,395 unrelated normoglycemic nonobese control subjects. The French Caucasians were selected for replication because in a prior genome-wide linkage study in this population, the most significant linkage for obesity was on chromosome 6q22.31–6q23.2, which contains the SIM1 locus (31). The two variants were in high linkage disequilibrium among the Caucasians (r 2 = 0.87 in HAPMAP), and neither were associated with obesity in the case/control samples (e.g., rs3734353: P = 0.70 for obese children/control subjects, P = 0.38 for obese adults/control subjects, P = 0.43 for obese children + obese adults/control subjects). The allele frequencies for these two variants differed between the two populations; for example, the risk allele (C) for rs3734353 is the major allele (frequency = 0.62) in Pima Indians but the minor allele (frequency = 0.30) in French Caucasians. However, the overall pattern of linkage disequilibrium across the SIM1 locus is quite similar between these two ethnic groups (online Fig. A1). Because of the different study designs, the results in the French Caucasian subjects are not directly comparable with those in the Pima Indians. To obtain a comparable OR estimate, the predominately Native American samples (full-heritage Pima Indian and mixed-heritage samples) were classified into case (BMI >40 kg/m2; n = 1,694) and control (nondiabetic and BMI <30 kg/m2; n = 1,272) subjects. With this classification, the OR for severe obesity is 1.26 per copy of the G allele for rs3213541 compared with the estimate of 0.95 for French Caucasian subjects, and Cochran's Q test for homogeneity indicates significant heterogeneity between Pima Indian and French subjects (P = 0.001). However, there was no statistically significant interaction with self-identified American Indian heritage among the Pima Indian families (P = 0.80).
In this study, noncoding variants spanning intron 8 through to the 5′ region of SIM1 showed the strongest association with BMI in full-heritage Pima Indians, and these associations also replicated in a group of mixed-heritage individuals from the same community. Meyre et al. (31) had previously reported linkage to obesity in French Caucasians in the region of chromosome 6 that includes the SIM1 locus, but the variants most strongly associated with BMI in Pima Indians did not replicate in the French population. Linkage to BMI on chromosome 6q was not identified in our prior genome-wide linkage scan in Pima Indians (32). Hung et al. (33) previously reported that the coding Ala371Val polymorphism (rs3734355) was modestly associated with BMI in Caucasian males; however, none of the coding variants—Pro352Thr (rs3734354), Ala371Val (rs3734355), Thr361Ile, and Arg665His—were associated with BMI in our study of Pima Indians. Ahituv et al. (34) also found no association between either the Ala371Val (rs3734355) or the Pro352Thr (rs3734354) in their case/control study for obesity in Caucasians.
Few genetic variants with reproducible associations with obesity have been identified, and most of those that have been identified have been studied primarily in European populations. The present study identifies several variants in SIM1 that are reproducibly associated with BMI in individuals from a longitudinal study, most of whom have Pima Indian heritage. The issue of statistical significance in genetic association studies is complicated. Given the potential for artifactual associations and the fact that most variants are unlikely a priori to be associated with a given trait, most statisticians recommend a stringent threshold for declaring an association “significant.” The exact P value threshold is a matter of controversy, but for genome-wide studies, a value in the vicinity of P < 5 × 10−8 (35,36) is generally considered significant. Whether the same level of stringency is required for a candidate gene, such as SIM1, is debatable, but the poor record of reproducibility for such variants suggests similar criteria may be appropriate. Such stringency, however, comes at the cost of potentially missing true associations, particularly in small populations in whom the potential for additional replication studies is limited. The present study achieves P values approaching levels of genome-wide significance for the association of SIM1 variants with BMI, with replication observed in two separate samples of Pima Indians, comprising >6,000 individuals in toto. The effect of these common variants on BMI is notable, where the difference in mean BMI by genotype is ~2.2 kg/m2 among the initial sample of full-heritage Pimas and ~2.4 kg/m2 in the combined sample of full-heritage and mixed-heritage subjects, an effect corresponding to ~1% of the variance in BMI in the population. In comparison, the difference in mean BMI by genotype for FTO, the most highly replicated gene for polygenic obesity identified to date, is ~1 kg/m2 among Caucasians (37) and 1.6 kg/m2 among Pimas (38).
Our results also reflect a genetic heterogeneity between the French Caucasians and the Pima Indians. Indeed, two common variants (rs3734353 and rs3213541) associated with obesity in the Pima Indians are not associated in French European subjects. Moreover, different allele frequencies are observed for these two variants among European and Pima Indian backgrounds (the major allele in Pima is the minor allele in Europeans), the obesity risk allele being more prevalent in the obesity-prone Pima Indian population.
Kublaoui et al. (17) examined the role of Sim1 in hyperphagic obesity in mice and reported that Sim1 and Mc4r are both expressed in the paraventricular nucleus, overexpression of Sim1 in agouti yellow (Ay) mice inhibited hyperphagia and reduced fat mass in the Ay/Sim1 transgenic mice (19), Sim1+/− mice had higher levels of food intake than wild-type mice after treatment with melanotan II (an α-melanocyte–stimulating hormone analog and melanocortin 4 receptor agonist) (17), and oxytocin mRNA and protein levels were notably reduced in Sim1+/− mice (39). Based on these findings, they proposed a simplified model for hyperphagic obesity that involves a signaling pathway from Mc4r to Sim1 to oxytocin in the paraventricular nucleus neurons (17,19,39).
In conclusion, SIM1 is emerging as a potential critical component of the MC4R signaling pathway known to regulate appetite in humans. The present study shows that common variation in this gene provides strong replicated associations with BMI in Pima Indians, where the common allele is the risk allele for obesity.
This study was supported by the intramural research program of the National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health. M.T. was supported by a grant from the American Diabetes Association.
No potential conflicts of interest relevant to this article were reported.
We gratefully acknowledge the volunteers from the Gila River Indian Community, whose cooperation made these studies possible.
The costs of publication of this article were defrayed in part by the payment of page charges. This article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.