Sequencing and genotyping.
To identify novel variants in SIM1, all 11 exons, the 5′- and 3′-UTRs, 2.1 kb upstream of the 5′-UTR, and a 2-kb highly conserved region in intron 8 immediately adjacent to exon 8 were sequenced in 96 obese (BMI ≥50 kg/m2) Pima Indians. Sequencing of these regions identified several variants, including two previously identified missense substitutions in exon 9, Pro352Thr (rs3734354), and Ala371Val (rs3734355), and five novel polymorphisms (sequences are shown in online Table A1), which included two rare nonsynonymous amino acid changes in exons 9 and 11 (Thr361Ile and Arg665His, respectively). Variants (n = 16) identified by sequencing along with an additional 30 tag SNPs chosen from the dbSNP public database spanning 35 kb upstream of SIM1 to 25 kb downstream of SIM1 were genotyped in a population-based sample of 3,250 full-heritage Pima Indians (online Table A2). The 46 variants fell into seven haplotype blocks (A–G, defined by the four-gamete method; A). The two largest blocks were block B, which spans much of the 3′ region of SIM1, and block F, which spans half of intron 8 through the 5′ region flanking SIM1 (A). Several variants in blocks E and F were significantly associated with BMI in the full-heritage Pima Indians (P values ranging from 5 × 10−3 − 7 × 10−6; adjusted for age, sex, and birth year; B). These variants were associated with the maximum recorded BMI from any exam after the age of 15 years (n = 3,250) as well as the maximum BMI recorded at a nondiabetic exam after the age of 15 years (n = 2,789) (online Table A2; representative variants for blocks E and F are shown in ). The differences in mean BMIs for the individuals homozygous for the major allele (M/M) versus individuals homozygous for the minor allele (m/m) for either analysis were ~2.2 kg/m2 ( and online Table A2). The four missense mutations, Pro352Thr (rs3734354), Thr361Ile, Ala371Val (rs3734355), and Arg665His, were not associated with BMI (online Table A2).
FIG. 1. Linkage disequilibrium plot (A) and association analyses (B) for the 46 variants genotyped in SIM1 and adjacent 3′ and 5′ regions. A: Linkage disequilibrium is shown as D′. The 46 variants separate into seven haplotype blocks (A–G). (more ...)
Representative variants from linkage disequilibrium blocks E and F and their associations with BMI in the full-heritage Pima Indian, mixed-heritage replication, and combined (full-heritage + mixed-heritage) population sets
To assess whether the association with BMI could be replicated in a separate group of subjects, the variants were further genotyped in a population-based sample of individuals from the same longitudinal study, most of whom were of mixed heritage (n = 2,944, Pima heritage ranging from 0/8th to 8/8th). The variants in blocks E and F were reproducibly associated with maximum BMI from any exam, as well as maximum BMI from a nondiabetic exam in this group (mixed-heritage and mixed-heritage nondiabetic replication sets; online Table A2; representative variants are shown in ). Combining the initial full-heritage Pima Indians set with the mixed-heritage replication set (n = 6,194) provided the strongest associations with BMI (e.g., rs3213541 P = 4 × 10−7; and online Table A2). When the mean maximum BMIs based on genotypes for rs3213541 were stratified by age, an increase in BMI among the risk (G) allele carriers for rs3213541 was observed at nearly all ages (A), and this increase appears to be consistent for both men and women (B and C). There was no significant interaction with age, suggesting that the fluctuation observed in men (B) after the age of 45 years is likely due to a smaller sample size. The magnitude of the BMI difference is similar among individuals who are predominately Pima heritage (more than half Pima heritage) compared with individuals who are predominately of different heritage (less than half Pima heritage) (A and B, respectively), except at older ages (>45 years), where the number of subjects becomes small, making the values of mean BMI somewhat less reliable. However, the alleles associated with high BMI (risk alleles) are less common among the mixed-heritage individuals compared with the full-heritage Pima Indians ( and online Table A2). For example, the risk (G) allele for increased BMI for rs3213541 has a frequency of 0.62 among the full-heritage Pima subjects and 0.54 among mixed-heritage subjects. Among the 72 individuals for whom there was no reported American Indian heritage, the frequency of the rs3213541 G allele was 0.40. HapMap data for rs3213541 show that the G allele is the minor allele for all four populations (G allele frequencies: Caucasians, 0.36; Chinese, 0.34; Japanese, 0.44; and Africans, 0.13 [International HapMap Project]); therefore, the risk allele for obesity in Pima Indians is the major allele but it is the minor allele in non-Native American populations. Given the allele frequency differences of rs3213541 (and other variants in linkage disequilibrium) between American Indians and other populations, it is possible that these associations with BMI could be influenced by admixture even though our analyses controlled for heritage. Therefore, within-family association tests that are robust to population stratification were also done. The within-family analyses were less significant but consistent with the overall general associations (P = 0.04 − 0.1), while tests for population stratification were not statistically significant. Given the strong overall association, these results suggest that these associations are not solely the result of admixture.
FIG. 2. Mean maximum BMIs based on genotypes for rs3213541 in the combined (full-heritage + mixed-heritage) population set stratified by age (A) or age and sex (B and C). The BMI is the maximum BMI recorded within the specified age range. The number of subjects (more ...)
FIG. 3. Mean maximum BMIs based on genotypes for rs3213541 among subjects who are predominately Pima heritage (more than half Pima heritage) (A) and who are predominately of different heritage (less than half Pima heritage) (B) stratified by age. The BMI is the (more ...)
There was little, if any, association of these variants with type 2 diabetes, with only a few variants achieving nominal statistical significance (P < 0.05; see online Table A2). Most of the nominal associations were no longer significant after controlling for BMI. For example, the diabetes odds ratio for rs3213541 is 1.09 per copy of the G allele (95% CI 0.99–1.19, P = 0.08); after control for BMI, the odds ratio is attenuated to 1.05 per copy of the G allele (95% CI 0.96–1.16, P = 0.31).
To examine whether any of the variants had an effect in addition to the most strongly associated variants, associations were further analyzed conditional on that observed for rs3213541 for each of the 45 other variants. For variants that were not highly concordant with rs3213541 (r2 < 0.78), the effect of rs3213541 remained significant (P < 0.001), whereas in most cases, the other variant was not significant (P > 0.05). The exceptions were the rare novel SIM1–2 variant located in intron 8 and the rare variant rs7766596, both of which remained significant despite controlling for rs3213541 (SIM1–2, P = 0.01, and rs7768342, P = 0.03); however, these associations are modest and may reflect chance findings. These results are consistent with the hypothesis that the primary association reflects the effect rs3213541 or a strongly concordant variant.
To assess whether a specific haplotype provided a stronger association than a single variant, tag SNPs that captured the common variation (r 2 > 0.8) within each of the seven haplotype blocks (A–G) were determined for the 46 variants spanning SIM1. The lowest P values obtained from tag SNP combinations within each block for the initial full-heritage set are listed in and are also shown in B (plotted at the haplotype block's midpoint; black triangles). Consistent with the single variant analysis, the haplotypes providing the strongest associations with BMI were in haplotype blocks E and F (spanning intron 8 through the 5′ region of SIM1) and were highly concordant with rs3213451 ().
Tag SNP combinations providing the strongest association within each of seven haplotypes with BMI
To determine whether common variation in SIM1
had a significant effect on obesity in a non–Native American population, two common variants (rs3734353 and rs3213541) were also genotyped in French Caucasian case/control samples consisting of 602 unrelated severely obese children, 673 unrelated morbidly obese adults, and 1,395 unrelated normoglycemic nonobese control subjects. The French Caucasians were selected for replication because in a prior genome-wide linkage study in this population, the most significant linkage for obesity was on chromosome 6q22.31–6q23.2, which contains the SIM1
). The two variants were in high linkage disequilibrium among the Caucasians (r 2
= 0.87 in HAPMAP), and neither were associated with obesity in the case/control samples (e.g., rs3734353: P
= 0.70 for obese children/control subjects, P
= 0.38 for obese adults/control subjects, P
= 0.43 for obese children + obese adults/control subjects). The allele frequencies for these two variants differed between the two populations; for example, the risk allele (C) for rs3734353 is the major allele (frequency = 0.62) in Pima Indians but the minor allele (frequency = 0.30) in French Caucasians. However, the overall pattern of linkage disequilibrium across the SIM1
locus is quite similar between these two ethnic groups (online Fig. A1). Because of the different study designs, the results in the French Caucasian subjects are not directly comparable with those in the Pima Indians. To obtain a comparable OR estimate, the predominately Native American samples (full-heritage Pima Indian and mixed-heritage samples) were classified into case (BMI >40 kg/m2
= 1,694) and control (nondiabetic and BMI <30 kg/m2
= 1,272) subjects. With this classification, the OR for severe obesity is 1.26 per copy of the G allele for rs3213541 compared with the estimate of 0.95 for French Caucasian subjects, and Cochran's Q test for homogeneity indicates significant heterogeneity between Pima Indian and French subjects (P
= 0.001). However, there was no statistically significant interaction with self-identified American Indian heritage among the Pima Indian families (P