|Home | About | Journals | Submit | Contact Us | Français|
Obesity is a heritable trait and a major risk factor for highly prevalent common diseases such as hypertension, cardiac diseases and type 2 diabetes. Obesity is a major public health concern worldwide. Previously we showed that BMI was positively correlated with African ancestry among the African American (AA) participants in the NHLBI’s Family Blood Pressure Program (FBPP). Using Individual Ancestry (IA) estimates at 284 marker locations across the genome, we now present a Quantitative Admixture Mapping (QAM) analysis of body mass index (BMI) in the same population. We used a set of unrelated individuals from Nigeria to represent the African ancestral population and the European Americans in the FBPP as the European ancestral population. The analysis was based on a common set of 284 microsatellite markers genotyped in all three groups. We considered the quantitative trait, BMI, as the response variable in a regression analysis with the marker location specific excess European ancestry as the explanatory variable. After suitably adjusting for different covariates such as sex, age and study center, we found strong evidence for a positive association with European ancestry at chromosome locations 3q29 and 5q14 and a negative association on chromosome 15q26. These results suggest that these regions may harbor genes influencing BMI in the AA population.
Although obesity is an individual clinical condition, it is increasingly viewed as a serious and growing worldwide public health problem. Obesity is believed to predispose to all the major killer diseases such as type 2 diabetes, cardiac disorders, hypertension, stroke, metabolic diseases and even some forms of cancer1. The prevalence of obesity has been continually rising for two decades2. A common and inexpensive surrogate to measure obesity is body mass index (BMI), defined as the ratio of weight in kilograms to squared height in meters. Another advantage to analyzing BMI is that it is a continuously measured trait, which generally provides greater power in searching for explanatory covariates than overweight or obesity defined by arbitrary cutoffs.
Though it is largely believed that excessive nutrient intake and a sedentary lifestyle of the developed world has been the major cause behind the obesity epidemic3, other factors including genetic predisposition are also deemed responsible. Genetic factors contribute significantly to obesity4,5, with heritability estimates of BMI ranging from 30 to 70%6–9. There have been numerous efforts to identify genes and chromosomal regions responsible for BMI, using genome-wide linkage and association analysis10. While a number of genes with rare mutations are known to lead to increased BMI and obesity, such as the melanocortin 4 receptor gene11, most recently convincing evidence for the role of a common polymorphism in the FTO gene with a modest effect on BMI has been presented12.
New world admixed populations provide unique opportunities for genetic admixture mapping studies 13–21. The AA population of the United States is typically represented by admixture of European and African ancestral genomes in different proportions with some spatial variation22–24. Several studies have examined the correlation between European (or African) ancestry in African Americans and BMI or obesity24–27. One study found a positive association between African ancestry and BMI25, another a positive association of European ancestry with obesity related traits26, and a third no correlation of ancestry with BMI27. In a previous analysis of the AA participants in the Family Blood Pressure Program (FBPP)28, we found a positive correlation between BMI and individual African ancestry estimated from genome-wide microsatellite markers24. In the current study, we now present results of an analysis examining the correlation of BMI with estimated ancestry proportions at each of 284 marker loci among 1344 unrelated AA subjects from the same FBPP population, in a search for potential locus-specific effects.
The FBPP is a large multicenter genetic study of high blood pressure and related conditions in multiple racial/ethnic groups, including European Americans (EA), African Americans, Mexican Americans and Asians and Asian Americans. It includes four component networks: GenNet, GENOA, HyperGEN and SAPPHIRE. GenNet, GENOA and HyperGEN independently collected samples from EA and AA families. GenNet sampled AA and EA nuclear families in Maywood, Illinois and Tecumseh, Michigan, respectively, through identification of a young middle-aged proband with elevated blood pressure. GENOA sampled AA sibships containing sibling pairs with hypertension from Jackson, Mississippi and EA sibships with an affected proband from Rochester, Minnesota. HyperGEN recruited AA and EA hypertensive siblings and random unrelated individuals from five field centres (AA from Birmingham, Alabama and Forsyth County, North Carolina; EA from Salt Lake City, Utah, Minneapolis, Minnesota, Framingham, Massachusetts, and Forsyth County, North Carolina).
All the individuals we included in the study were unrelated AA from field centers of GenNet, GENOA and HyperGEN. Race/ethnicity information was obtained by self-description. To maximize the number of unrelated individuals in our sample, whenever possible we selected unrelated founder individuals, otherwise one randomly selected individual per family. Our final sample of 1344 individuals consisted of 280 individuals who were sampled by the GenNet network, 349 individuals sampled by the GENOA network and 715 individuals sampled by the HyperGEN network.
DNA was extracted from whole blood by standard methods by each of the four FBPP networks and was sent to the US National Heart, Lung, and Blood Institute(NHLBI)’s Mammalian genotyping service in Marshfield, Wisconsin, for genotyping. Screening set 8 (372 highly polymorphic microsattelite markers with an average map distance of 10 cM) was used for all four networks.
We used the computer program Structure29 to estimate genome-wide, as well as site-specific ancestries in all African American participants. The linkage model was used, with genetic distance between markers specified according to the Marshfield map. In each analysis, the MCMC algorithm was run for 100,000 steps of burn-in followed by another 100,000 steps.
For the analysis of 1344 FBPP African Americans, we assumed a two-ancestral populations model. We also included 1378 unrelated non-Hispanic white participants from the FBPP as well as 127 African individuals from the Human Genome Diversity Project (HGDP)30. This latter set of individuals had been genotyped at more than three hundred STRs at the time of our analysis, and we included genotypes at 284 markers which were also genotyped in the FBPP individuals.
At each locus we calculated, for each individual, an ancestry deviation defined as the estimated ancestry at that location minus the background ancestry estimated from the genome-wide markers for that individual.
Specifically, let be the locus-specific ancestry of individual i (i = 1, 2, …, N) at marker locus l (l=1, 2, …, L) estimated from Structure. We compute the overall (genomewise) individual admixture ( ) for the individual i as the average individual admixture per locus:
We computed the ancestry difference at each locus, using the genomewide IA as baseline. Specifically for individual i, the ancestry difference for ancestral population k at marker l is defined as:
Note that we have dropped the suffix k pertaining to the population from which the ancestry coefficients are derived. The variable was then used as the primary dependent variable in a linear regression model with BMI (transformed) as the dependent variable. Age, sex and network were also included as covariates in this analysis, if significant. The standardized regression coefficient of xli, defined as , is distributed as asymptotically normal and was used to assess statistical significance.
To account for multiple testing (284 markers) we performed a permutation analysis in which we randomly reassigned the genetic ancestry estimates for the 284 marker locations to individuals, whose BMI and covariate data remained intact. This procedure preserved the correlation structure of the markers and the correlation structure of BMI and covariates, but dissociated the relationship between the markers and phenotypes. For each permuted data set, we performed the same regression analysis of BMI on excess ancestry at each marker location, as was done for the original data, and obtained the most extreme values (positive and negative) of rl (the Z-score statistics). One thousand permutations were performed. To derive P values adjusted for multiple testing, we determined the percentage of times out of 1,000 permutations that an observed value of rl was exceeded in the permuted data analysis.
Sample demographics are given in Table 1. There were 878 males and 466 females. The subjects from the GenNet network were the youngest (average age 41) and subjects from GENOA the oldest (average age 60), while the HyperGen subjects were in between (average age 51). Average BMIs were generally higher in males than females, and there was little variation in BMI among networks. Average European ancestry varied modestly among the three networks, as was previously observed23.
The distribution of BMI for the 1344 unrelated individuals in this study was positively skewed. Neither the 1/BMI or the Log(BMI) transformation, generally found in the literature, provided a satisfactory normalization of the data. However, a loglog transformation of BMI did make the distribution normal (Supplementary Figure 1). The trait LLBMI, which was defined as the loglog transformation of the original BMI, was strongly affected by sex and study center but not age (or age2) when tested by analysis of variance (Table 2).
After adjusting the LLBMI values for sex, study network and the interaction between sex and network, we regressed it on excess European Ancestry (xl) at each locus l (l= 1 to 284). The ratio (rl) of the estimated slope of the regression (bl) divided by its standard error (sl) is asymptotically normal. We looked at regions with high absolute value of rl. A positive value of the gradient bl (and hence rl) at the marker locus l implies that BMI is positively correlated with excess European ancestry (and negatively correlated with African ancestry) at that locus while a negative value of rl implies that BMI is negatively correlated with European Ancestry (and positively correlated with African ancestry) at that locus.
A Quantile-Quantile (Q-Q) plot of the rl values against a normal distribution reveals that the fit is good except for the tails. There is a significant bulge in the right tail and at least 3 points in the left tail of the distribution also appear to be outliers (Figure 1). Table 3 lists all markers for which the absolute value of the ratio rl was larger than 1.96 (corresponding to a two-sided unadjusted P value of .05). There are 11 points in the left tail and 13 in the right tail with absolute values larger than 1.96, compared to 7 expected in each tail by chance. The three most extreme points in the left tail of the distribution are three consecutive markers, all from chromosome region 15q25.3–26.2. The next locus lying posterior to 15q26.2 at 15q26.3 (D15S642 or GATA27A03) also has a low rl value of −2.12. The points that constitute the bulge in the right tail of the distribution are markers primarily from chromosome regions 3q28–29 and 5q14–23. There are also several points from the region 16p11.2–13.1.
To check the significance of our findings we ran a permutation test in which we randomly assigned the adjusted LLBMI values with covariates to an individual, and regressed it on the locus specific excess European Ancestry (xl) (see Methods). In 1000 permutations, the minimum of 284 rl scores only once crossed −3.14 and never crossed −3.43. The maximum also crossed 2.95 only once. Hence the results associated with the markers D15S816, D15S652 and D3S1311 have empirical adjusted P values less than .002. We have done some additional analyses to show that these values are not due to ‘outlier’ effects. Specifically, we looked at the scatter plot of individual excess European ancestry and LLBMI to search for outlier points with very high (or low) individual excess European Ancestry at a locus coupled with a very high (or low) BMI value, which could distort the results. However, no such ‘outlier’ points were identified (Supplementary Figure 2).
Figure 1 reveals outliers in both tails of the ancestry Z-score distribution, but otherwise a good fit to a normal distribution. Table 4 lists 24 markers out of 284 (8.5%) compared to the expected 14 (5%). If we take |rl| > 2.5 as our cutoff instead 1.96, there are 9 (3.2%) markers above that threshold while we expect only 3.5 (1.24%). If we look at the markers in Table 3 with further detail, we find the markers are mostly clustered into 6 different regions of the genome (Table 4) at 1q, 7p and 15q (excess African ancestry) and 3q, 5q nad 16p (excess European ancestry). Eight of the 9 markers for which the rl values are larger than 2.5 (shown in bold) are from the three regions, 15q25.3–q26.3, 3q28–q29 and 5q14.1–5q32.
Among the six identified locations, the regions 7p12.3–7p14.3 and 16p11.2–16p13.1 have the lowest statistical significance. However the region on 7p is known to harbor the growth related Growth hormone-releasing hormone receptor (GHRHR) and Isolated growth hormone deficiency (IGHD) genes. Mutations in these genes have been generally associated with lower BMI31. Mouse polygenic models of obesity studying the QTL abdominal fat has found a putative human homologue in region 7p13–p1232. It may be noted here that the rl values of the region 7p are negative, implicating a positive association of BMI with excess African ancestry at that locus. In a recent linkage study among 769 subjects from 182 families in Africa, the marker D7S817 has been linked to BMI with a LOD score of 3.8333. In another linkage study consisting of 342 families, D7S1818 was associated with a LOD score of 2.2 for BMI trends from childhood to adulthood34. The marker D16S764, which has the highest rl value of 2.41 among all the markers in the 16p region, has been shown previously to be both modestly linked (LOD 2.45) and associated (p<0.0006) with BMI35. The sample in the above study was composed of 893 white sibpairs, which may be consistent with the high positive value of rl. Several markers in this region have also been shown to be linked to BMI and other obesity related phenotypes with LOD scores ranging from 1.7 to 3.21. All these studies were reported among different populations of European descent 35–38.
The region 1q32.2–1q42.3 contains the Angiotensinogen (AGT) gene. Numerous previous studies30–41 have found association between this gene and obesity related phenotypes, including BMI. Keeping in mind the negative rl values of markers in this region (African excess), the fact that linkage of this region with obesity related phenotypes was found only among black families may be of particular interest42.
The 6 consecutive markers on chromosome 5 with high positive rl values span a large region from 78 Mb to 144 Mb (5q14.1–5q32). In a study involving 321 sibpairs the second marker in this region, D5S1725 (with an rl value of 2.37) was previously found to be linked to body fat and fat mass with LOD scores of 2.56 and 2.25 respectively43. It is worth mentioning that these findings are from a study in Western Africa. However in two different linkage studies among white families, markers in this region, including D5S1505, have been found to be linked with more direct BMI related traits34,44. Chen et al34 analyzed 342 sibships and found D5S1453 at 5q21.3 to be linked with trends in BMI from childhood to adulthood with a LOD score of 2 and D5S1505 to be linked with long term burden in BMI with a LOD score of 2.2. The French study44 was based on 447 subjects in 109 pedigrees chosen through a proband with BMI>27. They found a linkage peak with LOD score 2.68 at marker D5S1463, located at 5q14.3. Rice et al45 studied abdominal fat and high BMI, involving 453 subjects in 99 white families, and found two peaks in this region. Of these two locations, marker D5S658 at 5q31.3 had a LOD score of 2.06 and D5S1480 had a LOD score of 2.1. In another study involving 88 families, BMI was shown to be associated with polymorphisms in the gene Nuclear Receptor subfamily 3, group C, member 1 (glucocorticoid receptor) (NR3C1) at 5q31 with a p-value of 0.00946.
The chromosome 3 markers D3S2418 and D3S1311, spanning a 5 Mb region, show the strongest excess European ancestry with BMI in our study. Associations of markers in the candidate gene Apolipoprotein D (3q26.2-qter) with BMI was reported by Vijayraghaban et al47. Marker D3S1311, which was statistically significant after the permutation test, also lies in a promising linkage region (LOD score 2.5) in the study of Rice et al45. The region 15q25.3–15q26.3 harbors the 2 markers (D15S652 and D15S816) with the most significant rl values in our study. It is also the site for the gene neuromedin B (NMB) (15q22-qter) which is a candidate for type 1 diabetes, obesity and hunger disorder48. Polymorphisms in NMB have been shown to be associated with BMI and other obesity related phenotypes46. In a recent study, Bouchard et al49 fine mapped a 20-megabase region around a quantitative trait locus on chromosome 15q26 for abdominal subcutaneous fat (ASF) in an extended sample of 707 subjectsfrom 202 families from the Quebec Family Study. Chagnon et al50 studied 336 sibpairs and 609 relative pairs and found D15S652 linked with fat free mass with a LOD score of 3.56. Another marker in this region, D15S657 at 15q26.2, also had a LOD score of 2. Using a subset of the data that we have analyzed here, Lewis et al51 have reported linkage (LOD score 3) of body fat (%) with marker D15S655 among males in the HyperGEN network.
To our knowledge, this is the largest Quantitative admixture mapping effort in terms of sample size and marker locus involvement. We took care to eliminate possible errors in the locus-wise ancestry estimates, because any systematic bias could affect the final results. Statistical variation in locus-wise ancestry estimates was kept to a minimum by running the MCMC for a long period. To further check the robustness of our results, we also examined results of analyses based on different random selections of unrelated individuals from the AA families and obtained very similar results. Overall our findings are encouraging and provide regions for follow up analyses of genes influencing BMI in these and other African American families.
Figure S1. Histograms and normal Q-Q plots of BMI, Log(BMI) and Log(log(BMI)).
Figure S2. Scatter plots of excess European ancestry at three loci versus Log(log(BMI)).