To date, few GWA studies have been attempted in African-derived populations, despite the success in GWA studies based primarily on European-derived participants (19
). In the present study, we describe the first GWA study assessing the association of imputed genotypes and CNVs with height and BMI in two African-derived populations. We applied imputation methods to test variation at up to 2.9 million HapMap SNPs for association. We note that the optimal approach to impute African-derived genomes is under development; possible approaches include equal weighting of the YRI and CEU panels, using all HapMap reference panels, or sequential imputation using separate reference panels (31
). Although a rigorous comparison of these approaches is beyond the scope of this paper, our assessment of imputation quality by concordance scores (see Materials and Methods) nonetheless indicated to us that, at least for relatively common SNPs, our imputation results were of high quality and appropriate for use in association testing.
Using information from both imputed and chip-genotyped SNPs, we discovered no common SNPs showing consistent strong evidence of association with either height or BMI in our study. We have previously reported a region near 3q27 showing evidence of linkage to BMI from a cohort sampled from Maywood, IL (23
), but no SNP in this region showed elevated association with BMI in our current study (Supplementary Material, Fig. S6
). We also examined a total of 53 loci for height (n
= 41) (6
) and BMI (n
= 12) (17
) previously reported to be associated in European-derived populations. Among them, we were able to provide some evidence of replication in African-derived populations for IHH
for height and MC4R
for BMI. Finally, no strong association of common CNVs with height or BMI was detected. Although lean individuals appeared to have a heavier burden of rare CNVs compared with the heavier individuals, we were not able to replicate this finding in a European data set.
Like many other individual GWA studies, our data set is substantially underpowered to detect novel associations with modest effect sizes, having ~3% power to detect an SNP with effect size similar to that of ZBTB38
. Nonetheless, at this significance level, our GWA discovery panel would have ~50% power to detect SNPs explaining ~1% of the phenotypic variation or ~80% power to detect SNPs explaining ~1.5% of the phenotypic variation, and our replication panels would have >90% power to replicate variants with such strong effects (Supplementary Material, Fig. S4
). As such, it appears that there are no common variants with extremely large effects (>1.5%) or multiple common variants with large effects (>1%) in the African-derived genome for the anthropometric traits analyzed here. These represent the initial estimates of the upper bound of effect sizes for variants associated with anthropometric traits in African-derived populations. Increasing sample size and meta-analysis with other data sets of African-derived populations will be necessary to improve power to identify novel associations as well as to refine the estimates of effect sizes present in African-derived populations.
Examples exist where the variants associated with phenotype were shown to have large allele frequency differences between populations [e.g. MYBPC3
for cardiomyopathies (34
for type 2 diabetes (34
) and MYH9
for focal segmental glomerulosclerosis (36
)]. Such variants could be missed by GWA studies in a single population, but may be highlighted by admixture mapping, because admixture association signals arise from alleles with large differences in ancestral population frequencies (37
). Thus, an approach that combines admixture mapping with the GWA data in larger samples may be a fruitful route to identifying variants that were not highlighted by studies in European-derived populations because of low allele frequency (and hence low power). Further, such an approach allows for testing to determine whether an associated variant is able to account for the admixture mapping signal, thus potentially testing for the causality of the variant.
Despite our insufficient power to detect novel associations, we have >60% power to detect true associations explaining ~0.2% of the variance in our GWA panel at a one-tailed significance level of 0.05. As such, among the 53 confirmed loci for height and BMI previously reported in Europeans, we expected to see nominal associations at ~22 loci if effect sizes are similar in European- and African-derived populations. In total, we observed 17 loci with at least one independent SNP associated with one-tailed P < 0.05. Our observation of 17 loci nominally associated in the GWA panel is slightly below a priori expectations. However, the estimated percent variance explained is likely inflated due to ‘winner's curse’ for some of the reported loci as effect sizes from independent European-derived replication samples have not been published. Moreover, less than optimal coverage of variants as well as lower correlation of the HapMap SNPs and any actual causal variant in the African-derived genomes would also cause us to overestimate our expectation. Thus, it appears reasonable to anticipate that the effect sizes of variants associated with anthropometric traits in African-derived population will be similar to those observed in European-derived populations thus far, and therefore, assuming equal coverage of variation, at least comparable sample sizes would need to be enrolled in studies conducted in African-ancestry populations to achieve the success of European GWA studies.
Lastly, it was a curious finding that lean individuals in our data set appeared to have heavier burden of rare CNVs compared with the obese individuals (Table ). We first observed the finding in the African-American panel, which was supported by data from the Nigerian cohort, but was not supported by data from a European-derived cohort that also had Affymetrix 6.0 genotypes and CNV calls (28
). Therefore, our finding of heavier CNV burden among lean individuals could represent an effect specific to cohorts of African ancestry, or a statistical fluctuation. If true, and assuming that rare CNVs are generally deleterious from an evolutionary standpoint, this finding could support the thrifty gene hypothesis that heavier individuals are evolutionarily better suited to store energy (39
). However, this result would need to be confirmed in additional African-derived cohorts with comparable or larger sample size and genotyped on a similar platform.
In summary, we have described the first GWA study of anthropometric traits in African-derived populations. Although we were not able to identify replicable novel single nucleotide or common CNVs associated with height or BMI, we were able to replicate three of the loci identified in European-derived populations at nominal levels of significance. On the basis of the analysis of power, we have ruled out the existence of multiple variants with large effects (~1% variance explained) in African-derived samples and presented an initial estimate for the upper bound of effect sizes for anthropometric traits, which will improve with a more comprehensive assessment of signal fine-mapping in African-derived genomes. Given the need for larger sample sizes, the data set described here would represent a useful component of a meta-analysis of African-derived populations, for both SNPs and CNVs, to facilitate mapping of genes for anthropometric traits in African-derived populations.