In the Jackson Heart Study, we confirmed that five SNPs discovered in European-derived populations were associated with lipid traits in African Americans at a significant or suggestive level. Furthermore, we identified three SNPs with stronger associations in African Americans than the index discovery SNPs at their respective loci. Taken together, our findings suggested that the same genomic regions that predict lipid traits in European-derived populations are likely relevant in African Americans, but that the specific variants within each locus may differ among ethnic groups. Finally, we found that fine-mapping genomic regions on an African ancestral background has limited utility to narrow genomic regions of association in an admixed cohort of this size.
Three index GWA SNPs did not meet our significance threshold in JHS. This failure to replicate may have occurred for a variety of reasons. Our analysis has modestly reduced power (Supplemental Table
) due to sample size relative to the original reports, which were meta-analyses of ~8000 individuals. If the effect sizes in Kathiresan et al4
, which were typically less than 0.12 standard units for all SNPs, are applicable to JHS, we had 80% power to detect associations with three variants at p<0.004: rs1260326 (2p23, GCKR
) with TG; rs646776 (1p13, CELSR2/PSRC1/SORT1
); and rs17321515 (8q24, TRIB1
). In addition to a smaller sample size, the reduced power arises in part from the fact that minor allele frequencies at 4 of the 8 loci were lower in African Americans as compared to the discovery cohorts of European ancestry, further decreasing power to confirm association (rs4846914: MAF=0.12 vs 0.40; rs17145738, 0.08 vs 0.13; rs17321515, 0.44 vs 0.49; and rs12130333, 0.10 vs 0.22). Interestingly, although we replicated associations at 2p23 and 1p13, we saw no association at 8q24, despite adequate power. Although the reason for this result is unclear, a possible explanation could be differing patterns in linkage disequilibrium in African Americans, which would weaken the observed association with a non-causal SNP. Additionally, unidentified gene-x-gene or gene-x-environment interactions may cause fundamental differences in effect sizes between European and African Americans.2
We confirmed that local ancestry data at each locus provides additional insight into genetic associations in African Americans. For 4 of the 5 associated index GWA SNPs, the effect sizes on the European ancestry background exceeded those on the African background. Although the effect size differences for any single SNP can only be considered suggestive, this finding supports a pattern of systematically weaker effect sizes on the African ancestral background. Because members of the JHS are from the same community, it is unlikely that unmeasured differential environmental exposures contributed to the apparent systematic difference in effect sizes. A more likely explanation is that the index SNPs are markers for the major casual SNP at each locus and that smaller effect sizes in the African local ancestry subgroup result from the weaker SNP-SNP correlations seen in the ancestral West African population.
Fine-mapping yielded two variants with consistent effects on plasma TG levels irrespective of local ancestry - rs780093 in GCKR and rs636523 near DOCK7/ANGPTL3 - making them credible candidates as the causal variants at their respective loci. Neither SNP is a nonsynonymous coding variant: rs780093 is intronic within the GCKR gene, and rs636523 is near DOCK7 and ANGPTL3. It currently remains unclear how these noncoding variants might influence phenotypic variation. Ultimately, experimental validation in a faithful disease model will be required to further elucidate their mechanistic action. While further replication of these SNPs in additional African American samples is needed, our findings suggest that these SNPs may have predictive value in African Americans at these two loci. In contrast, while rs629301 was nominally more strongly associated with plasma LDL cholesterol than its corresponding index SNP rs646776, there remained a disparity in the effect on the African background compared to the European background. A suitable genetic marker at this locus for LDL in African Americans thus remains to be identified.
Fine-mapping genomic regions is one proposed next step to further characterize loci discovered through GWA.14
It has been suggested that the shorter linkage disequilibrium (LD) patterns in historically larger populations such as West Africans may allow narrowing the interval in which causal variants may lie.3
However, fine-mapping does not necessarily provide an independent pattern of LD because of the large amount of European chromosomal segments arising from admixture. Therefore, we performed a stratified association analysis based upon local ancestry, limiting ourselves to individuals with only a high probability of 2 alleles of African ancestry at the locus (JHS_AFR2
). In principle, analysis in this subpopulation will provide an entirely independent pattern of LD as there is no contribution from European ancestral segments. For each of the eight loci, this resulted in a subgroup of ~2500 individuals. Using this approach, we did not find any significant associations for LDL, HDL, or TG.
While theoretically attractive, our work highlights several challenges for tag-based fine-mapping in African Americans. First, power to detect novel associations, even in previously defined loci, is limited. Power calculations for SNP discovery within the African Ancestry subpopulations (Supplemental Table
) show that we are substantially underpowered to detect the typical effect sizes for SNPs seen in European populations. Genome-wide association studies have shown that common variants impart modest effects on their associated traits; thus, identifying these variants often requires meta-analyses of large cohort consortia. Unfortunately, there are fewer available cohorts of African ancestry, limiting the number of individuals available for genetic analysis. Furthermore, analyses in admixed populations such as African Americans are complicated by the mixture of African and European segments across the genome. By excluding individuals with European ancestry at a given locus from analysis in order to narrow the association signal, sample size is further diminished (by approximately 45% in our study). In addition, it is plausible that the genetic loci discovered in European-derived populations in fact harbor multiple functional loci along a single haplotype and that in the absence of similar LD structure, the association signal may be much weaker. Finally, tag-based strategies typically utilize highly correlated common variants from the HapMap to capture genetic variation across a genomic region. The second version of HapMap contains only ~30% of the common SNPs that are present in the genome,14
so coverage is currently incomplete. The 1000 Genomes Project, an effort currently underway to resequence the entire genome of >1000 individuals, should significantly improve upon the currently available datasets of common variants. Next generation genomic resequencing technology holds great promise by providing the ability to resequence genomic regions, identifying both common and rare variants in a large number of individuals at a relatively modest financial cost.
Our study has several strengths and limitations. First, the Jackson Heart Study is a well-phenotyped, community based study and is one of the largest available cohorts of African American individuals. Next, we accounted for the genetic architecture of both African and European-derived populations when we selected our tag SNPs. In our analyses, we accounted for population stratification, which can confound genotype-phenotype associations, by adjusting for both global and local ancestry. In addition, we performed stratified analyses by local ancestry and examined whether genetic association signals could be narrowed on the African ancestry background. While many have postulated that this may be a useful approach, few have tested this hypothesis. A key limitation to our study is statistical power. As mentioned earlier, most common variants discovered to date impart small effects, mandating large sample sizes to detect genetic associations. Unfortunately, genetic research in African Americans is currently limited by the relatively few cohorts available as compared to groups of European descent. In addition, tag SNP selection is limited to known common variants.
In conclusion, we investigated eight genomic regions recently reported as associated with lipids in populations of European descent to determine their relevance in African Americans. Of the eight regions, we confirmed that five of the eight index SNPs were significantly/suggestively associated with plasma levels of LDL, HDL, and/or TG levels in African Americans. Moreover, at three of the five confirmed loci, we found SNPs with stronger associations with lipids in African Americans compared to the index SNPs discovered in European-derived cohorts, demonstrating that although associated regions may be relevant to many ethnic groups, specific markers of association can be unique to individual groups. Finally, we demonstrated the limitations of using a cross-ethnic fine-mapping approach to narrow association signals and suggest resequencing these loci in multi-ethnic groups as an alternate approach.