Subject and genotyping panel selection for the PAGE consortium have been described elsewhere 
. In brief, a panel of 68 common polymorphisms previously reported to associate with body mass index (BMI) 
, type 2 diabetes (T2D) 
, or lipid levels 
was genotyped in up to 14,492 self-reported African Americans (AA), 8,202 Hispanic Americans (HA), 5,425 Asian Americans (AS), 6,186 Native Americans (NA), 1,801 Pacific Islanders (PI), and 37,061 EA (for details, see Materials and Methods
, Table S1
and Table S2
). We also analyzed a subset of 5863 AA from PAGE who were genotyped on the Illumina Metabochip, which contains approximately 200,000 SNPs densely focused on 257 regions with reported GWAS associations to traits that include lipids, BMI, and T2D 
For a replication analysis it would be overly conservative to use the Bonferroni correction, so the Benjamini-Hochberg method 
was applied to assess replication of previous EA reports in the PAGE EA population. Reported effects in EA were replicated for 51 out of the 68 index SNPs at a 5% FDR. Power to replicate at most of these 68 SNPs far exceeded 80%; 16 of the 17 SNPs that did not replicate exceeded 80% power to replicate the reported effect size, and the 17th exceeded 70% power, as described previously 
. The originally reported effect sizes tend to be less extreme for these seventeen index SNPs, but in 63 out of 79 comparisons between non-EA and EA populations involving these 17 SNPs, the direction of effect was the same in EA and non-EA groups (p
for the null hypothesis of random effects in either direction, data in column “Index SNPs Not Replicated in EA”). Only 79 of the 85 possible pairwise comparisons against EA were assessed, because some of the 17 SNPs were not genotyped in all five non-EA populations. Thus, it appears likely that most of the 17 failures to replicate represent weak effects that were underpowered in PAGE EA, rather than false-positive primary reports. Therefore, all 68 index SNPs were carried forward in the generalization analysis.
Summary of direction and strength of β relative to EA.
In all non-EA groups, we observe significantly more effects in the same direction as in EA than expected under the null hypothesis, ranging from 68% in Asians to 88% in Hispanics (p
<0.001 in all non-EA groups, and ). Even in the relatively small Pacific Islander population (N
1801), where only four index SNPs were significantly associated with reported traits, 48 out of 62 effects were in the same direction as EA (p
<0.001), so in larger samples from this population we would expect additional loci to generalize. Although a higher proportion of effects in the opposite direction of EA was observed in Asians and Pacific Islanders, the opposite effects were neither significantly different from no effect, nor significantly different from the observed effect in the EA population. This suggests that the greater number of effects in the opposite direction observed in these smallest groups simply reflects greater uncertainty in estimating effect sizes for these populations, rather than any true trend toward opposite effects. The proportion of effects in the same direction as EA was similar across all non-EA populations, suggesting that for at least 70% of index SNPs, a significant effect in a consistent direction will ultimately be observed in non-EA populations of adequate size.
Generalization analysis in the PAGE populations.
Whereas the direction of effect was consistent between EA and non-EA populations, the magnitude of effect varied considerably, consistent with prior meta-analyses of generalization 
. Because effect sizes were correlated among non-EA populations, we applied the Benjamini-Hochberg method within each population to identify index SNPs with significantly inconsistent effects between EA and non-EA populations. Inconsistent effects (βpop
at 5% FDR) were observed for 17 of 68 index SNPs in at least one non-EA population ( and Table S2
, see Box 1
for definitions). Inconsistent effects were most frequent in the AA population (12 out of 68 loci), but examples were also observed in Pacific Islanders and Native Americans. Although most effects were consistent between EA and non-EA populations, the relatively high frequency with which differential effects were observed in non-EA populations suggests that genetic risk models derived from GWAS in EA will predict risk less reliably in non-EA populations, particularly AA. Consequently, caution should be exercised in applying risk models based upon risk variants genotyped outside of the ethnic background in which they were derived 
, regardless of the factors causing the observed variation between populations,.
Summary of generalization results.
Box 1. Definitions
: The effect size of a given SNP in linear or logistic regression models for a specific PAGE population. Where available and when allowed by the informed consent protocols, effect sizes were estimated in models that included estimated genetic ancestry, as previously reported (see Text S1
ßEA: The effect size of a given SNP in the PAGE EA population. We use the PAGE EA effect size for comparisons to PAGE non-EA populations rather than the original report for two reasons: to minimize the impact of winner's curse on these comparisons, and because several of the SNPs genotyped in PAGE were proxies strongly correlated with the original tagSNP, and might not match the reported effect size.
We define replicated
SNPs as SNPs with direction of effect consistent with the original report in EA, and significant ßEA
in PAGE (using α
0.05 as the threshold for hypothesis rejection, unadjusted for multiple testing, as these are considered specific prior hypothesis being validated).
When comparing two populations, the direction
of effect can be either the same
are either both positive, or both negative) or opposite
is positive, and the other is negative). Magnitude
of effect was evaluated only for SNPs that replicated in EA and can be either stronger
|), the same
|), or weaker
In order to describe the generalization of EA findings to non-EA populations, SNPs are categorized in terms of (a) significance in the non-EA population and (b) consistency between non-EA and EA populations. Here we use the Benjamini-Hochberg procedure to adjust for testing up to 68 SNPs in each non-EA population.
SNPs reject the null hypothesis of no effect in the non-EA population (ßpop≠0
) at q
SNPs reject the hypothesis of equal effect size in EA and non-EA populations (ßpop≠ßEA
) at q
Combining these parameters yields four categories of generalization:
- Ambiguous SNPs are neither significant in non-EA, nor inconsistent between non-EA and EA.
- Differential SNPs are not significant in non-EA, but are inconsistent between non-EA and EA.
- Differentially Generalized SNPs are significant in non-EA, and inconsistent between non-EA and EA.
- Strictly Generalized SNPs are significant in non-EA, and consistent between non-EA and EA.
Four index SNPs showed differentially generalized effects (ßpop≠ßEA and ßpop≠0). Two of these did not replicate in EA (rs7578597 and rs7961581 for T2D in NA) so consistency of direction cannot accurately be inferred. Direction of effect in EA and non-EA was the same for the remaining two index SNPs; rs3764261 was significantly weaker for HDL in AA, and rs28927680 was significantly stronger for TG in Pacific Islanders. There were no observations of opposite effects where both the EA effect and the non-EA effect were significant.
Considering only the 15 SNPs with a significantly inconsistent effect between EA and at least one non-EA population, 14 of 15 diluted toward the null (p
<0.01, ), a trend driven by the AA population, where all 12 out of 12 significant inconsistencies were diluted. Expanding analysis to all 51 loci replicated in EA, regardless of whether a significant difference was observed between EA and non-EA at a given SNP, we observed a significant excess of effects diluted toward the null (ßpop
<1) in AA, HA, and NA populations (Table S5
). Comparisons between non-EA populations revealed that diluted effect sizes were significantly more likely in AA than in any other non-EA population.
Given that differential effect sizes were observed for many tagSNPs, we sought to leverage the data in order to assess the relative contributions of several factors that might contribute to the significant trend toward diluted effects, including gene–environment interaction with an exposure that varies across populations (differential environment), differences in the correlation between the index SNP and the functional variant across populations (differential tagging), modulation of the index SNP effect by additional, population-specific polymorphism (differential genetic background), population-specific synthetic alleles (combinations of rare, functional alleles tagged by a single common tagSNP 
), or some combination of these factors. It seems unlikely that differential environments would be much more frequent in AA than other non-EA populations, or that differential environment would consistently bias toward the null within AA. Differential tagging is consistent with differentially diluted effects in AA; because linkage disequilibrium extends over significantly shorter distances in African populations than in non-African populations 
, common functional variants (or synthetic alleles) are likely to be less strongly tagged by the index tagSNPs in AA. Differential genetic background effects in AA would also be consistent with the high nucleotide diversity known to exist in this population. The rare functional variants contributing to synthetic alleles will tend to be younger than common variants, and therefore are more likely to be population-specific, so synthetic alleles are compatible with the trend toward dilution. Thus, although differential environmental effects cannot be excluded, the observed data are more consistent with differential tagging and/or differential genetic background effects, and synthetic alleles cannot be excluded.
Genetic background effects can be subdivided into modifying effects, where variants elsewhere in the genome directly alter the effect associated with a given index SNP, and interference effects, where secondary variants change the proportion of variance explained by the index SNP. Interfering functional variants with effects in the same direction as the index SNP would tend to dilute the apparent effect size at the index variant. The most likely source of such variants is the region surrounding an index SNP, as demonstrably functional variants already exist in that region. Although examples have been described of genes carrying both risk and protective mutations 
, others clearly exhibit trends toward risk alleles with similar effects (e.g., preferentially toward breast cancer risk alleles at BRCA1
). If the direction of effect for functional variants in a given region is consistently biased, then an increase in the number of interfering variants within a given population would be consistent with a trend toward dilution of index effects. The higher nucleotide diversity observed in African populations relative to non-African populations 
would be consistent with a greater burden of secondary functional variants in AA than other populations.
In order to assess contribution of the factors outlined above to differential effect sizes between EA and AA in the index tagSNP associations, high density genotype data were collected from a subset of the PAGE African American sample. The number of AA individuals used for index tagSNP analyses varied by phenotype, with an average of 7501 (Table S3
). Similar data on other populations are currently unavailable, so only loci showing differential effects between EA and AA could be analyzed. Genotype data were collected using the Metabochip, a high density genotyping array commercially available from Illumina. Detailed methods for the Metabochip genotype data collection, calling, and quality control are available elsewhere 
In order to measure the contribution of differential LD to dilution, we need a model of how changes in LD between tagSNP and a functional variant would be expected to alter the observed effect size at the tagSNP, assuming that the effect size at the functional variant is the same in both populations. Given a functional SNP (fSNP) and an associated tagSNP, linkage disequilibrium between the two SNPs can be described as the measurement error introduced by genotyping the tagSNP, rather than genotyping the fSNP directly. As such, by appealing to prior work on regression dilution bias, it can be shown that the effect size β′
at the tagSNP is related to the effect size β
at the fSNP by the following equation:
(see Text S1
for details). Thus, assuming that the effect size at the fSNP is constant between populations, when linkage disequilibrium between tagSNP and fSNP is weaker in a given population, we expect to see a greater degree of dilution bias for the estimated tagSNP effect size. Rearranging this equation,
. Extrapolating to compare the degree of dilution bias between AA and EA populations, we expect changes in linkage disequilibrium across populations to be reflected by changes in relative effect size:
Assuming the effect size of the functional variant is the same in both populations, this reduces to:
The above equation allows us to directly compare the observed distribution of relative effect sizes at the tagSNPs in AA and EA (
) against the relative strength of tagging in AA and EA (
). Considering the subset of index tagSNPs in regions that were present on the Metabochip, we observed 51 index tagSNPs that fell into 47 independent loci on the Metabochip. We identified the set of SNPs tagged by each index tagSNP at r2
>0 .8 in an EA population 
, yielding a total of 1,093 tagged SNPs for the 51 index tagSNPs. For each of these 1,144 SNPs, we then calculated
. Let this represent the expected distribution of differential LD between AA and EA. Next, we calculated
for the subset of 40 of the 51 index tagSNPs that replicated at q
0.05 in EA, truncating at 0 if the signs were opposite between populations. These two distributions (
in all 1,144 SNPs versus
for the 40 index tagSNPs) were not significantly different by two-tailed t
test. Thus, we cannot reject the hypothesis that the observed dilution bias in AA effect sizes at the index tagSNPs is consistent with the observed distribution of differential LD between the two populations. A single-locus example of the potential for differential LD to contribute to diluted effect sizes is shown in .
Dilution of effect size at PSRC1 for LDL.
Considering the 12 SNPs showing differential effect size in AA, regions spanning 11 were present on the Metabochip (Table S3
). Before comparison with EA, we compared the observed effect sizes at the index tagSNPs in the full AA sample and the subsample of AAs genotyped on the Metabochip (AAmchip
). Three of the index tagSNPs failed to genotype on the Metabochip, leaving eight index tagSNPs for this direct comparison (Table S4
). No significant allele frequency differences were observed between the AAmchip
subset and the full AA population, consistent with AAmchip
being a representative subsample. A significantly inconsistent and diluted effect size in AAmchip
compared to EA was still observed for five of these eight tagSNPs (p
<0.05, Table S4
). The index tagSNPs without a significant difference likely reflect reduced power to detect the differential effect size in the AAmchip
subsample, as these three index tagSNPs also had the least significant differential effect when comparing the full PAGE AA subpopulation against EA.
The Metabochip genotype data allowed us to evaluate regions spanning each of the 11 variants for the underlying contributions of population-specific alleles, differential tagging, and secondary alleles to differential effect sizes. Detailed discussion of each locus is provided in Text S1
. In summary, the 11 SNPs fell in 10 Metabochip regions, so all SNPs in each of the 10 regions were assessed for association with the reported trait in AAmchip
. The threshold level for significance within each region was conservatively adjusted for multiple testing by Bonferroni adjustment for the number of SNPs successfully genotyped on the Metabochip within the region, with minor allele frequency greater than 1% in the AAmchip
sample. For example, the Metabochip region spanning CETP contained 84 SNPs, so our significance threshold for that region was p
. One locus (APOE
) could not be dissected confidently as LD data for the index tagSNP were not available in EA, and two loci were underpowered to draw strong conclusions, as evidenced by the failure of any variant in the region to show a significantly inconsistent effect with the index tagSNP effect in EA. Among the remaining seven loci, we observed one clear example of a diluted signal consistent with EA-specific functional alleles, either common or synthetic (), and five loci showed patterns consistent with fine-mapping of the index tagSNP bin (, ). One of these fine-mapped the EA association to a variant that was not strongly associated with the index tagSNP in EA (r2
<0.5, ), potentially consistent with a synthetic allele in EA. We also observed statistically significant secondary functional alleles at three loci ().
Examples of loci without evidence of association in AAmchip or fine mapping EA signal.
Examples of secondary alleles in the AA population.
Thus, although the overall pattern of effect dilution in AA is consistent with expectations on the basis of differential LD patterns between AA and EA populations, putative examples of EA-specific alleles and secondary alleles in AA were also observed. A contribution from synthetic alleles cannot be excluded, and may well account for the EA-specific allele at CILP2 (). However, at half of the 10 loci we observed at least one of the tagged SNPs in EA that showed an effect size in AA consistent with the effect size at the tagSNP in EA. These examples of fine-mapping EA signal suggest that at least half of EA GWAS signals tag a common, functional variant. The observed excess of dilution effects in AA (as compared to other non-EA populations) suggests that African-descended populations will be the most useful single subpopulation for fine-mapping of EA GWAS associations, although the significant trend toward excess dilution in HA and NA populations (Table S5
) suggests that trans-ethnic fine-mapping may prove more powerful than fine-mapping with any single non-EA population.
In conclusion, we have assessed the generalization of GWAS associations from EA populations across five clinically relevant traits, in five non-EA populations. Our results demonstrate that although most EA GWAS findings can be expected to show an effect in the same direction for non-EA populations, a significant fraction of GWAS-identified variants from EA will exhibit differential effect sizes in at least one non-EA population, and these differential results will be far more frequent in the AA population. These findings suggest that expanded GWAS and fine-mapping efforts focused on non-EA populations, especially AA, will substantially enhance our understanding of the genetic architecture of common traits within non-EA populations. It will be particularly important to extend GWAS discovery efforts to non-EA populations if genetic risk prediction models using tagSNP genotypes demonstrate clinical utility, because risk estimates derived from European GWAS clearly generalize imperfectly to non-EA populations. Our analyses suggest that variable LD in its many guises accounts for much of the heterogeneity of effect size at index tagSNPs, rather than any “true” differences in effect size between populations for the functional variants that were tagged. Thus, risk models derived directly from genotypes at functional variants (rather than tagSNPs) may generalize more effectively to non-EA populations.