|Home | About | Journals | Submit | Contact Us | Français|
Genetic association studies conducted in admixed populations may be confounded by population stratification resulting in spurious associations. The purpose of this pilot study was to determine the presence and effect of population stratification in a case-control study of brain arteriovenous malformation (BAVM).
We tested 83 ancestry informative markers in BAVM cases and healthy controls of self-reported Latino race/ethnicity (n = 294). Individual ancestry estimates (IAE) were obtained using the Structure program, assuming 3 underlying subpopulations. Summary χ2 tests comparing genotype frequency of ancestry informative markers were used to detect stratification and IAE were included as covariates in logistic regression analysis to account for differences in genetic background.
Admixture estimates for Latinos (overall 47% native American, 45% European and 8% African ancestry) revealed heterogeneity between individuals within ancestral groups. The summary χ2 test was significant (p = 0.005), suggesting ancestral differences between cases and controls. Furthermore, genetic ancestry was associated with frequency differences in a promoter variant in the IL-6 gene (IL-6 −174G>C). On average, subjects with the IL6 −174 GG genotype had 6% greater Native American ancestry (p = 0.023). Age- and sex-adjusted risk of BAVM associated with the IL-6 −174 GG genotype was 1.85 (95% CI 0.99–3.48, p = 0.055), and further adjustments for IAE yielded an OR of 1.96 (95% CI 1.03–3.72, p = 0.039).
The IL-6 −174G>C polymorphism was associated with increased risk of BAVM among Latinos after accounting for differences in ancestral background. These results suggest subtle, negative confounding and illustrate the importance of addressing population stratification in case-control studies conducted in admixed populations.
Case-control studies in admixed populations, such as Latinos or African Americans, are susceptible to genetic confounding by population stratification, which can result in both false-positive or false-negative associations. If the risk of disease varies with ancestry, then admixture will confound associations of disease with genotypes at any locus where allele frequencies also vary between ancestral populations [1, 2]. Several studies have demonstrated that confounding by population stratification can exist, even despite stringent recruitment of ethnically matched cases and controls from the same clinics [2,3,4,5].
We recently reported that compared to Caucasians, Latinos were at 2-fold increased risk of hemorrhage in the untreated course of 1,464 brain arteriovenous malformation (BAVM) patients . The incidence of primary spontaneous intracerebral hemorrhage is also known to be higher among Latinos and African Americans . Therefore, the purpose of this pilot study was to determine if population stratification exists in a case-control study of BAVM conducted among self-reported Latinos. We illustrate the effect of population stratification on a promoter polymorphism in the IL-6 gene (IL-6 −174G>C), which is known to vary by race/ethnicity  and has been associated with increased risk of subarachnoid hemorrhage  and intracerebral hemorrhage presentation in BAVM patients .
BAVM cases were recruited from the University of California, San Francisco or the Northern California Kaiser Permanente Medical Care Program . Healthy controls from 4 racial/ethnic groups (Chinese, African American, Mexican and Caucasian) were recruited for a pharmacogenetics study of membrane transporters at the University of California, San Francisco . Self-reported race/ethnicity was recorded from the medical records (cases) or medical history questionnaire (controls). For this study, we focused on subjects who self-reported as Latino (n = 294), including 79 BAVM cases and 215 controls.
We selected 96 ancestry informative markers (AIMs) with large absolute allele frequency differences (δ > 0.5) between African, European or Native American populations (online suppl. table, www.karger.com/doi/10.1159/000160215) that were not known to be associated with BAVM. The unlinked genetic markers were selected from a large pool of AIMs identified by genotyping DNA from 3 ancestral populations (42 European Caucasians, 30 Native Americans from Mexico and 37 West Africans) using the Affymetrix GeneChip Human Mapping 100K array : 8,242 AIMs for African-Native American, 5,781 AIMs for African-European and 5,040 AIMs for European-Native American populations were identified on the 100K array with δ > 0.5. Selected AIMs were genotyped using a template-directed dye-terminator incorporation assay on the Beckman Coulter SNPstream 48-plex platform .
The Cochran-Armitage χ2 trend test was used to evaluate differences in genotype frequency distributions between BAVM cases and controls for each AIM. To test for population stratification, a global χ2 test of association was performed by summing the χ2 values (Σχ2) and degrees of freedom (d.f.) over all tests . p < 0.05 implies that cases and controls have different genetic backgrounds. Quantile-quantile plots were used to compare the observed χ2 distribution for unlinked AIMs to the expected distribution with 1 d.f. under the null hypothesis of no stratification.
Individual admixture estimates (IAE) were obtained using a Bayesian Markov Chain-Monte Carlo method implemented in the Structure 2.1 program  under an admixture model, assuming 3 underlying subpopulations (K = 3) and using prior ancestral population genotype information (described above). The Markov Chain-Monte Carlo scheme was run for 10,000 iterations to obtain proportions of Native American, European and African ancestry for each individual; individual proportions summed to 1.0. IAE values can then be included as covariates in the regression analysis to adjust for differences in genetic background.
To determine the effect of population stratification in our study, we investigated the association between an IL-6 −174G>C promoter polymorphism and risk of BAVM among Latinos. A total of 14 subjects with AIM data were missing IL-6 −174G>C genotypes for a total of 280 Latinos (75 cases and 205 controls) available for analysis. Odds ratios (OR) and 95% confidence intervals (CI) were calculated using logistic regression analysis, including age, sex and IAE to account for differences in genetic background.
Of 96 AIMs, 7 failed 48-plex genotyping and 6 were monomorphic (online suppl. table), leaving a total of 83 AIMs available for association analysis in 79 BAVM cases and 215 healthy controls, all of self-reported Latino race/ethnicity. Descriptive characteristics of BAVM cases and controls are summarized in table table1.1. Mean age at diagnosis for cases and at study enrollment for controls was similar; however, a greater percentage of cases were male (57%). Among cases, 41% presented with hemorrhage, 18% had deep-only venous drainage and 64% had BAVM in eloquent location.
Group and individual admixture estimates are shown in table table22 and figure figure1,1, respectively. The majority of Latinos in our study population are primarily of European and Native American ancestry (fig. 1a, b). However, there is dramatic heterogeneity in admixture levels, with some individuals being almost of 100% European ancestry while others are of almost 100% Native American ancestry. Overall, Latinos in our study population were of 47% Native American, 45% European and 8% African ancestry. These admixture estimates are consistent with previous studies conducted among Latinos recruited based on stringent ethnicity criteria [3, 4].
Next, we compared the genotype frequency of AIMs between BAVM cases and controls. The mean and median χ2 values for Latinos were 1.45 and 0.55, respectively. The summary χ2 test was significant (Σχ2 = 119.99, d.f. = 83, p = 0.005), suggesting differences in ancestral background between Latino cases and controls. In figure figure2,2, we plotted quantiles of the observed against the expected χ2 distribution with 1 d.f. under the null hypothesis of no stratification. The observed distribution clearly deviates from the expected distribution. That is, a greater number of observed χ2 tests had higher values than expected for unlinked AIMs.
Next, we investigated the effect of population stratification on a promoter polymorphism in the IL-6 gene (−174G>C) and BAVM. In our Latino cohort, genetic ancestry proportions were associated with the IL-6 −174G>C polymorphism; subjects with the IL-6 −174 GG genotype had, on average, 5.5% higher Native American ancestry (p = 0.023).
The genotype frequency and association between IL-6 −174G>C promoter polymorphism and risk of BAVM among Latinos is summarized in table table3.3. The univariate OR for BAVM associated with the IL-6 −174 GG versus any C reference genotype was 1.77 (95% CI 0.96–3.26, p = 0.068). This estimate increased after adjusting for the effects of age and sex (OR = 1.85, 95% CI 0.99–3.48, p = 0.055). Further adjustments for IAE resulted in the highest OR (1.96, 95% CI 1.03–3.72, p = 0.039) for the IL-6 −174 GG genotype.
In summary, we found evidence for population stratification in BAVM cases and controls of self-reported Latino race/ethnicity. After accounting for differences in genetic background, a common variant in the promoter region of the IL-6 gene (−174G>C) was associated with an approximately 2-fold increased risk of BAVM among Latinos (p = 0.039).
The proinflammatory cytokine IL-6 may play a key role in intracranial vessel dysplasia and rupture. We previously reported an almost 3-fold increased risk of hemorrhagic presentation in BAVM patients with the IL-6 −174 GG genotype , which further correlated with the highest IL-6 mRNA and protein levels in BAVM tissue . The same polymorphism has also been associated with increased risk of subarachnoid hemorrhage among Caucasians , as well as a number of other neurological diseases. Additionally, the distribution of IL-6 −174 alleles is known to vary between populations : the −174 G and C alleles are equally common in Caucasians (approximately 50:50), whereas the C allele is extremely rare (present at <5% frequency) among Africans and Asians genotyped in the International HapMap project (dbSNP build 128). These findings suggest that underlying genetic differences may play a role in BAVM pathogenesis and progression.
Our admixture estimates obtained from the Structure program are similar to those reported in other genetic association studies conducted among Latinos [3, 4], which used both Bayesian and maximum likelihood methods to infer individual ancestry proportions. Previous work has suggested that approximately 50–100 AIMs are sufficient to accurately infer individual ancestry for Latinos . In our study, genetic ancestry was associated with both case-control status (outcome) and IL-6 −174G>C genotype (predictor) among Latinos, meeting the classic epidemiological definition for a potential confounder. The increased OR after accounting for IAE suggests subtle negative confounding here. However, the adverse effects of even small amounts of admixture will increase markedly with larger sample sizes, and illustrates the importance of addressing population stratification.
In conclusion, the IL-6 −174G>C polymorphism may be a risk factor for BAVM among Latinos. This association was observed after accounting for differences in genetic ancestry, and illustrates the importance of addressing population stratification in genetic association studies conducted in admixed populations. Further work in population stratification methods will allow for more accurate risk estimates and a better understanding of the influence of race/ethnicity in genetic association studies.
This work was supported by grants to H.K. (AHA Scientist Development Grant, 0735242N), W.L.Y. (NIH NS34949, NS41877), E.G.B. (NIH HL078885, American Lung Association of California, RWJ Amos Medical Faculty Development Award, NCMHD Health Disparities Scholar, Extramural Clinical Research Loan Repayment Program for Individuals from Disadvantaged Backgrounds, 2001–2003), S.C. (ATS BOLD Award, ATS-05-078) and K. Giacomini (NIH GM61390).