Our results support the notion that a well-matched case-control study design is a feasible solution to overcome population stratification confounding while initiating genetic association studies in admixed populations (Cardon and Palmer 2003
). To match admixture background, we have recruited self-identified African American cases and controls from three clinics, two of which are in the same census tract. As expected, we have minimized differences in the degree of admixture by recruiting subjects in such a very specific way. Our results show that subjects share high similarity in genetic background and SES. The minimal degrees of admixture do not lead confounding effect in the genetic association tests. Our results demonstrate that an admixture-matched case-control study design among African Americans can successfully avoid inflation of type I error rate in genetic association tests. The recruitment strategies achieve the goal of matching cases and controls based on admixture background and SES.
In a well-design case-control study, the source population from which cases are ascertained should be that one from which controls are also ascertained (Schlesselman and Stolley 1982
). Our strategies for matching admixture are recruiting cases and controls on the basis of geographic proximity, self-reported ancestry and similar SES background. In terms of cost effectiveness, this admixture-matched study design is less expensive, practical for late age-of-onset diseases, and is capable of minimizing the confounding effect due to population stratification. Our findings provide the evidence that it is feasible to control population stratification confounding in the study-design stage.
According to our previous works, three main factors affecting the accuracy of admixture estimates are the number of markers, the informativeness of markers and the number of ancestral subjects. Specifically, the most important factor in determining the accuracy of admixture estimates is the number of AIMs (Tsai et al. 2005
, in press
). Although we only applied 30 AIMs to obtain admixture estimates, the results of group admixture estimates from our African American cohort agree with the results in previous reports (Hoggart et al. 2003
; Parra et al. 1998
; Reiner et al. 2005
; Shriver et al. 2003
). The European admixture proportion in our cohort is approximately 20%, which is consistent to the European admixture proportion in northern or western African American populations from other studies. In addition, we applied two different programs, ADMIXMAP and Structure2.1, for estimating individual admixture proportions. Admixture estimates from both programs showed a very high degree of correlation. Even though admixture estimates here could not be 100% accurate, they should be highly correlated to the underlying individual admixture proportions. Detailed information could be obtained elsewhere (Tsai et al. 2005
, in press
It has been reported that genetic background from different ancestral populations may be associated with socio-economic status (SES) (Burchard et al. 2003b
). SES has been considered as an important indicator related to all-cause mortality within and across different racial groups (Lin et al. 2003
). We examined the interaction between asthma-related phenotypes, SES and ancestry in our cohort, but did not find any association. One possible explanation as to why we did not observe an association may be due to the fact that admixture proportion and SES were well matched between our African American cases and controls.
We observed the association of pre-FEV1
with ancestry ( and ). The results demonstrated that higher African proportions among asthmatics were associated with more severe asthma as defined by lower pre-FEV1
values. Specifically, asthmatics with higher African ancestry had more severe asthma. Because pre-FEV1
is measured at least eight hours after the use of inhaled beta-agonists, this value is presumably an acceptable index of asthma severity. NHLBI guidelines currently use pre-FEV1
as an objective measurement in grading asthma severity. Pre-FEV1
has been validated as a measure of airway obstruction as it closely correlates with pathologic scores of airway diameter (Hogg et al. 1968
). Decreased measures of pre-FEV1
were shown to be associated with the risk of future attacks and response to therapy among children with asthma (Enright et al. 1994
; Fuhlbrigge et al. 2001
). A previous study in Latino Americans showed that asthma severity might be influenced by ancestry in Mexican Americans (Salari et al. 2005
). Recognizing that there is no single measure that accurately captures all facets of asthma severity, pre-FEV1
percent predicted has several advantages as a marker of asthma severity, including its objectivity and reproducibility (Enright et al. 1991
; Enright et al. 1994
; Kitch et al. 2004
Previous studies based on U.S. vital statistics collected from the Third National Health and Nutrition Examination Survey (NHANES III) have reported that African Americans have higher prevalence of asthma than European Americans (Rodriguez et al. 2002
; Romieu et al. 2004
). We observed minor excess of false positives in the association tests of pre-FEV1
with AIMs before adjusting ancestry (). The type I error rate returned back to the expected level after including ancestry in the models. It will be of importance to determine whether the association between ancestry and asthma severity will be reproducible in African American asthmatics across the United States. It will be also important to explore gene-environment interactions of ancestry with SES and/or environmental factors.
A known limitation of genetic association studies in admixed populations is the difficulty in recruiting subjects from appropriate ancestral populations. Recent studies have shown that the principal component approach may be an appealing alternative to account for population stratification confounding, especially when investigators have no data from ancestral populations (Zhang et al. 2003
). However, a significant limitation of this approach is how it handles missing data. Since it is common that many study participants do not have complete genotype information for all markers, power on the basis of the principal component approach may be limited. We approached this issue by only incorporating partial prior information from ancestral populations into ADMIXMAP and Structure2.1. The results in indicated that ADMIXMAP provided similar admixture estimates by using all priors and by only using data from African subjects. In contrast, Structure2.1 gave comparable estimates by using all priors and by only using data from European subjects (). The difference was likely due to weighting ancestral information differently while inferring admixture proportions in ADMIXMAP and Structure2.1. In future plan, we will assess the difference in performance between ADMIXMAP and Structure2.1 through realistic simulation works. Besides, both programs provided poor admixture estimates when using no priors from ancestral populations. If investigators do not have genotyping data collected from proper ancestral populations, we would recommend investigators using the ‘genome-control’ approach to adjust for population stratification confounding, instead of including poor estimates in the model.
Population stratification occurs when there is an event of nonrandom mating. This permits allele frequencies of markers to vary among segments of the populations, as the results of genetic drift or founder effects (Slatkin 1991
). As a consequence, a disease with high prevalence in one subpopulation will be also associated with any alleles that are in high frequency in that subpopulation. Since we detected two subgroups in the cohort, it would be of interest to explore whether the ‘group membership’ was correlated with asthma disease status or asthma-related traits. We grouped the subjects into two clusters based on their IAEs by using k
-means cluster analysis (data not shown). We then checked the correlation between group membership and disease status, and between group membership and asthma-related traits. Correlation coefficients were less than ± 0.1 for disease status and asthma-related phenotypes, except for pre-FEV1
(ρ = 0.25). Taken together, substructures observed in our cohort was not correlated with asthma disease status and asthma-related traits.
According to our admixture-matched study design, we did not observe inflation of type I error in our association tests, even without adjustment of ancestry (). To demonstrate that population stratification can be a potential confounding factor if investigators do not match admixture background during the recruitment stage, we deliberately created a subset from our 352 SAGE subjects. In this subset, we selected asthmatic cases with top 100 African ancestral proportions and healthy controls with bottom 100 African ancestral proportions from the cohort. We then performed association tests of disease status with 30 AIMs in this subset. The results in Table S2
showed that there was an excess of false positives while not adjusting ancestral information in the analysis. The results here strengthened that our recruitment schema— matching admixture background in the study design could efficiently control population stratification confounding in admixed populations.
To adjust and control for potential confounding due to population stratification in the analysis, we applied a two-step approach, in which we estimated IAEs first using ADMIXMAP or Structure2.1. We then included these estimates into a conventional regression model as a covariate. The ADMIXMAP program provides a one-step approach, in which inference of admixture proportions, regression modeling and testing for association are combined in one model simultaneously. We compared the two-step and one-step approaches via comprehensive simulation scenarios that were described elsewhere (Tsai et al. 2005
, in press
). The findings in this work showed that the most important factor in determining accuracy of IAEs and in minimizing type I error rate was the number of AIMs used to estimate ancestry. For both one-step and two-step approaches, after accounting for precise ancestry information in association tests, the excess of type I error rate was controlled at the 5% level when 100 AIMs were used to calculate IAEs.
In summary, our present study demonstrates that an admixture-matched case-control study design is capable of controlling for confounding due to population stratification in admixed populations. Our results indicate that recruiting admixed subjects in a very specific way such as recruiting from the same clinic or very nearby geographic location can minimize differences in the degree of admixture. Our results show that the minimal differences of admixture in our SAGE cohort do not confound the genetic association tests. Genetic background in our cohort is similar to previously reported genetic background in northern and western African Americans. Ancestry is likely to be associated with asthma severity. We do not observe an excess of false positives in our genetic association tests. Population stratification does not confound the genetic association tests of β2AR SNPs and asthma in our cohort. Our work supports that the admixture-matched case-control study design is a promising strategy for studying genetic association in admixed populations.