|Home | About | Journals | Submit | Contact Us | Français|
Asthma is a common disease with a complex risk architecture including both genetic and environmental factors. We performed a meta-analysis of North American genome-wide association studies (GWAS) of asthma in 5,416 asthma cases representing European Americans, African Americans/African Caribbeans, and Latinos, and replicated five regions among the most significant signals in 12,649 individuals from the same ethnic groups. Four were at previously reported loci on 17q21, and near the IL1RL1, TSLP, and IL33, genes, but we report for the first time that these loci are associated with asthma risk in three ethnic groups. In addition, we identified a novel association with asthma in the PYHIN1, gene that was specific to individuals of African descent (p=3.9×10−9). These results suggest that some asthma susceptibility loci are robust to differences in ancestry when sufficiently large samples sizes are investigated, and that ancestry-specific associations also contribute to the complex genetic architecture of asthma.
Asthma is a common, complex disease that affects over 300 million people worldwide1. In the United States, the prevalence of asthma in 2001–2003 varied between ethnic groups, ranging from 7.7% in European Americans, 12.5% in African Americans, and 3.9–14.5% in Latino Americans2. Estimates of heritability indicate that 35–80% of the variation in risk is attributable to genetic variation3,4, yet attempts to identify asthma susceptibility alleles by candidate gene and positional cloning studies had relatively few successes5,6. Most genome-wide association studies (GWAS) of asthma were conducted in relatively small samples of Europeans7, European Americans 8–10, Mexicans11, Puerto Ricans12, and African Americans and African Caribbeans13 and did not reveal significant associations or associations that replicated across studies. In contrast, the association at the IKZF3/ZPBP2/GSDMB/ORMDL3 locus on chromosome 17q21 (henceforth referred to as the 17q21 locus) has been consistently replicated as an asthma susceptibility locus in ethnically diverse subjects from Europe, North America, and Asia14–22. Recently, a large meta-analysis of GWAS in European populations, called the GABRIEL Study23, reported significant associations with single nucleotide polymorphisms (SNPs) at six loci, including the 17q21 locus and five other regions not identified in previous GWASs of asthma. However, it is currently unknown whether the additional associations, implicating HLA-DQ, IL1RL1/IL18R1, IL33, SMAD3, and IL2RB1, play a significant role in asthma risk in European populations only or whether the earlier studies were just underpowered to detect associations with these loci.
The combined results of the previous GWAS for asthma highlight the challenges in the search for genetic risk factors for asthma. These include the large number of statistical tests performed (with millions of polymorphisms), missing genotype information (especially for rare variants, and due to inadequately designed genotyping platforms for non-European populations), modest effect sizes for associated alleles, heterogeneity in the clinical definition of disease, and the effects of a large number of potentially important environmental exposures, all of which will reduce power to detect associations. The simplest solution for increasing power is to pool data from many studies in a meta-analysis of genome-wide data sets for asthma, as in the GABRIEL Study in Europeans23. The goal of the EVE Consortium, which includes GWAS datasets from nine research groups in the United States (Table 1), is not only to increase the power to identify SNPs, genes and pathways associated with asthma risk by combining studies, but also to provide a better understanding of the patterns of variation in asthma risk genes or variants in the three major ethnic groups in the United States and to provide a more comprehensive understanding of the differences in genetic risk patterns between European American, African American/African Caribbean, and Latino individuals. Only studies of diverse populations will allow for the discovery of both robust associations that replicate across ethnic groups and unique associations that contribute to the heterogeneity in disease prevalence across different ethnic groups. Moreover, studying populations with diverse ancestries can increase the resolution of associated regions as a result of different patterns of linkage disequilibrium (LD) between racial/ethnic groups. The results reported here are based on analyses of >2 million SNPs in 3246 asthma cases, 3385 non-asthmatic controls, 1702 asthma case-parent trios, and 355 family-based cases and 468 family-based controls, comprising three ethnic groups: European American, African American/African Caribbean, and Latino (Table 1).
We performed four genome-wide investigations: one meta-analysis in each of three ethnic groups and one in the combined sample. The quantile-quantile (QQ) plots of the results (see Supplementary Figs. 1–4 online) indicate that none of the studies showed inflation in test statistics, but also revealed an abundance of small p-values, especially in the combined sample. In each of the four meta-analyses, we expected (on average) approximately two p-values to be smaller than 10−6 by chance alone. We observed 34 SNPs with p-values smaller than 10−6 in the European American, four in the African American/African Caribbean, 32 in the Latino, and 75 in the combined meta-analyses (see Supplementary Fig. 5 online). In the European Americans, 33 of the SNPs are at the 17q21 asthma locus; one additional SNP on chromosome 17 (rs9891949) is 27 Mb from the 17q21 locus. In the African Americans/African Caribbeans, two SNPs are in the PYHIN1 gene on chromosome 1q23 and two are in the intergenic region between the NNMT and C11orf71 loci on chromosome 11q23. In the Latinos, 13 SNPs are on chromosome 3q27 around the RTP2 gene, one is on chromosome 5q33 in the GALNT10 gene, 12 are at the chromosome 17q21 locus, and two are on chromosome 19q12 between the CCNE1 and C19orf2 loci. One SNP in an intron of RTP2, rs2017908, reached genome-wide significance in the Latino samples (p=4.4×10−9). The 75 SNPs with p<10−6 in the combined meta-analysis occur in 15 chromosomal regions; the most significant SNP in each region with at least one p<10−6 in the combined sample or in one of the ethnic groups is shown in Table 2, with risk allele frequencies shown in Supplementary Table 1 online. Among the 75 SNPs, those at the 17q21, IL1RLI, and TSLP loci reached genome-wide significance in the combined sample (p<2×10−8). The results for all SNPs with p<10−6 can be found as Supplementary Tables 2 and 3 online.
We selected one SNP from each of the 15 regions with at least one p-value <10−6 for replication studies (as described in Table 2). The samples used for replication are described in the Supplementary Note and Supplementary Table 4 online, and the results of those studies are shown in Table 3. Two SNPs (rs4653433 near the SRP9 gene on chromosome 1q and rs9891949 near the AURK gene on chromosome 17p) could not be assayed in the replication samples. Using a Bonferroni-corrected (for 13 tests) p-value of p<0.0038 as the threshold for significance in the replication studies, SNPs in five regions were significantly associated with asthma in the replication samples (Table 3).
SNPs near the 17q21 locus and the IL1RL1, TSLP, and IL33 genes were associated with asthma in all three ethnic groups in the replication studies, whereas the PYHIN1 association was specific to the African American replication samples. Interestingly, the associated SNPs in PYHIN1 occur with minor allele frequencies of 0.26–0.29 in the African American/African Caribbean controls, but they are not polymorphic in European Americans and occur at low frequencies (<0.05) in the Latino populations. However, multiple other SNPs in the PYHIN1 region showed evidence for association only in the African American/African Caribbean sample (Figure 1), whereas none showed evidence of association with asthma in Latinos or European Americans (see Supplementary Fig. 6 online), suggesting that this association may be specific to populations of African descent. In fact, rs1102000 in PYHIN1 (also known as IFIX) has a relatively large effect size (OR=1.34 in the GWAS meta-analysis, OR=1.23 in the replication samples), suggesting an important role for this gene in risk for asthma in African American and African Caribbean populations.
The associations with SNPs in PYHIN1 (pyrin and HIN domain family member 1; IFIX, interferon inducible nuclear protein X) with asthma are the first genome-wide significant associations reported in African Americans or African Caribbeans and may be the first asthma susceptibility gene specific to populations of African descent. The associated SNPs are in an LD bin in HapMap YRI samples, spanning ~30kb within the seventh intron of the gene, suggesting that the causal variation is contained within this intron. Imputation of additional SNPs in PYHIN1 using pilot data from the 1000 Genomes Project yielded no additional signals of association (see Supplementary Fig. 7 online). However, the incomplete coverage of PYHIN1 makes any conclusions on causal variation inaccurate (see Supplementary Fig. 8 online). Interestingly, the most strongly associated variants discovered in this study are not present in populations of European descent. Although, it is possible that rare variants in PHYIN1 that are not tagged by the SNPs included in this study are risk alleles for asthma in European American and Latino populations, none of the multiple SNPs in this gene that were associated with asthma in the African American/African Caribbean populations showed evidence for association in the European Americans or Latinos. This further suggests that the African-specific variants may be causal or in LD with other African-specific causal variants in this gene. Moreover, estimates of local ancestry at PYHIN1 in the African American/African Caribbean samples did not differ between cases and controls for any of the studies (combined p=0.77; see Supplementary Table 6 online for more details), indicating that the African-American specific association is present in the absence of an admixture signal and that the observed association is not due to uncorrected local ancestry. At present, little is known about the function of PYHIN1, although it is expressed in both adult leukocytes and lung tissues and the pyrin domain is a protein-protein interaction domain that is present in many interferon-inducible proteins that functions in both apoptotic and inflammatory pathways (see URL). This family of genes has been previously associated with autoimmunity24, but to date PYHIN1 has not been implicated in asthma pathogenesis.
Lastly, we examined the evidence for association with SNPs that were associated with asthma in previous GWAS other than those included in the EVE meta-analysis (see Supplementary Table 5 online). These included SNPs in or near HLA-DQ9,23, the 17q12 asthma locus23, IL3323, IL1RL123, SMAD323, IL2RB23, RORA23, SLC22A523, IL139,23, RAD509, and DENND1B10. We were able to replicate at p<0.05 associations with SNPs at the HLA-DQ locus in all three ethnic groups and in the combined sample, although no single SNP was associated in all three groups. The two SNPs associated at the 17q21 asthma locus in the GABRIEL Study were also associated with asthma in our study, although one (rs2894194) showed little evidence for association in the African American/African Caribbean samples. Among the remaining SNPs associated in the GABRIEL study, we replicated associations with the same SNPs in or near the IL18RL1, IL33, SLC22A5, SMAD3, and RORA genes. SNPs at the latter three loci were associated only in the European American sample. Lastly, we replicated associations with two SNPs in RAD50, with the signals coming largely from the Latino and African American/African Caribbean samples. A SNP at the IL2RB locus that was associated with asthma in the GABRIEL Study was nearly significant in the European American EVE sample (p=0.06). We did not replicate associations with SNPs at the IL139,23 and DENNDB110 loci.
The results reported here highlight the importance of studying large datasets of diverse populations in several ways. First, the large sample size allowed us to first discover and then replicate loci with modest effects. For example, SNPs at the IL1RL1, TSLP, and IL33 loci did not reach genome-wide significance in any of the ethnic-specific meta-analyses, but SNPs in IL1RL1 and TSLP reached this threshold of significance, and IL33 approached genome-wide significance, in the overall analysis (Table 2; see Supplementary Figs. 10–12 online). The modest, but real, effect of variants on asthma risk likely explains why this locus was not identified as an asthma susceptibility locus in previous genome wide association studies in individual samples7–11,13, but was among the most significant associations in the EVE and GABRIEL meta-analyses.
Second, the diverse ancestries of the EVE Consortium samples provided a more accurate and more complete picture of asthma risk by identifying common variants at four loci (17q21 locus, IL1RL1, TSLP, and IL33) that increase risk in all ethnic groups and at least one locus (PYHIN1) that may contribute to risk only in populations of African descent. Replication of previously associated variants in the EVE samples may also provide additional insights into those genes. For example, the SNPs in or near SMAD3 and RORA reported in the GABRIEL Study meta-analysis were modestly associated with asthma in the European American EVE samples but showed no association in the African American/African Caribbean samples, suggesting that these may be risk variants only in populations of European descent. We anticipate that ongoing studies in the EVE Consortium data sets will identify other loci contributing to asthma risk and, ultimately provide a better understanding of the molecular pathways and networks that are common to the risk architecture of asthma in diverse populations, and those that are specific to certain groups.
Asthma cases, unaffected controls, asthmacase parent trios, and extended families were recruited in clinics in the U.S, Mexico, and Barbados. Twelve samples with GWAS data were included in this study (Table 1). Detailed descriptions of the individual studies, ascertainment schemes, genotyping platforms, quality control (QC) protocols, and statistical analyses for the primary association testing are described in the Supplementary Note online.
Summary files on a common set of SNPs were shared among the EVE investigators. The common set of SNPs consisted of all Phase 2, Release 21 consensus HapMap variants. Prior to genotype imputation, each center oriented their SNPs to the plus strand, and filtered for call rates (> 95%) and consistency with Hardy-Weinberg expectations (p > 10−5 for case/control studies, p > 10−6 for trio studies). Genotype imputation using HapMap reference panels were performed separately in each sample with the program MACH25, and associations were tested using the genotype dosages that are part of the imputation algorithms output, with adjustments for admixture in the African American and Latino case-control samples (see Supplementary Note online). The shared summary files contained the SNP identifiers (rs number, chromosome, position, alleles), SNP QC metrics (call rate, Hardy-Weinberg equilibrium p-value, imputation quality metrics), and information related to the test for association (allele frequencies in cases and controls or in the transmitted and untransmitted alleles in trios, association p-value, odds ratios with standard errors). Reference alleles were assigned as the allele coded 0 in the HapMap release 21 phased consensus haplotypes during genotype imputation in MACH. The QC checks performed on summary file data included consistency in reference allele and strand orientation (see Supplementary Figs. 13−15 online), and imputation quality (see Supplementary Fig. 16 online). SNPs with imputation quality scores below a threshold were removed from the analysis (Rsq < 0.3 or < 0.5 for the Barbados study). QQ plots were visually inspected to compare the distribution of association p-values for genotyped and imputed markers separately for each cohort.
The meta-analysis searched for asthma susceptibility variants for which the same allele was associated with asthma in the different studies. For each study, we constructed a test statistic that has a standard normal distribution under the null hypothesis of no association and captures the direction of the effect (i.e., the statistic was positive if the reference allele was associated with an increased risk of asthma). The meta-analysis test statistic was calculated as a linear combination of the individual study scores with weights proportional to the square root of the number of cases (or trios). P-values were obtained using normal approximations. Odds ratios were calculated by combining linearly log odds ratios with weights reflecting the standard errors from the genome-wide association studies.
The replication cohorts and sample sizes are shown in Supplementary Table 4 online. Detailed descriptions of these samples, genotyping technologies, QC protocols, and statistical analyses are described in the Supplementary Note online.
The combined analysis of the replication samples was performed in a similar manner to the meta-analysis. For each study, we constructed a test statistic that had standard normal distribution under the null hypothesis of no association, and that also captured the direction of the effect. A combined test statistic was calculated as a linear combination of the individual study scores with weights proportional to the square root of the number of cases, and p-values were obtained using normal approximations.
Pilot data from the 1000 Genomes Project (August 2010 haplotypes) was used to impute a 2Mb region surrounding the PYHIN1 gene in all studies using impute226. The EUR haplotypes (European) were used as a reference for the European American studies (assuming an effective population size [Ne] of 11418), the EUR and AFR (African) haplotypes were used as a reference for the African American studies (assuming Ne=15000), and the EUR, AFR, and ASN (Asian haplotypes) were used as a reference for the Latino studies (assuming Ne=15000). Individual genotypes were filtered for probabilities > 90%, and only SNPs with call rates > 90% were included in allelic tests of association. For family-based studies, SNPs with Mendelian errors were removed.
This work was supported by grants from the National Heart, Lung, and Blood Institute (HL101651 to C.O. and D.L.N.; HL087665 to D.L.N.; HL70831, HL087665, HL072414, and HL49596 to C.O.; HL064307 and HL064313 to F.D.M.; HL075419, HL65899, HL083069, HL066289, HL087680, HL101543, and HL101651 to S.T.W.; HL079055 to L.K.W.; HL087699, HL49612, HL075417, HL04266, and HL072433 to K.C.B.; HL061768, HL076647, to F.D.G.; HL087680 to W.J.G.; HL078885 and HL088133 to E.G.B.; HL87665 to D.A.M.; and HL69116, HL69130, HL69149, HL69155, HL69167, HL69170, HL69174, HL69349 to D.A.M., E.R.B., W.W.B., W.J.C., M.C., K.F.C., S.C.E., E.I., and S.E.W.), the National Institutes of Allergy and Infectious Disease (AI070503 to C.O.; AI079139 and AI061774 to L.K.W.; AI50024, AI44840, and AI41040 to K.C.B.; and AI077439 to E.G.B.), the National Institute of Diabetes and Digestive and Kidney Diseases to L.K.W. (DK064695); the National Institutes of Environmental Health Sciences (ES09606, ES018176, and ES015903 to K.C.B.; ES007048, ES009581, R826708, RD831861, and ES011627 to F.D.G.; ES015794 to E.G.B.; and the Division of Intramural Research, Z01 ES049019, to S.J.L.); the National Center for Research Resources (RR03048 to K.C.B.), the Environmental Protection Agency (83213901 and R-826724 to K.C.B.), the American Asthma Foundation and the Fund for Henry Ford Hospital (to L.K.W.), Mary Beryl Patch Turnbull Scholar Program (to K.C.B.), the National Council of Science and Technology (Mexico) (26206-M to I.R.), the Centers for Disease Control, U.S. (to I.R.), the Eudowood Foundation (to N.N.H.); and the Flight Attendant Medical Research Institute (FAMRI), RWJF Amos Medical Faculty Development Award, the American Asthma Foundation, and the Sandler Foundation (to E.G.B.).
The authors gratefully acknowledge the support from Dr. James Kiley, Dr. Susan Banks-Schlegel, and Dr. Weiniu Gan at the National Heart, Lung, and Blood Institute, and all the patients and families that participated in these studies.