|Home | About | Journals | Submit | Contact Us | Français|
Rationale: Association studies have implicated many genes in asthma pathogenesis, with replicated associations between single-nucleotide polymorphisms (SNPs) and asthma reported for more than 30 genes. Genome-wide genotyping enables simultaneous evaluation of most of this variation, and facilitates more comprehensive analysis of other common genetic variation around these candidate genes for association with asthma.
Objectives: To use available genome-wide genotypic data to assess the reproducibility of previously reported associations with asthma and to evaluate the contribution of additional common genetic variation surrounding these loci to asthma susceptibility.
Methods: Illumina Human Hap 550Kv3 BeadChip (Illumina, San Diego, CA) SNP arrays were genotyped in 422 nuclear families participating in the Childhood Asthma Management Program. Genes with at least one SNP demonstrating prior association with asthma in two or more populations were tested for evidence of association with asthma, using family-based association testing.
Measurements and Main Results: We identified 39 candidate genes from the literature, using prespecified criteria. Of the 160 SNPs previously genotyped in these 39 genes, 10 SNPs in 6 genes were significantly associated with asthma (including the first independent replication for asthma-associated integrin β3 [ITGB3]). Evaluation of 619 additional common variants included in the Illumina 550K array revealed additional evidence of asthma association for 15 genes, although none were significant after adjustment for multiple comparisons.
Conclusions: We replicated asthma associations for a minority of candidate genes. Pooling genome-wide association study results from multiple studies will increase the power to appreciate marginal effects of genes and further clarify which candidates are true “asthma genes.”
Many genes have been implicated in asthma pathogenesis, yet most prior studies assessed only a few loci within a gene.
Using genome-wide data in a well-characterized cohort of children with asthma, we performed a comprehensive replication study of 39 candidate genes previously associated with asthma. We found evidence for variant- and gene-level replication for 17 genes, a minority of candidate genes.
Asthma is a disease with a strong genetic component, with estimates of heritability as high as 36 to 75% (1). A thorough review of the literature in 2006 highlighted more than 100 susceptibility genes for atopy and asthma identified through candidate gene studies, positional cloning, or linkage studies (2). Yet for nearly every gene putatively associated with asthma, multiple publications have failed to replicate the original findings. Differentiating false positives (related to factors such as multiple comparisons, random chance, and population stratification) from false negatives (resulting from underpowered replication studies, random chance, or population stratification) is difficult.
Until more recently, candidate gene studies were constrained largely by limitations of genotyping: technical and monetary considerations limited the number of single-nucleotide polymorphisms (SNPs) that could reasonably be evaluated per study. For example, in 2001, deCODE genetics (Reykjavik, Iceland) published the finding that 42 SNPs within 24 “asthma genes” did not differ between cases and controls (3). Because only 94 cases and controls were studied, they could only conclude that those mutations were unlikely to incur odds ratios (ORs) greater than 2. After eight more years of studying asthma genetics, additional candidate loci have been identified, and we recognize that contributions of individual asthma genes are almost certainly more modest, yet no additional systematic assessments of previously identified asthma genes have been performed.
The cost of SNP genotyping has fallen dramatically, enabling more thorough assessment of candidate loci, both by testing a sufficient number of SNPs to cover common variation within the gene and by genotyping cohorts sufficiently large to detect modest genetic associations. In this work, we used data from a genome-wide association study (GWAS) to evaluate previously replicated asthma genes. The primary analysis of these data identified phosphodiesterase 4D (PDE4D) as a novel asthma susceptibility gene that replicated in seven populations (4); that work will not be discussed further here. However, this genotyping array also provides coverage of about 75% of common genetic variation in white individuals (5), thus enabling assessment all known asthma genes in a single large, well-phenotyped population—in this case children with mild to moderate asthma who participated in the Childhood Asthma Management Program (CAMP). This data set also enables us to comprehensively assess the common genetic variation surrounding putative asthma susceptibility genes. Herein we describe the results of our analysis of more than 600 SNPs in 39 replicated asthma candidate genes, using GWAS. Some of the results of this analysis have been previously reported in the form of an abstract (6).
CAMP is a multicenter North American clinical trial designed to investigate the long-term effects of inhaled antiinflammatory medications in children with mild to moderate asthma (7, 8). Inclusion criteria and protocols for collection of baseline phenotypic data have been described in detail elsewhere (7).
Of the 1,041 children enrolled in the original clinical trial, 968 children and 1,518 of their parents contributed DNA samples to the CAMP Genetics Ancillary Study. Our only selection criteria for GWAS genotyping were (1) self-described non-Hispanic white ethnicity and (2) availability of sufficient DNA for microarray hybridization. Four hundred and twenty-two subjects and their parents met these criteria. The Institutional Review Boards of the Brigham and Women's Hospital (Boston, MA) and of the other CAMP study centers approved this study. Informed assent and consent were obtained from the study participants and their parents to collect DNA for genetic studies.
We developed a list of candidate asthma genes on the basis of a literature review of asthma genetic association studies published through 2005 (2), supplemented with a PubMed (www.ncbi.nlm.nih.gov/pubmed/) search limited to articles published between September 1, 2005 and July 1, 2008 and using the terms “genetic association AND asthma” and “case control AND asthma.” Candidate genes were included if the following criteria were met: (1) significant association with asthma affection status in at least two populations, (2) at least one significant association study with no fewer than 150 cases and 150 controls or 150 trios, and (3) asthma association with SNP (as opposed to haplotypes, microsatellite markers, or with structural genetic variants) in at least one population (9, 10). Genes were excluded if the associations were with asthma-associated phenotypes rather than asthma or if inclusion criteria were met only through association studies in the CAMP cohort (11–13).
A Hum Hap 550Kv3 BeadChip SNP array (Illumina, San Diego, CA) was used for SNP genotyping. Please see Table E1 in the online supplement for detailed genotyping and quality control methods. 97.5% of SNPs passed our quality control checks, leaving 534,290 SNPs for analysis. Genotype completion rates for these SNPs averaged 99.75%.
Association testing was performed with PBAT version 5.3.0 (Golden Helix, Inc., Bozeman, MT). The primary association analysis assumed an additive genetic model, with additional testing under dominant or recessive models performed for those SNPs with prior association under those models.
We performed two sets of association analyses. First, we assessed SNP-level replication, testing SNPs previously associated with asthma that were either themselves represented on the Illumina array or were tagged with an r2 ≥ 0.80 (as calculated from the CEPH [Centre d'Etude du Polymorphisme Humain] from Utah [CEU] HapMap genotype data) (14, 15). SNP-level analysis tests were one-tailed, because significance was declared only if exact replication was observed (i.e., same risk allele). Second, we assessed gene-level replication, by testing whether other common genetic variants in these candidate genes were associated with asthma. We considered all additional SNPs on the Illumina 550K array with minor allele frequency ≥ 0.05 mapping within 50 kb up- or downstream of the RefSeq mRNA sequence of candidate genes (based on the extent of CEU linkage disequilibrium and haplotype block patterns ). Association tests in this second analysis were two-tailed. Power was calculated with Quanto, assuming 400 trios and 15% disease prevalence (17).
Review of the medical literature through July 2008 identified 39 genes with SNP association with asthma in at least two populations. A complete list of genes and prior publications is in the online supplement (see Table E2). All these genes harbor SNPs within 50 kb of transcript that were represented on the Illumina 550K array. We note that because lymphotaxin-α (LTA) and tumor necrosis factor-α (TNF-α) lie within a 7-kb region of chromosome 6, they were considered jointly (LTA/TNF) for all analyses.
Genotyping was performed for 422 CAMP participants and their parents, using the Illumina 550K array. After quality control analysis and removal of 43 individuals with inadequate genotype data, 1,169 members of 403 nuclear families were available for analysis (see Tables E1 and E3 for details). Of the 403 probands, 63% were male, with a mean age of 8.8 (±2) years and baseline FEV1 of 93 (±14)% predicted. The 403 probands analyzed were more frequently male and had slightly more severe asthma than the remaining white probands in the CAMP cohort (see Table E3 for full comparison).
We first evaluated variants previously associated with asthma that were testable, using the Illumina 550K array (see Table E4 in the online supplement). One hundred and sixty SNPs within 39 genes were previously associated with asthma. Of these 160 SNPs, 93 (58%) were amenable to testing with the Illumina array: 39 SNPs were present on the array (directly tested) and 54 were in high linkage disequilibrium (LD) (r2 > 0.8 based on HapMap CEU data) with an SNP on the array (indirectly tested). The remaining SNPs previously associated with asthma were not amenable to testing, because they were either not taggable (18 SNPs) or unknown (49 SNPs with no dbSNP accession number [rs] number, or not in HapMap).
Of the 93 SNPs tested, 10 SNPs in 6 candidate genes were nominally associated with asthma in our population, with directionality consistent with prior association studies (one-sided P < 0.05; see Table 1). Although multiple SNP associations were replicated in both IRAK-3 and ORMDL3, these likely reflect a single disease susceptibility locus in each gene. In ORMDL3, for example, the r2 between all tagged SNP is greater than 0.9, and thus all SNPs could have been tagged had we tested only a single marker in CAMP. Direct replication (meaning the SNP reported in the literature was genotyped) was observed for three genes (IRAK-3, ORMDL3, and IL4R), whereas only indirect evidence of replication (using LD-tagging SNPs) was observed for three genes (PHF11, IL10, and ITGB3). To our knowledge, this is the first independent replication of association with asthma and ITGB3 subsequent to the two populations analyzed in the initial publication (18). We note that we required consistent directionality to call replication (i.e., minor allele is risk allele in both studies). We did see evidence of a so-called flip-flop association for two SNPs, rs946263 within CHI3L1 (tags rs4950928, minor allele is risk allele in our cohort; P = 0.04) and rs5065 in NPPA (directly tested, minor allele is protective in our cohort; P = 0.03). The significance of such associations, both biologically and clinically, is unclear (19).
SNP-level replication was observed for 6 of 34 genes. Among possible reasons for failing to replicate the remaining SNP associations is allelic heterogeneity, whereby more than one variant per gene contributes independently to disease risk. To evaluate this possibility, we next assessed whether the 39 genes harbor other common asthma-associated variants. The Illumina 550Kv3 array includes 619 additional SNPs with minor allele frequency (MAF) at or exceeding 0.05 mapping within or near these loci (see Table E5).
We found additional association in 15 genes, each with at least one SNP associated with asthma in an additive model (54 SNPs with uncorrected P < 0.05; see Table 2). The complex nature of the associations is evident, for example, in the gene ITGB3, as shown in Figure 1 (see Figures E1–E17 in the online supplement for images of all genes with association). The minor allele of rs11869835 (Figure 1A) is associated with asthma and tags a previously identified SNP with an r2 of 1.0. Four previously identified SNPs were directly tested in CAMP and showed no association in CAMP (Figure 1B). Finally, a SNP upstream of ITGB3 has not been previously tested, yet was associated with asthma here (Figure 1C); this locus demonstrates the capacity of GWAS surveys to identify novel susceptibility SNPs within the candidate genes. Only one SNP, located in NPPA, met significance after gene-level Bonferroni correction (P < 0.013 for the four SNPs tested); none met the more stringent Bonferroni correction for all SNPs tested in this analysis (PBonferroni < 8 × 10−5 for 619 SNPs).
We performed several analyses to investigate potential contributors to our replication of only a minority of previous asthma associations (in 10 of 93 SNPs and 15 of 39 genes), using GWAS. LD coverage of common variation was adequate for most genes, using the Illumina 550K array (89% of HapMap SNPs with MAF ≥ 0.05 were directly tested or tagged), although coverage within genes varied widely, ranging from 21% in CCL24 to 98% in IL4.
We assessed whether using the Affymetrix genome-wide human SNP array 6.0 (Affymetrix 6.0, with >900K SNPs represented) would have increased SNP coverage and thus increase our ability to replicate SNP associations. Coverage was not improved with the Affymetrix 6.0 array, with only 83% of HapMap SNPs tagged; this is a best-case estimate of Affymetrix SNP 6.0 array coverage, as genotype completion rates of 100% were assumed for this analysis. As shown in Figure 2, of the 160 SNPs previously associated with asthma, fewer would be directly tested, and more would be untaggable with Affymetrix 6.0 rather than the Illumina 550K array.
The 403 trios included here represent a substantially larger population than the vast majority (>90%) of prior publications (Table E3). We had 80% power to identify ORs ranging from 1.4 to 1.7 for genes and from 1.35 to 1.8 for SNPs. Despite our larger population, these ORs are high for complex trait genetics; thus low power may have contributed to our failure to replicate the majority of SNPs and genes tested.
We assessed whether genes that successfully replicated differed from those that did not in any important ways. We found that genes that were larger or contained more SNPs were more likely to be significant: genes with at least one significant SNP were a median 73 kb long and contained 11 SNPs, whereas those without significant associations were 42 kb and contained 7 SNPs (Wilcoxon P = 0.07 and 0.11, respectively). In contrast, the number of published replications did not vary between the two groups of genes (median, three populations in each; P = 0.43). A higher effect estimate in previous publications likewise was not associated with replication in this study (median OR, 2.82 in replicated genes vs. 2.78 in nonreplicated genes; Wilcoxon P = 0.87). Among the 619 Illumina 550K SNPs tested for gene-level replication, minor allele frequency was higher in the 54 significant SNPs (MAF, 0.29 vs. MAF 0.25 in nonsignificant SNPs; P = 0.03).
The focus of most previous GWAS publications has been on discovery of novel biology, tested in large association studies in a “hypothesis-free” manner. Yet by virtue of their broad genomic coverage, GWASs also include loci that have been previously associated with phenotypes in candidate gene or fine-mapping association studies. We thus have the opportunity to evaluate many previously reported asthma associations in a well-characterized asthma cohort using GWAS data, assess the reproducibility of these associations, test additional common variants in these loci, and ask the question: “Are these genes truly associated with asthma?” In this work, we used GWAS data to systematically test the most promising asthma candidate genes (i.e., those previously associated in two or more cohorts). We were able to replicate findings at the SNP and gene level for multiple genes, but failed to replicate many others; both points merit discussion.
We found evidence for SNP-level replication for six genes in this cohort. All six associations are modest, with transmitted:untransmitted ratios of 1.18–1.43, leading to P values in the 0.01–0.05 range (see Table 1). They were appreciated here only because of a focus on candidate genes—none would have been detected in the context of typical GWAS analysis, where stringent multiple comparisons correction is required (i.e., α < 10−6). This is the first independent replication for integrin β3 (ITGB3). Two previous case–control analyses in white children showed association with ITGB3, but directionality of the minor allele differed between the two studies (18). Our study adds support to the finding that the minor allele is actually the risk allele at this locus. In addition, the family-based nature of our cohort makes population stratification an unlikely false-positive cause of these asthma associations, thus increasing the likelihood that these associations are real. Another convincing replication is in ORMDL3, a candidate gene discovered via a GWAS (20). All associated SNPs identified in CAMP are in strong LD with SNPs from the initial report of association (Figure 1); although a precise causative locus for asthma risk is uncertain, our results are certainly consistent with the findings of others that this region harbors an asthma susceptibility locus (20–23).
We focused not only on SNP-level replication, but also more broadly assessed each gene and its LD-flanking region for evidence of association with asthma. We found an additional 54 SNPs in 15 genes that were significantly associated, with P < 0.05. As shown in Figure 1, the SNPs associated with asthma in our study were frequently distant from those noted in original publications. Given that none of these SNPs met correction for multiple comparisons testing (P < 8 × 10−5), it is certainly possible that some of these findings represent false positives. Alternatively, these new associations may represent differences between populations in LD structure or may be due to allelic heterogeneity, with two or more causative variants resulting in similar phenotype.
Our results may be most striking for our failure to find more evidence of replication, at both the SNP and gene levels. Only a small minority of previously identified SNPs (10 of 93 SNPs either tested directly or LD-tagged) could be replicated. Only 17 of the 39 genes showed any evidence of association with asthma (11 with association at the gene level only, 2 with only SNP-level replication, and 4 with both). Thus, even using the liberal standard of “any P < 0.05” to suggest replication, we failed to find any association with asthma in the majority of previously identified genes. Why did we not find more evidence of replication? One potential answer is inadequate coverage of common variation using a GWAS genotyping platform. The Illumina 550K array tagged 70% or more of HapMap SNPs with r2 greater than 0.8 for 29 of the 39 genes, and SNP-level coverage would not have improved using an alternative GWAS platform (the Affymetrix genome-wide human SNP array 6.0). However, these estimates are derived with HapMap data (from which LD data were used to inform SNP content on SNP arrays) and may thus overestimate genetic coverage on current GWAS microarray platforms. More recent estimates suggest substantially lower coverage, approaching 50% (24). This sparseness would diminish power (and thus sensitivity) to detect true associations.
Poor coverage is clearly not the answer for the 93 SNPs that were directly tagged or tested—even in those SNPs, we found an association for only 10 SNPs. Lack of statistical power could further explain our negative associations. We studied 403 trios, a substantially larger population than the vast majority of previous publications. Thus, we had 80% power to detect an OR of 1.4–1.7 in these genes. It is certainly possible that we missed smaller effects, however; genetic studies suffer a well-described “winner's curse” phenomenon, with initial publications tending to overestimate effect estimates (10, 25). Genes were more likely to replicate if they were larger and more SNPs were tested; in both analyses, SNPs with higher MAF were more likely to be positive. These findings suggest that higher power would have increased our significant findings. Thus, for the 24 genes without evidence of replication, it may be that their true effect size is 1.3 or less, and thus even larger sample sizes are needed to identify a true association.
Another potential cause of failure to replicate involves heterogeneity between studies. Definitions of asthma vary widely. Although we used an extraordinarily well-phenotyped cohort of children with documented doctor-diagnosed asthma and a positive methacholine challenge test, and who were participating in a clinical trial, many studies rely on self-reported asthma or wheezing. Heterogeneity in the age and race of subjects could reduce power as well. We note that although we ran our analyses in an additive model, we also assessed for association in a recessive or dominant model if these were supported in the literature. In no case did we find additional significant SNPs; thus, model misspecification does not contribute to our negative results. Finally, we note that evidence of replication at either the SNP or gene level was observed for five of six genes identified by position-based genetic mapping approaches (i.e., linkage analysis or GWAS, including GPR154, ORMDL3, CHI3L1, PHF11, and DPP10, but not ADAM33). This compares with replication for less than 50% of primarily biological candidates. It is interesting to speculate that susceptibility variants identified by hypothesis-free gene mapping more consistently contribute to disease liability across populations.
So what does GWAS add to our understanding of candidate genes? It is most obviously useful when previously identified SNPs are directly represented (the 93 SNPs testable here, for example). As the number of available asthma GWASs increases, we may soon be able to pool results from multiple cohorts and thus have sufficient power to more definitively answer whether those SNPs are truly asthma susceptibility loci. GWAS also facilitates broader surveys of common variation within these candidate genes and can reveal novel candidate susceptibility SNPs, as illustrated in 15 genes in this study. However, where genetic coverage is sparse, GWAS is less well suited for “ruling out” candidate genes. In these instances, negative studies not only require larger samples, but also direct genotyping of candidate variants not represented (either directly or indirectly) on the arrays.
In summary, we have performed the first systematic assessment of asthma genes by GWAS technology. We found evidence of SNP-level replication in 6 genes, and gene-level replication for 15. We anticipate that GWAS data will continue to be used to evaluate candidate genes. As results from additional asthma GWASs become available and investigators pool their results, we will be able to more definitively resolve which of these candidates are truly asthma genes.
The authors thank all subjects for their ongoing participation in this study. The authors acknowledge the CAMP investigators and research team, supported by the National Heart, Lung, and Blood Institute (NHLBI), for collection of CAMP Genetic Ancillary Study data. All work on data from the CAMP Genetic Ancillary Study was conducted at the Channing Laboratory and the Brigham and Women's Hospital under appropriate CAMP policies and human subject protections.
Supported by the National Institutes of Health and the National Heart, Lung, and Blood Institute, NO1 HR16049 (CAMP Genetics Ancillary Study). Additional support for this research came from grants U01 HL065899, U01 HL075419, P01 HL083069, and T32 HL07427, also from the NHLBI. B.A.R. is a recipient of a Mentored Clinical Scientist Development Award from NIH/NHLBI (K08 HL074193).
This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org
Originally Published in Press as DOI: 10.1164/rccm.200812-1860OC on March 5, 2005
Conflict of Interest Statement: A.J.R. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. B.A.R. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.A.L-S. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. A.M. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. R.L. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. B.J.K. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.S.S. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.P.Z. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. C.L. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. J.C.C. does not have a financial relationship with a commercial entity that has an interest in the subject of this manuscript. E.K.S. received an honorarium for a talk on COPD genetics in 2006, grant support for two studies of COPD genetics (2004–2008), and consulting fees (2005–2008) from GlaxoSmithKline, an honorarium from Wyeth for a talk on COPD genetics in 2004, an honorarium from Bayer for a symposium at the ERS meeting in 2005, and honoraria for talks in 2007 and 2008 and consulting fees in 2008 from AstraZeneca. S.T.W. has been a consultant to the TENOR Study for Genentech and has received $20,000 for years 2007 and 2008.