|Home | About | Journals | Submit | Contact Us | Français|
Asthma is a heterogeneous disease that is caused by the interaction of genetic susceptibility with environmental influences. Genome-wide association studies (GWAS) represent a powerful approach to investigate the association of DNA variants with disease susceptibility. To date, few GWAS for asthma have been reported.
GWAS was performed on a population of severe or difficult-to-treat asthmatics to identify genes that are involved in the pathogenesis of asthma.
292,443 SNPs were tested for association with asthma in 473 TENOR cases and 1,892 Illumina general population controls. Asthma-related quantitative traits (total serum IgE, FEV1, FVC, and FEV1/FVC) were also tested in identified candidate regions in 473 TENOR cases and 363 phenotyped controls without a history of asthma to further analyze GWAS results. Imputation was performed in identified candidate regions for analysis with denser SNP coverage.
Multiple SNPs in the RAD50-IL13 region on chromosome 5q31.1 were associated with asthma: rs2244012 in intron 2 of RAD50 (P = 3.04E-07). The HLA-DR/DQ region on chromosome 6p21.3 was also associated with asthma: rs1063355 in the 3’ UTR of HLA-DQB1 (P = 9.55E-06). Imputation identified several significant SNPs in the TH2 locus control region (LCR) 3’ of RAD50. Imputation also identified a more significant SNP, rs3998159 (P = 1.45E-06), between HLA-DQB1 and HLA-DQA2.
This GWAS confirmed the important role of TH2 cytokine and antigen presentation genes in asthma at a genome-wide level and the importance of additional investigation of these two regions to delineate their structural complexity and biologic function in the development of asthma.
Asthma is a complex disease that is caused by the interaction of genetic susceptibility with environmental influences. Genome-wide linkage studies, candidate-gene association studies, and genome-wide association studies (GWAS) represent three major approaches to investigate the association between genetic variants and disease development.
Genome-wide linkage studies have consistently identified regions linked to asthma or asthma-related traits on chromosome 2q, 5q, 6p, 12q, and 13q . The most highly replicated regions with obvious candidate genes are chromosome 5q31-33 (including interleukin (IL)5, IL13, IL4, CD14, and adrenergic beta-2-receptor (ADRB2)) and 6p21 (including lymphotoxin alpha (LTA or TNFB), tumor necrosis factor (TNF), major histocompatibility complex, class II, DQ beta 1 (HLA-DQB1), and DR beta 1 (HLA-DRB1)) . In addition, a recent meta-analysis of genome-wide linkage studies of asthma, bronchial hyperresponsiveness (BHR), positive allergen skin prick test (SPT), and total immunoglobulin E (IgE) identified overlapping regions for multiple phenotypes on chromosomes 5q and 6p as well as 3p and 7p . Unfortunately, genome-wide linkage studies can only identify genes with relative strong effects in broad regions that include many genes. Positional cloning studies have identified six genes for asthma: a disintegrin and metalloprotease domain 33 (ADAM33) on chromosome 20p13 , dipeptidyl-peptidase 10 (DPP10) on 2q14.1 , PHD finger protein 11 (PHF11) on 13q14.11 , neuropeptide S receptor 1 (NPSR1 or GPRA) on 7p14.3 , major histocompatibility complex, class I, G (HLA-G) on 6p21.3 , and cytoplasmic FMR1 interacting protein 2 (CYFIP2) on 5q33.3 .
Candidate-gene association studies have identified over 100 genes for asthma and asthma-related traits [2, 10, 11]. Although candidate-gene association studies have identified many genes, only a few have been replicated extensively. Thus, only 14 genes including genes on 5q and 6p (ADRB2, interleukin 4 receptor (IL4R), HLA-DRB1, IL13, CD14, TNF, membrane-spanning 4-domains, subfamily A, member 2 (MS4A2 or FCER1B), IL4, ADAM33, signal transducer and activator of transcription 6, interleukin-4 induced (STAT6), IL10, HLA-DQB1, glutathione S-transferase pi 1 (GSTP1), and LTA) that have been replicated in more than 20 independent studies . Even for highly replicated genes, replication might be due to ‘winner’s bias’ and/or loose replication standard (gene as a unit and related phenotypes).
A GWAS is a hypothesis-free approach able to identify novel genes with mild/moderate effects and, thus has become the best approach for studying association between genes and common disease phenotypes. To date, only four GWAS have been performed for asthma and asthma-related traits . The first GWAS of childhood asthma identified ORM1-like 3 (ORMDL3) on chromosome 17q12 . The second GWAS of serum YKL-40 levels identified chitinase 3-like 1 (CHI3L1) on 1q32 . The third GWAS was for a related trait, total serum IgE levels, and the most significant SNPs are in the Fc fragment of IgE, high affinity I, receptor for alpha polypeptide gene (FCER1A) on chromosome 1q23 and the second highest region observed was RAD50 on 5q31 . The fourth GWAS of childhood asthma indicated phosphodiesterase 4D, cAMP-specific (phosphodiesterase E3 dunce homolog, Drosophila) (PDE4D) on chromosome 5q12 .
In this study, we performed GWAS of asthma in The Epidemiology and Natural History of Asthma: Outcomes and Treatment Regimens (TENOR) population of severe or difficult to treat asthmatics to search for novel genes and to confirm previously identified genes involved in asthma. The purpose of the TENOR study was to investigate the natural history of asthma in a large cohort of well characterized asthmatics with severe or difficult to treat asthma; no treatment intervention was involved and patients continued to be treated by their asthma specialist [17–19].
The TENOR study was a multi-center observational and longitudinal cohort study of 4,756 asthmatics described as “severe or difficult-to-treat” by their physicians, sponsored by Genentech and Novartis . Subjects were included if they had physician-characterized difficult-to-treat asthma, and met additional criteria based on frequency of urgent care visits and/or the use of multiple controller medications. The clinical sites from the original TENOR study were contacted and invited to participate in this study. Sites that agreed were mailed Oragene DNA saliva collection kits (DNA Genotek, Inc), labeled with the TENOR participant ID. Sites then mailed the kits to participating individuals, who sent their collected samples to the Center for Human Genomics at Wake Forest University School of Medicine. This process was required to maintain anonymity between investigators at Wake Forest University and the study participants. Unfortunately the TENOR study had ended (end of 2004) before this project started so it was difficult to re-contact participants. 607 samples had sufficient DNA for successful SNP genotyping. Table 1 shows the demographic data for the TENOR cases and the two control populations. The TENOR asthmatics genotyped were similar in characteristics to the larger TENOR cohort.
General population controls were obtained using the Illumina iControlDB client to download genotypes for 3,294 Caucasian individuals with genotype data available from any of the three available HumanHap550k products (v1, v3, and −2v3). As shown in Table 1, only age and gender data are available. Additional control samples for asthma-related quantitative traits were obtained from a separate GWAS for asthma. These 363 phenotyped controls had no personal or family history of asthma and had normal pulmonary function including lack of bronchial hyperresponsiveness or bronchodilator reversibility. Testing also included measures of atopy including total serum IgE levels (Table 1). HapMap samples (N = 262) to be used for genetic ancestry check were also downloaded from the iControlDB database (Illumina, Inc.) after selecting the HumanHap300_v1 genotyping product.
DNA was isolated using the protocol described by DNA Genotek, and SNP genotyping was performed using the Illumina HumanCNV370 BeadChip. The samples were clustered by first applying Illumina’s cluster definition, removing samples with call rates less than 0.90, and then re-clustering using the samples themselves.
Quality control (QC) was applied to cases and controls separately since they were genotyped using slightly different Illumina products. Genetic ancestry of the TENOR cases was determined using the HapMap 300k dataset as a reference. Fixed 3 groups clustering and pairwise population concordance (PPC) of 1.0E-05 based on identity-by-state (IBS) were used to cross-validate ethnic group identity. Subjects were removed if they 1) were not of European white descent, 2) had low genotyping call rates (< 95%), 3) were discrepant or ambiguous for genetic sex (heterozygous haploid genotype percentage ≥ 0.01 or X chromosome homozygosity F ≥ 0.9), 4) failed the cryptic relatedness check (PI_HAT > 0.125), 5) were detected as an outlier (> 6 standard deviation for the first or se cond principal component). After subjects meeting these criteria were deleted, SNPs were deleted if the call rates were low (95%) or were inconsistent with Hardy-Weinberg Equilibrium (HWE) (P < 10E-04). QC was then applied on the subjects and SNPs of merged case-control dataset as done separately. SNPs were also deleted if the minor allele frequency (MAF) was less than 0.05 in cases and controls or the HWE P value was less than 0.01 in controls only.
Asthma susceptibility was analyzed by comparing the non-Hispanic white TENOR cases to the general population Illumina controls. To reduce population stratification, four controls were matched with every one case based on pairwise IBS. Principal components were generated using principal components analysis (PCA) in EIGENSTRAT (version 3.0, URL: http://genepath.med.harvard.edu/~reich/Software.htm) . Sex, age, and significant principal components were used as covariates in the logistic additive model. Genomic control (GC) was applied on P values to reduce population stratification further . A linear model was analyzed in GWAS-identified candidate regions in 473 TENOR cases and 363 phenotyped controls for asthma-related quantitative traits (total serum IgE, % predicted FEV1, FVC, and FEV1/FVC).
Haploview (URL: http://www.broad.mit.edu/mpg/haploview/) was used to generate linkage disequilibrium plots . 95% confidence intervals on D’ was used to define blocks . SNAP (version 2.0, URL: http://www.broad.mit.edu/mpg/snap/) was used to generate the association plots . Imputation was performed based on HapMap II CEU genotype data  using MACH (version 1.0, URL: http://www.sph.umich.edu/csg/abecasis/MaCH/index.html) . Association of candidate SNPs with nearby gene expression data in lymphocytes was performed based on GENEVAR dataset (URL: http://www.sanger.ac.uk/humgen/genevar/)  by using WGAViewer .
A total of 607 TENOR cases were genotyped with the HumanCNV370 BeadChip. After removal of non-white samples (see Figure E1 in the Online Repository) and removal based on the QC criteria described above, data from 474 asthmatics were carried forward to analysis. Of the 3,294 Illumina Caucasian controls downloaded from iControldb, 3,141 Illumina controls passed QC. After merging 474 TENOR cases with 3,141 Illumina controls and evaluating the combined QC metrics, 473 cases and 3,106 controls were retained. To reduce population stratification, four controls were matched with every one case based on pairwise IBS, thus 473 cases and 1,892 Illumina controls were used for GWAS (see Table I for demographics and Figure E2 in the Online Repository). After QC analysis of the 318,075 common SNPs, 292,443 SNPs were retained for the GWAS.
GWAS of asthma was performed on 292,443 SNPs of 473 TENOR cases and 1,892 Illumina controls with sex, age, and significant principal components as covariates in the logistic additive model (see Figure 1). Genomic control (GC) was applied to P values to reduce population stratification (genomic inflation factor = 1.073 and 1.000 before and after adjustment, see Figure E3 in the Online Repository). In total, 248 SNPs had GC-adjusted P values ≤ 1.0E-03 (see Table E1 in the Online Repository). Focusing on SNPs with a GC-adjusted P values ≤ 1.0E-04 and at least two neighboring SNPs (+/− 100 kb) with GC-adjusted P values ≤ 1.0E-03, six regions were identified: RAD50-IL13 on chromosome 5q31.1, HLA-DR/DQ on 6p21.32, low density lipoprotein-related protein 1B (LRP1B) on 2q22.1-22.2, sorting nexin 10 (SNX10) on 7p15.2, carbonic anhydrase X (CA10) on 17q21.33, and potassium inwardly-rectifying channel, subfamily J, member 2 (KCNJ2) 17q24.3 (see Figure 1 and Table E1 in the Online Repository).
The RAD50-IL13 region had the strongest evidence for association (see Table II and Figure 2A) with multiple SNPs in this region strongly associated with asthma susceptibility (rs2244012, rs6871536, and rs2897443 in RAD50 ranked highly as 1, 2, and 4, respectively in this study). rs2244012 in intron 2 of RAD50 had an odds ratio of 1.64 (95% CI: 1.36 – 1.97; P = 3.04E-07; GC-adjusted P = 7.69E-07). Three SNPs in or near IL13 (rs2243204 (3’ downstream), rs20541 (Arg130Gln), and rs1295686 (intron 3)) were also associated with asthma (P < 0.001), but in weak LD (0.2 < r2 < 0.3) with SNPs in RAD50 (see Figure 2A and 2B). rs2243300, which is ~5kb upstream of IL4, was weakly associated with asthma (P = 0.0032). Six SNPs downstream of IL5 were not associated with asthma, although they were in weak LD with SNPs in RAD50 (see Table II, Figure 2A and 2B). Four LD blocks were identified based on 95% confidence interval of D’ (see Figure 2B) . Blocks 1 and 2 were each composed of three SNPs downstream of IL5. Block 3 was composed of four SNPs in the intron of RAD50. Block 4 was composed of two SNPs in IL13.
Linear model analysis was performed with the 14 SNPs of the TH2 cytokine locus in 473 TENOR cases and 363 phenotyped controls for asthma-related quantitative traits (total serum IgE, FEV1, FVC, and FEV1/FVC) (see Table II). Multiple SNPs in each gene: RAD50, IL13, and IL4 but not IL5 showed significant association (P ≤ 0.05) with asthma-related quantitative traits.
The HLA-DR/DQ region (see Table III and Figure 3A) also showed consistent association with asthma. rs1063355 in the 3’ UTR of HLA-DQB1 had an odds ratio of 0.68 (95% CI: 0.58 – 0.81; P = 9.55E-06; GC-adjusted P = 1.93E-05). Ten of the 46 SNPs in the HLA-DR/DQ region had P ≤ 0.001 (see Table III and Figure 3A). Multiple SNPs in or near butyrophilin-like 2 (BTNL2), HLA-DRA, HLA-DRB1, HLA-DQB1, and HLA-DQA2 were strongly associated with asthma (P < 10E-04). LD is complicated in this region when considering all 46 SNPs (data not shown). One LD block composed of three SNPs upstream of HLA-DQB1 was formed based on 95% confidence interval of D’ of these 10 SNPs (see Figure 3B).
Linear model analysis was performed with the 10 SNPs of the HLA-DR/DQ region for asthma-related quantitative traits (see Table III). A single SNP, rs1063355, on HLA-DQB1 showed significant association with asthma-related quantitative traits (P = 0.01, 0.001, 0.007, and 0.05 for total serum IgE, FEV1, FVC, and FEV1/FVC, respectively).
The highest associated SNP identified in this study was rs2244012 in intron 2 of RAD50 (P = 3.04E-07). In addition, evidence was observed for association with multiple SNPs in the RAD50-IL13 region for asthma susceptibility and asthma related quantitative traits. The protein encoded by RAD50 is involved in DNA double-strand break repair and its expression level is constitutively low in most tissues, thus it has no known function directly related to asthma, although MER11-RAD50-NBS1 complex has been shown to be involved in somatic hypermutation and gene conversion of immunoglobulin regions . On the contrary, other genes (IL4, IL5, and IL13) in the TH2 cytokines locus are better candidates based on their biologic functions. Three SNPs in IL13 in this study were associated with asthma. IL13 is critical to the pathogenesis of allergen-induced asthma and thus one of the most highly studied and replicated genes in both genome-wide linkage and candidate-gene association studies. rs20541 (Arg130Gln or IL13+4257GA), in the coding region of IL13, was also analyzed in this study and has been shown to be associated with asthma  and total serum IgE levels . rs1800925 (IL13-1111CT), in the promoter region of IL13, has been shown to be associated with asthma  and total serum IgE levels . In a GWAS with total serum IgE levels, four SNPs in RAD50 (rs2706347, rs3798135, rs2040704, and rs7737470), have been identified (P < 10E-04) . These four SNPs in RAD50 were in strong LD with rs1800925 (0.7 < r2 < 0.8) and in weak LD with rs20541 (0.2 < r2 < 0.3) in IL13 . These results are consistent with the results of this study; since many of the TENOR asthmatics were recruited from allergist’s offices and the population has increased IgE levels . Since the actual functional SNPs can-not be determined purely by their P values, it is difficult to dissect the association data of RAD50 from IL13 in this study or other genetic studies due to the degree of LD present in this chromosomal region.
In a transgenic mouse study, a TH2 locus control region (LCR) was identified as the 25 kb fragment at the 3’ end of Rad50 . An LCR is defined experimentally as regulating the expression of linked genes in a copy number dependent and tissue-specific manner. The TH2 LCR is involved in the chromatin configuration to re-organize promoters of IL4, IL5, IL13 in proximity and co-regulation of TH2 cytokine expression . Seven Rad50 DNase I hypersensitive sites (RHS1-7) were identified, where RHS4-7 formed the core of the LCR . LCR-C (RHS7) and LCR-B (RHS6) were possible TH2 cytokine expression enhancers; LCR-A (RHS6) and LCR-O (RHS5) were likely insulators . RHS7 is essential for TH2 cytokine expression by showing TH2 specific demethylation after allergen stimulation and intrachromosomal interactions between LCR and the promoters of TH2 cytokines . Furthermore, RHS6, Rad50 promoter (RHS2), and IL5 promoter interacted with interferon gamma (Ifng) on a different chromosome, which suggests an interchromosomal regulation of the expression of TH1/TH2 cytokines . Although all the above experiments were done in mouse, the RAD50 sequence is highly conserved in the LCR between human and mouse. With imputation, multiple significant SNPs were found in the LCR (see Table E2 in the Online Repository): rs3798135 (P = 1.49E-06, in RHS5/LCR-O), rs12653750 (P = 1.49E-06, in RHS6/LCR-A), rs2040704 (P = 1.33E-06, in RHS6/LCR-B), and rs2240032 (P = 6.68E-06, in RHS7/LCR-C). The association of rs2244012 with the expression levels of IL13 in lymphocytes from white adults based on GENEVAR dataset was not significant (P = 0.176), but may be due to small sample size.
Since both a previous GWAS for total serum IgE levels and our GWAS of asthma identified RAD50, it appears to be a new candidate gene for asthma. Although it is still possible the signal from RAD50 is purely due to its LD with the promoter of IL13, RAD50 deserves to be carefully studied when considering TH2 cytokine locus.
HLA-DR/DQ also showed consistent association with asthma, for example, rs1063355 in the 3’ UTR of HLA-DQB1 (P = 9.55E-06), rs2239804 in intron of HLA-DRA (P = 2.80E-05), and rs2516049 5’ upstream of HLA-DRB1 (P = 2.62E-05). HLA-DR/DQ is part of the HLA class II region, which is one of the most gene/variant dense regions in the human genome and is associated with many diseases . HLA-DQB1 and HLA-DRB1 have been shown to be associated with asthma in multiple independent studies [42–44]. Genetic variants in the HLA-DR/DQ region have also been shown to be highly associated with HLA-DR/DQ gene expression, indicating that the association of HLA-DR/DQ with disease might be due to gene expression levels in addition to antigen recognition [45, 46]. The association of rs2516049 with asthma in our study and with the expression levels of HLA-DRB1 (P = 1.25E-04) in lymphocytes from white adults based on GENEVAR dataset indicated that the variant might function through expression level changes (see Figure E4 in the Online Repository) [28, 29]. Imputation identified a SNP with a more significant P value, rs3998159 (P = 1.45E-06), between HLA-DQB1 and HLA-DQA2. It is difficult to determine the functional genes/SNPs in the HLA-DR/DQ region in our study due to the complicated LD pattern in this region. The long-range LD and haplotype analysis based on the MHC Haplotype Project may solve the issue .
Using a GWAS approach, this study is the first to confirm the association of RAD50-IL13 and HLA-DR/DQ regions with asthma susceptibility, regions which have been identified by multiple candidate-gene association studies and one genome-wide association study on total serum IgE levels. Our results weakly replicated the findings of the other GWAS: ORMDL3 and gasdermin B (GSDML) (rs7216389) with asthma (P = 0.057); FCER1A (rs2251746) with total serum IgE (P = 0.040); CHI3L1 (rs880633) with FEV1 (P = 0.003), FVC (P = 0.031), and FEV1/FVC (P = 0.040). rs1588265 (P = 0.507) and rs1544791 (P = 0.678) in PDE4D with asthma were not replicated. GWAS of total serum IgE by Weidinger  identified several SNPs in RAD50 (P < 10E-04). In our study, the most significant SNP in RAD50 for total serum IgE is rs6871536 (P = 2.61E-03). The geometric mean of total serum IgE in Weidinger’s study is 42.41 (95% CI: 39.56 – 45.47). In our study, the geometric mean of total serum IgE is higher, 48.94 (95% CI: 43.04 – 55.65). The difference in the total serum IgE distribution and relatively small sample size in our study may lead to the difference of significant levels between these two studies.
The potential for false negative results could not be avoided in this study due to the relatively small sample size (473 cases) which may also be the reason that although significance levels of 10-7 were observed, no SNP reached Bonferroni adjusted multiple test criterion (p=0.05/292,443 = 1.71E-07. However, evidence for multiple SNPs was observed in our results in this comprehensively phenotyped relatively homogeneous cohort of difficult-to-treat asthmatics from the larger TENOR study. Our control datasets (general population and phenotyped controls) both have some limitations. They were both significantly younger (see Table I) than TENOR cases, making our results a little conservative because some controls might become asthma cases in the future. Genotyping confirmation and fine-mapping of candidate regions were impossible since the Illumina controls were from a public database, but our approach compensated for this by using imputation. Population stratification was relatively strong between TENOR cases and Illumina 550k controls.
This GWAS confirmed the important role of TH2 cytokine and antigen presentation genes in asthma at a genome-wide level. Furthermore, these findings will stimulate more comprehensive research (e.g., re-sequencing, long-range LD, epistasis, epigenetics, copy number variant, and function) on these two regions due to their functional importance and structural complexity.
GWAS of asthma identifies RAD50-IL13 and HLA-DR/DQ. These findings will stimulate more comprehensive research on these genes because of their structural complexity and functional importance in the pathogenesis of asthma.
We would like to thank Dr. Elizabeth J. Ampleford for analytical assistance. We would also like to acknowledge the TENOR/SARP/CSGA/STAMPEED Study Group and the TENOR/SARP/CSGA/STAMPEED participants who contributed DNA samples.
Declaration of all sources of funding: The clinical TENOR study was supported by Genentech, Inc. and Novartis Pharmaceuticals Corporation, and this genetic study was funded by NIH HL76285 and HL87665.
Publisher's Disclaimer: This is a PDF file of an unedited manuscript that has been accepted for publication. As a service to our customers we are providing this early version of the manuscript. The manuscript will undergo copyediting, typesetting, and review of the resulting proof before it is published in its final citable form. Please note that during the production process errors may be discovered which could affect the content, and all legal disclaimers that apply to the journal pertain.
This study confirmed the association of the candidate genes: RAD50-IL13 and HLA-DR/DQ with asthma susceptibility at the genome-wide level and provides confirmation of the important role of TH2 cytokine and antigen presentation genes in asthma.